Product Requirements Document (PRD): Applesauce Unified Licensing API

  1. Project Overview

Project Name: Applesauce

Vision: A unified architectural framework for audio licensing that bifurcates into two distinct products (SKUs) to address the diverging needs of human-to-human media licensing and human-to-machine AI training.

Status: Alpha / Proof of Concept (POC)

  1. Problem Statement

Current audio licensing is fragmented and fails to address two distinct market needs:

  1. Granular Media Licensing: Creators need to sell specific usage rights (TV, Social, Ads) with price discrimination.

  2. Mass AI Training: Developers need access to massive aggregate datasets without negotiating individual licenses, while creators deserve a fair share of the value.

  3. The Two-SKU Architecture

The system is divided into two distinct licensing engines:

3.1 SKU 1: Dataset Licensing (The "Weighted" Model) Target: Media Producers, Game Developers, Ad Agencies

A user-specific, high-value license for a specific asset. Prices are determined by the "Five Knobs + Quality" framework.

The Four-Domain Engine: Categorizes every license request into one of four distinct domains:

  • A: Promotion: Use of the voice to sell, promote, or advertise. (e.g., Ads, Influencer content).
  • B: Entertainment: Use of the voice in narrative or artistic content. (e.g., Games, Film, TV).
  • C: Information & Utility: Functional or educational use. (e.g., GPS, Screen Readers).
  • D: Synthetic & Transformative: Use as data to train models or create deep fakes/clones.

The Pricing Calculator (Base * Domain * Modifiers): Calculates price based on 6 key variables (Knobs) defined in the codebase:

  1. Quality: Q0 (Raw) to Q3 (Premium/Mastered).
  2. Duration: Temporal validity (30 Days to Perpetual).
  3. Reach: Geographic/Audience definition (Internal to Unlimited Digital).
  4. Volume: Intensity of use (Single Project to Unlimited Reuse).
  5. Exclusivity: Competitive restrictions (Non-Exclusive to Fully Exclusive).
  6. Identity: Trust/Verification level (Tier 1 OAuth to Tier 3 KYC).

3.2 SKU 2: ML Training Licensing (The "Egalitarian" Model) Target: Foundation Model Builders, AI Companies

A broad, aggregate license for using part of the platform's library for machine learning training.

  • Model: Flat/Egalitarian revenue share.
  • Split: 70% to Human Creators / 30% to Platform.
  • Logic: Revenue is distributed pro-rata based on the volume of data contributed to the training set, not the specific commercial value of an individual voice.
  1. Functional Requirements

4.1 Revenue & Transparency

  • Dataset SKU: Transactional revenue. Creator sets base price (or algorithms suggest it); Platform takes a fee.
  • ML Training SKU: Aggregate revenue. 70/30 split.
  • PDR (Personal Data Receipts): Machine-readable logs for creators detailing exactly which SKU was purchased and usage terms.
  1. Technical Requirements

5.1 Architecture

  • Umbrella API: Unified FastAPI entry point (main_api.py) routing to sub-services.
  • Service A (datasetsLicensing): Handles SKU 1 logic, pricing math, and config.
  • Service B (MLTrainingLicensing): Handles SKU 2 logic, cohort analysis, and viability checks.

5.2 Data Architecture

  • Metadata: Implementation of MLCommons Croissant for embedding legal/provenance data.
  • Storage: Distinction between specific asset delivery (Dataset SKU) and bulk corpus access (Training SKU).
  1. Legal & Compliance Framework

6.1 Minimum Contract Modules

  • Grant of Rights: Specific to the SKU (Individual License vs Data Consent).
  • ELVIS Act: Consent clauses for Voice Identity use (Critical for Synthetic Domain).
  • Liability: Caps limited to 12-month fees.
  1. Roadmap & Success Metrics
  • Alpha Goal: Deploy both SKUs via the Unified API.
  • Metric: 100% PDR delivery rate.
  • Latency: Pricing calculation < 200ms.