Documentation

Build with Yellowforge

Everything you need to generate images and video, wire up a workflow, pick the right model, and spend credits well. Read top to bottom, or jump to a section from the sidebar.

Getting started

Yellowforge is an AI generation studio. Describe what you want, pick a model, and get back an image or a video. This section takes you from a fresh account to your first result.

Create your account

You can sign up two ways:

  • Email and password. Register at the sign up page with a username, your email, and a password.
  • Google. One click, no password to remember.

Claim your free credits

Every new account gets a one-time grant of 25 credits at signup. No card required. That is enough for a handful of images or a short video, depending on the model and resolution you choose. Your live balance shows in the top right of the nav at all times.

Your first generation

  1. Open the Image studio

    Go to Image in the top nav.

  2. Write a prompt

    Describe the shot in plain language. A clear subject and concrete nouns beat long adjective lists.

  3. Pick a model

    The model chip sits next to the prompt. To generate from text alone, choose a text-to-image model. The Models section below explains which is which.

  4. Set resolution and count, then Generate

    The cost in credits is shown before you confirm. Press Generate and your result appears in a moment.

Where things live

Everything you generate is saved to your Library, along with any images you upload. From there you can preview at full size, reuse a result as a reference, or retry a generation, which brings back the exact settings you used. The studios are at Image and Video; the node-based workspace and Metaforge each get their own section below.

The studios

The studios are where you generate. The Image studio makes stills, the Video studio makes motion. Both share the same idea: a prompt, an optional set of references, a model, and a Generate button.

Image studio

Open Image. The prompt bar holds everything you need for a single generation: the prompt, the model chip, aspect ratio and resolution, and a count for how many variations to make at once.

The Image studio, prompt bar and result canvas
The Image studio: prompt bar at the bottom, results on the canvas.

Video studio

Open Video. The left rail is where you pick a source and model; the prompt describes the motion.

  • Image to video. Start from a still, then write how it should move, leading with verbs: how the camera moves, how the subject moves.
  • Video to video. Start from a short clip and restyle or edit it while keeping the original motion. A negative prompt helps here: name what you do not want to see.

Prompts and references

Many models let you attach reference images to steer the result. There are two roles: a character, the subject to carry into the output, and an element, a scene, background, or style to draw from. Pull references from your Library or upload them on the spot.

Some models require at least one reference to run, and others ignore references because they generate from text only. The Models section spells out which is which. If you want a detailed, structured prompt built from your references, the JSON prompt generator can build one and attach the references back to the prompt bar in a single step.

Resolution and batches

Resolution affects both quality and credit cost, and the ratios on offer depend on the model. Where a model supports it, set a count above one to generate several variations in a single run. The credit cost for the current settings is always shown before you confirm, and it updates live as you change the model or resolution.

Retries and failures

Each studio keeps a running history of your generations, and everything is saved to your Library. Open any result at full size in the preview, or use Retry to rerun it. Retry rehydrates the exact settings behind a generation: the prompt, the model, the references, and the resolution. Tweak one thing and run it again without rebuilding the whole setup.

If a generation fails, you are not charged for the failed attempt. Read the error shown on the result, adjust, and run again.

Characters

A character is a saved, reusable identity: a face Yellowforge keeps consistent across generations. Build it once, then use it in any prompt and get the same person back, scene after scene, without re-attaching reference photos. Every character is private to your account.

What a character is

A character is two things: a name and a reference sheet, the set of photos that define the look. When you use a character, Yellowforge supplies those photos to the model and weaves the name and description into your prompt, so the identity carries from one generation to the next. The AI-influencer case is the obvious one: the same face across hundreds of scenes.

Creating a character

Open Characters in the top nav and choose New character. A few ways to build one:

  1. From your library

    Pick clear face photos you have already generated or uploaded. No cost. A mix of front and 3/4 angles works best.

  2. Generate in the studio

    No photos yet? Head to the Image studio, generate face shots, then save them as a character.

  3. Save from a generation

    On any image you have made in a studio, open the cell menu and choose Save as character. Zero cost, the image already exists.

The photos become the reference sheet; the first is the primary and carries the most weight. Names use letters, numbers, underscore or dot (2 to 30 characters), and the dice will roll an invented one for you.

Using one in a generation

In the Image or Video studio, click the Character pill next to the model chip and pick your character. A Character: name chip shows it is locked. Now prompt as normal, for example “[name] on a beach at golden hour”, and generate. You get the same person without attaching anything. Press the chip × to clear it.

Retry on a result keeps the character locked, so you can tweak the prompt and run it again without re-picking.

Which models use it

You pick a character the same way for every model; Yellowforge routes it automatically:

  • Reference-capable image models(Nano Banana Pro, Seedream, Wan, Qwen) receive the character’s photos directly, primary first, up to that model’s reference limit, plus the name and description in the prompt.
  • Text-to-image models(GPT Images 2, Z-Image-Turbo) have no reference slots, so the character’s name and description are woven into the prompt instead.
  • Video uses the name and description to steer the motion prompt; for video-to-video the reference sheet is added to the optional references.

Nano Banana Pro holds a face the most reliably; reach for it when identity matters most. See Models for the full lineup.

Tips for a consistent face

  • Use three or more clear, well-lit face shots.
  • Mix front and 3/4 angles; avoid sunglasses, heavy filters, and extreme expressions.
  • Make the primary image your best, sharpest shot, it carries the most weight.
  • Keep the name invented so the model does not bias toward a real person.

The ForgeComing soon

The Forge is where Yellowforge is headed, and the part we are most excited to ship. Instead of one prompt and one output, you build a visual pipeline: nodes that generate, edit, and combine media, wired together so the result of one step feeds the next. A workflow you build once can be saved, reused, and shared. Here is a preview of how it will work.

What The Forge is

When it lands, you will reach it from Studio in the top nav. The canvas is an infinite board you pan and zoom: drop nodes onto it, connect them left to right, and run the whole graph at once. Each node does one job, and the power comes from chaining them.

This is what will set Yellowforge apart from a single prompt box. A model is chosen inside a node, not wired in, so the graph stays about the work: where media comes from, what transforms it, and what comes out the other end. One canvas, from first idea to finished shot.

Starting from a preset

The fastest way into The Forge is a preset. A preset is a ready made graph, already wired and ready to run. Open The Forge on a blank canvas and you are met with a gallery of them, grouped by what you are making: portraits, product shots, editorial sets, and video.

Pick one and the whole graph drops onto the canvas in a single click. From there you swap in your own prompt or reference image and run it, so you are never staring at an empty board wondering where to start. You can reopen the gallery any time from the presets button in the canvas toolbar.

  • Cinematic portrait. A prompt into a single hero image with film lighting.
  • Character turnaround. One description into four matching angles of the same person.
  • Product hero and product variations. Drop in a product photo and place it on clean studio sets, one or several at once.
  • Editorial spread. A prompt into three variations to choose from.
  • Image to video and brand sting. Bring a still to life with a camera move, or turn a logo into a short animated sting.

Node types

Every node fits one of four roles. Source nodes bring data in, generation nodes make new media, edit nodes transform existing media, and the prompt nodes shape the instructions the generators read.

NodeWhat it doesInOut
TextHolds a written prompt. A starting point for a generation.NonePrompt
UploadBrings your own image or video onto the canvas.NoneMedia
JSON PromptTurns a plain prompt into a structured, detailed one.PromptPrompt
ImageGenerates an image with the model you pick inside it.Media or promptMedia
VideoGenerates video from a still or a prompt.Media or promptMedia
InpaintReplaces a masked region of an image.MediaMedia
Remove BGCuts the background out of an image.MediaMedia
UpscaleEnlarges media to a higher resolution.MediaMedia

Spawn a node fast with a single key: T for Text, I for Image, V for Video, J for JSON Prompt, and U for Upload. You can also right-click the canvas for the full node menu.

Connecting nodes

Drag from a node's output port to another node's input port to wire them. Data flows left to right, and each node transforms what reached it. Only two kinds of data travel a wire:

  • Media. An image or a video.
  • A prompt. A text instruction.

A wire is allowed only when the source's output matches what the target accepts. A Text node's prompt can feed an Image node; an Image node's media can feed an Upscale node. The canvas refuses a connection that would not type-check, so you cannot wire a graph that can never run.

A prompt becomes an image, and the image is enlarged. Data flows left to right.

Saving workflows

A workflow is the whole graph: every node, its settings, and the wires between them. Workflows are saved as you work and each one has its own address, so you can keep several on the go and reopen any of them later from your Library. Reopening a workflow restores the canvas exactly as you left it.

The handful of shortcuts worth knowing:

  • Ctrl+D duplicates the selected nodes.
  • Del removes the selected nodes.
  • Ctrl+Shift+Enter runs the whole workflow.

Sharing workflows

Because a saved workflow is a self-contained recipe, it is the natural unit to pass to other people. A workflow that reliably turns a brief into a finished shot is worth far more shared than kept. That is where the community comes in: a place to publish workflows and start from ones other builders have already proven out.

Common patterns

A few workflows you will reach for often, each just a short chain of nodes:

  • Idea to finished still. Text into Image into Upscale. Write the shot, generate it, then enlarge the result to a print-ready resolution.
  • Product into a new scene. Upload into Remove BG into Image. Bring in a product photo, cut its background, then composite it into a fresh setting.
  • Still to motion. Image (or Upload) into Video. Generate or import a still, then animate it in the Video node.
  • Richer prompts. Text into JSON Prompt into Image. Turn a loose brief into a structured prompt before it reaches the generator.

Models

Yellowforge runs several image and video models, each with its own strengths and its own prompting style. This section is the map: what each one is for, and whether it needs reference images.

The image lineup

ModelTypeReferencesBest for
Nano Banana ProEdit, with refsUp to 14Premium quality and identity. The strongest choice when a face or subject must hold across a scene.
GPT Images 2Text to imageNoneClean, instruction-following images from a prompt alone.
Z-Image-TurboText to imageNoneThe fastest option. Sub-second and photorealistic.
Seedream 4.5Edit, with refsRequiredCinematic, atmospheric results from a reference.
Wan 2.7 / Wan 2.7 ProEdit, with refsRequiredLiteral, controllable edits. Pro trades a little speed for more fidelity.
Qwen 2.0 / Qwen 2.0 ProEdit, with refsRequiredConcise prompts, dependable edits. Pro is the higher tier.

The video lineup

  • Wan 2.2 Turbo Spicy I2V. Image to video. Give it a source still and lead the prompt with motion: how the subject and camera move.
  • Wan 2.7. Video to video. Takes an existing clip and restyles or edits it while preserving the original motion.

Picking a model

  • Starting from text? Z-Image-Turbo for speed, GPT Images 2 for instruction following.
  • Keeping a face or product consistent? Nano Banana Pro, with reference images attached.
  • Want a cinematic look from a reference? Seedream 4.5.
  • Need a precise, literal edit? Wan or Qwen, and the Pro tier when fidelity matters more than speed.

Reference-image models

The image models split into two groups. Text to image models generate from a written prompt alone and ignore any references you attach. Edit models are built to transform reference images and need at least one to run. Reach for a text-to-image model when starting from nothing, and an edit model when you have a face, product, or scene to build on.

Resolution support

The resolutions and aspect ratios available depend on the model, and you set them per generation in the studio. Higher resolution costs more credits. For video, duration and resolution both factor into the cost. Whatever the model and settings, the exact cost is shown before you generate; see Credits and billing.

Motion control

Motion control models take two inputs: a character image and a driving video. The model transfers the driver's movement onto the character, so you can animate any subject by acting out the motion in front of a camera, or by using any existing clip whose motion you want to reuse.

Think of the character image as who or what and the driving video as how they move. The output runs as long as the driver, sounds like the driver if you keep its audio, and looks like the character. Two motion-control models are live: Seedance 2.0 (Fast and Standard tiers, 4 to 15 seconds) and Kling v2.6 Pro (3 to 30 seconds).

Seedance 2.0

Reference-to-video with native audio synthesis and phoneme-level lip-sync in eight or more languages. The character can speak, sing, or perform any motion the driver shows, and the output carries clean synced audio out of the box. Two tiers ship: Fast for quick iteration and lower credit cost, Standard for higher quality at a small premium.

Best for.Talking-head clips, performance transfers, character animation with lip-sync, and any case where you want native sound on the output rather than the driver's original audio.

Limits. Output length is 4 to 15 seconds, set in the studio. Resolution is 480p or 720p. The character image and driver should each be a single clear subject; complex multi-subject sources transfer poorly.

  1. Pick the tier

    In the studio, choose Seedance 2.0 Fast or Seedance 2.0 Standard. Standard costs slightly more and looks slightly better; Fast is the right default for first drafts.
  2. Upload the character image

    The subject who will perform the motion. A clear, well-lit full-body or torso shot works best.
  3. Upload the driving video

    Any clip whose motion you want to transfer. Up to 15 seconds.
  4. Set the duration

    Pick a number of seconds between 4 and 15. The live cost estimate updates as you change it.
  5. Optional: refine with a prompt

    Add a short prompt to nudge the scene, lighting, or style. The prompt is optional; the model already knows the character and motion from the inputs.
  6. Toggle audio

    Native audio with lip-sync is on by default. Turn it off in the Motion control card for a silent output.
  7. Click Generate

    The cell shows queued, then running, then the finished clip.

Kling v2.6 Pro

Long-form motion transfer up to 30 seconds. The driving video sets the output length directly, and the model can match either the character image's framing or the driver's framing depending on which you pick. Driver audio is preserved by default so the output sounds like the source clip.

Best for. Longer performance transfers, dance and movement clips, anything where you want the output framed the same as the driver (camera angle, aspect ratio) but with a different subject. Strong choice when the driver already has the audio you want on the output.

Limits.3 to 30 seconds, taken from the driver clip's length. No separate text-to-video generation; both inputs are required.

  1. Upload the character image

    A single clear subject. Same rules as Seedance: clear lighting, uncluttered background.
  2. Upload the driving video

    Up to 30 seconds. The cost estimate uses this clip's duration, so the price you see is the price you pay.
  3. Pick a character orientation

    Match videokeeps the driver's framing and aspect ratio. Best when the driver is well-composed footage and you want the output to look like the same shot with a different subject.

    Match imageuses the character image's framing. Best when the driver is a rough motion reference but the output should be styled and framed like the character.

  4. Decide on driver audio

    Keep driver audio is on by default. Turn it off to strip the source audio for a silent output.
  5. Optional: positive and negative prompts

    A positive prompt refines the scene; a negative prompt (up to 2500 characters) lists things to avoid. Both are optional.
  6. Click Generate

    Output is one mp4 at the driver's duration.

Credits and billing

Credits are how you pay for generations. Every image and video spends credits; you get some free to start, and you top up with a one-time pack or a subscription when you need more.

How credits work

Each generation costs a number of credits set by the model and the resolution or duration you choose. A fast text-to-image model costs little; a long, high-resolution video costs more. The exact cost is shown before you confirm, and credits are spent at the moment you generate. Your live balance sits in the top right of the nav.

Each generation draws from your balance and returns a result.

The free signup grant

New accounts receive a one-time grant of 25 credits when you sign up. It is a single grant, not a daily refill, and it needs no card. Use it to try the models and find your workflow before you pay anything.

One-time packs

Credit packs are a single purchase that adds credits to your balance. Buy one when you want more credits without an ongoing commitment. Packs never expire, so a pack you buy today is still there next month. Sizes and prices are on the pricing section of the home page, and checkout is handled securely by Stripe.

Subscriptions

A plan grants credits on a recurring basis and suits regular use. Pay monthly and your plan's credits are granted each month when the invoice is paid; pay yearly and you receive a full year of credits up front, twelve times the monthly amount, at a lower effective rate.

Subscriptions are managed through the Stripe customer portal, where you update your card, download receipts, and cancel. Upgrading to a higher tier takes effect immediately. Downgrading or cancelling takes effect at the end of your current billing period, and you keep your access and credits until then.

When credits expire

They do not. There is no rolling expiry and no cap on how many you can hold. Credits from packs and plans stack and stay until you spend them.

Metaforge

Metaforge prepares a file for export. It writes a realistic set of capture metadata onto an image or video so the exported file carries the kind of details a camera would record.

What Metaforge does

When you export a file, Metaforge re-encodes it and writes a fresh metadata block: the device and lens it should read as, a capture time, and a location if you set one. The result is a single file you download, ready to use.

Supported metadata fields

  • Device and lens. Pick from preset profiles so the metadata matches a known camera.
  • Capture time. Set when the file should read as having been captured.
  • Location. Optionally write latitude and longitude.
  • Output quality and size. Control the encode quality, or aim for a target file size or maximum dimension.

How to use it

Open Metaforge. The page has three panels: Source, where you pick the image or video and its dimensions are read in the browser; Settings, where you choose the device profile and the details to write; and Report, where you run the forge, review what was written, and download the result.

FAQ

Short answers to the questions people ask most. If something is not here, the sections above go deeper, and you can always reach support.

Account and access

What does it cost to start?

Nothing. New accounts get a one-time grant of 25 free credits, with no card required. You only pay when you want more.

Do I need to install anything?

No. Yellowforge runs in the browser. Sign in and you are ready to generate.

Is there an API?

Not yet. Yellowforge is used through the studios and the workspace in the browser today.

Credits and billing

Do my credits expire?

No. There is no rolling expiry and no cap. Credits from packs and plans stack and stay until you spend them.

What is the difference between a pack and a plan?

A pack is a one-time purchase that adds credits to your balance. A plan is a subscription that grants credits each billing period. Both are covered under Credits and billing.

How do I cancel?

Through the Stripe customer portal. Cancellation takes effect at the end of your current period, and you keep the credits already in your balance.

Generations and quality

Which model should I use?

Use a text-to-image model when starting from a written idea, and an edit model when you have a reference to build on. The Models section breaks down each one.

Can I use my own images as references?

Yes. Upload them in the studio or pick from your Library. Note that text-to-image models do not use references.

Who owns what I generate?

Ownership and usage are governed by the Terms of service. Read those for the full terms that apply to your generations.

Workflows and sharing

What is a workflow?

A workflow is a graph of connected nodes in the workspace: where media comes from, what transforms it, and what comes out. It is saved as you build it and can be reopened later.

Can I share a workflow?

Community sharing is taking shape alongside the workspace redesign. Build your pipeline as a clean, reusable workflow now and it will be ready to share when that lands.

Troubleshooting

A generation failed. Was I charged?

No. You are not charged for a failed attempt. Read the error on the result, adjust, and run again.

My video upload was rejected.

Check the format, file size, and clip length against the limits the Video studio shows for the selected model. Sources are validated before upload.

Still stuck?

Email contact@yellowforge.app and we will help.

Was this helpful?

Tell us what was missing or unclear and we will fix it.

Send feedback