top of page

Forge Tools Mastery Codex: Advanced Prompting Strategies for AI Image and Video Generation

  • Writer: Albert Landry
    Albert Landry
  • Mar 15
  • 5 min read

In the rapidly advancing field of AI creativity, effective prompting is the cornerstone of producing professional-grade images and videos. This expanded codex delves deeper into optimal strategies for each forge, incorporating multimodal capabilities, audio integration, and advanced techniques derived from extensive research. By leveraging structured frameworks, iterative refinement, and tool-specific features, users can achieve unprecedented control and quality. This guide emphasizes real-world applications, including handling complex inputs like reference images, videos, and audio, while providing detailed examples and best practices.


OpenForge — OpenAI: gpt-image-1.5


OpenForge, powered by OpenAI's gpt-image-1.5 (aligned with DALL-E 3 capabilities), excels in generating detailed images from text. It supports high-resolution outputs and handles complex compositions, but performs best with clear, structured prompts that avoid ambiguity.


Advanced Prompting Principles

  • Clarity and Specificity: Use precise language with descriptive adjectives for subjects, avoiding vague terms. Prioritize key elements first, as order influences emphasis.

  • Structural Framework: Organize as subject + details + mood + style + technical specs. Include composition (e.g., rule of thirds) and lighting (e.g., golden hour) for cinematic results.

  • Text Integration: For embedded text, use short phrases in quotes and specify style (e.g., "bold sans-serif"). Limit to simple elements to avoid distortion.

  • Iteration and Experimentation: Start with basic prompts and refine iteratively. Experiment with keywords for variations; newer models like this respond better to natural language over flowery prose.

  • Limitations and Workarounds: Avoid overloading with too many details; balance for coherence. For consistent results, reference artistic styles or real-world analogies.


Example Prompts

  • Basic: "A serene mountain lake at dawn, mist rising from the water, photorealistic, 4K."

  • Advanced: "Hyper-detailed portrait of an elderly inventor in a cluttered workshop, warm incandescent lighting casting long shadows, steampunk style with intricate gears, shallow depth of field, cinematic composition."


DreamForge — ByteDance: Seedream 4.5 (Default) and Seedream 5 (Toggle)


DreamForge utilizes ByteDance's Seedream models for high-fidelity image generation and advanced editing. Seedream 4.5 balances speed and quality, while 5.0 introduces multi-step reasoning and example-based edits for superior controllability.


Advanced Prompting Principles

  • Natural Language Structure: Use complete sentences (30-100 words) in the format: subject + action + environment + style + technical details. Prioritize key elements early for emphasis.

  • Editing Techniques: For inpainting/outpainting, use natural instructions like "replace the sky with a starry night." In Seedream 5, incorporate examples: "edit like [reference image], changing colors to autumn tones."

  • Multimodal Enhancements (Seedream 5): Leverage real-time web retrieval for current events or styles. Use reference images for consistency in multi-image grids or sequential creations.

  • Optimization Tips: Avoid keyword lists; opt for prose. For posters/grids, specify layout (e.g., 2x2) and text placement. Negate unwanted elements if supported.

  • Iteration Strategies: Build prompts modularly; test with shorter versions before expanding. Use Chain-of-Thought for complex edits in 5.0.

Feature

Seedream 4.5

Seedream 5

Editing

Basic inpainting/outpainting

Example-based, multi-reference

Length

30-100 words optimal

Supports longer for reasoning

Strength

Balanced performance

Deep domain knowledge, retrieval

Example Prompts

  • Seedream 4.5: "A professional headshot of a young executive in a modern office, soft natural lighting, neutral gray background, high detail, realistic."

  • Seedream 5: "Generate a triptych: left panel like [reference sunrise image], center hiking scene, right summit view; warm tones, hyperrealistic."


FluxForge — Black Forest Labs: Flux 2 Pro (Default) and Flux 2 Max (Toggle)


FluxForge employs Black Forest Labs' Flux 2 models for state-of-the-art image synthesis. Pro offers efficient generation, while Max provides enhanced detail and multi-reference consistency.


Advanced Prompting Principles

  • Prose Over Keywords: Craft descriptive narratives without negatives; focus on desired elements. Structure: Main subject first, then attributes, style, and specs.

  • Composition and Guidance: Specify asymmetry or golden ratio. Adjust guidance scale: higher (3.5+) for adherence, lower for creativity.

  • For Flux 2 Max: Use up to 10 references for style/character consistency. Include physics/world knowledge (e.g., "realistic gravity").

  • Text Rendering: Quote text precisely; describe placement and font for accuracy.

  • Advanced Iteration: Experiment with variants like [klein] for speed or [flex] for parameter tuning. Combine with Kontext for i2i edits.


Example Prompts

  • Flux 2 Pro: "A weathered fisherman casting his line at dawn, misty river background, cinematic realism, 8K, golden hour lighting."

  • Flux 2 Max: "Portrait like [reference 1] with lighting from [reference 2], ethereal woman in neon forest, hyper-maximalist, octane render."


NanoForge — Google: Nano Banana 2 (Default) and Nano Banana Pro (Toggle)


NanoForge harnesses Google's Nano Banana models (based on Gemini/Imagen tech) for fast, editable images. Nano Banana 2 emphasizes speed, while Pro excels in text rendering and consistency.


Advanced Prompting Principles

  • Descriptive Narratives: Use full sentences like a director's brief: subject + composition + action + location + style. Include camera/lighting specifics.

  • For Nano Banana Pro: Quote text, ground with real-time search (e.g., "current weather in LA"). Iterative edits: "enhance contrast like [reference]."

  • Multimodal Aspects: Upload references for style transfer or editing. Use for diagrams/timelines by specifying structure.

  • Optimization: Be specific on mood/atmosphere. Avoid ambiguity; refine with Gemini for prompt enhancement.


Example Prompts

  • Nano Banana 2: "Close-up of a confident CEO in a high-rise office, morning sunlight, photorealistic, 4K."

  • Nano Banana Pro: "Poster with title 'AI Summit' in bold font, background like current Tokyo skyline, cinematic lighting."


OmniForge — Kling / Kuaishou: Kling v3 Omni

OmniForge uses Kuaishou's Kling v3 Omni for versatile video generation, with strong multimodal support for text, images, videos, and elements.

Advanced Prompting Principles

  • Core Structure: Subject + action + environment + camera + lighting/atmosphere. Use cinematic terms (e.g., dolly zoom).

  • Multimodal Inputs: Reference images/videos for consistency (@Image/@Video). Create elements from uploads for reusable subjects. Combine with text for edits (e.g., "add [@Element] to [@Video]"). Supports up to 15s with audio.

  • Multi-Shot and Audio: Label shots (Shot 1: ...); enable multi-shot mode. Use quotes for dialogue, describe accents/sounds. Negate elements for refinement.

  • Editing Workflows: Remove/add via prompts (e.g., "remove background"). Use Chain-of-Thought for complex narratives.


Example Prompt

"Shot 1 (3s): Close-up of [@Character] speaking 'Hello world' in English accent, @Image background; dolly out to reveal cityscape, ambient urban sounds."


VeoForge — Google: Veo 3.1 Fast

VeoForge leverages Google's Veo 3.1 for rapid video creation, blending realism with audio-visual sync.


Advanced Prompting Principles

  • Five-Part Formula: Cinematography + subject + action + context + style/ambiance. Specify camera motion and mood.

  • Audio Handling: Use quotes for dialogue; describe SFX/music (e.g., "birds chirping").

  • Optimization: Short clips (4-8s) for quality; stitch later. Use Gemini for meta-prompts to generate detailed inputs.

  • Iteration: Refine for coherence; favor specific over abstract.


Example Prompt

"Tracking shot of a golden retriever chasing a ball in a park, sunny afternoon, cinematic style, upbeat music with dog barks."


WanForge — Wan Video: Wan 2.5 i2v Fast

WanForge employs Alibaba's Wan 2.5 for image-to-video, with native audio support up to 10 seconds in user builds.


Advanced Prompting Principles

  • Prompt Formula: Subject + scene + motion (80-120 words). Describe actions, camera, and modifiers explicitly.

  • Audio Integration: Upload 3-30s audio (10s max in build); syncs lips, generates SFX. Prompt dialogue in quotes, ambient sounds descriptively.

  • i2v Specifics: From input image, layer motions (foreground/background). Use negatives for refinement.

  • Advanced Features: Bilingual prompts; audio-driven for lip-sync. Iterate with templates.


Example Prompt

"A red convertible drives along a coastal road, waves crashing, camera panning right, slow-motion, ambient ocean sounds with engine rumble; upload audio clip for narration."


CinemaForge — OpenAI: Sora 2

CinemaForge harnesses OpenAI's Sora 2 for immersive video synthesis, with integrated audio and physics simulation.


Advanced Prompting Principles

  • Cinematic Breakdown: Shot type + subject + action + setting + lighting. Use sensory details for immersion.

  • Audio and Dialogue: Quotes for speech; describe effects/music. Balance for sync.

  • Optimization: Short shots for reliability; structure as storyboards. Bypass guards with descriptive analogies.

  • Iteration: Use meta-prompts; refine for mood/reactions.


Example Prompt

"Medium shot of a detective entering a dimly lit room, saying 'The clues lead here,' suspenseful music, rain pattering outside, realistic physics."


DirectorForge — Kling / Kuaishou: Kling 2.6

DirectorForge builds on Kuaishou's Kling 2.6 for video editing, focusing on multi-shot and consistency.


Advanced Prompting Principles

  • Hierarchical Framework: Scene + characters + action + camera + audio/style. Use sliders for precision.

  • Multimodal and Editing: Reference images for stability; natural language for changes. Short clips for quality.

  • Audio/Continuity: Describe pacing; use negatives. Modular for narratives.


Example Prompt

"Scene: Knight in forest, action: riding horse, camera: tracking left, audio: hoofbeats, golden light."

Mastery comes through practice—experiment with these strategies to elevate your creations across all forges.

Comments


bottom of page