<strong>MagicQuill</strong> 是由支付宝和香港大学联合开发的一个功能强大的智能互动图像编辑系统,通过直观的界面和 AI 驱动的功能,实现快速而精准的图像修改。 该系统集成了多模态大语言模型 (MLLM),<strong>实现实时意图预测,从而免去复杂的文字输入。</strong> 基于用户的操作,系统会<strong>智能生成相关提示</strong>,<strong>支持连续的编辑</strong>流程。 只需简单的笔触操作,即可轻松完成如添加新元素、移除对象、调整颜色等复杂的图像编辑任务。 简单来说就是,你可以用画笔随便在你需要修改的图像上画几笔,然后这个工具会通过 AI 自动帮你把图片变成你想要的样子 [video width="1920" height="1080" mp4="https://img.xiaohu.ai/2024/11/magicquill.mp4"][/video] 比如: <ul> <li><strong>添加东西</strong>:想给图片里的人加一顶帽子?用笔刷画个大概的形状,输入提示词,它会自动生成一顶帽子。</li> <li><strong>删除东西</strong>:不想要图片里的某个物品?用擦除笔刷涂掉它,AI 会自动修补背景,看起来就像那个物品从来没存在过。</li> <li><strong>改颜色</strong>:不喜欢图片里的颜色?用颜色笔刷涂一下,比如把粉色花变成蓝色。</li> </ul> 工具还会猜测你的意图,比如你画了一条线,它会问你“这是路径还是藤蔓?” 如果猜错了,你可以改掉它。 <h5>操作流程</h5> <ol> <li><strong>上传图片</strong>:选择需要编辑的图片,或使用内置画布开始创作。</li> <li><strong>选择笔刷工具</strong>: <ul> <li>根据需求选择添加、删除或颜色笔刷。</li> <li>使用笔刷在画布上绘制。</li> </ul> </li> <li><strong>AI 实时生成</strong>: <ul> <li>系统根据<strong>笔触</strong>和<strong>提示</strong>生成相应的图像编辑结果。</li> <li>用户可修改提示以优化效果。</li> </ul> </li> <li><strong>调整参数</strong>: <ul> <li>使用高级参数调整生成结果的细节,如边缘强度、颜色范围等。</li> </ul> </li> <li><strong>保存或继续编辑</strong>: <ul> <li>确认满意后保存结果,或进行进一步编辑。</li> </ul> </li> </ol> MagicQuill 的主要功能可以分为以下几个核心模块,每个模块都针对图像编辑的关键需求设计,显著提升用户体验和操作效率: <hr /> <h3><strong>1. 编辑处理器 (Editing Processor)</strong></h3> 编辑处理器是系统的核心模块,负责实际的图像编辑任务,<strong>通过对用户笔触信号的解读,实现高精度的图像修改。</strong> <h5><strong>添加元素(Add Brush)</strong></h5> <ul> <li><strong>作用</strong>:通过笔刷在图片上绘制轮廓,AI 自动生成指定的新元素。</li> <li><strong>特点</strong>: <ul> <li>用户只需简单地画几笔,AI 会根据笔触和提示推测用户的意图。</li> <li>可生成动物、装饰物或其他物体,并自动匹配图片的风格和细节。</li> </ul> </li> </ul> <strong>应用案例</strong> <ul> <li><strong>案例 1:给人物添加饰品</strong> 上传一张肖像图片,使用添加笔刷画一个简单的圆环,AI 会生成一条逼真的项链,并与人物的脖子自然贴合。</li> </ul> <img class="aligncenter size-full wp-image-15627" src="https://img.xiaohu.ai/2024/11/necklace.gif" alt="" width="500" height="409" /> <ul> <li><strong>案例 2:为风景增添元素</strong> 在一片森林画几笔作为小鹿的轮廓,AI 自动补全“一只生动的小鹿便栩栩如生。"且与背景融为一体。</li> </ul> <img class="aligncenter size-full wp-image-15628" src="https://img.xiaohu.ai/2024/11/deer.gif" alt="" width="500" height="420" /> <h5><strong>删除元素(Subtract Brush)</strong></h5> <ul> <li><strong>作用</strong>:通过擦除笔刷,去掉图片中的不需要的部分,并自动修补空白区域。</li> <li><strong>特点</strong>: <ul> <li>AI 根据周围环境自动填补被删除部分,保持图片整体一致性。</li> <li>适合移除多余物体、错误细节或干扰元素。</li> </ul> </li> </ul> <strong>应用案例</strong> <ul> <li><strong>案例 1:去掉多余的物品</strong> “让我们把骷髅先生的帽子脱掉,帮他降温。”</li> </ul> <img class="aligncenter size-full wp-image-15625" src="https://img.xiaohu.ai/2024/11/skeleton-cowboy.gif" alt="" width="500" height="409" /> <ul> <li><strong>案例 2:细节调整</strong> 上传一张海豚图片,发现海豚有多余的尾鳍。用删除笔刷涂掉多余的尾鳍,AI 会重新绘制海豚尾部,看起来毫无违和感。</li> </ul> <img class="aligncenter size-full wp-image-15626" src="https://img.xiaohu.ai/2024/11/dolphin.gif" alt="" width="500" height="411" /> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><img class="icon" src="https://magicquill.art/demo/tutorials/brush_edge_add.svg" alt="add brush" width="100" /><span data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" data-immersive-translate-paragraph="1">&</span><img class="icon" src="https://magicquill.art/demo/tutorials/brush_edge_remove.svg" alt="minus brush" width="100" /></div> <div align="center" data-immersive-translate-paragraph="1" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">结合<b data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6">加法和减法画笔</b>来创造惊人的组合效果!</span></span></span></div> <div align="center" data-immersive-translate-paragraph="1" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><img class="aligncenter size-full wp-image-15624" src="https://img.xiaohu.ai/2024/11/mona-lisa-cat.gif" alt="" width="500" height="417" /></div> <div align="center" data-immersive-translate-paragraph="1" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6">"让我们给蒙娜丽莎一只宠物猫~"</div> <div align="center" data-immersive-translate-paragraph="1" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><img class="aligncenter size-full wp-image-15623" src="https://img.xiaohu.ai/2024/11/handsome-bowtie.gif" alt="" width="500" height="418" /></div> <div align="center" data-immersive-translate-paragraph="1" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6">“让我们把这个帅哥的领带换成领结!”</div> <h5><strong>颜色调整(Color Brush)</strong></h5> <ul> <li><strong>作用</strong>:使用颜色笔刷在图片中指定区域上色或改变已有颜色。</li> <li><strong>特点</strong>: <ul> <li>支持精准上色,用户可选择任意颜色。</li> <li>可调整颜色强度,让效果更细腻。</li> <li>自动匹配图片的光影和风格,避免人工上色的生硬感。</li> </ul> </li> </ul> <strong>应用案例</strong> <ul> <li><strong>案例 1:改花的颜色</strong> 改变蛋糕上花的颜色 “你不觉得蓝色花朵看起来比粉色花朵更梦幻吗?”</li> </ul> <img class="aligncenter size-full wp-image-15621" src="https://img.xiaohu.ai/2024/11/cake-flowers.gif" alt="" width="500" height="411" /> <ul> <li><strong style="font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen-Sans, Ubuntu, Cantarell, 'Helvetica Neue', sans-serif;">案例 2:人物妆容调整</strong></li> <li>精确的颜色高亮 - 精确涂抹您想要上色的地方,改变头发的一部分颜色,同时还能剪短头发</li> </ul> <img class="aligncenter size-full wp-image-15622" src="https://img.xiaohu.ai/2024/11/beautiful-hair.gif" alt="" width="500" height="379" /> <ol> <li style="list-style-type: none;"></li> </ol> <hr /> <h3><strong>2.绘画助手 (Painting Assistor)</strong></h3> 绘画助手是 MagicQuill 的智能化核心,通过<strong>实时理解用户的操作意图,大幅简化编辑流程。</strong> <h5><strong>智能猜测与修正</strong></h5> <ul> <li><strong>作用</strong>:AI 根据用户的笔触自动猜测编辑意图并生成内容,用户可以手动修改 AI 的猜测。</li> <li><strong>特点</strong>: <ul> <li>提升编辑效率,无需从零输入复杂的文本提示。</li> <li>如果猜测错误,用户可更正提示,优化生成结果。</li> </ul> </li> </ul> <strong>应用案例</strong> <ul> <li><strong>案例 1:路径绘制</strong> 用户在一张花园图片上画了一条线,AI 自动生成了一条小径。如果用户想要生成的是“藤蔓”,可以修改提示,让 AI 重新生成符合意图的内容。</li> </ul> <img class="aligncenter size-full wp-image-15644" src="https://img.xiaohu.ai/2024/11/path.gif" alt="" width="500" height="413" /> <h5><strong>功能细节</strong>:</h5> <ol> <li><strong>实时意图预测 (Draw&Guess)</strong>: <ul> <li>分析用户的笔触和上下文图像内容,预测用户的编辑意图。</li> <li>自动生成符合语义的提示,例如“画出头饰后,系统提示‘花冠’”。</li> </ul> </li> <li><strong>多模态大模型 (MLLM)</strong>: <ul> <li>基于 LLaVA 模型进行微调,专注于用户笔触的语义解读。</li> <li>支持连续编辑,减少用户在每一步都需输入文本提示的负担。</li> </ul> </li> <li><strong>自动化提示生成</strong>: <ul> <li>系统通过“画与猜”模式,将用户的涂画意图自动转换为编辑命令。</li> <li>例如,用户画一个圆圈,系统预测“这是一个盘子”并执行相关编辑。</li> </ul> </li> <li><strong>数据增强与语义优化</strong>: <ul> <li>构建专用数据集,模拟用户绘画场景,使模型更擅长处理人类手绘输入。</li> </ul> </li> <li><strong>误差处理</strong>: <ul> <li>针对模糊或多义的用户输入(例如一个简单的圆形),模型能给出上下文相关的多种猜测。</li> </ul> </li> </ol> <hr /> <h3><strong>3. 创意收集器 (Idea Collector)</strong></h3> 创意采集器(Idea Collector)提供简洁而强大的交互界面,降低学习成本: <ul> <li><strong>模块化设计</strong>: <ul> <li>包含工具栏(选择笔刷和参数调整)、画布(实时绘画和修改)、预览区域(查看生成结果)。</li> </ul> </li> <li><strong>跨平台支持</strong>: <ul> <li>支持通过 Gradio 和 ComfyUI 等平台运行,适配多种设备。</li> </ul> </li> </ul> <strong>特点:</strong> <ul> <li><strong>快速上手</strong>:适合专业和非专业用户,学习成本低。</li> <li><strong>功能丰富</strong>:提供分层管理和参数调整功能,方便用户自由发挥创意。</li> </ul> <h4><strong>功能细节</strong>:</h4> <ol> <li><strong>直观操作工具</strong>: <ul> <li><strong>笔刷工具</strong>: <ul> <li>涂鸦笔刷和颜色笔刷,便于用户对图像进行自由绘画式修改。</li> </ul> </li> <li><strong>橡皮擦</strong>: <ul> <li>用于精细修正笔触,增强编辑的准确性。</li> </ul> </li> <li><strong>图层管理</strong>: <ul> <li>支持管理多个编辑步骤,让用户能够随时撤销或重做修改。</li> </ul> </li> </ul> </li> <li><strong>跨平台兼容性</strong>: <ul> <li>与 Gradio 和 ComfyUI 等生成式 AI 平台兼容。</li> <li>通过 ReactJS 组件实现模块化设计,方便未来扩展和集成。</li> <li><img class="aligncenter size-full wp-image-15652" src="https://img.xiaohu.ai/2024/11/handsome-hair.gif" alt="" width="500" height="419" /></li> </ul> </li> <li><strong>实时生成预览</strong>: <ul> <li>编辑后的图像实时显示,用户可以在生成结果区域预览修改效果。</li> <li>提供“确认”和“撤销”功能,确保每一步修改都符合用户意图。</li> </ul> </li> <li><strong>灵活参数调整</strong>: <ul> <li>用户可调节边缘强度、颜色透明度等参数,以适应不同的编辑需求。</li> </ul> </li> </ol> <h3 class="heading" align="center" data-immersive-translate-paragraph="1" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">III. 超实用的画布工具!</span></span></span></h3> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><img class="icon" src="https://magicquill.art/demo/tutorials/upload.svg" alt="SVG image" width="48" height="48" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /></div> <div align="center" data-immersive-translate-paragraph="1" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">点击此按钮上传您想要编辑的照片~</span></span></span></div> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><img class="icon" src="https://magicquill.art/demo/tutorials/eraser.svg" alt="SVG image" width="48" height="48" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /></div> <div align="center" data-immersive-translate-paragraph="1" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">用橡皮工具擦掉它就可以了!</span></span></span></div> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><img class="icon" src="https://magicquill.art/demo/tutorials/cursor.svg" alt="SVG image" width="48" height="48" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /></div> <div align="center" data-immersive-translate-paragraph="1" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">使用光标拖动、旋转和调整您的笔画大小 - 就像在 PowerPoint 中工作时一样!</span></span></span></div> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><img class="icon" src="https://magicquill.art/demo/tutorials/undo.svg" alt="add brush" width="100" /><span data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" data-immersive-translate-paragraph="1">&</span><img class="icon" src="https://magicquill.art/demo/tutorials/redo.svg" alt="minus brush" width="100" /></div> <div align="center" data-immersive-translate-paragraph="1" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">左边是 ctrl+z,右边是 ctrl+y - 你知道这意味着什么!😊</span></span></span><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">对于 Mac 用户,左边是 command+z,右边是 command+shift+z!😝</span></span></span></div> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><img class="icon" src="https://magicquill.art/demo/tutorials/delete.svg" alt="SVG image" width="48" height="48" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /></div> <div align="center" data-immersive-translate-paragraph="1" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">哎呀!这看起来不对 😵 - 点击这个垃圾桶删除这条线</span></span></span></div> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><img class="icon" src="https://magicquill.art/demo/tutorials/eye.svg" alt="SVG image" width="48" height="48" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /></div> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" data-immersive-translate-paragraph="1"><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">笔触挡住了我的视线,我怎么能看到图像😡?!试着点击这个按钮暂时隐藏你的笔触</span></span></span></div> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><img class="icon" src="https://magicquill.art/demo/tutorials/accept.svg" alt="add brush" width="100" /><span data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" data-immersive-translate-paragraph="1">&</span><img class="icon" src="https://magicquill.art/demo/tutorials/discard.svg" alt="minus brush" width="100" /></div> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" data-immersive-translate-paragraph="1"><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">这两个图标将在图像生成后出现...</span></span></span><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">我喜欢这个生成的图像😍,我想继续编辑!➡️ 点击✅继续编辑</span></span></span><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">这是什么东西 😡,我不想看到它! ➡️ 点击 ❎ 丢弃结果</span></span></span><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /></div> <hr /> <h3 class="heading" align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" data-immersive-translate-paragraph="1"><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate" data-immersive-translate-translation-element-mark="1"> </span><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-inline-wrapper-theme-none immersive-translate-target-translation-inline-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">IV. 注释</span></span></span></h3> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><img class="icon" src="https://magicquill.art/demo/tutorials/loading.svg" alt="SVG image" width="48" height="48" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /></div> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" data-immersive-translate-paragraph="1"><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">当你看到左下角的旋转图标时,这意味着魔法羽毛笔仍在充电 💪 等待它消失后再点击运行按钮!</span></span></span></div> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6"><br data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /><img class="icon" src="https://magicquill.art/demo/tutorials/wand.svg" alt="SVG image" width="48" height="48" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" /></div> <div align="center" data-immersive-translate-walked="db4eb60d-a09d-4a3e-bb14-97558c6566e6" data-immersive-translate-paragraph="1"><span class="notranslate immersive-translate-target-wrapper" lang="zh-CN" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-translation-theme-none immersive-translate-target-translation-block-wrapper-theme-none immersive-translate-target-translation-block-wrapper" data-immersive-translate-translation-element-mark="1"><span class="notranslate immersive-translate-target-inner immersive-translate-target-translation-theme-none-inner" data-immersive-translate-translation-element-mark="1">当魔法棒闪烁时,我们的画笔正在努力猜测您想画什么 🤔 请耐心等待!🙏</span></span></span></div> <hr /> <h3><strong>4.</strong><strong>多种风格模型支持</strong></h3> <ul> <li>提供多种生成风格模型,用户可以随时选择适合的模型切换风格,满足用户在不同艺术风格上的需求。</li> <li><strong>可用模型及适用场景</strong>: <ul> <li><strong>SD1.5/realisticVisionV60B1_v51VAE.safetensors</strong>: <ul> <li><strong>用途</strong>:生成逼真的写实风格图像。</li> <li><strong>推荐场景</strong>:适用于大多数日常编辑需求,如照片修复或背景调整。</li> </ul> </li> <li><strong>SD1.5/DreamShaper.safetensors</strong>: <ul> <li><strong>用途</strong>:生成梦幻风格的图像。</li> <li><strong>推荐场景</strong>:适合制作充满幻想和艺术感的场景。</li> </ul> </li> <li><strong>SD1.5/majicMIX_realistic</strong>: <ul> <li><strong>用途</strong>:擅长生成真实感强的人像。</li> <li><strong>推荐场景</strong>:适合头像设计或肖像照片的精细修改。</li> </ul> </li> <li><strong>SD1.5/MeinaMix.safetensors</strong>: <ul> <li><strong>用途</strong>:擅长生成动漫风格图像。</li> <li><strong>推荐场景</strong>:适合二次元插画设计和角色创作。</li> </ul> </li> <li><strong>SD1.5/ghostmix_v20Bakedvae.safetensors</strong>: <ul> <li><strong>用途</strong>:另一个适合生成动漫图像的模型。</li> <li><strong>推荐场景</strong>:尤其适合需要柔和风格和细腻细节的二次元图像。</li> </ul> </li> </ul> </li> </ul> <h3>5. <strong>高级参数调整</strong></h3> <ul> <li>为有经验的用户提供更精细的生成控制。</li> <li><strong>常用参数</strong>: <ul> <li><strong>细节控制</strong>:启用此选项后,可以增强边缘处理的精细程度。</li> <li><strong>笔触影响范围</strong>:调节笔触影响范围的像素大小。控制笔刷周围区域的扩展或缩小,决定修改区域的精确度。</li> <li><strong>颜色强度</strong>:调节颜色笔刷的控制强度,控制颜色的渲染范围和饱和度。</li> <li><strong>负面提示</strong>:用户可以输入希望模型避免生成的内容。</li> <li><strong>边缘强度:</strong>控制添加/删除笔刷的边缘影响强度。</li> </ul> </li> </ul> <strong>高级编辑能力</strong> MagicQuill 还提供了一些扩散模型的标准参数,虽然默认用户无需调整,但高级用户或行业专家可以探索更高级的设置。 <ul> <li style="list-style-type: none;"> <ul> <li><strong>噪声级别 (Noise Level)</strong>:调整生成过程中的随机性,影响生成图像的风格和细节。</li> <li><strong>采样步骤 (Sampling Steps)</strong>:增加步骤可提高生成质量,但会增加计算时间。</li> <li><strong>控制强度 (Control Strength)</strong>:决定用户输入的条件对生成结果的影响权重。</li> </ul> </li> <li style="list-style-type: none;"></li> </ul> <h5>功能总结表</h5> <table> <thead> <tr> <th><strong>功能</strong></th> <th><strong>描述</strong></th> <th><strong>应用场景</strong></th> </tr> </thead> <tbody> <tr> <td>添加笔刷</td> <td>增加新元素,如添加饰品、绘制草图生成内容</td> <td>人像添加帽子、画出植物的轮廓</td> </tr> <tr> <td>删除笔刷</td> <td>移除不需要的部分,如背景或杂物</td> <td>清除照片中的多余元素</td> </tr> <tr> <td>颜色笔刷</td> <td>修改区域颜色,如头发染色、衣物颜色调整</td> <td>改变服饰颜色,优化整体配色</td> </tr> <tr> <td>意图预测</td> <td>基于笔画和上下文自动生成提示,减少文本输入</td> <td>简单勾勒轮廓即可实现复杂修改</td> </tr> <tr> <td>连续编辑</td> <td>支持反复调整和多步编辑,实现逐步优化</td> <td>从背景到服饰的全局调整</td> </tr> <tr> <td>用户友好界面</td> <td>简洁的工具栏和实时预览设计,支持跨平台运行</td> <td>专业与非专业用户都可快速上手</td> </tr> <tr> <td>精确生成控制</td> <td>利用边缘和颜色条件提供高质量生成</td> <td>精细局部修改,保持自然过渡</td> </tr> </tbody> </table> <h3>一些案例</h3> <img class="aligncenter size-full wp-image-15614" src="https://img.xiaohu.ai/2024/11/hao.gif" alt="" width="500" height="647" /> <img class="aligncenter size-full wp-image-15615" src="https://img.xiaohu.ai/2024/11/yue.gif" alt="" width="500" height="647" /> <img class="aligncenter size-full wp-image-15613" src="https://img.xiaohu.ai/2024/11/qiuyu.gif" alt="" width="500" height="647" /> <img class="aligncenter size-full wp-image-15612" src="https://img.xiaohu.ai/2024/11/ka-leong.gif" alt="" width="500" height="647" /> <img class="aligncenter size-full wp-image-15616" src="https://img.xiaohu.ai/2024/11/zichen.gif" alt="" width="500" height="647" /> <img class="aligncenter size-full wp-image-15617" src="https://img.xiaohu.ai/2024/11/yujun.gif" alt="" width="500" height="647" /> <img class="aligncenter size-full wp-image-15618" src="https://img.xiaohu.ai/2024/11/qifeng.gif" alt="" width="500" height="647" /> <img class="aligncenter size-full wp-image-15619" src="https://img.xiaohu.ai/2024/11/zhiheng.gif" alt="" width="500" height="647" /> <img class="aligncenter size-full wp-image-15620" src="https://img.xiaohu.ai/2024/11/wen.gif" alt="" width="500" height="647" /> <h3>MagicQuill 的工作原理和技术方法</h3> MagicQuill 是基于**扩散模型(Diffusion Models)<strong>和</strong>多模态大语言模型(MLLM)**的智能图像编辑系统,通过整合用户友好的交互界面和高级生成技术,实现精准、高效的图像修改。 MagicQuill 的系统架构围绕三个核心模块展开,分别是<strong>编辑处理器(Editing Processor)</strong>、<strong>绘画助手(Painting Assistor)和创意采集器(Idea Collector)</strong>,并结合了<strong>扩散模型</strong>和<strong>多模态大语言模型(MLLM)</strong>。 <h5><strong><img class="aligncenter size-full wp-image-15646" src="https://img.xiaohu.ai/2024/11/Jietu20241122-200946@2x.jpg" alt="" width="2280" height="1074" />系统整体架构</strong></h5> MagicQuill 的架构由三个主要模块组成:<strong>编辑处理器 (Editing Processor)</strong>、<strong>绘画助手 (Painting Assistor)</strong> 和 <strong>创意收集器 (Idea Collector)</strong>,它们协同工作,实现高效的图像编辑。 <ul> <li><strong>编辑处理器 (Editing Processor)</strong>: <ul> <li>核心功能:通过扩散模型完成精确的图像生成和修改。</li> <li>技术实现:结合双分支架构,对用户输入的边缘和颜色条件进行处理,生成符合用户需求的高质量图像。</li> </ul> </li> <li><strong>绘画助手 (Painting Assistor)</strong>: <ul> <li>核心功能:通过多模态大语言模型,实时预测用户意图并生成编辑提示。</li> <li>技术实现:利用 LoRA 技术微调 LLaVA 模型,结合自定义数据集,提升对用户笔触的语义理解能力。</li> </ul> </li> <li><strong>创意收集器 (Idea Collector)</strong>: <ul> <li>核心功能:提供用户友好的界面,支持多种工具和参数调整。</li> <li>技术实现:采用模块化设计,支持跨平台运行并提供实时图像预览。</li> </ul> </li> </ul> <h4>一、工作原理</h4> <ol> <li><strong>用户交互输入</strong> <ul> <li>用户通过笔刷工具(添加、删除、上色)在图像上直接绘制。</li> <li>系统捕获用户绘制的边缘、颜色和掩膜信号。</li> </ul> </li> <li><strong>多模态意图解析</strong> <ul> <li>系统利用多模态大语言模型(MLLM)实时预测用户的编辑意图。</li> <li>自动生成适配的文本提示或补充上下文信息,减少用户手动输入需求。</li> </ul> </li> <li><strong>条件生成</strong> <ul> <li>基于用户输入的条件(如边缘、颜色或掩膜),系统通过扩散模型生成符合用户意图的图像内容。</li> <li>双分支架构确保了边缘和颜色条件的精确控制。</li> </ul> </li> <li><strong>实时反馈与调整</strong> <ul> <li>系统在用户每次操作后生成编辑结果,用户可进一步优化或撤销操作。</li> <li>编辑结果实时反映在界面中,支持连续和多次调整。</li> </ul> </li> </ol> <hr /> <h4>二、技术方法</h4> <h5>1. <strong>基于扩散模型的条件生成</strong></h5> MagicQuill 以扩散模型(Diffusion Models)为核心,通过两种条件(边缘和颜色)实现图像编辑: <ul> <li><strong>扩散模型简介</strong>: <ul> <li>扩散模型通过逐步去噪生成图像,其强大的生成能力可重现高质量的细节。</li> <li>MagicQuill 构建于 Stable Diffusion v1.5,并扩展了控制和涂抹分支。</li> </ul> </li> </ul> <strong>(1) 双分支架构</strong> <ul> <li><strong>涂抹分支(Inpainting Branch)</strong>: <ul> <li>负责在被掩膜的区域中生成内容。</li> <li>将用户绘制的掩膜区域输入扩散模型,通过内容感知的方式进行像素级生成。</li> </ul> </li> <li><strong>控制分支(Control Branch)</strong>: <ul> <li>负责解析用户输入的边缘和颜色信号。</li> <li>引入 ControlNet 机制,确保生成的内容严格符合用户的输入条件。</li> </ul> </li> </ul> <strong><img class="aligncenter size-full wp-image-15650" src="https://img.xiaohu.ai/2024/11/Jietu20241122-201018@2x.jpg" alt="" width="2212" height="1446" />(2) 边缘和颜色条件</strong> <ul> <li><strong>边缘条件(Edge Condition)</strong>: <ul> <li>提取原始图像的边缘信息(通过预训练的 CNN),并根据用户的笔刷修改这些边缘。</li> <li>添加笔刷在边缘图中插入新的边缘,删除笔刷则移除特定区域的边缘。</li> </ul> </li> <li><strong>颜色条件(Color Condition)</strong>: <ul> <li>用户的颜色笔刷输入被下采样为颜色块,形成全局颜色指导信号。</li> <li>系统通过条件插入机制将颜色块与边缘条件结合,生成具有精细颜色变化的内容。</li> </ul> </li> </ul> <strong>(3) 条件控制机制</strong> <ul> <li>使用 ControlNet 扩展扩散模型,通过将控制信号嵌入扩散模型的中间层,确保生成结果严格遵循用户输入。</li> <li>在生成过程中,模型通过调整控制强度参数(如 wC 和 wI)灵活处理结构和内容之间的权衡。</li> </ul> <img class="aligncenter size-full wp-image-15651" src="https://img.xiaohu.ai/2024/11/Jietu20241122-201000@2x.jpg" alt="" width="2202" height="1256" /> <img class="aligncenter size-full wp-image-15649" src="https://img.xiaohu.ai/2024/11/Jietu20241122-201038@2x-scaled.jpg" alt="" width="2560" height="797" /> <hr /> <h5>2. <strong>多模态大语言模型(MLLM)集成</strong></h5> <strong>(1) 意图预测</strong> MagicQuill 内置的绘画助手(Painting Assistor)通过多模态大语言模型(如 LLaVA)实时解析用户的绘画意图。 <ul> <li><strong>绘画与猜测(Draw&Guess)任务</strong>: <ul> <li>通过用户绘制的笔画和上下文信息预测编辑目标。</li> <li>使用问题回答(Q&A)框架,例如“这些笔画代表什么内容?”模型会生成一个短语作为预测结果。</li> </ul> </li> <li><strong>提示自动生成</strong>: <ul> <li>系统利用用户输入的轮廓或颜色,自动补充相应的文本提示,从而避免手动输入。</li> </ul> </li> </ul> <strong>(2) 数据集与模型优化</strong> <ul> <li><strong>模拟编辑场景数据集</strong>: <ul> <li>使用公开的 Densely Captioned Images(DCI)数据集,生成包含边缘、掩膜和颜色的模拟用户输入。</li> <li>数据集标注详细,包含多级语义信息,有助于模型学习复杂场景中的用户意图。</li> </ul> </li> <li><strong>微调方法</strong>: <ul> <li>使用 LoRA(Low-Rank Adaptation)技术对 LLaVA 模型进行轻量级微调,以优化其在绘画任务中的性能。</li> </ul> </li> </ul> <strong>(3) 意图推断逻辑</strong> <ul> <li>系统将用户笔画映射到已有的视觉和语义知识,通过生成的边界框、颜色信息等细化编辑目标。</li> </ul> <hr /> <h5>3. <strong>实时用户界面与交互</strong></h5> <strong>(1) 创意采集器(Idea Collector)</strong> <ul> <li>提供直观的交互界面,用户通过画布直接绘制,实时查看编辑结果。</li> <li><strong>主要功能</strong>: <ul> <li><strong>笔刷工具</strong>:支持添加、删除和颜色三种操作。</li> <li><strong>分层管理</strong>:对每次笔画操作进行分层组织,便于逐步调整。</li> <li><strong>实时预览</strong>:用户点击运行后,可在生成区域看到结果,并选择保存或撤销。</li> </ul> </li> </ul> <strong>(2) 跨平台支持</strong> <ul> <li>MagicQuill 界面基于 ReactJS 构建,可嵌入 Gradio 和 ComfyUI 等平台,支持多设备和浏览器环境。</li> </ul> <hr /> <h5>4. <strong>数据处理与模型训练</strong></h5> <strong>(1) 输入信号处理</strong> <ul> <li><strong>边缘提取</strong>:通过 CNN 提取图像边缘,结合笔刷信号生成新的边缘条件。</li> <li><strong>颜色简化</strong>:将用户的颜色笔画转化为低分辨率颜色块,既保留全局颜色信息,又简化局部细节。</li> </ul> <strong>(2) 模型训练优化</strong> <ul> <li><strong>数据增强</strong>:对边缘掩膜进行随机扩展和变化,以增强模型的泛化能力。</li> <li><strong>优化目标</strong>: <ul> <li>使用重建损失(如 LPIPS 和 SSIM)优化生成质量。</li> <li>利用语义相似度指标(如 BERT 和 CLIP)提升意图预测准确性。</li> </ul> </li> </ul> <hr /> <h4>三、系统优势</h4> <ol> <li><strong>精准编辑</strong>: <ul> <li>双分支架构结合扩散模型,确保边缘和颜色控制的高精度。</li> </ul> </li> <li><strong>简化操作</strong>: <ul> <li>用户只需绘制简单笔画,无需复杂文本输入即可完成任务。</li> </ul> </li> <li><strong>实时反馈</strong>: <ul> <li>生成结果快速呈现,支持连续调整。</li> </ul> </li> <li><strong>跨平台兼容</strong>: <ul> <li>界面支持多平台,满足广泛的应用场景需求。</li> </ul> </li> </ol> MagicQuill 通过整合强大的生成模型与用户友好的交互设计,提供了一种高效、直观的图像编辑解决方案,特别适用于需要精细修改的场景。 项目地址:<a href="https://magicquill.art/demo/" target="_blank" rel="noopener">https://magicquill.art/demo/</a> GitHub:<a href="https://github.com/magic-quill/magicquill" target="_blank" rel="noopener">https://github.com/magic-quill/magicquill</a> 论文:<a href="https://arxiv.org/abs/2411.09703" target="_blank" rel="noopener">https://arxiv.org/abs/2411.09703</a> 在线体验: <a href="https://huggingface.co/spaces/AI4Editing/MagicQuill" target="_blank" rel="noopener">https://huggingface.co/spaces/AI4Editing/MagicQuill</a> <a href="http://magic.chenjunfeng.xyz/" target="_blank" rel="noopener">http://magic.chenjunfeng.xyz/</a>