| Title: | Stable Diffusion Image Generation |
|---|---|
| Description: | Provides Stable Diffusion image generation using the 'ggmlR' library, with no 'Python' or external API dependencies. Supports text-to-image and image-to-image generation for SD 1.x, SD 2.x, 'SDXL', and Flux. A single sd_generate() function handles the entire pipeline, including sampling and high-resolution output. Features multi-GPU support, a 'Shiny' GUI, and runs on CPU or 'Vulkan' GPU across Linux, macOS, and Windows. |
| Authors: | Yuri Baramykov [aut, cre] (ORCID: <https://orcid.org/0009-0000-7627-4217>), Georgi Gerganov [ctb, cph] (Author of the GGML library), leejet [ctb, cph] (Author of stable-diffusion.cpp), stduhpf [ctb] (Core contributor to stable-diffusion.cpp), Green-Sky [ctb] (Contributor to stable-diffusion.cpp), wbruna [ctb] (Contributor to stable-diffusion.cpp), akleine [ctb] (Contributor to stable-diffusion.cpp), Martin Raiber [cph] (Copyright holder in miniz.h), Rich Geldreich [cph] (Author of miniz.h), RAD Game Tools [cph] (Copyright holder in miniz.h), Valve Software [cph] (Copyright holder in miniz.h), Alex Evans [cph] (PNG writing code in miniz.h), Sean Barrett [cph] (Author of stb_image.h), Jorge L Rodriguez [cph] (Author of stb_image_resize.h), Niels Lohmann [cph] (Author of json.hpp (nlohmann/json)), Susumu Yata [cph] (Author of darts.h (darts-clone)), Kuba Podgorski [cph] (Author of zip.h/zip.c (kuba--/zip)), Meta Platforms Inc. [cph] (rng_mt19937.hpp (ported from PyTorch)), Google Inc. [cph] (Sentencepiece tokenizer code in t5.hpp) |
| Maintainer: | Yuri Baramykov <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.1 |
| Built: | 2026-06-02 21:51:28 UTC |
| Source: | https://github.com/zabis13/sd2r |
LoRA apply modes
LORA_APPLY_MODELORA_APPLY_MODE
An object of class list of length 3.
Prediction types
PREDICTIONPREDICTION
An object of class list of length 6.
Mode strings accepted by the preview callback (see
sd_preview_start). "proj" is a fast linear projection
of the latent (cheap, rough), "tae" uses the tiny autoencoder
(needs taesd_path), "vae" runs the full VAE (slow, accurate).
PREVIEWPREVIEW
An object of class list of length 4.
Sampling methods
SAMPLE_METHODSAMPLE_METHOD
An object of class list of length 12.
Schedulers
SCHEDULERSCHEDULER
An object of class list of length 10.
Launches a plumber-based REST API for image generation. Optionally pre-loads a model at startup.
sd_api_start( model_path = NULL, model_type = "sd1", model_id = NULL, vae_decode_only = TRUE, host = "0.0.0.0", port = 8080L, api_key = NULL, ... )sd_api_start( model_path = NULL, model_type = "sd1", model_id = NULL, vae_decode_only = TRUE, host = "0.0.0.0", port = 8080L, api_key = NULL, ... )
model_path |
Optional path to model file to load at startup |
model_type |
Model type for the pre-loaded model (default "sd1") |
model_id |
Identifier for the pre-loaded model (default: basename of model_path) |
vae_decode_only |
VAE decode only for the pre-loaded model (default TRUE) |
host |
Host to bind to (default "0.0.0.0") |
port |
Port to listen on (default 8080) |
api_key |
Optional API key string. When set, non-localhost requests
must include |
... |
Additional arguments passed to |
Invisibly returns the plumber router object
## Not run: # Start with a pre-loaded model sd_api_start("model.safetensors", model_type = "flux", port = 8080) # Start empty, load models via API sd_api_start(port = 8080) # With API key sd_api_start("model.safetensors", api_key = "my-secret-key") ## End(Not run)## Not run: # Start with a pre-loaded model sd_api_start("model.safetensors", model_type = "flux", port = 8080) # Start empty, load models via API sd_api_start(port = 8080) # With API key sd_api_start("model.safetensors", api_key = "my-secret-key") ## End(Not run)
Stops the running plumber server and unloads all models.
sd_api_stop()sd_api_stop()
No return value, called for side effects.
Opens an interactive Shiny application for text-to-image generation. Requires the shiny and base64enc packages.
sd_app(model_dir = NULL, launch.browser = TRUE, port = NULL, ...)sd_app(model_dir = NULL, launch.browser = TRUE, port = NULL, ...)
model_dir |
Path to folder with model files. If provided, the app scans the folder on startup and auto-assigns model roles. |
launch.browser |
Open in browser (default TRUE) |
port |
Port number (default NULL = random) |
... |
Additional arguments passed to |
This function does not return; it runs the Shiny app until stopped.
## Not run: sd_app() sd_app(model_dir = "/path/to/models") ## End(Not run)## Not run: sd_app() sd_app(model_dir = "/path/to/models") ## End(Not run)
Cache modes
SD_CACHE_MODESD_CACHE_MODE
An object of class list of length 6.
Constructs a list of cache parameters for fine-tuning step caching behavior.
Pass the result as cache_config to generation functions.
sd_cache_params( mode = SD_CACHE_MODE$EASYCACHE, threshold = 1, start_percent = 0.15, end_percent = 0.95 )sd_cache_params( mode = SD_CACHE_MODE$EASYCACHE, threshold = 1, start_percent = 0.15, end_percent = 0.95 )
mode |
Cache mode integer from |
threshold |
Reuse threshold (default 1.0). Lower = more aggressive caching |
start_percent |
Start caching after this fraction of steps (default 0.15) |
end_percent |
Stop caching after this fraction of steps (default 0.95) |
Named list of cache parameters
Convert model to different quantization format
sd_convert( input_path, output_path, output_type = SD_TYPE$F16, vae_path = NULL, tensor_type_rules = NULL )sd_convert( input_path, output_path, output_type = SD_TYPE$F16, vae_path = NULL, tensor_type_rules = NULL )
input_path |
Path to input model file |
output_path |
Path for output model file |
output_type |
Target quantization type (see |
vae_path |
Optional path to separate VAE model |
tensor_type_rules |
Optional tensor type rules string |
TRUE on success
Loads a model and creates a context for image generation.
sd_ctx( model_path = NULL, vae_path = NULL, taesd_path = NULL, clip_l_path = NULL, clip_g_path = NULL, t5xxl_path = NULL, llm_path = NULL, diffusion_model_path = NULL, control_net_path = NULL, n_threads = 0L, wtype = SD_TYPE$COUNT, tensor_type_rules = NULL, vae_decode_only = TRUE, free_params_immediately = FALSE, keep_clip_on_cpu = FALSE, keep_vae_on_cpu = FALSE, enable_mmap = FALSE, vae_conv_direct = TRUE, diffusion_conv_direct = FALSE, diffusion_flash_attn = TRUE, rng_type = RNG_TYPE$CUDA, prediction = NULL, lora_apply_mode = LORA_APPLY_MODE$AUTO, flow_shift = 0, model_type = "sd1", vram_gb = NULL, device_layout = "mono", diffusion_gpu = -1L, clip_gpu = -1L, vae_gpu = -1L, verbose = FALSE )sd_ctx( model_path = NULL, vae_path = NULL, taesd_path = NULL, clip_l_path = NULL, clip_g_path = NULL, t5xxl_path = NULL, llm_path = NULL, diffusion_model_path = NULL, control_net_path = NULL, n_threads = 0L, wtype = SD_TYPE$COUNT, tensor_type_rules = NULL, vae_decode_only = TRUE, free_params_immediately = FALSE, keep_clip_on_cpu = FALSE, keep_vae_on_cpu = FALSE, enable_mmap = FALSE, vae_conv_direct = TRUE, diffusion_conv_direct = FALSE, diffusion_flash_attn = TRUE, rng_type = RNG_TYPE$CUDA, prediction = NULL, lora_apply_mode = LORA_APPLY_MODE$AUTO, flow_shift = 0, model_type = "sd1", vram_gb = NULL, device_layout = "mono", diffusion_gpu = -1L, clip_gpu = -1L, vae_gpu = -1L, verbose = FALSE )
model_path |
Path to the model file (safetensors, gguf, or checkpoint) |
vae_path |
Optional path to a separate VAE model |
taesd_path |
Optional path to TAESD model for preview |
clip_l_path |
Optional path to CLIP-L model |
clip_g_path |
Optional path to CLIP-G model |
t5xxl_path |
Optional path to T5-XXL model |
llm_path |
Optional path to an LLM text encoder (Qwen3 / Mistral-Small).
Required for models that use an LLM conditioner, e.g. FLUX.2 Klein (Qwen3),
FLUX.2 (Mistral-Small), Z-Image and Qwen-Image. Loaded into the
|
diffusion_model_path |
Optional path to separate diffusion model |
control_net_path |
Optional path to ControlNet model |
n_threads |
Number of CPU threads (0 = auto-detect) |
wtype |
Weight type for quantization (see |
tensor_type_rules |
Optional per-component weight type override, as a
comma-separated string of
Type names match ggml type names ( |
vae_decode_only |
If TRUE, only load VAE decoder (saves memory) |
free_params_immediately |
Free model params after first computation. If TRUE, the context can only be used for a single generation — subsequent calls will crash. Set to TRUE only when you need to save memory and will not reuse the context. Default is FALSE. |
keep_clip_on_cpu |
Keep CLIP model on CPU even when using GPU |
keep_vae_on_cpu |
Keep VAE on CPU even when using GPU |
enable_mmap |
Memory-map model weights from disk instead of reading them into a malloc'd buffer (default FALSE). Lowers RAM footprint for large models (e.g. Flux); pages are loaded on demand by the OS and shared across processes. Ignored for zip-archived weights. May slow the first generation slightly as pages fault in. |
vae_conv_direct |
Use direct Conv2d implementation in VAE (default TRUE). Faster on GPU; skips im2col and uses direct convolution kernels. |
diffusion_conv_direct |
Use direct Conv2d in diffusion model (default FALSE). |
diffusion_flash_attn |
Enable flash attention for diffusion model (default TRUE). Set to FALSE if you experience issues with specific GPU drivers or backends. |
rng_type |
RNG type (see |
prediction |
Prediction type override (see |
lora_apply_mode |
LoRA application mode (see |
flow_shift |
Flow shift value for Flux models |
model_type |
Model architecture hint: |
vram_gb |
Override available VRAM in GB. When set, disables auto-detection
and uses this value for strategy routing. Default |
device_layout |
GPU layout preset for multi-GPU systems. One of:
Ignored when |
diffusion_gpu |
Vulkan GPU device index for the diffusion model.
Default |
clip_gpu |
Vulkan GPU device index for CLIP/T5 text encoders.
Default |
vae_gpu |
Vulkan GPU device index for VAE encoder/decoder.
Default |
verbose |
If |
An external pointer to the SD context (class "sd_ctx") with
attributes model_type, vae_decode_only, vram_gb,
vram_total_gb, and vram_device.
## Not run: ctx <- sd_ctx("model.safetensors") imgs <- sd_txt2img(ctx, "a cat sitting on a chair") sd_save_image(imgs[[1]], "cat.png") ## End(Not run)## Not run: ctx <- sd_ctx("model.safetensors") imgs <- sd_txt2img(ctx, "a cat sitting on a chair") sd_save_image(imgs[[1]], "cat.png") ## End(Not run)
Decode a latent into a pixel image (low-level VAE decode)
sd_decode_latent(ctx, latent)sd_decode_latent(ctx, latent)
ctx |
SD context |
latent |
An sd_tensor list (e.g. the output of |
An sd_image list (width, height, channel,
data).
Returns a named list of all per-generation defaults used by
sd_generate. Edit the returned list and pass it back via the
params argument to set a reusable baseline; any explicit argument to
sd_generate() overrides the matching field.
sd_default_params()sd_default_params()
This is the R-level analogue of IRIS_PARAMS_DEFAULT. It covers
generation knobs only; context-construction options (model paths, devices,
offload, etc.) belong to sd_ctx.
A named list with fields: negative_prompt, width,
height, strength, sample_method, sample_steps,
cfg_scale, seed, batch_count, scheduler,
clip_skip, eta, hr_strength, vae_mode,
vae_tile_size, vae_tile_overlap, cache_mode,
cache_config.
p <- sd_default_params() p$sample_steps <- 30 p$cfg_scale <- 4.0 ## Not run: ctx <- sd_ctx("model.safetensors", model_type = "auto") imgs <- sd_generate(ctx, "a cat", params = p) ## End(Not run)p <- sd_default_params() p$sample_steps <- 30 p$cfg_scale <- 4.0 ## Not run: ctx <- sd_ctx("model.safetensors", model_type = "auto") imgs <- sd_generate(ctx, "a cat", params = p) ## End(Not run)
Runs the diffusion model once on x at sigma and returns the
denoised x_0 estimate. The Euler update of x is done by the caller
(see sd_sample_stepwise for the full loop). Must be called
between sd_sampler_begin and sd_sampler_end.
sd_denoise_step( ctx, x, sigma, cond, uncond = list(crossattn = NULL, vector = NULL, concat = NULL), cfg_scale = 7, step = 1L, total_steps = 1L )sd_denoise_step( ctx, x, sigma, cond, uncond = list(crossattn = NULL, vector = NULL, concat = NULL), cfg_scale = 7, step = 1L, total_steps = 1L )
ctx |
SD context |
x |
Current latent sd_tensor |
sigma |
Current sigma (scalar) |
cond |
Positive conditioning from |
uncond |
Negative conditioning; empty (all |
cfg_scale |
CFG scale (1 disables CFG) |
step, total_steps
|
1-based step index / total, for progress hooks |
An sd_tensor list — the denoised x_0 estimate.
Encode an image into a latent (low-level VAE encode)
sd_encode_image(ctx, image)sd_encode_image(ctx, image)
ctx |
SD context (must be built with |
image |
An sd_image list ( |
An sd_tensor list (type, ne, data) — the latent.
Runs only the text-encoder stage of the pipeline, returning the
conditioning tensors (analogue of SDCondition). Building block for
custom pipelines; most users want sd_generate.
sd_encode_text(ctx, prompt, clip_skip = -1L, width = -1L, height = -1L)sd_encode_text(ctx, prompt, clip_skip = -1L, width = -1L, height = -1L)
ctx |
SD context from |
prompt |
Text prompt |
clip_skip |
CLIP layers to skip (-1 = model default) |
width, height
|
Intended generation size (affects size-conditioning for some models, e.g. SDXL). -1 lets the model decide. |
A conditioning list with elements crossattn, vector,
concat; each is an sd_tensor list (type, ne,
data) or NULL when the model does not produce it.
Automatically selects the best generation strategy based on output resolution
and available VRAM (set via vram_gb in sd_ctx). For
txt2img, routes between direct generation, tiled sampling (MultiDiffusion),
or highres fix. For img2img (when init_image is provided), routes
between direct and tiled img2img.
sd_generate( ctx, prompt, negative_prompt = "", width = 512L, height = 512L, init_image = NULL, strength = 0.75, sample_method = SAMPLE_METHOD$EULER, sample_steps = 20L, cfg_scale = 7, seed = 42L, batch_count = 1L, scheduler = SCHEDULER$DISCRETE, clip_skip = -1L, eta = 0, hr_strength = 0.4, vae_mode = "auto", vae_tile_size = 64L, vae_tile_overlap = 0.25, cache_mode = c("off", "easy", "ucache"), cache_config = NULL, params = NULL, preview = FALSE, preview_path = NULL, preview_mode = PREVIEW$PROJ, preview_interval = 1L )sd_generate( ctx, prompt, negative_prompt = "", width = 512L, height = 512L, init_image = NULL, strength = 0.75, sample_method = SAMPLE_METHOD$EULER, sample_steps = 20L, cfg_scale = 7, seed = 42L, batch_count = 1L, scheduler = SCHEDULER$DISCRETE, clip_skip = -1L, eta = 0, hr_strength = 0.4, vae_mode = "auto", vae_tile_size = 64L, vae_tile_overlap = 0.25, cache_mode = c("off", "easy", "ucache"), cache_config = NULL, params = NULL, preview = FALSE, preview_path = NULL, preview_mode = PREVIEW$PROJ, preview_interval = 1L )
ctx |
SD context created by |
prompt |
Text prompt describing desired image |
negative_prompt |
Negative prompt (default "") |
width |
Image width in pixels (default 512) |
height |
Image height in pixels (default 512) |
init_image |
Optional init image for img2img. If provided, runs img2img
instead of txt2img. Requires |
strength |
Denoising strength for img2img (default 0.75). Ignored for txt2img. |
sample_method |
Sampling method (see |
sample_steps |
Number of sampling steps (default 20) |
cfg_scale |
Classifier-free guidance scale (default 7.0) |
seed |
Random seed (-1 for random) |
batch_count |
Number of images to generate (default 1) |
scheduler |
Scheduler type (see |
clip_skip |
Number of CLIP layers to skip (-1 = auto) |
eta |
Eta parameter for DDIM-like samplers |
hr_strength |
Denoising strength for highres fix refinement pass (default 0.4). Only used when auto-routing selects highres fix. |
vae_mode |
VAE processing mode: |
vae_tile_size |
Tile size for VAE tiling (default 64) |
vae_tile_overlap |
Overlap for VAE tiling (default 0.25) |
cache_mode |
Step caching mode: |
cache_config |
Optional fine-tuned cache config from
|
params |
Optional baseline list from |
preview |
If |
preview_path |
File path for the preview PPM. Defaults to a tempfile
when |
preview_mode |
Preview decode mode (see |
preview_interval |
Emit a preview every N steps (default 1). |
When vram_gb is not set on the context, defaults to direct generation
(equivalent to calling sd_txt2img or sd_img2img
directly).
List of SD images (or single image for highres fix path).
## Not run: # Simple — auto-routes based on detected VRAM ctx <- sd_ctx("model.safetensors", model_type = "sd1", vae_decode_only = FALSE) imgs <- sd_generate(ctx, "a cat", width = 2048, height = 2048) # Manual override — force 4 GB VRAM limit ctx4 <- sd_ctx("model.safetensors", model_type = "sd1", vram_gb = 4, vae_decode_only = FALSE) imgs <- sd_generate(ctx4, "a cat", width = 2048, height = 2048) ## End(Not run)## Not run: # Simple — auto-routes based on detected VRAM ctx <- sd_ctx("model.safetensors", model_type = "sd1", vae_decode_only = FALSE) imgs <- sd_generate(ctx, "a cat", width = 2048, height = 2048) # Manual override — force 4 GB VRAM limit ctx4 <- sd_ctx("model.safetensors", model_type = "sd1", vram_gb = 4, vae_decode_only = FALSE) imgs <- sd_generate(ctx4, "a cat", width = 2048, height = 2048) ## End(Not run)
Distributes prompts across available Vulkan GPUs, running one process per
GPU via callr. Each process creates its own sd_ctx and
calls sd_generate. Requires the callr package.
sd_generate_multi_gpu( model_path = NULL, prompts, negative_prompt = "", devices = NULL, seeds = NULL, width = 512L, height = 512L, model_type = "sd1", vram_gb = NULL, vae_decode_only = TRUE, progress = TRUE, diffusion_model_path = NULL, vae_path = NULL, clip_l_path = NULL, t5xxl_path = NULL, ... )sd_generate_multi_gpu( model_path = NULL, prompts, negative_prompt = "", devices = NULL, seeds = NULL, width = 512L, height = 512L, model_type = "sd1", vram_gb = NULL, vae_decode_only = TRUE, progress = TRUE, diffusion_model_path = NULL, vae_path = NULL, clip_l_path = NULL, t5xxl_path = NULL, ... )
model_path |
Path to the model file (single-file models like SD 1.x/2.x/SDXL) |
prompts |
Character vector of prompts (one image per prompt) |
negative_prompt |
Negative prompt applied to all images (default "") |
devices |
Integer vector of Vulkan device indices (0-based). Default
|
seeds |
Integer vector of seeds, same length as |
width |
Image width (default 512) |
height |
Image height (default 512) |
model_type |
Model type (default "sd1") |
vram_gb |
VRAM per GPU for auto-routing (default NULL) |
vae_decode_only |
VAE decode only (default TRUE) |
progress |
Print progress messages (default TRUE) |
diffusion_model_path |
Path to diffusion model (Flux/multi-file models) |
vae_path |
Path to VAE model |
clip_l_path |
Path to CLIP-L model |
t5xxl_path |
Path to T5-XXL model |
... |
Additional arguments passed to |
List of SD images, one per prompt, in original order.
Release any existing SD context (rm(ctx); gc()) before calling
this function. Holding a Vulkan context in the main process while
subprocesses try to use the same GPU can produce corrupted (grey) images.
## Not run: # Single-file model (SD 1.x/2.x/SDXL) imgs <- sd_generate_multi_gpu( "model.safetensors", prompts = c("a cat", "a dog", "a bird", "a fish"), devices = 0:1 ) # Multi-file model (Flux) imgs <- sd_generate_multi_gpu( diffusion_model_path = "flux1-dev-Q4_K_S.gguf", vae_path = "ae.safetensors", clip_l_path = "clip_l.safetensors", t5xxl_path = "t5-v1_1-xxl-encoder-Q5_K_M.gguf", prompts = c("a cat", "a dog"), model_type = "flux", devices = 0:1 ) ## End(Not run)## Not run: # Single-file model (SD 1.x/2.x/SDXL) imgs <- sd_generate_multi_gpu( "model.safetensors", prompts = c("a cat", "a dog", "a bird", "a fish"), devices = 0:1 ) # Multi-file model (Flux) imgs <- sd_generate_multi_gpu( diffusion_model_path = "flux1-dev-Q4_K_S.gguf", vae_path = "ae.safetensors", clip_l_path = "clip_l.safetensors", t5xxl_path = "t5-v1_1-xxl-encoder-Q5_K_M.gguf", prompts = c("a cat", "a dog"), model_type = "flux", devices = 0:1 ) ## End(Not run)
Runs generation with one or more reference images, as used by edit /
reference-conditioned models (e.g. Qwen-Image, FLUX control/edit variants).
The references are passed straight through to the underlying
generate_image C-API (ref_images); the active model decides how
to use them, so this only has effect on models that support reference
conditioning.
sd_generate_multiref( ctx, prompt, refs, negative_prompt = "", width = 512L, height = 512L, auto_resize_ref_image = TRUE, increase_ref_index = FALSE, sample_method = SAMPLE_METHOD$EULER, scheduler = SCHEDULER$DISCRETE, sample_steps = 20L, cfg_scale = 7, seed = 42L, clip_skip = -1L, eta = 0, batch_count = 1L )sd_generate_multiref( ctx, prompt, refs, negative_prompt = "", width = 512L, height = 512L, auto_resize_ref_image = TRUE, increase_ref_index = FALSE, sample_method = SAMPLE_METHOD$EULER, scheduler = SCHEDULER$DISCRETE, sample_steps = 20L, cfg_scale = 7, seed = 42L, clip_skip = -1L, eta = 0, batch_count = 1L )
ctx |
SD context from |
prompt |
Text prompt |
refs |
A list of sd_image lists (each with |
negative_prompt |
Negative prompt (default "") |
width, height
|
Output size in pixels |
auto_resize_ref_image |
If |
increase_ref_index |
If |
sample_method, scheduler
|
Sampler / scheduler (name or enum value) |
sample_steps, cfg_scale, seed, clip_skip, eta
|
Standard sampling controls |
batch_count |
Number of images (default 1) |
List of sd_image lists.
Converts the raw uint8 SD image format to a [height, width, channels] numeric array with values in [0, 1] suitable for R image processing.
sd_image_to_array(image)sd_image_to_array(image)
image |
SD image list (width, height, channel, data) |
3D numeric array [height, width, channels] in [0, 1]
Generate images with img2img
sd_img2img( ctx, prompt, init_image, negative_prompt = "", width = NULL, height = NULL, sample_method = SAMPLE_METHOD$EULER, sample_steps = 20L, cfg_scale = 7, seed = 42L, batch_count = 1L, scheduler = SCHEDULER$DISCRETE, clip_skip = -1L, strength = 0.75, eta = 0, vae_mode = "auto", vae_auto_threshold = 1048576L, vae_tile_size = 64L, vae_tile_overlap = 0.25, vae_tile_rel_x = NULL, vae_tile_rel_y = NULL, vae_tiling = NULL, cache_mode = c("off", "easy", "ucache"), cache_config = NULL )sd_img2img( ctx, prompt, init_image, negative_prompt = "", width = NULL, height = NULL, sample_method = SAMPLE_METHOD$EULER, sample_steps = 20L, cfg_scale = 7, seed = 42L, batch_count = 1L, scheduler = SCHEDULER$DISCRETE, clip_skip = -1L, strength = 0.75, eta = 0, vae_mode = "auto", vae_auto_threshold = 1048576L, vae_tile_size = 64L, vae_tile_overlap = 0.25, vae_tile_rel_x = NULL, vae_tile_rel_y = NULL, vae_tiling = NULL, cache_mode = c("off", "easy", "ucache"), cache_config = NULL )
ctx |
SD context created by |
prompt |
Text prompt describing desired image |
init_image |
Init image in sd_image format. Use |
negative_prompt |
Negative prompt (default "") |
width |
Image width in pixels (default 512) |
height |
Image height in pixels (default 512) |
sample_method |
Sampling method (see |
sample_steps |
Number of sampling steps (default 20) |
cfg_scale |
Classifier-free guidance scale (default 7.0) |
seed |
Random seed (-1 for random) |
batch_count |
Number of images to generate (default 1) |
scheduler |
Scheduler type (see |
clip_skip |
Number of CLIP layers to skip (-1 = auto) |
strength |
Denoising strength (0.0 = no change, 1.0 = full denoise, default 0.75) |
eta |
Eta parameter for DDIM-like samplers |
vae_mode |
VAE processing mode: |
vae_auto_threshold |
Pixel area fallback threshold for
|
vae_tile_size |
Tile size in latent pixels for tiled VAE (default 64).
Ignored when |
vae_tile_overlap |
Overlap ratio between tiles, 0.0-0.5 (default 0.25) |
vae_tile_rel_x |
Relative tile width as fraction of latent width (0-1)
or number of tiles (>1). NULL = use |
vae_tile_rel_y |
Relative tile height as fraction of latent height (0-1)
or number of tiles (>1). NULL = use |
vae_tiling |
Deprecated. Use |
cache_mode |
Step caching mode: |
cache_config |
Optional fine-tuned cache config from
|
List of SD images
Applies the denoiser's inverse noise scaling after the last step. A no-op for discrete CompVis denoisers (SD1/SD2/SDXL).
sd_inverse_noise_scale(ctx, x, sigma_last)sd_inverse_noise_scale(ctx, x, sigma_last)
ctx |
SD context |
x |
Latent sd_tensor after the last step |
sigma_last |
Last sigma of the schedule (typically 0) |
An sd_tensor.
Returns a data frame of all models recorded in the sd2R model registry, with a column indicating which are currently loaded in memory.
sd_list_models()sd_list_models()
Data frame with columns: id, model_type, loaded, diffusion_path
Reads a PNG file and converts it to the SD image format (list with width, height, channel, data) suitable for img2img.
sd_load_image(path, channels = 3L)sd_load_image(path, channels = 3L)
path |
Path to image file (PNG) |
channels |
Number of output channels (3 for RGB, default) |
SD image list (width, height, channel, data as raw vector)
Loads a model by its registry id. Returns a cached context if already
loaded, otherwise creates a new sd_ctx. Additional
arguments override registry defaults.
sd_load_model(id, ...)sd_load_model(id, ...)
id |
Model identifier from registry |
... |
Additional arguments passed to |
If loading fails due to insufficient VRAM, automatically unloads the least recently used model and retries.
SD context (external pointer)
## Not run: ctx <- sd_load_model("flux-dev") imgs <- sd_txt2img(ctx, "a cat in space") # Override defaults ctx <- sd_load_model("flux-dev", vae_decode_only = FALSE, verbose = TRUE) ## End(Not run)## Not run: ctx <- sd_load_model("flux-dev") imgs <- sd_txt2img(ctx, "a cat in space") # Override defaults ctx <- sd_load_model("flux-dev", vae_decode_only = FALSE, verbose = TRUE) ## End(Not run)
Load pipeline from JSON
sd_load_pipeline(path)sd_load_pipeline(path)
path |
Path to a JSON file saved by |
An sd_pipeline object.
Create a pipeline node
sd_node(type, ...)sd_node(type, ...)
type |
Node type: |
... |
Parameters for the node (passed to the corresponding function). |
A list with class "sd_node".
Applies the denoiser's noise scaling for the first sigma, producing the
starting x for the sampling loop. For txt2img pass init_latent
= NULL.
sd_noise_scale(ctx, noise, sigma0, init_latent = NULL)sd_noise_scale(ctx, noise, sigma0, init_latent = NULL)
ctx |
SD context |
noise |
Noise sd_tensor (defines geometry) |
sigma0 |
First sigma of the schedule |
init_latent |
Optional starting latent (img2img); |
An sd_tensor — the scaled starting latent.
Nodes are executed sequentially. The image output of each node is passed as input to the next node.
sd_pipeline(...)sd_pipeline(...)
... |
|
A list with class "sd_pipeline".
Installs the preview callback so that, during the next generation, the most
recent intermediate frame is written to path (a single PPM file,
updated atomically). Poll it with sd_read_preview. Call
sd_preview_stop when done.
sd_preview_start(path, mode = PREVIEW$PROJ, interval = 1L, denoised = TRUE)sd_preview_start(path, mode = PREVIEW$PROJ, interval = 1L, denoised = TRUE)
path |
File path for the preview PPM (e.g. a tempfile). |
mode |
Decode mode, one of |
interval |
Emit a preview every N sampling steps (default 1). |
denoised |
If |
Most users pass preview = TRUE to sd_generate instead,
which wires this up automatically.
Invisibly, path.
sd_read_preview, sd_preview_stop
Removes the preview callback and cleans up the temporary .tmp file.
sd_preview_stop()sd_preview_stop()
Invisibly NULL.
Returns a data frame of captured events with columns stage,
kind ("start"/"end"), and timestamp_ms.
Data frame of profile events.
Clears the event buffer and begins capturing stage timings from sd.cpp.
No return value, called for side effects.
Stops capturing stage events. Call sd_profile_get to retrieve.
No return value, called for side effects.
Matches start/end events by stage and computes durations.
sd_profile_summary(events)sd_profile_summary(events)
events |
Data frame from |
Data frame with columns stage, start_ms,
end_ms, duration_ms, duration_s.
Has class "sd_profile" for pretty printing.
Reads the latest preview PPM written by the running generation and returns
it as an sd_image list. Returns NULL if no preview exists yet (e.g.
generation has not produced a frame). Optionally writes a PNG copy.
sd_read_preview(path, png_path = NULL)sd_read_preview(path, png_path = NULL)
path |
The preview PPM path passed to |
png_path |
Optional path; if set, the frame is also written there as
PNG via |
An sd_image list (width, height, channel,
data), or NULL if unavailable.
Adds or updates a model entry in the sd2R model registry file. The
registry lives in tools::R_user_dir("sd2R", "config") by default
and can be overridden via the SD2R_REGISTRY_DIR environment
variable. The directory is created only when a model is actually
registered. Paths and defaults are stored for later use by
sd_load_model.
sd_register_model(id, model_type, paths, defaults = list(), overwrite = FALSE)sd_register_model(id, model_type, paths, defaults = list(), overwrite = FALSE)
id |
Unique model identifier (e.g. "flux-dev", "sd15-base") |
model_type |
Model architecture: "sd1", "sd2", "sdxl", "flux", "flux2", "sd3" |
paths |
Named list of file paths. Recognized names:
|
defaults |
Named list of generation defaults (optional). Recognized:
|
overwrite |
If FALSE (default), error when id already exists |
Invisible model id
## Not run: sd_register_model( id = "flux-dev", model_type = "flux", paths = list( diffusion = "models/flux1-dev-Q4_K_S.gguf", vae = "models/ae.safetensors", clip_l = "models/clip_l.safetensors", t5xxl = "models/t5xxl_fp16.safetensors" ), defaults = list(steps = 25, cfg_scale = 3.5, width = 1024, height = 1024) ) ## End(Not run)## Not run: sd_register_model( id = "flux-dev", model_type = "flux", paths = list( diffusion = "models/flux1-dev-Q4_K_S.gguf", vae = "models/ae.safetensors", clip_l = "models/clip_l.safetensors", t5xxl = "models/t5xxl_fp16.safetensors" ), defaults = list(steps = 25, cfg_scale = 3.5, width = 1024, height = 1024) ) ## End(Not run)
Removes the model entry from the sd2R model registry and unloads it from memory if loaded.
sd_remove_model(id)sd_remove_model(id)
id |
Model identifier |
No return value, called for side effects.
Executes nodes sequentially. The first node must be "txt2img"
(produces an image from nothing). Subsequent nodes receive the previous
node's image output.
sd_run_pipeline(pipeline, ctx, upscaler_ctx = NULL, verbose = FALSE)sd_run_pipeline(pipeline, ctx, upscaler_ctx = NULL, verbose = FALSE)
pipeline |
An |
ctx |
A Stable Diffusion context created by |
upscaler_ctx |
Optional upscaler context created by
|
verbose |
Logical. Print progress messages. Default |
The final image (sd_image list), or the path string if the last
node is "save".
Runs the full denoising loop given pre-computed conditioning and an explicit
noise tensor. Noise is supplied by the caller for determinism; use
seed to generate it reproducibly, or pass noise directly.
sd_sample( ctx, cond, uncond = list(crossattn = NULL, vector = NULL, concat = NULL), latent_shape = NULL, init_latent = NULL, noise = NULL, strength = 1, sample_method = SAMPLE_METHOD$EULER, scheduler = SCHEDULER$DISCRETE, sample_steps = 20L, cfg_scale = 7, eta = 0, seed = 42L, custom_sigmas = NULL )sd_sample( ctx, cond, uncond = list(crossattn = NULL, vector = NULL, concat = NULL), latent_shape = NULL, init_latent = NULL, noise = NULL, strength = 1, sample_method = SAMPLE_METHOD$EULER, scheduler = SCHEDULER$DISCRETE, sample_steps = 20L, cfg_scale = 7, eta = 0, seed = 42L, custom_sigmas = NULL )
ctx |
SD context |
cond |
Positive conditioning from |
uncond |
Negative conditioning from |
latent_shape |
Integer vector |
init_latent |
Optional starting latent for img2img (from
|
noise |
Optional explicit noise sd_tensor. When |
strength |
img2img denoising strength (ignored for txt2img) |
sample_method |
Sampling method (name or |
scheduler |
Scheduler (name or |
sample_steps |
Number of steps |
cfg_scale |
CFG scale |
eta |
Eta for DDIM-like samplers |
seed |
Seed for noise generation when |
custom_sigmas |
Optional explicit sigma schedule (overrides scheduler) |
An sd_tensor list — the denoised latent x_0. Pass to
sd_decode_latent.
sd_encode_text, sd_decode_latent
Equivalent to sd_sample for the Euler / Euler-a samplers, but
runs the loop in R so a callback can observe or interrupt each step (e.g.
live preview). For Euler (no ancestral noise) the result is bit-for-bit equal
to sd_sample; Euler-a differs (R RNG vs ggml RNG for the ancestral
term). Other samplers are not supported here — use sd_sample.
sd_sample_stepwise( ctx, cond, uncond = list(crossattn = NULL, vector = NULL, concat = NULL), latent_shape = NULL, init_latent = NULL, noise = NULL, width = 512L, height = 512L, sample_method = SAMPLE_METHOD$EULER, scheduler = SCHEDULER$DISCRETE, sample_steps = 20L, cfg_scale = 7, seed = 42L, custom_sigmas = NULL, on_step = NULL )sd_sample_stepwise( ctx, cond, uncond = list(crossattn = NULL, vector = NULL, concat = NULL), latent_shape = NULL, init_latent = NULL, noise = NULL, width = 512L, height = 512L, sample_method = SAMPLE_METHOD$EULER, scheduler = SCHEDULER$DISCRETE, sample_steps = 20L, cfg_scale = 7, seed = 42L, custom_sigmas = NULL, on_step = NULL )
ctx |
SD context |
cond |
Positive conditioning from |
uncond |
Negative conditioning; empty (all |
latent_shape |
Integer |
init_latent |
Optional starting latent (img2img); |
noise |
Optional explicit noise sd_tensor; generated from |
width, height
|
Generation size in PIXELS (for the sigma schedule) |
sample_method |
|
scheduler |
Scheduler (name or |
sample_steps |
Number of steps |
cfg_scale |
CFG scale |
seed |
Seed for noise generation when |
custom_sigmas |
Optional explicit sigma schedule (overrides scheduler) |
on_step |
Optional callback |
An sd_tensor — the denoised latent x_0.
Between begin and end the diffusion model keeps its GPU compute buffer alive
across sd_denoise_step calls, avoiding a large realloc per
step. Must be paired; sd_sampler_end frees the buffer. Not reentrant.
sd_sample_stepwise manages this for you.
sd_sampler_begin(ctx) sd_sampler_end(ctx)sd_sampler_begin(ctx) sd_sampler_end(ctx)
ctx |
SD context |
Invisibly NULL.
Returns the sigma schedule that sd_sample_stepwise iterates
over, for a given scheduler / step count / generation size.
sd_sampler_sigmas( ctx, scheduler = SCHEDULER$DISCRETE, sample_steps = 20L, width = 512L, height = 512L, sample_method = SAMPLE_METHOD$EULER )sd_sampler_sigmas( ctx, scheduler = SCHEDULER$DISCRETE, sample_steps = 20L, width = 512L, height = 512L, sample_method = SAMPLE_METHOD$EULER )
ctx |
SD context from |
scheduler |
Scheduler (name or |
sample_steps |
Number of steps |
width, height
|
Generation size in PIXELS (same as passed to generation) |
sample_method |
Sampling method (name or |
Numeric vector of length sample_steps + 1; the last value is 0.
sd_denoise_step, sd_sample_stepwise
Save SD image to PNG file
sd_save_image(image, path)sd_save_image(image, path)
image |
SD image (list with width, height, channel, data) as returned
by |
path |
Output file path (should end in .png) |
The file path (invisibly).
Save pipeline to JSON
sd_save_pipeline(pipeline, path)sd_save_pipeline(pipeline, path)
pipeline |
An |
path |
File path (should end in |
The file path, invisibly.
Scans for .safetensors and .gguf files, guesses component
roles and model types from filenames, groups multi-file models (Flux),
and registers them.
sd_scan_models(dir, overwrite = FALSE, recursive = FALSE)sd_scan_models(dir, overwrite = FALSE, recursive = FALSE)
dir |
Directory to scan |
overwrite |
If TRUE, overwrite existing entries (default FALSE) |
recursive |
Scan subdirectories (default FALSE) |
Single-file models (SD 1.5, SDXL) are registered individually. Multi-file Flux models are grouped when diffusion + supporting files (VAE, CLIP, T5) are found in the same directory.
Character vector of registered model ids (invisible)
## Not run: sd_scan_models("/mnt/models/") sd_list_models() ## End(Not run)## Not run: sd_scan_models("/mnt/models/") sd_list_models() ## End(Not run)
Reports whether the model in ctx consumes reference images (edit /
control / DiT families: Flux, Flux.2, SD3, Qwen-Image, Z-Image). Passing
refs to other models aborts inside ggml, so sd_generate_multiref
uses this to fail cleanly first.
sd_supports_ref_images(ctx)sd_supports_ref_images(ctx)
ctx |
SD context from |
Logical scalar.
Returns information about the stable-diffusion.cpp backend.
sd_system_info()sd_system_info()
List with system info, version, and core count
Generate images from text prompt
sd_txt2img( ctx, prompt, negative_prompt = "", width = 512L, height = 512L, sample_method = SAMPLE_METHOD$EULER, sample_steps = 20L, cfg_scale = 7, seed = 42L, batch_count = 1L, scheduler = SCHEDULER$DISCRETE, clip_skip = -1L, eta = 0, control_image = NULL, control_strength = 0.9, vae_mode = "auto", vae_auto_threshold = 1048576L, vae_tile_size = 64L, vae_tile_overlap = 0.25, vae_tile_rel_x = NULL, vae_tile_rel_y = NULL, vae_tiling = NULL, cache_mode = c("off", "easy", "ucache"), cache_config = NULL )sd_txt2img( ctx, prompt, negative_prompt = "", width = 512L, height = 512L, sample_method = SAMPLE_METHOD$EULER, sample_steps = 20L, cfg_scale = 7, seed = 42L, batch_count = 1L, scheduler = SCHEDULER$DISCRETE, clip_skip = -1L, eta = 0, control_image = NULL, control_strength = 0.9, vae_mode = "auto", vae_auto_threshold = 1048576L, vae_tile_size = 64L, vae_tile_overlap = 0.25, vae_tile_rel_x = NULL, vae_tile_rel_y = NULL, vae_tiling = NULL, cache_mode = c("off", "easy", "ucache"), cache_config = NULL )
ctx |
SD context created by |
prompt |
Text prompt describing desired image |
negative_prompt |
Negative prompt (default "") |
width |
Image width in pixels (default 512) |
height |
Image height in pixels (default 512) |
sample_method |
Sampling method (see |
sample_steps |
Number of sampling steps (default 20) |
cfg_scale |
Classifier-free guidance scale (default 7.0) |
seed |
Random seed (-1 for random) |
batch_count |
Number of images to generate (default 1) |
scheduler |
Scheduler type (see |
clip_skip |
Number of CLIP layers to skip (-1 = auto) |
eta |
Eta parameter for DDIM-like samplers |
control_image |
Optional control image for ControlNet (sd_image format) |
control_strength |
ControlNet strength (default 0.9) |
vae_mode |
VAE processing mode: |
vae_auto_threshold |
Pixel area fallback threshold for
|
vae_tile_size |
Tile size in latent pixels for tiled VAE (default 64).
Ignored when |
vae_tile_overlap |
Overlap ratio between tiles, 0.0-0.5 (default 0.25) |
vae_tile_rel_x |
Relative tile width as fraction of latent width (0-1)
or number of tiles (>1). NULL = use |
vae_tile_rel_y |
Relative tile height as fraction of latent height (0-1)
or number of tiles (>1). NULL = use |
vae_tiling |
Deprecated. Use |
cache_mode |
Step caching mode: |
cache_config |
Optional fine-tuned cache config from
|
List of SD images. Each image is a list with
width, height, channel, and data (raw vector of RGB pixels).
Use sd_save_image to save or sd_image_to_array to convert.
Generates a large image by independently rendering overlapping patches at
the model's native resolution, then stitching them with linear blending.
An optional img2img harmonization pass can smooth seams further.
sd_txt2img_highres( ctx, prompt, negative_prompt = "", width = 2048L, height = 2048L, tile_size = NULL, overlap = 0.125, img2img_strength = NULL, sample_method = SAMPLE_METHOD$EULER, sample_steps = 20L, cfg_scale = 7, seed = 42L, scheduler = SCHEDULER$DISCRETE, clip_skip = -1L, eta = 0, vae_mode = "auto", vae_auto_threshold = 1048576L, vae_tile_size = 64L, vae_tile_overlap = 0.25 )sd_txt2img_highres( ctx, prompt, negative_prompt = "", width = 2048L, height = 2048L, tile_size = NULL, overlap = 0.125, img2img_strength = NULL, sample_method = SAMPLE_METHOD$EULER, sample_steps = 20L, cfg_scale = 7, seed = 42L, scheduler = SCHEDULER$DISCRETE, clip_skip = -1L, eta = 0, vae_mode = "auto", vae_auto_threshold = 1048576L, vae_tile_size = 64L, vae_tile_overlap = 0.25 )
ctx |
SD context created by |
prompt |
Text prompt |
negative_prompt |
Negative prompt (default "") |
width |
Target image width in pixels |
height |
Target image height in pixels |
tile_size |
Patch size in pixels. |
overlap |
Overlap between patches as fraction of |
img2img_strength |
If not |
sample_method |
Sampling method (see |
sample_steps |
Number of sampling steps (default 20) |
cfg_scale |
Classifier-free guidance scale (default 7.0) |
seed |
Base random seed. Each patch gets |
scheduler |
Scheduler type (see |
clip_skip |
Number of CLIP layers to skip (-1 = auto) |
eta |
Eta parameter for DDIM-like samplers |
vae_mode |
VAE tiling mode for the harmonization pass
(default |
vae_auto_threshold |
Pixel area fallback threshold for auto VAE tiling when VRAM query is unavailable |
vae_tile_size |
Tile size for VAE tiling (default 64) |
vae_tile_overlap |
Overlap for VAE tiling (default 0.25) |
SD image (list with width, height, channel, data)
## Not run: ctx <- sd_ctx("sd15.safetensors", model_type = "sd1") img <- sd_txt2img_highres(ctx, "a panoramic mountain landscape", width = 2048, height = 1024) sd_save_image(img, "panorama.png") ## End(Not run)## Not run: ctx <- sd_ctx("sd15.safetensors", model_type = "sd1") img <- sd_txt2img_highres(ctx, "a panoramic mountain landscape", width = 2048, height = 1024) sd_save_image(img, "panorama.png") ## End(Not run)
Generates images at any resolution using tiled sampling: at each denoising step the latent is split into overlapping tiles, each tile is denoised independently by the UNet, and results are merged with Gaussian weighting. VRAM usage is bounded by tile size, not output resolution.
sd_txt2img_tiled( ctx, prompt, negative_prompt = "", width = 2048L, height = 2048L, sample_tile_size = NULL, sample_tile_overlap = 0.25, sample_method = SAMPLE_METHOD$EULER, sample_steps = 20L, cfg_scale = 7, seed = 42L, batch_count = 1L, scheduler = SCHEDULER$DISCRETE, clip_skip = -1L, eta = 0, vae_mode = "auto", vae_auto_threshold = 1048576L, vae_tile_size = 64L, vae_tile_overlap = 0.25, vae_tile_rel_x = NULL, vae_tile_rel_y = NULL, cache_mode = c("off", "easy", "ucache"), cache_config = NULL )sd_txt2img_tiled( ctx, prompt, negative_prompt = "", width = 2048L, height = 2048L, sample_tile_size = NULL, sample_tile_overlap = 0.25, sample_method = SAMPLE_METHOD$EULER, sample_steps = 20L, cfg_scale = 7, seed = 42L, batch_count = 1L, scheduler = SCHEDULER$DISCRETE, clip_skip = -1L, eta = 0, vae_mode = "auto", vae_auto_threshold = 1048576L, vae_tile_size = 64L, vae_tile_overlap = 0.25, vae_tile_rel_x = NULL, vae_tile_rel_y = NULL, cache_mode = c("off", "easy", "ucache"), cache_config = NULL )
ctx |
SD context created by |
prompt |
Text prompt describing desired image |
negative_prompt |
Negative prompt (default "") |
width |
Target image width in pixels (can exceed model native resolution) |
height |
Target image height in pixels |
sample_tile_size |
Tile size in latent pixels (default |
sample_tile_overlap |
Overlap between tiles as fraction of tile size, 0.0-0.5 (default 0.25). |
sample_method |
Sampling method (see |
sample_steps |
Number of sampling steps (default 20) |
cfg_scale |
Classifier-free guidance scale (default 7.0) |
seed |
Random seed (-1 for random) |
batch_count |
Number of images to generate (default 1) |
scheduler |
Scheduler type (see |
clip_skip |
Number of CLIP layers to skip (-1 = auto) |
eta |
Eta parameter for DDIM-like samplers |
vae_mode |
VAE processing mode: |
vae_auto_threshold |
Pixel area fallback threshold for
|
vae_tile_size |
Tile size in latent pixels for tiled VAE (default 64).
Ignored when |
vae_tile_overlap |
Overlap ratio between tiles, 0.0-0.5 (default 0.25) |
vae_tile_rel_x |
Relative tile width as fraction of latent width (0-1)
or number of tiles (>1). NULL = use |
vae_tile_rel_y |
Relative tile height as fraction of latent height (0-1)
or number of tiles (>1). NULL = use |
cache_mode |
Step caching mode: |
cache_config |
Optional fine-tuned cache config from
|
Requires tiled VAE (enabled automatically via vae_mode = "auto").
List of SD images
## Not run: ctx <- sd_ctx("sd15.safetensors", model_type = "sd1") imgs <- sd_txt2img_tiled(ctx, "a vast mountain landscape", width = 2048, height = 1024) sd_save_image(imgs[[1]], "landscape.png") ## End(Not run)## Not run: ctx <- sd_ctx("sd15.safetensors", model_type = "sd1") imgs <- sd_txt2img_tiled(ctx, "a vast mountain landscape", width = 2048, height = 1024) sd_save_image(imgs[[1]], "landscape.png") ## End(Not run)
Weight types (ggml quantization types)
SD_TYPESD_TYPE
An object of class list of length 15.
Removes all cached contexts. Registry is preserved.
sd_unload_all()sd_unload_all()
No return value, called for side effects.
Removes the cached context for the given model id. The model remains
in the registry and can be reloaded with sd_load_model.
sd_unload_model(id)sd_unload_model(id)
id |
Model identifier |
No return value, called for side effects.
Upscale an image using ESRGAN
sd_upscale_image(esrgan_path, image, upscale_factor = 4L, n_threads = 0L)sd_upscale_image(esrgan_path, image, upscale_factor = 4L, n_threads = 0L)
esrgan_path |
Path to ESRGAN model file |
image |
SD image to upscale (list with width, height, channel, data) |
upscale_factor |
Upscale factor (default 4) |
n_threads |
Number of CPU threads (0 = auto-detect) |
Upscaled SD image
Returns the number of Vulkan-capable GPU devices available on the system.
Useful for deciding whether to use sd_generate_multi_gpu.
sd_vulkan_device_count()sd_vulkan_device_count()
Integer, number of Vulkan devices (0 if Vulkan is not available)