Unified Container – > WIS – > JIT on GPU nodes
One image runs on both vendors—no config conflicts, no rebuilds.
Execute from CPU-only dev/CI while kernels run on a shared GPU pool.
More workloads per GPU with consistent performance.
Wooly Controller to manage client kernel requests across multiple GPU clusters – Wooly Controller routes client CUDA kernels to available GPUs based on live utilization and saturation metrics.
Integration with Kubernetes – Use Wooly Client Docker Image and your existing K8 workflow to spin up/manage ML dev environments. K8 pods are not bound to specific GPUs.
Ray for orchestration, Wooly for all GPU work – Ray head + workers run on CPU instances (or mixed), each worker uses the Wooly Client container. Ray doesn’t bind real GPUs.