Your agent stopped suggesting code and started running it.

Your agent stopped suggesting code and started running it. Today's launches are all fence, no engine. The shift that makes them matter is "code mode" — agents like VLM Run's Orion 2 now write a whole program and execute it end to end instead of asking permission one tool call at a time. The unit of risk used to be a function call you could approve; now it's a script that already ran, plus whatever it installed and whatever it read on the way. So the interesting work this week isn't a smarter agent — it's the perimeter around a dumb one you can't fully trust: a box to run it in, a leash on what it installs, a blindfold over what it sees. (There's even a benchmark now, islo-labs' RewardHackBench, for measuring whether the box actually holds when the agent tries to cheat its way out.)

Vpod — a Linux sandbox that boots in WASM

An open-source CLI plus Python SDK that spins up a throwaway Linux sandbox to run untrusted code, built by emulating a RISC-V machine inside WebAssembly. You pull a snapshot (Alpine by default) and it boots in under a second; everything runs inside the WASM boundary with only the filesystem, network, and I/O you hand it through WASI. Reach for it when your agent wants to run code you didn't write and you'd rather not give it your actual machine — `pip install vpod`, `Sandbox.create()`, done, no daemon and no root. Delete the "I'll just run it in a Docker container I forgot to lock down" habit, and the heavier Firecracker setup for cases where you only need process-level isolation. The tradeoff is honest: emulating a CPU in WASM means CPU-bound work runs slower than native, so this is a fence for untrusted code, not a place to do real compute. Apache-2.0, 42 stars, actively pushed.

→ github.com/capsulerun/vpod

VELA — the same problem, solved with microVMs instead

The opposite bet from Vpod, shipped the same week. Where Vpod isolates with WebAssembly, VELA wraps agent code in Firecracker micro-VMs (~150ms cold start) and gates it with HMAC capability tokens — scoped, time-bound grants for filesystem, network, and memory, with a JSONL audit trail of everything the code did. It ships Python wrappers and LangChain adapters, so you route an agent's tool calls into the sandbox without rewriting the agent. That two people independently shipped "a box for code my agent wrote" this week, with opposite isolation philosophies, is the tell that this is a real category forming. Delete the bare `exec()` at the end of your LangChain tool. The tradeoff is bluntness for once: this is one developer's project, one star, last pushed two weeks ago, and the repo carries no license file despite the launch page claiming MIT. Read it for the capability-token design — the cleanest part — not as a dependency to adopt today. If you need this in production now, the microVM idea is sound but the maintenance isn't proven.

→ github.com/karnati-praveen/VELA

Refuse — block the CVE before your agent installs it

A single binary that sits in front of npm, pip, cargo, go, and a dozen-plus other package managers and checks each package against live CVE data at install time — before it touches disk, not after. If something's vulnerable it blocks the install and prints the safe version (`blocked — CVE-2019-10744 (critical); safe: 4.17.21`). The pitch names the actual new threat directly: coding agents like Claude Code and Cursor cheerfully `npm install` a pinned version with a known critical CVE because that's the version that was in the training data, and nothing stops them. Refuse does. Delete the post-hoc `npm audit` you run after the dependency is already in your lockfile — this moves the check left, to install time, where it can still say no. The tradeoff: it's self-hostable as one Docker container with no telemetry, but the project is brand new (the CLI repo is at two stars), so you're trusting a fresh tool to sit in your install path. Apache-2.0, hosted tier available if you don't want to run it yourself.

→ refuse.dev

pii-gui — redact it locally before any model sees it

A cross-platform desktop app (Tauri, so a real native binary for macOS, Windows, and Linux — not a web tab) that finds and redacts personal data in your documents entirely on-device before you paste them into an AI tool. Detection runs in a Rust backend with no network calls except an optional one-time model download: regex for emails, phone numbers, account numbers, and secrets, plus optional ONNX models (OpenAI's privacy filter, a multilingual EU-PII model) for the fuzzier cases. You review and toggle each match, then export — and for PDFs the redaction is burned in as opaque rectangles, so the text underneath is actually gone, not just visually covered. Delete the habit of pasting a customer email thread straight into a chat window and hoping. The tradeoff is that it's a manual review step you have to actually run, not a transparent proxy that catches leaks automatically — and at 14 stars it's early, though the feature set (PDF burn-in, persistence, localization) is further along than the star count suggests. AGPL-3.0, installers on the releases page.

→ github.com/sophia486/pii-gui

One of these,
every weekday.