Pp Agent Capture

Purpose

macOS screen capture, window recording, GIF conversion, and agent evidence bundles from the terminal. Built on ScreenCaptureKit for window-level targeting ffmpeg cannot do. Use when the user wants a screenshot of a specific window or app, a screen recording, a GIF conversion, a before/after diff, an evidence bundle for a PR, OCR text from a window, a terminal VHS recording, a Remotion render, or wants to watch a UI for changes. Requires macOS Screen Recording permission on first run.

Description

Agent Capture - Printing Press CLI

Prerequisites: Install the CLI

This skill drives the agent-capture-pp-cli binary. You must verify the CLI is installed before invoking any command from this skill. If it is missing, install it first:

  1. Install via the Printing Press installer:
    npx -y @mvanhorn/printing-press install agent-capture --cli-only
    
  2. Verify: agent-capture-pp-cli --version
  3. Ensure $GOPATH/bin (or $HOME/go/bin) is on $PATH.

If the npx install fails (no Node, offline, etc.), fall back to a direct Go install (requires Go 1.26.3 or newer):

go install github.com/mvanhorn/printing-press-library/library/developer-tools/agent-capture/cmd/agent-capture-pp-cli@latest

If --version reports "command not found" after install, the install step did not put the binary on $PATH. Do not proceed with skill commands until verification succeeds.

When to Use This CLI

Reach for this when the user wants:

  • screenshot a specific window or app (screenshot, batch for multiple)
  • record video of a window, app, display, or region (record)
  • convert a recording to an optimized GIF (convert)
  • do a full capture + record + GIF pipeline in one command (pipeline)
  • diff against a baseline screenshot (diff) for before/after evidence
  • bundle screenshots + recording + GIF as evidence for a PR or bug report (evidence)
  • find the right window by fuzzy-matching its title (find)
  • stitch multiple screenshots into an animated GIF (stitch)
  • extract text from a window using macOS Vision OCR (ocr)
  • record a terminal session via VHS tape files (vhs)
  • render Remotion compositions to video or stills (remotion)
  • monitor a UI by periodic screenshots (watch)
  • save and replay capture configs (preset)

Skip it on non-macOS hosts; the CLI uses ScreenCaptureKit (macOS only). On first run it will prompt for Screen Recording permission; the permissions command guides that flow.

Argument Parsing

Parse $ARGUMENTS:

  1. Empty, help, or --help -> show agent-capture --help
  2. Starts with install -> CLI installation (no MCP server ships today)
  3. Anything else -> Direct Use (map to the best command and run it)

Direct Use

  1. Check installed: which agent-capture. If missing, offer CLI installation.
  2. If permissions aren't granted, run agent-capture permissions first.
  3. Use agent-capture list to see available capture targets (open windows, displays).
  4. Use agent-capture find <text> to fuzzy-match a window title before capturing.
  5. Execute with --json for structured output (agent-native default):
    agent-capture <command> [args] --json
    

Notable Commands

CommandWhat it does
screenshotCapture a window, app, display, or region
recordRecord video of a window, app, display, or region
pipelineRecord + convert + optimize in one command
convertVideo -> optimized GIF (two-pass palette)
diffCapture + diff against a baseline
evidenceFull bundle (screenshots + recording + GIF) for a PR
batchScreenshot multiple apps in one invocation
findFuzzy search open window titles
listList available capture targets
ocrExtract text from a window using macOS Vision
stitchCombine screenshots into an animated GIF
vhsRun a VHS tape file for terminal recording
remotionRender Remotion compositions
watchPeriodic capture for UI monitoring
presetSave / load capture configs
permissionsGuide Screen Recording permission setup
healthMachine-readable CI / agent preflight

Run any command with --help for full flag documentation.

Agent Mode

Add --agent to any command. Expands to: --json --compact --no-input --no-color --yes.

  • Pipeable — JSON on stdout, errors on stderr
  • Filterable--select keeps a subset of fields, with dotted-path support (see below)
  • Previewable--dry-run shows the request without sending
  • Cacheable — GET responses cached for 5 minutes, bypass with --no-cache
  • Non-interactive — never prompts, every input is a flag

Filtering output

--select accepts dotted paths to descend into nested responses; arrays traverse element-wise:

<cli>-pp-cli <command> --agent --select id,name
<cli>-pp-cli <command> --agent --select items.id,items.owner.name

Use this to narrow huge payloads to the fields you actually need — critical for deeply nested API responses.

Exit Codes

CodeMeaning
0Success
2Usage error (wrong arguments)
3Target not found (no matching window or display)
4Permissions missing (Screen Recording not granted)
5Capture error (ScreenCaptureKit failure, ffmpeg failure)

Source

Codex

Capabilities

  • See original source for capabilities.

Inputs

Inputs depend on the skill's trigger and arguments. See the source SKILL.md.

Outputs

Outputs depend on the skill. Typical: files written, reports generated, agent actions performed.

When To Use

When the user invokes /pp-agent-capture or describes a task the skill's description matches.

Dependencies

See the source skill's references and scripts folders.

Related Systems

  • Claude (if synced from ~/.claude/skills/pp-agent-capture)
  • HQ Project — landing page Skills section
  • MD Project (md.sgnk.ai) — Skills/Pp Agent Capture/
  • Obsidian Vault — Skills/Pp Agent Capture/

Examples

See workflow.md.


Source: ~/.codex/skills/pp-agent-capture/SKILL.md Category: Printing Press