Claude Code Tools

paper-fetch

github

DOI → PDF resolver with 7-source fallback (Unpaywall, S2, arXiv, PMC, bioRxiv, publisher, Sci-Hub). Multi-agent, zero-deps Python.

Stars
⭐ 96
License
MIT
Last Updated
2026-05-20
Source
github

paper-fetch — Download scientific paper PDFs by DOI

License: MIT GitHub stars GitHub forks Latest Release Last Commit

SkillsMP ClawHub Claude Code Plugin Agent Skills Discord

English · 中文 · 📖 Online Docs

Resolve a DOI (or title) to a PDF via a 7-source fallback chain — UnpaywallSemantic ScholararXivPubMed CentralbioRxiv/medRxiv → publisher direct → Sci-Hub mirrors. Pure Python stdlib, agent-native CLI with stable JSON envelopes.

What it does

Resolve a DOI (or title) to a PDF

  • 7-source fallback chain: UnpaywallSemantic ScholararXivPubMed CentralbioRxiv/medRxiv → publisher direct (institutional opt-in)Sci-Hub mirrors (last resort, on by default)
  • Title-only input via --title — Crossref + Semantic Scholar resolution with confidence flags
  • Auto-named output: {first_author}_{year}_{journal_abbrev}_{short_title}.pdf

Batch + agent-friendly

  • --batch dois.txt or --batch - (stdin) for bulk download
  • --idempotency-key replays the exact envelope on retry without network I/O
  • --stream emits one NDJSON result per line as each DOI resolves
  • Skips already-downloaded files unless --overwrite

Built-in correctness

  • Stable JSON envelope on stdout, NDJSON progress on stderr, machine-readable schema subcommand
  • TTY-aware format default, typed exit codes (0/1/3/4) for orchestrator routing
  • SSRF defense + %PDF magic-byte check + 50 MB size cap on every fetch
  • Zero runtime dependencies — pure Python stdlib

Works with Claude Code, Codex, Hermes, OpenClaw, ClawHub, pi-mono, and SkillsMP — any agent that supports the Agent Skills format.

Discipline coverage

The skill is discipline-agnostic — it works for any field, not just life sciences or CS.

SourceDiscipline scope
Unpaywall✅ All disciplines (every Crossref DOI — humanities, social sciences, physics, chemistry, economics)
Semantic Scholar✅ All disciplines (cross-domain academic graph)
arXivPhysics, math, CS, statistics, quant finance, economics, EE
PubMed CentralBiomedical only
bioRxiv / medRxivBiology / medicine preprints only
Sci-Hub✅ All disciplines (last resort)

In practice, Unpaywall + Semantic Scholar alone cover OA papers in chemistry, materials, economics, psychology, humanities, and every other field via institutional repositories, SSRN, RePEc, and publisher-hosted OA copies.

Comparison

vs. native agent (no skill)

FeatureNative agentThis skill
Resolve DOI to PDFAd-hoc web searchDeterministic 7-source chain
Title → DOI resolutionManual--title (Crossref + S2 fallback, confidence flags)
Batch download--batch dois.txt or --batch -
Consistent filenamesauthor_year_journal_title.pdf
Machine-readable schemafetch.py schema
Structured output✅ JSON envelope + NDJSON progress
Idempotent retries--idempotency-key
Typed exit codes0/1/3/4
SSRF + %PDF + size cap✅ enforced

Prerequisites

  • python3 (3.8+, stdlib only — no pip install needed)

  • (Recommended) An Unpaywall contact email:

    export UNPAYWALL_EMAIL=you@example.com

Without it, Unpaywall is skipped and the remaining 6 sources still work.

Installation

# Any agent (Claude Code, Cursor, Copilot, etc.)
npx skills add Agents365-ai/365-skills -g

# Claude Code only
> /plugin marketplace add Agents365-ai/365-skills
> /plugin install paper-fetch

Also published on SkillsMP and ClawHub — each handles updates through its own marketplace.

Usage

Just describe what you want:

> Download the AlphaFold2 paper PDF to ~/papers

> Fetch DOI 10.1038/s41586-020-2649-2

> Batch-download every DOI from dois.txt

> Find a PDF for "Attention Is All You Need" and save it

> Preview the resolved PDF URL for 10.1126/science.abj8754 without downloading

Or call the script directly:

# Single DOI
python skills/paper-fetch/scripts/fetch.py 10.1038/s41586-021-03819-2

# By title (resolved to DOI via Crossref + S2 fallback)
python skills/paper-fetch/scripts/fetch.py --title "Highly accurate protein structure prediction with AlphaFold"

# Dry-run preview (no download)
python skills/paper-fetch/scripts/fetch.py 10.1038/s41586-020-2649-2 --dry-run

# Batch with idempotency
python skills/paper-fetch/scripts/fetch.py --batch dois.txt --out ~/papers \
    --idempotency-key monday-review-batch

# Pipe DOIs from another tool
echo 10.1038/s41586-021-03819-2 | python skills/paper-fetch/scripts/fetch.py --batch -

# Agent discovery
python skills/paper-fetch/scripts/fetch.py schema --pretty

Full flag reference and JSON envelope schema in skills/paper-fetch/SKILL.md.

Institutional access (opt-in)

If your institution has a subscription, set PAPER_FETCH_INSTITUTIONAL=1 to enable the publisher-direct fallback. Your IP / cookies / EZproxy authorize the fetch; the skill adds a 1 req/s rate limiter to keep batch jobs within publisher ToS.

export PAPER_FETCH_INSTITUTIONAL=1

See plan/institutional-access.md for design details.

Known limitations

  • Some publisher redirects return an HTML landing page; the %PDF header check rejects them
  • No browser automation — no CAPTCHA solving, no Playwright, no stealth
  • SSRF defense rejects private IPs, non-http(s) schemes, non-80/443 ports, cloud metadata hosts
  • 50 MB cap per PDF download

Part of the Agents365-ai research-skill family — pick the right tool for the job:

SkillNicheWhen to use
semanticscholar-skillSemantic Scholar API searchWhen you need to FIND papers before fetching
asta-skillSame corpus via Ai2 Asta MCPWhen your host supports MCP and you have an Asta API key
scholar-deep-research8-phase literature review pipelineWhen you want a structured cited report, not just PDFs
zotero-research-assistantZotero library workflowsWhen references go into Zotero

💬 Community

WeChat Community Group

❤️ Support

If this skill helps you, consider supporting the author:

WeChat Pay
WeChat Pay
Alipay
Alipay
Buy Me a Coffee
Buy Me a Coffee
Give a Reward
Give a Reward

👤 Author

Agents365-ai

📄 License

MIT