Convert DOCX to PDF with an AI agent
Turn a Word .docx into a PDF in two steps — safe-docx exports semantic HTML, then Chrome, LibreOffice, or Pandoc renders the PDF. All local, no upload.
Most DOCX-to-PDF routes mean uploading the document to a converter site or installing a
dedicated library. If the document is a contract, the upload is the problem. safe-docx does not
ship a PDF renderer — pagination and font metrics are a layout engine's job, not a document
editor's — but its export tool emits semantic HTML, and every machine already has
something that turns HTML into PDF. The recipe: safe-docx exports the HTML in your agent
session, then Chrome, LibreOffice, or Pandoc renders the PDF. Two commands, nothing leaves
your machine.
Choosing the PDF renderer for step two
| Consideration | Chrome / Chromium (headless) | LibreOffice (headless) | Pandoc |
|---|---|---|---|
| Already on the machine | Almost always | Sometimes | Rarely (plus a PDF engine) |
| HTML/CSS rendering quality | Best (browser engine) | Good (own HTML import) | Depends on PDF engine |
| Page setup control | Print CSS (@page) | Export filter defaults | Engine variables |
| One-line command | Yes | Yes | Yes |
| Best for | One-off conversions on a laptop | Office-style output, batch jobs | Pipelines that already use Pandoc |
The PDF renders safe-docx's semantic HTML — structural, not a pixel-exact clone of the Word layout.
For a print-exact reproduction, export the PDF from Word or LibreOffice using the original .docx.
The workflow, step by step
-
1
Install safe-docx for your agent
Add the MIT-licensed safe-docx MCP server to your agent once. For Claude Code:
claude mcp add safe-docx -- npx -y @usejunior/safe-docxThe same server works with Gemini CLI, Cursor, and Codex.
-
2
Export the .docx to semantic HTML
Ask in plain language: “Export bonterms-mutual-nda.docx to HTML.” The agent calls the safe-docx
exporttool with the HTML format:export(file_path="bonterms-mutual-nda.docx", format="html")The tool writes
bonterms-mutual-nda.htmlnext to the source — paragraphs, headings, nested lists, tables with merged cells, images, and footnotes, as structural HTML. -
3
Render the HTML to PDF with a tool you already have
Any HTML-to-PDF renderer finishes the job. If Chrome or Chromium is on the machine — it almost always is — the agent finds the executable and runs one command:
CHROME="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" # macOS "$CHROME" --headless --no-pdf-header-footer \ --print-to-pdf=bonterms-mutual-nda.pdf bonterms-mutual-nda.htmlOn Linux the binary is usually
google-chromeorchromium— setCHROMEaccordingly. -
4
Or use Pandoc — or convert directly with LibreOffice
If you already script with Pandoc, point it at the exported HTML. Pandoc needs a PDF engine on the machine — a LaTeX install by default, or pass
--pdf-engine=weasyprint:pandoc bonterms-mutual-nda.html -o bonterms-mutual-nda.pdfAnd if LibreOffice is installed, you can skip the HTML step entirely: it converts the original Word file straight to PDF, with the closest layout fidelity. The PDF lands in the directory you run it from (pass
--outdirto choose another):soffice --headless --convert-to pdf bonterms-mutual-nda.docx -
5
Tune the page setup with print CSS (optional)
The exported HTML is deliberately unstyled, so the renderer's defaults decide margins and page size. For Chrome, ask the agent to add a small print stylesheet to the HTML head before rendering:
<style> @page { size: Letter; margin: 1in; } body { font-family: Georgia, serif; line-height: 1.5; max-width: 42em; } </style>That one block controls page size, margins, and typography for the whole document.
Related guides
Convert DOCX to Markdown
Turn an existing Word .docx into clean Markdown with an AI agent, using the safe-docx export tool — headings, lists, links, tables, and footnotes preserved.
Read the guide →Convert DOCX to HTML
Render a Word .docx to semantic HTML (paragraphs, headings, lists, tables, images) with the safe-docx export tool — ready for previews, web rendering, and content extraction.
Read the guide →Frequently asked questions
Can safe-docx export a PDF directly?
No, and that is a deliberate scope decision. A PDF renderer needs a full layout engine — line breaking, pagination, font metrics, headers and footers — which is a different problem from reading and editing DOCX. The supported path is the two-step recipe: export semantic HTML with safe-docx, then render the PDF with Chrome, LibreOffice, or Pandoc.
Will the PDF look exactly like the original Word document?
No. safe-docx exports the semantic tier — structural HTML that mirrors the document outline, not a pixel-faithful clone — so the PDF reflects that structure with the renderer's typography. If you need a print-exact reproduction of the original layout, open the .docx in Word or LibreOffice and export the PDF from there.
Which PDF renderer should I pick?
Chrome or Chromium if it is already on the machine — it renders modern HTML and CSS best and supports print CSS for page setup. LibreOffice headless gives office-style output and also converts the original .docx directly when layout fidelity matters more than the HTML route. Pandoc fits pipelines that already use it, but needs a separate PDF engine.
Can the agent run the whole conversion by itself?
Yes. Ask for the outcome — convert this .docx to PDF — and the agent chains both steps: the safe-docx export tool call, then the renderer command in the shell. Everything runs locally, so the document never leaves your machine.
Convert and edit DOCX from your agent
Install safe-docx, point your agent at a Word file, and export the HTML your PDF renderer needs — or edit the document in place.