mbeckerle commented on code in PR #193: URL: https://github.com/apache/daffodil-site/pull/193#discussion_r2469107921
########## site/_pandoc/README.md: ########## @@ -0,0 +1,223 @@ +--- +layout: page +title: Pandoc + Jekyll Integration +pdf: false +--- +# 🧭 Pandoc + Jekyll Integration + +This directory contains tools for generating **PDF versions** of selected Jekyll pages while keeping the same Markdown files usable by Jekyll for the website. + +The goal is to have **one Markdown source** that: +- renders cleanly in the Jekyll site (for HTML), +- and can also be converted into a polished PDF using **Pandoc + LaTeX**. + +--- + +## 🏗️ Directory Layout + +``` +_pandoc/ +│ +├── README.md ← this file +├── Makefile ← builds all PDFs +├── unwrap-pandoc.awk ← preprocessor that removes comment wrappers +├── template.latex ← (optional) custom LaTeX template +├── header.tex ← (optional) extra LaTeX header content +└── ../pdf/ ← generated PDFs appear here +``` + +At the root of the Jekyll site: + +``` +_config.yml +_posts/ +pages/ +assets/ +_pandoc/ +pdf/ +``` + +--- + +## 🧩 How It Works + +### 1. Mark pages that should have PDFs + +Any Markdown file (in `_posts`, `pages/`, or elsewhere) can be tagged with: + +```yaml +--- +title: Example Page +layout: page +pdf: true +--- +``` + +The Makefile will scan the entire Jekyll project and automatically detect these files. + +--- + +### 2. Use HTML comment wrappers for Pandoc-only content + +Pandoc sometimes needs LaTeX code for things like custom tables, math, or page layout. +We hide that LaTeX from Jekyll using **HTML comments**, which Jekyll ignores but our AWK preprocessor removes before running Pandoc. Review Comment: This was needed in the relnotes for some DFDL Schema project that I cloned this approach from. Even just doing the table of revisions at the start of the document required more than just markdown tables because there is a notes column that is a paragraph. This is needed if you want to create almost any sort of rich tables that work like the ones in the DFDL specification, where a cell of the table contains rich structure like bulleted or numbered lists, etc. Markdown is really quite restrictive about tables. Even putting a basic paragraph into a table cell is problematic, I think because a complex table cell isn't that readable when presented as the ".md" text, nor easy to create, so this kind of rich nesting structure is against the markdown philosophy. Jekyll lets you use HTML for tables, but pandoc does not. This is actually the primary issue that is making me question this whole approach. If I can't make a stylistically similar PDF to the DFDL spec, then I'm not sure what the point is. We could easily package the whole daffodil site so that offline users can just run a local copy of the whole thing... in fact, just git clone of daffodil-site and issuing the podman command will do this. (though there may be a simpler way using wget for example). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
