[ https://issues.apache.org/jira/browse/SOLR-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hoss Man updated SOLR-10934: ---------------------------- Attachment: SOLR-10934.patch bq. What we might want to consider, is refactoring our build.xml, so that the same <asciidoctor:convert/> task options use to generate the PDF, could also be used to generate a bare bones version of the html-site – ie: not using jekyll, just using raw asciidoctor with the "html5" output option. Then we could (in theory) run the same HTML link checking code we currently have against that output dir – just for the purpose of checking the links, not with any plan to ever publish it. I'm attaching a path that takes this approach -- i think it works pretty well. Unfortunately refactoring just the build.xml file proved to be insufficient to be able to re-use the existing {{<ascidoctor;convert>}} in a macro because of how the underlying Task class works -- it has some hard assumptions about XML element attributes like "sourceDocumentName" not being used even if they are ht empty string because of ant property expansion -- but i was able to deal with that by adding out own little AntTask subclass into the tools jar. i also did a little more refactoring of the build.xml file so running building both the PDF & jekyll site via {{ant}} wouldn't waste time redudently also building & validating the bare-bones HTML version. (unfortunately if you explicitly run {{ant build-pdf build-site}} this still happens, but hey: baby steps) like the previous patch, this includes some "nocommit" annotated intentional anchor/link errors in the {{*.adoc}} files. If you apply the patch as is, and run {{ant}} or {{ant build-pdf}} or {{ant build-site}} you'll get all the same validation errors that we want to see happen with this kind of bad content. If you refer the {{solr/solr-ref-guide/src}} changes then everything will start building happily. what do folks think of this approach? > create a link+anchor checker for the ref-guide PDF using PDFBox > --------------------------------------------------------------- > > Key: SOLR-10934 > URL: https://issues.apache.org/jira/browse/SOLR-10934 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation > Reporter: Hoss Man > Priority: Major > Attachments: SOLR-10934.patch, SOLR-10934.patch > > > We currently have CheckLinksAndAnchors.java which is automatically run > against the ref-guide HTML as part of the build to use JSoup to find bad > links/anchors that asciidoctor doesn't complain about -- but not everyone > does/can build the HTML version of the ref-guide sincif we can e it requires > manually installing jekyll. > The PDF build only requires things installed by ivy (via JRuby) and we > already have some PDFBox based code in ReducePDFSize.java that operates on > this PDF every time it's run -- so if we can find a way to do similar checks > using the PDFBox API we could catch these broken links faster. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org