Re: [PLUG] alternatives for creating scaled, searchable pdf from plain text?

Ted Mittelstaedt Thu, 10 Jul 2025 07:27:47 -0700

Hmm, that doesn't look quite right.

Check this page out:


https://tech.surveypoint.com/posts/printing-text-files-to-pdf-with-enscript/

enscript has a whole lot of options you might need to be applying.   Also, my 
understanding is it's output is PostScript not PDF, and although a PDF is 
mainly postscript, usually you post process the output with a program like 
ps2pdf

However, as for searchability, note the following:

"Be aware that while these approaches will produce PostScript which renders 
correctly, and then can be used to create a PDF file which displays correctly, 
it will not be possible to copy/search the resulting PDF file.

In order to search a PDF file the font must have an associated ToUnicode CMap, 
this is a PDF-only construct, it does not exist in PostScript and there is no 
PostScript equivalent. So there's no way to embed that information in the 
PostScript program, which means it can't be embedded in the PDF file."

This is from a post here:

https://stackoverflow.com/questions/57447046/how-to-convert-txt-to-pdf-with-utf-8

They were talking about mucking about with Unicode but the same thing applies 
with what you are doing - your converting text to postscript, postscript lacks 
what's needed to make a pdf seachable, thus even running a conversion tool from 
postscript to pdf in post processing isn't going to produce a searchable pdf.

Probably, enscript isn't the right program for doing this in the first place.  
You might try loading the text into OpenOffice then save it out as a pdf.

However I will quickly state it's been years since I've mucked about with PDFs 
and those conversion tools.  PDF was touted by Adobe as the be-all and end-all 
for documents but after using it for a while I realized how incredibly 
proprietary and difficult to work with it is, the PDF format is, in fact, 
designed to make documents so complex to work with that it takes hundreds of 
hours of programming time to actually do anything with them - short of the 
extremely crude kind of PDFs you have been generating, or the crude ones that 
people generate using Microsoft's default PDF printer or simple tools like 
that.  Adobe did it this way so they could continue selling expensive software 
that works with PDFs to organizations full of dumb users who think a PDF 
document is just like a Word document and end up pushing for Acrobat's 
commercial program when they discover - as you have - that the basic tools that 
are free that work with PDFs don't cut the mustard.

Nowadays I value text highly, and if I am lucky enough to get text output from 
something I keep it text, and if it's ASCII text then I'm in hog heaven.

Easily searchable by every tool out there, easy to load into the "vi" editor 
and search which gives you surrounding context that is often critical in a 
search anyway, and only very simple formatting is needed to make the meaning 
clear.  Also very easy on modern monitors to create terminal sessions using a 
font like Cascadia Monospace 8pt that will allow you to stretch out the window 
to over 250 columns and still remain very readable, which makes it extremely 
easy to deal with wide output text documents.

Narrow width 8.5X11 and 80 column terminal output is dinosaur technology, it's 
for printing on dead trees, and PDF was designed for that model.

Ted

-----Original Message-----
From: PLUG <[email protected]> On Behalf Of Galen Seitz
Sent: Wednesday, July 9, 2025 11:01 AM
To: Portland Linux/Unix Group <[email protected]>
Subject: [PLUG] alternatives for creating scaled, searchable pdf from plain 
text?

Hi,

I often create a pdf of sdiff output using a command like this:

sdiff -w100 old new | enscript -o sdiff.pdf

This works okay, but the resulting pdf is not searchable.  Can someone suggest 
an alternative that will properly scale the wide output (-w100) to a letter 
size pdf, yet retain the ability to search for text strings?

thanks,
galen
--
Galen Seitz
[email protected]

Re: [PLUG] alternatives for creating scaled, searchable pdf from plain text?

Reply via email to