Re: Needed tool for vision-impaired - was [Re: PDF Editor for Debian]
I wouldn't say PDFs are bad for visually impaired users. In fact, as bitmap fonts are thankfully a thing of the past for almost everywhere, you can zoom any document to your hearts desire. Though sometimes you need some tricks, e.g. Evince is configured to only use 50 MB of storage by default for caching, vastly limiting zoom capabilities. So you'll have to dig into dconf to change that. What you are looking for is ways to reflow text, but as a fixed layout format, PDFs are just not meant for that. Not even the PDF/UA standard [1] does require this, it only lays the ground rules for screen readers. Supposedly the swiss-made "VIP PDF-Reader" was able to help, yet it seems to have been abandoned as there doesn't seem to be any download options anymore. And other than that, PDF readers with that capability are very rare on any platform. No idea if anybody besides Adobe is doing that because PDF is such a terribly complicated format. In theory, this should all be doable with Tesseract, as it already does the OCR part. Just nobody has bothered yet to support such use cases yet and support an output format that can even handle more than just text. Best Richard [1]: https://en.wikipedia.org/wiki/PDF/UA
Re: Needed tool for vision-impaired - was [Re: PDF Editor for Debian]
Karen Lewellen (12024-06-24): > Good afternoon. > I am providing another option that might help here. > robobraille, > > www.robobraille.org > Provides services, free of charge, that will convert pdf files to a number > of different formats, including .html > They provide audio, mobi, and convert epub files too..but I digress. > As a test, consider sending your file to > convert at robobraille.org > correctly of course. > in the subjectline put html > leaving the body blank, and attach the file. > See if the .html file returned meets your needs. Interesting. Do you know how they fare with math? I mean real, non-trivial formulas produced by LaTeX like you would find in https://arxiv.org/abs/1803.05929 ? (I know, I could test. I will if you do not know the answer.) Regards, -- Nicolas George
Re: Needed tool for vision-impaired - was [Re: PDF Editor for Debian]
Good afternoon. I am providing another option that might help here. robobraille, www.robobraille.org Provides services, free of charge, that will convert pdf files to a number of different formats, including .html They provide audio, mobi, and convert epub files too..but I digress. As a test, consider sending your file to convert at robobraille.org correctly of course. in the subjectline put html leaving the body blank, and attach the file. See if the .html file returned meets your needs. Best, Karen On Mon, 24 Jun 2024, Richard Owlett wrote: On 06/24/2024 12:35 AM, Richard wrote: Hello, this very much depends on what you are expecting it to do. In general, PDFs are only meant to be viewed - and printed - they where never meant for anything else. ... Second sentence should read: ... only meant to be viewed by those with *NORMAL* vision ... I'm attempting to read a USDA document.[1] The printed version of this document is marginally readable. Tools such as "Atril Document Viewer" provide selected magnification. For this particular document and monitor, 150% is comfortable. Requires re-positioning the viewpoint 500 to 600 times to read document. For _this_ document, Atril can select all the text on a page in a manner that can be pasted in a "reasonable" manner to a Pluma document. It will: a. ignore actual graphics. b. put title/headings/??? on a separate line. c. all text between full page-width title/headings/??? will be treated as a logical unit. It will not: 1. put a blank line between paragraphs. 2. put a blank line above/below lines containing title/headings/???. 3. identify superscripts in some manner. All this suggests that it should be able to extract text from a PDF and create a HTML document likely using only , , , and in its . [1] https://fns-prod.azureedge.us/sites/default/files/resource-files/TFP2021.pdf _Thrifty Food Plan, 2021_ Food and Nutrition Service August 2021 FNS-916
Needed tool for vision-impaired - was [Re: PDF Editor for Debian]
On 06/24/2024 12:35 AM, Richard wrote: Hello, this very much depends on what you are expecting it to do. In general, PDFs are only meant to be viewed - and printed - they where never meant for anything else. ... Second sentence should read: ... only meant to be viewed by those with *NORMAL* vision ... I'm attempting to read a USDA document.[1] The printed version of this document is marginally readable. Tools such as "Atril Document Viewer" provide selected magnification. For this particular document and monitor, 150% is comfortable. Requires re-positioning the viewpoint 500 to 600 times to read document. For _this_ document, Atril can select all the text on a page in a manner that can be pasted in a "reasonable" manner to a Pluma document. It will: a. ignore actual graphics. b. put title/headings/??? on a separate line. c. all text between full page-width title/headings/??? will be treated as a logical unit. It will not: 1. put a blank line between paragraphs. 2. put a blank line above/below lines containing title/headings/???. 3. identify superscripts in some manner. All this suggests that it should be able to extract text from a PDF and create a HTML document likely using only , , , and in its . [1] https://fns-prod.azureedge.us/sites/default/files/resource-files/TFP2021.pdf _Thrifty Food Plan, 2021_ Food and Nutrition Service August 2021 FNS-916