Re: PDF Editor for Debian

2024-06-26 Thread Richard
qpdf is good for e.g. removing any password protection - given you know the
password. But I kinda doubt that's what's meant with editor. And quite
frankly, you can do most of what qpdf does more comfortably with tools like
PDFSam or PDF Arranger. The latter even lets you crop pages or rename the
document name (saved inside the pdf). If you want a reason to go CLI,
that's definitely ghostscript. It can compress (losslessly), decompress,
resize images and pages, have it conform to various PDF standards - to my
knowledge pretty much the only free piece of software that will write PDF
2.0 compatible files - merge files, embed fonts/font subsets or convert
them to outlines, convert to images...and that's far from a complete list.
Of course it's quite complex but there are many pages out there that will
tell you how to achieve what. I doubt there's a single program as capable
as ghostscript - maybe with the exception of Acrobat Pro.

Richard

Am Mi., 26. Juni 2024 um 21:48 Uhr schrieb Franco Martelli <
martelli...@gmail.com>:

> On 24/06/24 at 00:50, Arbol One wrote:
> > Hello.
> > Is there a PDF editor that would work with Debian 12?
> >
>
> Time ago I used Qpdf to delete some pages in a .pdf, for a quick
> description:
>
> ~$ apt show qpdf
>
> in the manual there are some command examples, I used these command to
> edit a pdf:
>
> - To delete the last two pages of a pdf:
>
> ~$ qpdf 1.pdf --pages . 1-r3 -- test.pdf
>
> - To merge two .pdf files:
>
> ~$ qpdf --empty --pages 1.pdf 2.pdf -- test.pdf
>
> If you are interested in qpdf, once installed, read the
> /usr/share/doc/qpdf/README-doc.txt file for a list of URL where to find
> documentation.
>
> Cheers,
> --
> Franco Martelli
>
>


Re: PDF Editor for Debian

2024-06-26 Thread Franco Martelli

On 24/06/24 at 00:50, Arbol One wrote:

Hello.
Is there a PDF editor that would work with Debian 12?



Time ago I used Qpdf to delete some pages in a .pdf, for a quick 
description:


~$ apt show qpdf

in the manual there are some command examples, I used these command to 
edit a pdf:


- To delete the last two pages of a pdf:

~$ qpdf 1.pdf --pages . 1-r3 -- test.pdf

- To merge two .pdf files:

~$ qpdf --empty --pages 1.pdf 2.pdf -- test.pdf

If you are interested in qpdf, once installed, read the 
/usr/share/doc/qpdf/README-doc.txt file for a list of URL where to find 
documentation.


Cheers,
--
Franco Martelli



Re: PDF Editor for Debian

2024-06-25 Thread tomas
On Tue, Jun 25, 2024 at 08:01:26PM +0200, Detlef Vollmann wrote:
> On Mon, 24 Jun 2024 04:26:47 -0400
> Timothy M Butterworth  wrote:
> 
> > I use Master PDF Editor. It works great.
> > https://code-industry.net/free-pdf-editor/
> 
> It looks nice.
> But being a closed source SW from Russia I'd be careful to run
> it outside of an isolated VM (which is actually true for most
> closed source SW).

Yes, not just from Russia. Here's some old, old story which might
amuse you :-)

  https://en.wikipedia.org/wiki/Crypto_AG

Cheers
-- 
tomás


signature.asc
Description: PGP signature


Re: PDF Editor for Debian

2024-06-25 Thread Detlef Vollmann
On Mon, 24 Jun 2024 04:26:47 -0400
Timothy M Butterworth  wrote:

> I use Master PDF Editor. It works great.
> https://code-industry.net/free-pdf-editor/

It looks nice.
But being a closed source SW from Russia I'd be careful to run
it outside of an isolated VM (which is actually true for most
closed source SW).

  Detlef



Re: Needed tool for vision-impaired - was [Re: PDF Editor for Debian]

2024-06-24 Thread Richard
I wouldn't say PDFs are bad for visually impaired users. In fact, as bitmap
fonts are thankfully a thing of the past for almost everywhere, you can
zoom any document to your hearts desire. Though sometimes you need some
tricks, e.g. Evince is configured to only use 50 MB of storage by default
for caching, vastly limiting zoom capabilities. So you'll have to dig into
dconf to change that.

What you are looking for is ways to reflow text, but as a fixed layout
format, PDFs are just not meant for that. Not even the PDF/UA standard [1]
does require this, it only lays the ground rules for screen readers.
Supposedly the swiss-made "VIP PDF-Reader" was able to help, yet it seems
to have been abandoned as there doesn't seem to be any download options
anymore. And other than that, PDF readers with that capability are very
rare on any platform. No idea if anybody besides Adobe is doing that
because PDF is such a terribly complicated format.

In theory, this should all be doable with Tesseract, as it already does the
OCR part. Just nobody has bothered yet to support such use cases yet and
support an output format that can even handle more than just text.

Best
Richard

[1]: https://en.wikipedia.org/wiki/PDF/UA


Re: Needed tool for vision-impaired - was [Re: PDF Editor for Debian]

2024-06-24 Thread Nicolas George
Karen Lewellen (12024-06-24):
> Good afternoon.
> I am providing another option that might help here.
> robobraille,
> 
> www.robobraille.org
> Provides services, free of charge, that will convert pdf files  to a number
> of different formats, including .html
> They provide audio, mobi, and  convert epub files too..but I digress.
> As a test, consider sending your file to
> convert at robobraille.org
>  correctly of course.
> in the subjectline put html
> leaving the body blank, and attach the file.
> See if the .html file returned meets your needs.

Interesting.

Do you know how they fare with math? I mean real, non-trivial formulas
produced by LaTeX like you would find in
https://arxiv.org/abs/1803.05929 ?

(I know, I could test. I will if you do not know the answer.)

Regards,

-- 
  Nicolas George



Re: Needed tool for vision-impaired - was [Re: PDF Editor for Debian]

2024-06-24 Thread Karen Lewellen

Good afternoon.
I am providing another option that might help here.
robobraille,

www.robobraille.org
Provides services, free of charge, that will convert pdf files  to a 
number of different formats, including .html

They provide audio, mobi, and  convert epub files too..but I digress.
As a test, consider sending your file to
convert at robobraille.org
 correctly of course.
in the subjectline put html
leaving the body blank, and attach the file.
See if the .html file returned meets your needs.
Best,
Karen



On Mon, 24 Jun 2024, Richard Owlett wrote:


On 06/24/2024 12:35 AM, Richard wrote:

 Hello,
 this very much depends on what you are expecting it to do. In general,
 PDFs
 are only meant to be viewed - and printed - they where never meant for
 anything else. ...


Second sentence should read:

 ... only meant to be viewed by those with *NORMAL* vision ...


I'm attempting to read a USDA document.[1]
The printed version of this document is marginally readable.

Tools such as "Atril Document Viewer" provide selected magnification.
For this particular document and monitor, 150% is comfortable. Requires 
re-positioning the viewpoint 500 to 600 times to read document.


For _this_ document, Atril can select all the text on a page in a manner that 
can be pasted in a "reasonable" manner to a Pluma document.


It will:
   a. ignore actual graphics.
   b. put title/headings/??? on a separate line.
   c. all text between full page-width title/headings/??? will be
 treated as a logical unit.
It will not:
   1. put a blank line between paragraphs.
   2. put a blank line above/below lines containing title/headings/???.
   3. identify superscripts in some manner.

All this suggests that it should be able to extract text from a PDF and 
create a HTML document likely using only , , , and  in its 
.



[1] 
https://fns-prod.azureedge.us/sites/default/files/resource-files/TFP2021.pdf

_Thrifty Food Plan, 2021_
Food and Nutrition Service
August 2021
FNS-916






Needed tool for vision-impaired - was [Re: PDF Editor for Debian]

2024-06-24 Thread Richard Owlett

On 06/24/2024 12:35 AM, Richard wrote:

Hello,
this very much depends on what you are expecting it to do. In general, PDFs
are only meant to be viewed - and printed - they where never meant for
anything else. ...


Second sentence should read:

... only meant to be viewed by those with *NORMAL* vision ...


I'm attempting to read a USDA document.[1]
The printed version of this document is marginally readable.

Tools such as "Atril Document Viewer" provide selected magnification.
For this particular document and monitor, 150% is comfortable. Requires 
re-positioning the viewpoint 500 to 600 times to read document.


For _this_ document, Atril can select all the text on a page in a manner 
that can be pasted in a "reasonable" manner to a Pluma document.


It will:
   a. ignore actual graphics.
   b. put title/headings/??? on a separate line.
   c. all text between full page-width title/headings/??? will be
  treated as a logical unit.
It will not:
   1. put a blank line between paragraphs.
   2. put a blank line above/below lines containing title/headings/???.
   3. identify superscripts in some manner.

All this suggests that it should be able to extract text from a PDF and 
create a HTML document likely using only , , , and  in 
its .



[1] 
https://fns-prod.azureedge.us/sites/default/files/resource-files/TFP2021.pdf

_Thrifty Food Plan, 2021_
Food and Nutrition Service
August 2021
FNS-916



Publishing Formats (was: PDF Editor for Debian)

2024-06-24 Thread Richard

Since it's quite OT, starting a new thread for this.

I would most certainly never call formats like ooxml or odf “publishing formats”, they are content creation or editing formats. From a publishing format I expect to be able to show the content as intended — which actually neither of them can do 100 % can, the probability of messing up just isn't that big. Either you want a fixed format, e.g. for printing, what you get with the likes of PDF, PS, SVG or your various raster graphic formats. Or you want your content to adapt in a foreseeable way to the viewer, i.e. HTLM, usually with the help of CSS and worst case JS. Sure, ooxml and odf want to be the former, but due to technical caveats that's not necessarily possible. With ooxml, you have several incompatible versions you can't just easily tell apart, often making identical display impossible due to using but not embedding proprietary fonts by default — and being an abomination of a format spanning around 5500 pages plus another 1000 pages for their tranistional mode, that was 
only standardized by world-wide corruption. ODF usually does things way better, but support in software beyond LibreOffice is still often lacking — though that's not their fault since their format is much simpler, being documented in just around 1000 pages. But still, as it doesn't communicate fixed positions — and as far as I can tell doesn't imply those by telling the software explicitly how to render font, so the result will always look identical, and won't embed fonts — or the needed subset — by default, it's also kinda not fulfilling the needs.


And no, editing a PDF as docx isn't the easiest — not to mention best — way to edit a PDF, especially not with some ominous web tool. Maybe someone can write an AI for that, but even then it's most likely much easier to just go the OCR route to derive content and extract layout from the document. At least I don't know how strict PDF defines things, I only always hear that PDF is at least as much of an unholy mess as ooxml — which was supposed to be fixed by PDF 2.0, which still pretty much no software creates by default, even though most software seems to be supporting it — and writing tools like Ghostscript or Poppler is a royal pain. LaTeX can probably only circumvent this because they just have to create a PDF from a predefined set of functions — and be able to embed other PDFs into these PDFs. But the most reliable way to edit PDFs — as I have little to no experience with most commercial solutions — is Inkscapte. If the internal importer succeeds, you get creat text 
editing features, which obviously can't rival office suites, but at least you don't completely and almost guaranteed completely mess up the whole layout.


Richard


On 24.06.24 10:31, jeremy ardley wrote:

In my view, pdf and docx shoud be regarded as publication formats for content 
managed in a professional content management system. HTML and odt and 
postscript also fall in to the category of publication formats.

Word documents suffer because back in the dim ages of the late 1980s Microsoft 
decided to merge content managing with content editing with content publishing 
and abysmally failed at all of them.

However, the easiest way to edit a pdf is convert it to word using say 
https://pdf2docx.com/ There are also plenty of ways in linux to do that but 
they all take time and effort to make work.



Re: PDF Editor for Debian

2024-06-24 Thread jeremy ardley



On 24/6/24 13:35, Richard wrote:
So your best bet is just to try to never have to edit a PDF at all. 
Always try to get a hand on the original file the PDF was delivered 
from. Even if it's a docx 



In my view, pdf and docx shoud be regarded as publication formats for 
content managed in a professional content management system. HTML and 
odt and postscript also fall in to the category of publication formats.


Word documents suffer because back in the dim ages of the late 1980s 
Microsoft decided to merge content managing with content editing with 
content publishing and abysmally failed at all of them.


However, the easiest way to edit a pdf is convert it to word using say 
https://pdf2docx.com/ There are also plenty of ways in linux to do that 
but they all take time and effort to make work.




Re: PDF Editor for Debian

2024-06-24 Thread Timothy M Butterworth
On Mon, Jun 24, 2024 at 2:23 AM Arbol One  wrote:

> Hello.
> Is there a PDF editor that would work with Debian 12?
>

I use Master PDF Editor. It works great.
https://code-industry.net/free-pdf-editor/

Thanks.
> --
> *ArbolOne.ca* Using Fire Fox and Thunderbird. ArbolOne is composed of
> students and volunteers dedicated to providing free services to charitable
> organizations. ArbolOne on Java Development is in progress [ í ]
>


-- 
⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org/
⠈⠳⣄⠀⠀


Re: PDF Editor for Debian

2024-06-24 Thread Klaus Singvogel
Arbol One wrote:
> Is there a PDF editor that would work with Debian 12?

It's depending on what you understand under "edit", and whether you expect to 
use Free Open Source Software (FOSS) or not.

If you just want to fill out forms (JavaScript), then I'd recommend the FOSS 
programs: chromium browser (not: Google chrome browser), or evince.

If you want to edit the PDF itself, like moving lines, edit texts or 
rearranging elements (like pictures), you can either use LibreOffice (but for 
me it wasn't quiet usable), or buy a license for a commercial program.

For commercial programs, I'd made good experience with Master PDF Editor. But 
I'd also give Qoppa a try, because a lot of people say that Qoppa is the better 
choice.
My experience for Master PDF Editor is: I'd running it since 2019 with the same 
bought license. But my Debian changed from Jessie, over Stretch and Buster to 
Bullseye (now), and it's still running. I'd to admit that I needed to reinstall 
the DEB package every then and a while (not after a Debian Upgrade), due to 
issues with Qt5. So it was a good idea to download the DEB package after I 
bought it, and keep the packages till today. So far I can say, all required 
depencies are included in the downloaded DEB package.

Best regards,
Klaus.
-- 
Klaus Singvogel
GnuPG-Key-ID: 1024R/5068792D  1994-06-27



Re: PDF Editor for Debian

2024-06-23 Thread Richard
Hello,
this very much depends on what you are expecting it to do. In general, PDFs
are only meant to be viewed - and printed - they where never meant for
anything else. Even filling out forms is just s bad hackjob through
JavaScript. That being said, there is software with PDF editing
capabilities on Linux, though it's much more basic than what you'll find on
Windows.

If you want to just make comments, Okular has some neat capabilities,
including signing PDFs. For handwritten notes on a PDF, Xournal++ is a
great tool. If you want to just want to reorder pages, rotate, delete or
add them, there are some tools like PDFSam. There's also the quite powerful
Ghostscript, though that's CLI only. At least I don't know of any GUI. For
more "editing" features, LibreOffice can import PDFs, but in my experience
it struggles quite a lot with layout. OnlyOffice also has that capability,
but I never used it. Also, Inkscape can do that. It can also import
multiple pages at once, but I recommend only importing single pages,
otherwise Inkscape quickly reaches its limits. It has two import modes, an
internal one and poppler. Use the internal one and see if that works for
you. It's easier to edit text boxes in there, but it's quite likely it
won't be able to use the right font, which will break the whole look. The
poppler import can preserve that, but that's because letters aren't
imported as letters but as paths. So you can't just edit text, you'd have
to delete letters and try to insert text in a way that looks decent.

Other than that, there are a few commercial tools, but they are not that
well known. So your best bet is just to try to never have to edit a PDF at
all. Always try to get a hand on the original file the PDF was delivered
from. Even if it's a docx - Microsofts infamous wannabe-open source format
that just nobody can handle properly, including their own software - it
will most likely be better handled by the software you use than a PDF made
editable.

Best
Richard

On Mon, Jun 24, 2024, 07:13 Arbol One  wrote:

> Hello.
> Is there a PDF editor that would work with Debian 12?
>
> Thanks.
> --
> *ArbolOne.ca* Using Fire Fox and Thunderbird. ArbolOne is composed of
> students and volunteers dedicated to providing free services to charitable
> organizations. ArbolOne on Java Development is in progress [ í ]
>


PDF Editor for Debian

2024-06-23 Thread Arbol One

Hello.
Is there a PDF editor that would work with Debian 12?

Thanks.

--
*/ArbolOne.ca/* Using Fire Fox and Thunderbird. ArbolOne is composed of 
students and volunteers dedicated to providing free services to 
charitable organizations. ArbolOne on Java Development is in progress [ í ]