On Mon, Aug 20, 2018 at 10:10:16AM +0200, Félix Sipma wrote:
> Are you aware of "libreoffice --convert-to pdf --outdir . document.doc"?

Yes.

> It seems to work well, so I'm not sure what lloconv brings to the table :-).

It's faster for a start - on a random .doc I have to hand:

$ time lloconv disclosure.doc tmp.pdf
No language whitelisted, turning off the language support.

real    0m0.431s
user    0m0.369s
sys     0m0.069s
$ time libreoffice --convert-to pdf --outdir . disclosure.doc > /dev/null

real    0m0.670s
user    0m0.548s
sys     0m0.098s

If you're just converting a single document then either is likely fine,
but for converting a lot of documents the extra overhead often starts to
matter.

You can batch convert many files to a single format with --convert-to,
but lloconv's server mode allows reusing a single LibreOfficeKit
instance to perform conversions to any formats you want.  A batch
conversion with --convert-to also needs enough scratch space for all
the output files, whereas an lloconv in server mode can be used to
process files on demand.

With lloconv you can specify options to the conversion (maybe
--convert-to supports that too, but if so it seems to be undocumented).
Sadly the LibreOfficeKit options are also rather poorly documented.  I
know of "SkipImages" (which is handy if you just want to extract text
for indexing for example) as that's mentioned in the LOK headers, but if
you can figure out the names I think this should allow setting options
that you can set when exporting by hand.

Also lloconv seems to be more reliable - on the trivial testcase from the
autopkgtest in the lloconv package, libreoffice --convert-to gives a
blank PDF for no obvious reason:

$ echo '<html><title>foo</title><body>hello world</body></html>' > in.html
$ libreoffice --convert-to pdf --outdir . in.html
[...]
$ pdftotext in.pdf
$ ls -l in.txt
-rw-r--r-- 1 olly olly 1 Aug 21 08:46 in.txt

Cheers,
    Olly

Reply via email to