On 11/30/2018 8:41 AM, Werner LEMBERG wrote:

About a month ago I wrote:

We have a new pdf parser (pplib from Paweł Jackowski) that replaces
poppler.  It is much smaller, a bit faster and it's written in
pure C [...]

Is there a project page for pplib?  The source code of this library
contained in TeXLive is very, very uncommented – in particular, a
description of the API is completely missing, AFAICS.  It also comes
with overly long lines and extremely densely written C code; it
almost feels as if the original source has been written with cweb or
something like that.

I would be glad if someone could answer my question.

During bachotex 2018 Pawel Jakowski (son of Jacko -- tex gyre project) showed me some code and after looking at it we realized that it could be used as drop in for poppler.

In luatex, the pdf library, is actually not used that much: it can open a pdf file and traverse the object tree. It has no further role in the backend which copies and creates objects itself. So, a lightweight drop in basically was considered doable quite well. Pawel explicitly limited the functionality to a bare minimum: opening a file and traversing objects. (But it's quite advanced as for instance we can also access to password protected files).

So, basically it went this way: pawel wrote the code, I replaced the inclusion code and rewrote the pdf access library (so that one got a different interface but the old one was way more complex and even has issues; we're not compatible here). Then luigi spent quite some time on integrating the library in the luatex source tree.

The final integration involved dealing with cross platform issues. Especially the arm platform with different alignment rules took some work (luigi and pawel sorted that out eventually). We had soem feedback from context testers (it's also always debatable to what extend one should support fuzzy cases, bad documents etc).

There might still be corner cases to cover but we expect all to be ready in time for tex live 2019. The biggest advantage is that we got rid of a c++ dependency and that the code (which is unlikely to change much) is part of the luatex code base. So it's in fact a library specially made for luatex originating in the tex community.

I hope that explains it a bit (there is not much more to tell i guess; normally this kind of progress gets reported in status articles),

Hans

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
       tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------

Reply via email to