Hello PoDoFo devs and users, I have a quick update on the pdfmm/PoDoFo-next merge status: - I checked in the text extraction API, with a small test[1]; - I checked in a full review/revamp of the IO subsystem.
The latter review has been more complex than I initially thought, but eventually successful. Basically the aim for the IO system review was cleaning it and making it simpler and more powerful at the same time: one of the issues of the previous hierarchy was for example that PdfInputDevice and PdfOutputDevice weren't inheriting PdfInputStream and PdfOutputStream respectively, needing adapter classes to interchange instances. Also the naming choices were sometimes weak, as for example PdfOutputDevice had in fact a read-write contract. The new hierarchy inspires from C++, .NET hierarchies and tries to take the best from all worlds: non overlapping Read/Write contracts/interfaces exist but most implementers for these just inherits a merged Read/Write StreamDevice[2] class similar to the Stream class in .NET[3]. This is a major cleaning/simplification as it makes the hierarchy easier/more balanced, as there is no requirement to have specialized implementations for all the Read/Write/ReadWrite combinations, a lot of those were lacking previously making the API looking incomplete. This is trading a bit of type enforcement (which anyway wasn't fully enforced before) to have less implementations to maintain. Specialized Read/Write only implementations are still possible and few notable examples exist (eg. PdfCanvasInputDevice). Attached is the UML diagram of the new hierarchy: the naming choices has been very carefully weighted so that the name of the classes are not excessively long (the "Pdf" prefix only for these classes has been sacrificed, also because they are very generic use). Even thought I have quite some experience with big refactors, it's always surprising how long it takes to do accomplish those: I reached the current model after more than 8 iterations/reversals. With these news, my list of TODOs is shortening quickly[4]. A couple of medium API reviews plus the porting of the tools and I will be ready to integrate into PoDoFo-next. The plan is still this summer, hopefully before the end of August. Cheers, Francesco [1] https://github.com/pdfmm/pdfmm/blob/f2be85e365a186f51fd13147cc6a0f1bc6ce0aa6/test/unit/TextExtraction.cpp#L15 [2] https://github.com/pdfmm/pdfmm/blob/675c03a872c0d8969ae5e123940c88712107a03b/src/pdfmm/base/PdfStreamDevice.h#L27 [3] https://docs.microsoft.com/en-us/dotnet/api/system.io.stream?view=net-6.0 [4] https://github.com/pdfmm/pdfmm/blob/master/TODO.md
_______________________________________________ Podofo-users mailing list Podofo-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/podofo-users