In the case of NAPS2 for instance it’s no so much the import functions, but 
would be good for other software, it’s the writing (exporting functions). As it 
currently saves to image formats, and PDF formats.

Which got me thinking that the document liberation would be helpful for saving 
in different formats. Since Libreoffice can save in different formats and 
there’s software for which a full editor maybe too heavy for it to do what it 
needs. Having the document liberation as a set of shared libraries for reading 
and saving in different formats would be helpful. That way software which don’t 
want to edit them, just save them in different formats can benefit.

For instance open source scanning software, especially when performing OCR 
operations, then need to save it in an editable format.

Sent from my iPad

> On 2 Sep 2025, at 07:29, Miklos Vajna <[email protected]> wrote:
> 
> Hi,
> 
> On Tue, Sep 02, 2025 at 04:15:44PM +1000, Chris Sherlock 
> <[email protected]> wrote:
>>> To make it easier to maintain the code and enable other open source 
>>> software to use it, it can be useful to split off the import export code as 
>>> libraries. This way other software such as NAPS2, can then use the 
>>> libraries to scan and save to doc or docx files. These produced files can 
>>> then be edited in Libreoffice for instance, it would help to aid its OCR 
>>> functions.
>>> 
>>> What would it take to split off this code into independent libraries which 
>>> can be installed and used by other software, as well as Libreoffice?
> 
> This is not as easy as it sounds, because naturally the DOC & DOCX
> import code maps from the specific formats to Writer's doc model. So in
> case other software would want to use this, then you would need to map
> to that different document model, so there is not much to share.
> 
> Additionally, given that e.g. DOCX can embed XLSX or PPTX files, you
> still need all of libreoffice to handle these documents properly, you
> can't just split off some of the import code to a separate library.
> 
> If other software wants to reuse libreoffice's import filters, perhaps
> use libreoffice to convert to (flat) ODF, then only handle that one
> format in your application?
> 
>> Does anyone know if there is a library based on librevenge that handles doc 
>> and docx files?
> 
> I'm not aware of something like that,
> https://www.documentliberation.org/projects/ has a list of existing
> importers.
> 
> Regards,
> 
> Miklos

Reply via email to