El dijous, 29 de gener del 2026, a les 20:29:45 (Hora estàndard d’Europa central), Gans, Jason David va escriure: > Hello Poppler project, > > I have been working towards a solution for extracting text from PDF files > that contain embedded Unicode values that do not match rendered glyphs. > This idea was mentioned in the Poppler mailing lists back in 2012 > (https://lists.freedesktop.org/archives/poppler/2012-April/009035.html), > but I couldn’t find any information suggesting that it was implemented and > tested. > > I have posted an experimental version of Poppler (“Poppler-science”; > https://github.com/lanl/poppler-science) that has been modified to include > a multilayer perceptron to decode font glyph symbols that are commonly used > in the scientific literature. I would appreciate any feedback from the > Poppler community and any suggestions for improvements!
Let's follow up in https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/2111 Cheers, Albert > > Regards, > > Jason Gans > > Bioscience Division > Los Alamos National Laboratory
