Salut Pascal, I'm glad that you were able to determine that Tesseract is working correctly.
On Wednesday, September 17, 2025 at 6:57:45 AM UTC-4 [email protected] wrote: Do you have any idea what's going on? Since you are working with two different applications, which folks in this forum are unlikely to have much knowledge of: 1) gscan2pdf, which uses/embeds Tesseract, and 2) Okular, which I'm guessing is a PDF viewer. There are a number of areas where things could go awry, including the way the PDF is constructed and the way the text is selected and formatted on the clipboard. I suspect that the text is being split into multiple text blocks and that each of those text blocks is getting a new line added for "free" at the end. Where in the processing chain this is happening isn't clear. If your goal is simply to get the best rendition of the text, it sounds like you've discovered what is needed. If you want to get that specific combination of programs to work better, you're probably going to need to address it with whoever supports them. Bonne chance! Tom -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/8bfc1d17-a48e-41a5-85b0-9c1367c40930n%40googlegroups.com.

