# Steps to reproduce

```bash
tesseract pngs.txt "$name" -l ita pdf
```

# Error

```
Page 0 : 
/storage/emulated/0/Download/tmp/CIRCOLARE-SPOSTAMENTO-CLASSI-DAL-30-09-2024-_-1D-AG-EN-_-2D-AG-EN-_-1LA-_-2LA-_-3LA.pdf-1.png
pdf to convert: 
/storage/emulated/0/Download/tmp/CIRCOLARE-SPOSTAMENTO-CLASSI-DAL-30-09-2024-_-1D-AG-EN-_-2D-AG-EN-_-1LA-_-2LA-_-3LA.pdf.txt
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table
Error in findFileFormatStream: truncated file
Error during processing.
pdf to convert: 
/storage/emulated/0/Download/tmp/CIRCOLARE-SPOSTAMENTO-CLASSI-DAL-30-09-2024-_-1D-AG-EN-_-2D-AG-EN-_-1LA-_-2LA-_-3LA.pdf.pdf
converting 
CIRCOLARE-SPOSTAMENTO-CLASSI-DAL-30-09-2024-_-1D-AG-EN-_-2D-AG-EN-_-1LA-_-2LA-_-3LA.pdf
.pdf to png...
Syntax Error: Document stream is empty
Error, could not create PDF output file: Operation not permitted
```

# Thoughts

I'm providing tesseract a list of png files, and this worked while I was 
outputting text (`tesseract pngs.txt "$name" -l ita txt`, but when I tried 
doing the same for a pdf it didn't work :/

I know there are lots of tools that use tesseract for this, but I prefer 
doing it with tesseract + combo of other tools if necessary so that I get 
better/easier control over tesseract itself.

Thank you in advance, I'm sure this must be something common but I just 
can't seem to get it right!


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/1726fc5f-202a-42ca-957f-4040f1fafcban%40googlegroups.com.

Reply via email to