[image: KoreanOCRExample.PNG] Hi all,
as a Tesseract/OCR newbie, I am currently working on deepening my understanding of the Tesseract foundations and OCR basics. This is why I came across the following strange results: When scanning some Korean Wikipedia pages (related to mathematics), Tesseract is completely unable to identify a correct set of records in this case alone. Most of the other sample test passages showed no problems, and even pre-processing had no impact on the results. So what am I missing here ? Are Korean language corporas so rare that the NN hasnt seen these particular letters ? I will add the scanned passage. Thank you in advance. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/b38b3f40-2e4e-4884-870c-d8e0d867b073n%40googlegroups.com.