How big was your training text? How many iterations? Did the fonts you use for training support the plus minus sign?
You can run training with -- debug-level of -1 so that you can see whether the plus minus is being picked for training in the console messages. On Mon, 17 Jun 2019, 23:29 Jingjing Lin, <joejoeu...@gmail.com> wrote: > Thanks. It works. The new character I added was there. > > Do you have any idea why after fine tuning tesseract still couldn't > recognize the new character I added? When I tried to add '±' to eng it > works, but when I tried to add '±' to chi_sim, it couldn't work (explained > below). Is there anything we need to pay attention to when fine tuning > other langs rather than eng? > > I used > > lstmeval --model ~/tesstutorial/trainplusminus/plusminus_checkpoint \ > --traineddata ~/tesstutorial/trainplusminus/chi_sim/chi_sim.traineddata \ > --eval_listfile ~/tesstutorial/evalplusminus/chi_sim.training_files.txt > 2>&1 | > grep ± > > to check and ± only shows up in Truth but not in OCR > > > 在 2019年6月17日星期一 UTC-4上午11:31:24,shree写道: >> >> combine_tessdata -u new.traineddata new. >> >> will unpack the traineddata file. check new.lstm-unicharset in it >> >> On Monday, June 17, 2019 at 8:20:24 PM UTC+5:30, Jingjing Lin wrote: >>> >>> I tried to fine tune the model and add a new character via training, but >>> it seems it still couldn't recognize this new character using the new >>> traineddata generated. To debug I want to check whether this new character >>> is in the .unicharset in the new traineddata generated. Is there a way to >>> do this? >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/d251e677-5f9d-4f8f-b41a-aa015538ca47%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/d251e677-5f9d-4f8f-b41a-aa015538ca47%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVjKKD%2B%3DPGNQB249yrndmQH_fo4P%2BtxHfvCbO-2hnH5_g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.