[tesseract-ocr] Re: Training Metrics

Des Bw Thu, 23 Nov 2023 01:34:45 -0800

I think they are abbreviations: 
best char error =BCER
character error = CER


There is no signs to tell if the model is overfit. I know no diagnostics 
for that. For fine-tuning, running iterations higher than 400 is always 
problematic because it destroys the base model. 

- So, the common strategy is to increase your data; and run just 300 
iterations. The BCER is not that important in that case. 
But, for training from scratch or from layer (network), you should try to 
get the BCER (error rate) as low as possible. Overfitting happens when the 
data is too small, and the iterations are too many. From my experience, 
running 2-5 epochs seems to generate good results. But, I have seen 
experienced guys training for hundred even thousands of epochs. 


On Thursday, November 23, 2023 at 12:28:35 PM UTC+3 smon...@gmail.com wrote:

> Alright, 
>
> this might be a litte bit of a dump question but where exactly can I see 
> the CER?
>
> 2 Percent improvement time=56, best error was 12.49 @ 8294
> At iteration 8350/10000/10000, Mean rms=2.701%, delta=2.491%, char 
> train=10.385%, word train=24.4%, skip ratio=0%,  New best char error = 
> 10.385 wrote best 
> model:data/Common_num/checkpoints/Common_num10.385_8350.checkpoint wrote 
> checkpoint.
>
> Is it the "best char error"? Where do I have to look to find CER? Is the 
> CER in the above example?
>
> Also what are signs that my model is overfitted? Is there any possibility 
> recognicing this in the above statement?
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/5d2445de-760e-4631-9982-795d61cd0e3fn%40googlegroups.com.

[tesseract-ocr] Re: Training Metrics

Reply via email to