I sent this some days before, but I got no answer :-((  :

To train a tokenizer I  can use a dictionary, but
where is the dictionary used to train the current English model? and
where can I find information about the dictionary format? , so I can, at least, generate my own one.

thanks
Joan Codina

Reply via email to