No.  Tokenizer and LM are separate tools.  You can of course replace space with 
a token like <spc> or something.  

On November 9, 2016 6:04:07 AM GMT+00:00, Nat Gillin <nat.gil...@gmail.com> 
wrote:
>Dear Moses community,
>
>Other than manually replacing space with an unused character and adding
>spaces to each character before training a language model with KenLM.
>Is it
>possible for KenLM to generate character ngrams and output in arpa
>format
>without altering the input file?
>
>Regards,
>Nat
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Moses-support mailing list
>Moses-support@mit.edu
>http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to