No. Tokenizer and LM are separate tools. You can of course replace space with a token like <spc> or something.
On November 9, 2016 6:04:07 AM GMT+00:00, Nat Gillin <nat.gil...@gmail.com> wrote: >Dear Moses community, > >Other than manually replacing space with an unused character and adding >spaces to each character before training a language model with KenLM. >Is it >possible for KenLM to generate character ngrams and output in arpa >format >without altering the input file? > >Regards, >Nat > > >------------------------------------------------------------------------ > >_______________________________________________ >Moses-support mailing list >Moses-support@mit.edu >http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support