Re: Catogorising strings into random versus non-random

Paul Rubin Mon, 21 Dec 2015 09:27:07 -0800

Steven D'Aprano <st...@pearwood.info> writes:
> Does anyone have any suggestions for how to do this? Preferably something
> already existing. I have some thoughts and/or questions:


I think I'd just look at the set of digraphs or trigraphs in each name
and see if there are a lot that aren't found in English.

> - I think nltk has a "language detection" function, would that be suitable?
> - If not nltk, are there are suitable language detection libraries?

I suspect these need longer strings to work.

> - Is this the sort of problem that neural networks are good at solving?
> Anyone know a really good tutorial for neural networks in Python?
> - How about Bayesian filters, e.g. SpamBayes?

You want large training sets for these approaches.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Catogorising strings into random versus non-random

Reply via email to