Steven D'Aprano <st...@pearwood.info> writes: > Does anyone have any suggestions for how to do this? Preferably something > already existing. I have some thoughts and/or questions:
I think I'd just look at the set of digraphs or trigraphs in each name and see if there are a lot that aren't found in English. > - I think nltk has a "language detection" function, would that be suitable? > - If not nltk, are there are suitable language detection libraries? I suspect these need longer strings to work. > - Is this the sort of problem that neural networks are good at solving? > Anyone know a really good tutorial for neural networks in Python? > - How about Bayesian filters, e.g. SpamBayes? You want large training sets for these approaches. -- https://mail.python.org/mailman/listinfo/python-list