Bradford, If I may:
Have a look at http://www.sematext.com/products/language-identifier/index.html And/or http://www.sematext.com/products/multilingual-indexer/index.html Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ----- Original Message ---- > From: Bradford Stephens <bradfordsteph...@gmail.com> > To: solr-u...@lucene.apache.org; java-user@lucene.apache.org > Sent: Thursday, August 6, 2009 3:46:21 PM > Subject: Language Detection for Analysis? > > Hey there, > > We're trying to add foreign language support into our new search > engine -- languages like Arabic, Farsi, and Urdu (that don't work with > standard analyzers). But our data source doesn't tell us which > languages we're actually collecting -- we just get blocks of text. Has > anyone here worked on language detection so we can figure out what > analyzers to use? Are there commercial solutions? > > Much appreciated! > > -- > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org