date:20131010

RE: Naive bayes and character n-grams

2013-10-10 Thread simon.2.thompson

Hey Dean, what do you mean by character n-grams? If you mean things like ab or ui2 then given that there are so few characters compared to words is there a problem that can't be solved without a look-up table for ny (where y 4ish ) Or are you looking at y 4 ish because if so then do you run

Re: Naive bayes and character n-grams

2013-10-10 Thread Dean Jones

Hi Suneel, On 9 October 2013 14:27, Suneel Marthi suneel_mar...@yahoo.com wrote: an example of a Naive-Bayes classifier trained on character n-grams is the LangDetect library. (see http://code.google.com/p/language-detection/) Agree with Ted that it should be relatively easy to build one.

Re: Naive bayes and character n-grams

2013-10-10 Thread Dean Jones

Hi Si, On 10 October 2013 07:59, simon.2.thomp...@bt.com wrote: what do you mean by character n-grams? If you mean things like ab or ui2 then given that there are so few characters compared to words is there a problem that can't be solved without a look-up table for ny (where y 4ish ) Or are

Re: Naive bayes and character n-grams

2013-10-10 Thread Ted Dunning

For language detection, you are going to have a hard time doing better than one of the standard packages for the purpose. See here: http://blog.mikemccandless.com/2011/10/accuracy-and-performance-of-googles.html On Thu, Oct 10, 2013 at 1:01 AM, Dean Jones dean.m.jo...@gmail.com wrote: Hi Si,

Re: Naive bayes and character n-grams

2013-10-10 Thread Dean Jones

On 10 October 2013 12:46, Ted Dunning ted.dunn...@gmail.com wrote: For language detection, you are going to have a hard time doing better than one of the standard packages for the purpose. See here: http://blog.mikemccandless.com/2011/10/accuracy-and-performance-of-googles.html Thanks for

Re: Naive bayes and character n-grams

2013-10-10 Thread Ted Dunning

Cool. Sounds like you are ahead of the game. Sent from my iPhone On Oct 10, 2013, at 13:15, Dean Jones dean.m.jo...@gmail.com wrote: On 10 October 2013 12:46, Ted Dunning ted.dunn...@gmail.com wrote: For language detection, you are going to have a hard time doing better than one of the

Re: Naive bayes and character n-grams

2013-10-10 Thread Suneel Marthi

Dean, Just a thought. You should be able to create new language models (with LangDetect) if there's Wikipedia content for the specific language, had to do it in the past for Pashto and Malaysian. On Thursday, October 10, 2013 8:16 AM, Dean Jones dean.m.jo...@gmail.com wrote: On 10

Re: Solr-recommender

2013-10-10 Thread Pat Ferrel

The issue of offline tests is often misunderstood I suspect. While I agree with Ted it might do to explain a bit. For myself I'd say offline testing is a requirement but not for comparing two disparate recommenders. Companies like Amazon and Netflix, as well as others on record, have a

RE: Naive bayes and character n-grams

Re: Naive bayes and character n-grams

Re: Naive bayes and character n-grams

Re: Naive bayes and character n-grams

Re: Naive bayes and character n-grams

Re: Naive bayes and character n-grams

Re: Naive bayes and character n-grams

Re: Solr-recommender

8 matches

Site Navigation

Mail list logo

Footer information