Editing Dictionary Vector Generated

2013-10-04 Thread Puneet Arora
Hello All, I am currently working on sentimental analysis of social media where in I am using mahout for vectors generation using bigrams, but while classifying them under the category some of the unigrams which I dont want are also coming. Like I classified anti English as negative now in

Re: Editing Dictionary Vector Generated

2013-10-04 Thread Ted Dunning
Why do you say that this is unacceptable? If the phrase is the most common way that the word English is used, this isn't such a bad thing. In general, with machine learning, the idea is to let the data speak. If the data say something you don't like, you have to be careful about

Re: Editing Dictionary Vector Generated

2013-10-04 Thread Puneet Arora
Thank you Sir for your reply. yes you guessed correct that I am using naive bayes, but how can I handle this type of problem. Rather then switching to any other algorithm With Regards On Fri, Oct 4, 2013 at 4:21 PM, Ted Dunning ted.dunn...@gmail.com wrote: Why do you say that this is

Re: Editing Dictionary Vector Generated

2013-10-04 Thread Ted Dunning
On Fri, Oct 4, 2013 at 6:13 AM, Puneet Arora arorapuneet2...@gmail.comwrote: yes you guessed correct that I am using naive bayes, but how can I handle this type of problem. I didn't hear about a problem. You said you didn't like weights on words like English to reflect the fact that they