Hi Jason, Thanks for the advice, I will take a look at the code.
--Thanks and Regards Vaijanath N. Rao ________________________________________ From: Jason Baldridge [[email protected]] Sent: Thursday, April 21, 2011 7:01 PM To: Rao, Vaijanath Cc: [email protected] Subject: Re: Merging different models What I'm saying is that what you are trying to do with merging models isn't even coherent, so AFAIK it doesn't even have a chance of working. You might try a label propagation approach -- you can see some software here: http://code.google.com/p/junto/ On Thu, Apr 21, 2011 at 7:29 AM, Rao, Vaijanath <[email protected]>wrote: > Hi Jason, > > Thanks for the reply, > > I have already tried out the naive bayes and was wondering if and how to > use maxent in this scenario. > > If you can guide me in getting the merging part correct It will be off > great help. I am currently trying to use Random project to project > document into a smaller dimension and then use it for classification. > > --Thanks and Regards > Vaijanath N. Rao > ________________________________________ > From: Jason Baldridge [[email protected]] > Sent: Thursday, April 21, 2011 5:35 PM > To: [email protected] > Subject: Re: Merging different models > > I've been very busy, so haven't been able to respond to this in detail yet. > But, briefly, based on a quick read, what you describe here shouldn't work > at all. You could train different models and combine them as an ensemble > (majority vote, average, product). You'll need to make sure that the label > vectors are comparable for each model as they will vary from dataset to > dataset with so many labels. > > I'd also recommend trying out a simple naive bayes classifier here, at > least > as a first pass. > > On Wed, Apr 20, 2011 at 7:35 AM, Rao, Vaijanath > <[email protected]>wrote: > > > Hi All, > > > > I am trying to use maxent for the Large scale hierarchical challenge ( > > http://lshtc.iit.demokritos.gr:10000/ ) contest. > > > > However, I could not get maxent to work on large number of > > classes/categories ( dmoz test data has something like 28K classes and > 580K+ > > features ). So decided to split the training and merging the models after > > every few iterations. The split is decided by the category/classes so > that > > all the instance belonging to one class resides in one split. > > > > At every few iteration the model generated by each of these splits is > > merged ( I merge out all of the model Data structures ) and average out > the > > parameters estimated. > > > > But even after something like 1000 iterations I don't see accuracy going > > beyond 70%. As after every merge there is dip in overall accuracy. So I > was > > wondering if there is a better way to merge. > > > > Can someone guide me in getting the split / incremental training or > should > > I try the perceptron model . > > > > --Thanks and Regards > > Vaijanath N. Rao > > > > > > > -- > Jason Baldridge > Assistant Professor, Department of Linguistics > The University of Texas at Austin > http://www.jasonbaldridge.com > http://twitter.com/jasonbaldridge > -- Jason Baldridge Assistant Professor, Department of Linguistics The University of Texas at Austin http://www.jasonbaldridge.com http://twitter.com/jasonbaldridge
