On Feb 9, 2010, at 12:43 PM, Robin Anil wrote: > Oops. The ARFF Driver writes only vectors not the tab separated format the > Bayes Classifier reads. I will try to add that as a flag > > @Grant: For batch classification,yes we can go with vectors, But I dont see > how we can classify documents on the fly if the dictionary cant fit in the > memory. Maybe, randomizers can help. We will have to wait for that.
I know Lucene is slower, but I still think that is the way to go. We can discuss that over on dev.
