Re: Train Lucene with topic-defined files

Koji Sekiguchi Sun, 22 Jun 2014 07:00:03 -0700

Hi benglish,

You are almost there. As it seems that you have got an index already,
all you should do to train is that call train() method of the classifier.

> for(int i = 0; i < NumberOfTraningFiles; i++)
> {
>       classifier.train(ar, bodyTextOfTheFile, categoryOfTheFile, new
> JapaneseAnalyzer(Version.LUCENE_46));
> }

But you should call train() method at the out of the loop.
And also, you need to use an appropriate Analyzer for your text field,
e.g. StandardAnalyzer for English.

koji
--
http://soleami.com/blog/comparing-document-classification-functions-of-lucene-and-mahout.html

(2014/06/22 16:27), benglish wrote:

Dear Koji,

Firstly, thank you so much.

I have a number of files and their categories. Each file can have just 2
attributes: category and text. Unfortunately, I could not understand how you
inserted your training data into the SimpleNaiveBayes classifier. In other
words, I did not get the .xml file. I guess it is something related to Solr,
but I have no experience with that. I was wondering if you'd mind helping me
and tell me how to have my files inserted into the training part of the
classifier. Is it possible to do something like this:

for(int i = 0; i < NumberOfTraningFiles; i++)
{
      classifier.train(ar, bodyTextOfTheFile, categoryOfTheFile, new
JapaneseAnalyzer(Version.LUCENE_46));
}



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Train-Lucene-with-topic-defined-files-tp4141979p4143296.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Re: Train Lucene with topic-defined files

Reply via email to