End of line whitespaces in Eclipse
When I save a .java file in Eclipse, it is removing the end of line whitespaces. I am using the http://opennlp.apache.org/code-formatter/OpenNLP-Eclipse-Formatter.xml This is causing lots of changes in files I actually needed to change only one line. Do anybody know how to I avoid it? Thank you, William
Re: Doccat evaluator
Yes, I just finished implementing the confusion matrix report, just like the one I did for the POS Tagger. I will commit it today. I could not test it properly with Leipzig corpus. For some reason to Doccat never fails with this corpus! To effectively test it I used the 20news corpus. 2014-04-10 19:37 GMT-03:00 Jörn Kottmann : > I thought it should be done similar to the way pos tags are measured when > I implemented that. > > A confusion matrix might also be helpful to see which categories are more > difficult to classify for the system. > > Jörn > > > On 04/10/2014 03:00 PM, William Colen wrote: > >> Actually, since we always add a tag to each document, accuracy makes >> sense. >> We could implement F-1 for the individual categories. >> >> 2014-04-09 17:23 GMT-03:00 William Colen : >> >> Hello, >>> >>> I was checking if there is any open issue related to Doccat, and I found >>> this one - >>> >>> OPENNLP-81: Add a cli tool for the doccat evaluation support >>> >>> I noticed that there is already a class >>> named DocumentCategorizerEvaluator, which is not used anywhere >>> internally. >>> This is evaluating performance in terms of accuracy, but I believe it >>> would >>> be better do do it in terms of F-Measuare. >>> >>> Any thoughts? >>> >>> As we are working in a major version, I think it would be OK to change >>> it. >>> >>> >>> Thank you, >>> William >>> >>> >
Re: Doccat evaluator
I thought it should be done similar to the way pos tags are measured when I implemented that. A confusion matrix might also be helpful to see which categories are more difficult to classify for the system. Jörn On 04/10/2014 03:00 PM, William Colen wrote: Actually, since we always add a tag to each document, accuracy makes sense. We could implement F-1 for the individual categories. 2014-04-09 17:23 GMT-03:00 William Colen : Hello, I was checking if there is any open issue related to Doccat, and I found this one - OPENNLP-81: Add a cli tool for the doccat evaluation support I noticed that there is already a class named DocumentCategorizerEvaluator, which is not used anywhere internally. This is evaluating performance in terms of accuracy, but I believe it would be better do do it in terms of F-Measuare. Any thoughts? As we are working in a major version, I think it would be OK to change it. Thank you, William
Re: Doccat evaluator
Actually, since we always add a tag to each document, accuracy makes sense. We could implement F-1 for the individual categories. 2014-04-09 17:23 GMT-03:00 William Colen : > Hello, > > I was checking if there is any open issue related to Doccat, and I found > this one - > > OPENNLP-81: Add a cli tool for the doccat evaluation support > > I noticed that there is already a class > named DocumentCategorizerEvaluator, which is not used anywhere internally. > This is evaluating performance in terms of accuracy, but I believe it would > be better do do it in terms of F-Measuare. > > Any thoughts? > > As we are working in a major version, I think it would be OK to change it. > > > Thank you, > William >