Hi Jason, Thanks for that
What I am trying to do is something like: 005-67 is the code for "hardware materials and house electrics" 005-68 is the code for "plumbing materials and hidraulic seals" And so on. So If I have a string that says "plumbing seals" then the most likely code for that is 005-68. I have a csv file with code ; description. I will take a look at that. I wonder if I have to save those into separate files with the respective codes as filenames so that I can run the trainer. On Mon, Mar 21, 2011 at 9:28 PM, Jason Baldridge <[email protected]>wrote: > You should be able to use a RealBasicEventStream, with your events > specified > in text files as > > feature1=value1,feature2=value2, ... featureN=valueN,outcome > > Note, you can just give feature42 (without =value42) and it assumes the > value is 1. > > You can then train a model by calling opennlp.maxent.ModelTrainer and > include the -real option. > > Or, if you are doing it all with the API, you can use create Events with > this constructor: > > public Event(String outcome, String[] context, float[] values) { > > where values[i] is the td-idf value for context[i] > > Hope this helps! > > -Jason > > On Fri, Mar 4, 2011 at 8:56 AM, Francesco Serra <[email protected]> wrote: > > > > > Hello, I'm trying to make text classification with maximum entropy model. > > I implemented some code that processes text files in input and calculates > > TF and IDF terms.. > > I wanna ask if someone has idea how to use these terms to make > > classification with maximum entropy.. > > Thanks to everyone. > > > > > > > -- > Jason Baldridge > Assistant Professor, Department of Linguistics > The University of Texas at Austin > http://www.jasonbaldridge.com >
