Hi Jason,

Thanks for that

What I am trying to do is something like:

005-67 is the code for "hardware materials and house electrics"

005-68  is the code for "plumbing materials and hidraulic seals"

And so on. So If I have a string that says "plumbing seals" then the most
likely code for that is 005-68.

I have a csv file with code ; description.

I will take a look at that.

I wonder if I have to save those into separate files with the respective
codes as filenames so that I can run the trainer.

On Mon, Mar 21, 2011 at 9:28 PM, Jason Baldridge
<[email protected]>wrote:

> You should be able to use a RealBasicEventStream, with your events
> specified
> in text files as
>
> feature1=value1,feature2=value2, ... featureN=valueN,outcome
>
> Note, you can just give feature42 (without =value42) and it assumes the
> value is 1.
>
> You can then train a model by calling opennlp.maxent.ModelTrainer and
> include the -real option.
>
> Or, if you are doing it all with the API, you can use create Events with
> this constructor:
>
> public Event(String outcome, String[] context, float[] values) {
>
> where values[i] is the td-idf value for context[i]
>
> Hope this helps!
>
> -Jason
>
> On Fri, Mar 4, 2011 at 8:56 AM, Francesco Serra <[email protected]> wrote:
>
> >
> > Hello, I'm trying to make text classification with maximum entropy model.
> > I implemented some code that processes text files in input and calculates
> > TF and IDF terms..
> > I wanna ask if someone has idea how to use these terms to make
> > classification with maximum entropy..
> > Thanks to everyone.
> >
>
>
>
>
> --
> Jason Baldridge
> Assistant Professor, Department of Linguistics
> The University of Texas at Austin
> http://www.jasonbaldridge.com
>

Reply via email to