On Sun, Sep 11, 2011 at 9:16 PM, Sebastian Schelter <[email protected]> wrote:
> On 11.09.2011 17:41, Grant Ingersoll wrote: > > > > On Sep 11, 2011, at 10:35 AM, Sebastian Schelter wrote: > > > >> On 11.09.2011 16:19, Grant Ingersoll wrote: > >> > >>> For instance, how do the labels get associated with the training > examples? I see the --labels option, but it isn't clear how it relates to > the training data. > >> > >> The training data must already be labeled, it consists of > >> <Text,VectorWritable> tuples that represents labeled vectors. The > >> --labels option specifies which labels (and there what parts of the > >> training data) to use. > >> > > > > So, it's just used as a filter? > > Seems so to me. > > Yes its a filter on top of the data. Usually I run into cases where I want to build a model with just two classes from the whole set. > > > >> > >> Both naive bayes implementations are based on the same paper, with the > >> old one still including the text-specific preprocessing. > >> > >> --sebastian > > > > > > > >
