On 11.09.2011 16:19, Grant Ingersoll wrote: > For instance, how do the labels get associated with the training examples? I > see the --labels option, but it isn't clear how it relates to the training > data.
The training data must already be labeled, it consists of <Text,VectorWritable> tuples that represents labeled vectors. The --labels option specifies which labels (and there what parts of the training data) to use. Both naive bayes implementations are based on the same paper, with the old one still including the text-specific preprocessing. --sebastian
