On Tue, Jan 19, 2010 at 10:58 AM, Jeff Eastman
<[email protected]>wrote:

>
> Looking in MAHOUT-228-3.patch, I don't see any sparse vectorizer. Did you
> have another patch in mind?
>

There should have been one.  Let me check to figure out the name.


> I'm trying to wrap my mind around "L-1 model distribution".


For the classifier learning, what we have is a prior distribution for
classifiers that has probability proportional to exp(- sum(abs(w_i))).  The
log of this probability is - sum(abs(w_i)) = L_1(w) which gives the name.
This log probability is what is used as a regularization term in the
optimization of the classifier.

It isn't obvious from this definition, but this prior/regularizer has the
effect of preferring sparse models (for classification).  Where L_2 priors
prefer lots of small weights in ambiguous conditions because the penalty on
large coefficients is so large, L_1 priors prefer to focus the weight on one
or a few larger coefficients.


> .... Would an L-1 model vector only have integer-valued elements?
>

In the sense that 0 is an integer, yes.  :-)

But what it prefers is zero valued coefficients.

Reply via email to