+10
Love the academics but I agree with this. Recently saw a VP from Netflix plead
with the audience (mostly academics) to move past RMSE--focus on maximizing
correct ranking, not rating prediction.
Anyway I have a pipeline that does the following:
ingests logs either TSV or CSV of arbitrary c
On 07/22/2013 12:20 PM, Pat Ferrel wrote:
My understanding of the Solr proposal puts B's row similarity matrix in a vector per
item. That means each row is turned into "terms" = external IDs--not sure how
the weights of each term are encoded.
This is the key question for me. The best idea I've
Just to make sure if I understood correctly, Ted, could you please correct
me?:)
1. Using a search engine, I will treat items as documents, where each
document vector consists of other items (similar to "words of documents")
with co-occurrence (LLR) weights (instead of tf-idf in a search engine
a
My experience is that TFIDF works just fine, especially as first cut.
Adding different kinds of data, building out backend A/B testing, tuning
the UI, weighting the query all come the next round of weighting changes.
Typically, the priority stack never empties enough for that task to rise
to the
Inline ... slightly redundant relative to other answers, but that shouldn't
be a problem.
On Mon, Jul 22, 2013 at 11:56 AM, Gokhan Capan wrote:
> Just to make sure if I understood correctly, Ted, could you please correct
> me?:)
>
>
> 1. Using a search engine, I will treat items as documents, w
On Mon, Jul 22, 2013 at 9:20 AM, Pat Ferrel wrote:
> +10
>
> Love the academics but I agree with this. Recently saw a VP from Netflix
> plead with the audience (mostly academics) to move past RMSE--focus on
> maximizing correct ranking, not rating prediction.
>
> Anyway I have a pipeline that doe
So you are proposing just grabbing the top N scoring related items and
indexing listing them without regard to weight? Effectively quantizing
the weights to = 1, and 0 for everything else? I guess LLR tends to do
that anyway
-Mike
On 07/22/2013 02:57 PM, Ted Dunning wrote:
My experience is
inline
BTW if there is an LLR cross-similarity job (replacing [B'A] it is easy to
integrate.
On Jul 22, 2013, at 12:09 PM, Ted Dunning wrote:
On Mon, Jul 22, 2013 at 9:20 AM, Pat Ferrel wrote:
> +10
>
> Love the academics but I agree with this. Recently saw a VP from Netflix
> plead with t
On Mon, Jul 22, 2013 at 12:40 PM, Pat Ferrel wrote:
> Yes. And the combined recommender would query on both at the same time.
>
> Pat-- doesn't it need ensemble type weighting for each recommender
> component? Probably a wishlist item for later?
Yes. Weighting different fields differently is
Not entirely without regard to weight. Just without regard to designing
weights specific to this application. The weights that Solr uses natively
are intuitively what we want (rare indicators have higher weights in a
log-ish kind of way).
Frankly, I doubt the effectiveness here of mathematical r
Fair enough - thanks for clarifying. I wondered whether that would be
worth the trouble, also. Maybe one the academics Pat mentioned will
test and find out for us :)
On 7/22/13 6:45 PM, Ted Dunning wrote:
Not entirely without regard to weight. Just without regard to
designing weights spec
This is pulled out of one of Ted's inline responses to the recent Setting
up a recommender thread, and was hoping to confirm some things... Most of
which may end up being a restatement of what he and others have said in the
first place.
It seems that you would have a "document" in solr for each th
Exactly what I was trying to say. Excellently clear way to put it all.
On Mon, Jul 22, 2013 at 8:38 PM, B Lyon wrote:
> This is pulled out of one of Ted's inline responses to the recent Setting
> up a recommender thread, and was hoping to confirm some things... Most of
> which may end up being
Could I ask everybody who had trouble with my prose help me out by
commenting on the design document? That way I can record the improvements
that would make it clear.
My apologies for only allowing commenting, but I find it easier to make
sure all comments get in because there is a very nice trac
Hi,
in Mahout examples, the (org.apache.mahout.classifier.sgd.)RunLogistic or
Trainlogistic
class is a great example to classify content with SGD and to get a nice
confusion matrix.
I'm trying to use and adapt this to classify data in more than 2 categories.
The alorithm uses classify scalar met
Classify is the call that you want.
The command line logistic regression programs were originally written more
as demonstrations and weren't written to handle multiple target values.
It shouldn't be hard to adapt them. It would be great to get a patch to do
so.
On Mon, Jul 22, 2013 at 9:08 PM
Hi Ted,
Thanks . Can you please tell me the class which one I have to use for
multinomial (Like AdaptiveLogisticRegression and OnLineLogistic Regression).
How do I give target categories value to this class.is it from constructor.If
you give small code snippet ,that will be helpful for me.
-
OLR supports a train method with the target being an integer. That allows
multi-class training.
I can't remember if ALR does as well. It may not since AUC is used to
select hyper parameters and AUC is not uniquely defined for multiple
classes.
Calling the classifyFull method is the easiest way
18 matches
Mail list logo