The confusion here may be over the term "supervised"
Supervised classification assumes you know which group each user is in, and the
classifier builds a model to classify new users into the predefined groups.
Usually there is a classifier for each group that, when given a user vector,
return h
Using a boolean data model and log likelihood similarity I get recommendations
with strengths.
If I were using preference rating magnitudes the recommendation strength is
interpreted as the likely magnitude that a user would rate the recommendation.
Using the boolean model I get values approach
* however since it means results can't be ranked by preference value (all
are 1). So instead this returns a
* sum of similarities to any other user in the neighborhood who has also
rated the item.
*/
On Nov 15, 2012, at 9:59 AM, Pat Ferrel wrote:
Using a boolean data model and log l
ighted by count -- which is to say, it's a
sum of similarities. This isn't terribly principled but works reasonably in
practice. A simple average tends to over-weight unpopular items, but there
are likely better ways to account for that.
On Thu, Nov 15, 2012 at 5:59 PM, Pat Ferrel wrote
similarity, weighted by count -- which is to say, it's a
sum of similarities. This isn't terribly principled but works reasonably in
practice. A simple average tends to over-weight unpopular items, but there
are likely better ways to account for that.
On Thu, Nov 15, 2012 at 5:59 PM,
Trying to catch up.
Isn't the sum of similarities actually a globally comparable number for
strength of preference in a boolean model? I was thinking it wasn't but it is
really. It may not be ideal but as an ordinal it should work, right?
Is the logic behind the IDF idea that very popular items
I'm doing a very simple recommender based on binary data. Using
GenericRecommenderIRStatsEvaluator I get nDCG = NaN for each user. My data is
still very incomplete, which means an extremely low cooccurrence rate but there
are some since otherwise I'd expect P and R to be 0 and they are not. For
will have to decide what NaN
means.
I am happy to change that -- but would not pay attention to these
tests at this scale.
On Mon, Dec 3, 2012 at 7:55 PM, Pat Ferrel wrote:
> I'm doing a very simple recommender based on binary data. Using
> GenericRecommenderIRStatsEvaluator I g
does anyone know if mahout/examples/bin/factorize-movielens-1M.sh is still
working? CLI version of splitDataset is crashing in my build (latest trunk).
Even as in "mahout splitDataset" to get the params. Wouldn't be the first time
I mucked up a build though.
it complete correctly. Not
exactly sure how this is supposed to be done, it doesn't look like the options
get parsed in the super class automatically.
This will cause any invocation of splitDataset or DatasetSplitter to crash
running the current trunk.
On Dec 5, 2012, at 1:58 PM, Pat Ferre
What is the intuition regarding the choice or tuning of the ALS params?
Job-Specific Options:
--lambda lambda regularization
parameter
--implicitFeedback implicitFeedback
+1 this, found the same problems, same fixes. Haven't seem your last problem
On Jan 11, 2013, at 1:41 PM, Ying Liao wrote:
I am tring factorize-movielens-1M.sh. I first find a bug in the sh file.
Then I find a bug in org.apache.mahout.cf.taste.hadoop.als.DatasetSplitter,
the argMap is not mapped
elter wrote:
Which version/distribution of Hadoop are you using?
On 17.01.2013 16:08, Pat Ferrel wrote:
> +1 this, found the same problems, same fixes. Haven't seem your last problem
>
> On Jan 11, 2013, at 1:41 PM, Ying Liao wrote:
>
> I am tring factorize-movielens-1M.sh.
RE: Temporal effects. In CF you are interested in similarities. For instance in
a User-based CF recommender you want to detect users similar to a given user.
The time decay of the similarities is likely to be very slow. In other word if
I bought an iPad 1 and you bought an iPad 1, the similarity
mporal dynamics.
On Sat, Feb 2, 2013 at 9:54 AM, Pat Ferrel wrote:
> RE: Temporal effects. In CF you are interested in similarities. For
> instance in a User-based CF recommender you want to detect users similar to
> a given user. The time decay of the similarities is likely to be ve
2013 at 1:03 PM, Pat Ferrel wrote:
> Indeed, please elaborate. Not sure what you mean by "this is an important
> effect"
>
> Do you disagree with what I said re temporal decay?
>
No. I agree with it. Human relatedness decays much more quickly than item
popularity.
I
:
On Tue, Feb 5, 2013 at 11:29 AM, Pat Ferrel wrote:
> I think you meant: "Human relatedness decays much slower than item
> popularity."
>
Yes. Oops.
> So to make sure I understand the implications of using IDF… For
> boolean/implicit preferences the sum of all pref
The affect of downweighting the popular items is very similar to removing them
from recommendations so I still suspect precision will go down using IDF.
Obviously this can pretty easily be tested, I just wondered if anyone had
already done it.
This brings up a problem with holdout based precisi
I'd like to experiment with using several types of implicit preference values
with recommenders. I have purchases as an implicit pref of high strength. I'd
like to see if add-to-cart, view-product-details, impressions-seen, etc. can
increase offline precision in holdout tests. These less than ob
nt for the affect: you looked at certain items
and eventually purchased one and I looked at the same items so I might like
what you purchased. It also seems to work better in the existing mahout
framework.
On Feb 9, 2013, at 11:50 AM, Pat Ferrel wrote:
I'd like to experiment with using s
together but not as strongly as ought to
> be obvious from the fact that they're the same. Still, interesting
thought.
>
> There ought to be some 'signal' in this data, just a question of how much
> vs noise. A purchase means much more than a page view of course; it'
There are several methods for recommending things given a shopping cart
contents. At the risk of using the same tool for every problem I was thinking
about a recommender's use here.
I'd do something like train on shopping cart purchases so row = cartID, column
= itemID.
Given cart contents I co
53 AM, Pat Ferrel wrote:
> There are several methods for recommending things given a shopping cart
> contents. At the risk of using the same tool for every problem I was
> thinking about a recommender's use here.
>
> I'd do something like train on shopping cart purch
eas you've mentioned here. Given N items in a cart,
which next item most frequently occurs in a purchased cart?
On Thu, Feb 14, 2013 at 6:30 PM, Pat Ferrel wrote:
> I thought you might say that but we don't have the add-to-cart action. We
> have to calculate cart purchases by ma
own version of it. Yes you are computing similarity
for k carted items by all N items, but is N so large? hundreds of
thousands of products? this is still likely pretty fast even if the
similarity is over millions of carts. Some smart precomputation and caching
goes a long way too.
On Thu, Feb 14
2013, at 6:09 PM, Ted Dunning wrote:
Do you see the contents of the cart?
Is the cart ID opaque? Does it persist as a surrogate for a user?
On Thu, Feb 14, 2013 at 10:30 AM, Pat Ferrel wrote:
> I thought you might say that but we don't have the add-to-cart action. We
> have t
Time splits are fine but may contain anomalies that bias the data. If you are
going to compare two recommenders based on time splits, make sure the data is
exactly the same for each recommender. One time split we did to create a 90-10
training to test set had a split date of 12/24! Some form of
be some 'signal' in this data, just a question of how much
> vs noise. A purchase means much more than a page view of course; it's not
> as subject to noise. Finding a means to use that info is probably going to
> help.
>
>
>
>
> On Sat, Feb 9, 2
My plan was to NOT use lucene to start with though I see the benefits. This is
because I want to experiment with weighting--doing idf, no weighting, and with
a non-log idf. Also I want to experiment with temporal decay of recomendability
and maybe blend item similarity based results in certain c
combined item recommendation matrix which is
roughly twice as much work as you need to do and it also doesn't let you
adjust weightings separately.
But it is probably the simplest way to get going with cross recommendation.
On Fri, Feb 22, 2013 at 9:48 AM, Pat Ferrel wrote:
> There
rm set of users to connect the items
together. When you compute the cooccurrence matrix you get A_1' A_1 + A_2'
A_2 which gives you recommendations from 1=>1 and from 2=>2, but no
recommendations 1=>2 or 2=>1. Thus, no cross recommendations.
On Sat, Feb 23, 2013 at 10
To pick up an old thread…
A = views items x users
B = purchases items x users
A cross recommender B'A h_v + B'B h_p = r_p
The B'B h_p is the basic boolean mahout recommender trained on purchases and
we'll use that implementation I assume.
B'A gives cooccurrences of views and purchases multiplyi
rong since view similarity unfiltered by
purchase is not ideal) or the cooccurrences in [B'A] and since this is not
symmetric it will matter whether I look at columns or rows. Either correspond
to item ids but similarities will be different.
Has anyone tried this sort of thing?
On Mar 19, 2
Taking an idea from Ted, I'm working on a cross recommender starting from
mahout's m/r implementation of an item-based recommender. We have purchases and
views for items by user. It is straightforward to create a recommender on
purchases but using views as a predictor of purchases does not work
to each row of the
>> input matrix. You can think of it as computing A'A and sparsifying the
>> result afterwards. Furthermore it allows to plug in a similarity measure
>> of your choice.
>>
>> If you want to have a cooccurrence matrix, you can use
>>
>
ed it. I will need to pass in the size of the matrices as the size of the
user and item space, Correct?
On Apr 3, 2013, at 9:15 AM, Pat Ferrel wrote:
The non-symmetry of the [B'A] and the fact that it is calculated from two
models leads me to a rather heavy handed approach at least for a
I guess I don't understand this issue.
In my case both the item ids and user ids of the separate DistributedRow Matrix
will match and I know the size for the entire space from a previous step where
I create id maps. I suppose you are saying the the m/r code would be super
simple if a row of B'
I need to do the equivalent of the xrecommender.mostSimilarItems(long[]
itemIDs, int howMany)
To over simplify this, in the standard Item-Based Recommender this is
equivalent to looking at the item similarities from the preference matrix
(similarity of item pruchases by user). In the xrecommen
like views and
purchases?
On Apr 8, 2013, at 2:31 PM, Ted Dunning wrote:
On Sat, Apr 6, 2013 at 3:26 PM, Pat Ferrel wrote:
> I guess I don't understand this issue.
>
> In my case both the item ids and user ids of the separate DistributedRow
> Matrix will match and I know th
to use Wikipedia articles (Myrrix, GraphLab).
Another idea is to use StackOverflow tags (Myrrix examples).
Although they are only good for emulating implicit feedback.
On Wed, Apr 10, 2013 at 6:48 PM, Ted Dunning wrote:
> On Wed, Apr 10, 2013 at 10:38 AM, Pat Ferrel
> wrote:
>
>&g
Getting this running with co-occurrence rather than using a similarity calc on
user rows finally forced me to understand what is going on in the base
recommender. And the answer implies further work.
[B'B] is usually not calculated in the usual item based recommender. The matrix
that comes out
Or you may want to look at recording purchases by user ID. Then use the
standard recommender to train on (userID, itemsID, boolean). Then query the
trained recommender thus: recommender.mostSimilarItems(long itemID, int
howMany) This does what you want but uses more data than just what items wer
Do you not have a user ID? No matter (though if you do I'd use it) you can use
the item ID as a surrogate for a user ID in the recommender. And there will be
no filtering if you ask for recommender.mostSimilarItems(long itemID, int
howMany), which has no user ID in the call and so will not filte
That looks like the best shortcut. It is one of the few places where the rows
of one and the columns of the other are seen together. Now I know why you
transpose the first input :-)
But, I have begun to wonder whether it is the right thing to do for a cross
recommender because you are comparing
esource.
>
> Robin
>
>
> On 4/10/13 8:37 PM, "Pat Ferrel" wrote:
>
>> I have retail data but can't publish results from it. If I could get a
>> public sample I'd share how the technique worked out.
>>
>> Not sure how to simulate
om/api-profiles/products-api
http://www.kaggle.com/c/acm-sf-chapter-hackathon-big/data
On Mon, Apr 15, 2013 at 2:03 PM, Pat Ferrel wrote:
> MAJOR may be too tame a word.
>
> Furthermore there are several enhancements the community could make to
> support retail data and retail recommen
k to
> view.
>
>
> On Tue, Apr 16, 2013 at 4:53 PM, Pat Ferrel wrote:
>
>> For the cross-recommender we need some replacement for a primary
>> action--purchases and a secondary action--views, clicks, impressions,
>> something.
>>
>> To use this da
u can infer
the search from the data, just not all search results.
On Apr 16, 2013, at 1:24 PM, Pat Ferrel wrote:
I think Ted is talking about a different application of this idea:
http://www.slideshare.net/tdunning/search-as-recommendation
The IDs in my case must be in the same space, at very
You always will have a "cold start" problem for a subset of users--the new ones
to a site. Popularity doesn't always work either. Sometimes you have a flat
purchase frequency distribution, as I've seen. In these cases a metadata or
content based recommender is nice to fill in. If you have no met
I'm doing an experiment creating a recommender from a Pinterest crawl I have
going. I have at least three actions that relate to recommendations:
Goal: recommend people you (a pinterest user) might want to follow
Actions mined by crawling:
follows (user, user)
followed by (user, user)
repinned
Using a Hadoop version of a Mahout recommender will create some number of recs
for all users as its output. Sean is talking about Myrrix I think which uses
factorization to get much smaller models and so can calculate the recs at
runtime for fairly large user sets.
However if you are using Maho
On May 19, 2013 6:27 PM, "Pat Ferrel" wrote:
> Using a Hadoop version of a Mahout recommender will create some number of
> recs for all users as its output. Sean is talking about Myrrix I think
> which uses factorization to get much smaller models and so can calculate
> the
no user data in the matrix. Or are you talking about using the user history as
the query? in which case you have to remember somewhere all users' history and
look it up for the query, no?
On May 19, 2013, at 8:09 PM, Ted Dunning wrote:
On Sun, May 19, 2013 at 8:04 PM, Pat Ferrel wrote:
&
I certainly have questions about this architecture mentioned below but first
let me make sure I understand.
You use the user history vector as a query? This will be a list of item IDs and
strength-of-preference values (maybe 1s for purchases). The cooccurrence matrix
has columns treated like t
In the interest of getting some empirical data out about various architectures:
On Mon, May 20, 2013 at 9:46 AM, Pat Ferrel wrote:
>> ...
>> You use the user history vector as a query?
>
> The most recent suffix of the history vector. How much is used varies by
> the
This data was for a mobile shopping app. Other answers below.
> On May 21, 2013, at 5:42 PM, Ted Dunning wrote:
>
> Inline
>
>
> On Tue, May 21, 2013 at 8:59 AM, Pat Ferrel wrote:
>
>> In the interest of getting some empirical data out about various
>> arc
e averages for
all clusters?
I don't think I've heard of this before. Seems interesting is there a paper?
On May 21, 2013, at 9:53 PM, Ted Dunning wrote:
On Tue, May 21, 2013 at 8:47 PM, Pat Ferrel wrote:
> For this sample it looks like about 20-40 clusters is "best"? Loo
proportional to log-likelihood (with an offset) for the mixture of
Gaussian model that underlies k-means clustering.
See this paper for a use of mean squared distance to nearest cluster.
On Fri, May 24, 2013 at 9:46 AM, Pat Ferrel wrote:
> I'm trying to automate something like a hier
I've got a cross-recommender too. It was originally conceived to do a
multi-action ensemble from Ted's notes. I'm now gathering a new data set and
building the meta-model learner.
Even with the same scale you need to learn the weighting factors. Consider a
simple ensemble case:
R_p is the matr
u really perform gradient descent learning of the weights
using hadoop/mahout? Isn't this too costly to perform due to the overheads of
the JVM and hadoop?
On Jun 1, 2013, at 1:21 AM, Pat Ferrel wrote:
> I've got a cross-recommender too. It was originally conceived to do a
&g
Am I loosing my mind or did the --outputPath option get removed from the
MatrixMultiplicationJob recently? It looks like it is now in 'productWith-xxx'
so I'll have to search for the most recent dir of that name? And why isn't
there a --outputPath option to transpose? I have to search for the mo
and, of course, eventually A/B test it.
You don't always have time associated with actions. In the data I'm mining from
Pinterest, for example, the date one user followed another user is not
available. So there is no reasonable way to do truncation. Maybe Pinterest
could do better.
cky and will require some
sort of grid search for good parameters (which might be sped up by using an
evolutionary algorithm picking the best intermediate solutions).
Since, most of what I wrote above about evaluation is still in the planning
stage, any suggestions are welcome!
On Jun 4, 2013, at
In the case where you know a user did not like an item, how should the
information be treated in a recommender? Normally for retail recommendations
you have an implicit 1 for a purchase and no value otherwise. But what if you
knew the user did not like an item? Maybe you have records of "I want
lidation search, which is initially quite expensive (even for
>> distributed machine cluster tech), but could be incrementally bail out
> much
>> sooner after previous good guess is already known.
>>
>> MR doesn't work well for this though since it requires A LOT of
>
distributed machine cluster tech), but could be incrementally bail out much
sooner after previous good guess is already known.
MR doesn't work well for this though since it requires A LOT of iterations.
On Mon, Jun 17, 2013 at 5:51 PM, Pat Ferrel wrote:
> In the case where you know a user did
I think https://issues.apache.org/jira/browse/MAHOUT-1030 may be the wrong
issue #.
The problem is that the Names from NamedVectorWritable are not used in the
cluster map after kmeans. You need to maintain your own map of your vector name
to internal Mahout id ints. NamedVectors work all the w
ther processing.
On Jul 5, 2013, at 10:28 PM, Andrew Musselman
wrote:
I want to have the core feature of k-means which is to find out which vectors
landed in what cluster, and I'm open to discussion beyond that.
Best
Andrew
On Jul 5, 2013, at 5:43 PM, Pat Ferrel wrote:
> I think ht
Read the paper, and the preso.
As to the 'offline to Solr' part. It sounds like you are suggesting an item
item similarity matrix be stored and indexed in Solr. One would have to create
the action matrix from user profile data (preference history), do a
rowsimiarity job on it (using LLR similar
no reason to not use a real data set. There is a strong reason to
use a synthetic dataset, however, in that it can be trivially scaled up and
down both in items and users. Also, the synthetic dataset doesn't require
that the real data be found and downloaded.
On Sun, Jul 21, 2013 at 2:17 PM
can handle two kinds of input, then
> RowSimilarity can be easily modified to be CrossRowSimilarity. Likewise,
> if we have two DRM's with the same row id's in the same order, we can do a
> map-side merge. Such a merge can be very efficient on a system like MapR
> where
inline
BTW if there is an LLR cross-similarity job (replacing [B'A] it is easy to
integrate.
On Jul 22, 2013, at 12:09 PM, Ted Dunning wrote:
On Mon, Jul 22, 2013 at 9:20 AM, Pat Ferrel wrote:
> +10
>
> Love the academics but I agree with this. Recently saw a VP from Netfli
lans for the next couple weeks as it
happens anyway.
Let me know if you want me to proceed.
On Jul 22, 2013, at 3:42 PM, Ted Dunning wrote:
On Mon, Jul 22, 2013 at 12:40 PM, Pat Ferrel wrote:
> Yes. And the combined recommender would query on both at the same time.
>
> Pat-- does
w.
On Jul 23, 2013, at 10:37 AM, Ted Dunning wrote:
This sounds great. Go for it. Put a comment on the design doc with a pointer
to text that I should import.
On Tue, Jul 23, 2013 at 9:39 AM, Pat Ferrel wrote:
I can supply:
1) a Maven based project in a public github repo as a baseline
arity rank is not something
we want to lose so unless someone has a better idea I'll just order the IDs in
the fields and call it good for now.
On Jul 23, 2013, at 12:03 PM, Pat Ferrel wrote:
Will do.
For what it's worth…
The project I'm working on is an online recommender
looks like similarity and TFIDF are plugable in Solr and seem pretty
easy to change. Planning to use cosine for the first cut since it's default.
On Jul 24, 2013, at 4:10 AM, Michael Sokolov
wrote:
On 7/23/13 7:26 PM, Pat Ferrel wrote:
> Honestly not trying to make this more co
uot;elegant" or "home-style" might be
good indicators for different restaurants even if those terms don't appear in a
restaurant description.
Sent from my iPhone
On Jul 23, 2013, at 18:26, Pat Ferrel wrote:
> Honestly not trying to make this more complicated but…
>
On Jul 24, 2013, at 8:32 PM, Pat Ferrel wrote:
Understood, catalog categories, tags, etc will make good metadata to be
included in the query and putting in separate fields allows us to separately
boost each in the query. UserIDs that have interacted with the item is an
interesting idea.
Howe
Well its a work in progress but you can see it here:
https://github.com/pferrel/solr-recommender
There is no Solr integration yet, it is just ingest, create id indexes, run
RecommenderJob, and XRecommenderJob. These create the item similarity matrixes,
which will be put into Solr. They also cre
ter this week.
On 23.07.2013 19:38, Ted Dunning wrote:
> On Tue, Jul 23, 2013 at 9:39 AM, Pat Ferrel wrote:
>
>> This pipeline lacks downsampling since I had to replace
>> PreparePreferenceMatrixJob and potentially LLR for [B'A]. I assume
>> Sebastian is the person to
In the cross-recommender the similarity matrix is calculated doing [B'A]. We
want the rows to be stored as the item-item similarities in Solr right? [B'B]
is symmetric so just want to make sure I have it straight for [B'A].
B = purchases
iphone ipadnexus galaxy surface
u1 1
A few architectural questions: http://bit.ly/18vbbaT
I created a local instance of the LucidWorks Search on my dev machine. I can
quite easily save the similarity vectors from the DRMs into docs at special
locations and index them with LucidWorks. But to ingest the docs and put them
in separate
Jul 31, 2013, at 11:20 AM, Pat Ferrel wrote:
A few architectural questions: http://bit.ly/18vbbaT
I created a local instance of the LucidWorks Search on my dev machine. I can
quite easily save the similarity vectors from the DRMs into docs at special
locations and index them with LucidWorks. But
OK and yes. The docs will look like:
ipad
iphone
iphone nexus
iphone
ipad
ipad galaxy
On Jul 31, 2013, at 11:42 AM, B Lyon wrote:
I'm interested in helping as well.
Btw I thought that what was stored in the solr fields were the
isioned in the design
> doc, although I could be wrong on this. Anyway I'm pretty open to helping
> wherever needed.
>
> Thanks,
> Andrew
>
>
>
>
>
> On 7/31/13 12:20 PM, "Pat Ferrel" wrote:
>
>> A few architectural questions: http://
I'd vote for csv then.
On Jul 31, 2013, at 12:00 PM, Ted Dunning wrote:
On Wed, Jul 31, 2013 at 11:20 AM, Pat Ferrel wrote:
A few architectural questions: http://bit.ly/18vbbaT
I created a local instance of the LucidWorks Search on my dev machine. I can
quite easily save the simil
to be retrieved. Better to have the tags for the
single doc on all the related docs so that a single retrieval will pull
them all in with their details.
On Wed, Jul 31, 2013 at 11:51 AM, Pat Ferrel wrote:
> OK and yes. The docs will look like:
>
>
>
> ipad
>
oops, mistyped…
If the LLR created DRM has a row:
Key: 1, Value { 0:1.0,}
where 0 -> iphone and 1 -> ipad then wouldn't the doc look like
ipad
iphone
On Jul 31, 2013, at 12:14 PM, Pat Ferrel wrote:
Sorry not sure what you are saying.
If the LLR created DRM has a r
o find a system--free tier AWS,
Ted's box, etc. Then install all the needed stuff.
I'll get the output working to csv.
On Jul 31, 2013, at 11:51 AM, Pat Ferrel wrote:
OK and yes. The docs will look like:
ipad
iphone
iphone nexus
iphone
larities there is no need to do more than fetch one doc that
contains the similarities, right? I've successfully used this method with the
Mahout recommender but please correct me if something above is wrong.
On Jul 31, 2013, at 4:52 PM, Ted Dunning wrote:
Pat,
See inline
O
cross-validation tests.
On Aug 1, 2013, at 9:49 AM, Ted Dunning wrote:
On Thu, Aug 1, 2013 at 8:46 AM, Pat Ferrel wrote:
>
> For item similarities there is no need to do more than fetch one doc that
> contains the similarities, right? I've successfully used this method with
agreed to store the rows there too
because they were from Bs items. This was the discussion about having different
items for cross actions. The excerpt below is Ted responding to my question. So
do we want the columns of [B'A]? It's only a transpose away.
> On Tue, Jul 30, 2013 at
o maybe someone else can check this reasoning. Have a look at the data here
https://github.com/pferrel/solr-recommender/blob/master/src/test/resources/Recommender%20Math.xlsx
On Aug 1, 2013, at 6:00 PM, Pat Ferrel wrote:
Yes, storing the similar_items in a field, cross_action_similar_items in
a
that, the rows do.
Going from rows to columns is the trivial addition of a transpose so I'm going
to go ahead with rows for now. This affects the cross_action_similar_items and
so only the cross-recommender part of the whole.
On Aug 2, 2013, at 8:00 AM, Pat Ferrel wrote:
I put so
ns of the google matrix (
https://googledrive.com/host/0B2GQktu-wcTiaWw5OFVqT1k3bDA/). There are lots
of other different pieces here of course,
but show connections soup-to-nuts as much as possible.
On Friday, August 2, 2013, Pat Ferrel wrote:
> I put some thought into this (actually I sle
We doing a hangout at 2 on the Solr recommender?
Assuming Ted needs to call it, not sure if an invite has gone out, I haven't
seen one.
On Aug 2, 2013, at 12:49 PM, B Lyon wrote:
I am planning on sitting in as flaky connection allows.
On Aug 2, 2013 3:21 PM, "Pat Ferrel" wrote:
> We doing a hangout at 2 on the Solr recommender?
>
sed on composite behavior composed of h_a and h_b
query is [b-a-links: h_a b-b-links: h_b]
Does this make sense by being more explicit?
Now, it is pretty clear that we could have an index of A objects as well
but the link fields would have to be a-a-links and a-b-links, of course.
On
Got away with that stupid comment. All doc ids will be from B items even in the
general case.
On Aug 2, 2013, at 2:39 PM, Pat Ferrel wrote:
Thanks, well put.
In order to have the ultimate impl with two id spaces for A and B would we have
to create different docs for A'B and B'B?
I'll refresh my copy of the trunk and look into it. If this happens a lot I'll
put my version of Mahout on github until it settles down.
Had to copy the code for a couple Mahout classes like Recommender and
ToItemsVectorReducer to get access to private statics, no substantive changes.
I haven't
1 - 100 of 720 matches
Mail list logo