Concur here. Obviously CrossRowSimilarityJob and RowSimilarityJob will be able to share some down-stream code. But there are economies in RSJ that probably can't apply to CRSJ.
On Mon, Aug 5, 2013 at 7:20 AM, Sebastian Schelter <s...@apache.org> wrote: > I think the downsampling belongs into RowSimilarityJob. But I also think > that we need a special CrossRowSimilarityJob that computes B'A > and also downsamples them during the computation. Furthermore it should > compute LLR similarities between the rows not dot products. > > --sebastian > > On 05.08.2013 16:14, Pat Ferrel wrote: > > OK, iI see it in my build now. Also not sufficient repos in the pom. > > > > Looks like some major refactoring of RowSimilarity is in progress. > > > > Sebastian, are you sure downsampling belongs in RowSimilairty? It won't > be applied to [B'A]? > > > > If so I'll update to the lastest Mahout trunk. > > > > On Aug 4, 2013, at 8:57 PM, B Lyon <bradfl...@gmail.com> wrote: > > > > Hi Pat > > > > Below is the compilation error - it's what led me to look at the > SAMPLE_SIZE stuff in the first place, where I confirmed via javap that the > downloaded mahout jar did not have it any more and then I started looking > at the svn source. Mebbe I've got something else misconfigured somehow, > although I don't see how it would compile if it's looking for that static > field that's removed. > > > > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile > (default-compile) on project solr-recommender: Compilation failure: > Compilation failure: > > [ERROR] > /Users/bradflyon/Documents/solr-recommender/src/main/java/finderbots/recommenders/hadoop/PrepareActionMatrixesJob.java:[120,71] > cannot find symbol > > [ERROR] symbol : variable SAMPLE_SIZE > > [ERROR] location: class > org.apache.mahout.cf.taste.hadoop.preparation.ToItemVectorsMapper > > [ERROR] > /Users/bradflyon/Documents/solr-recommender/src/main/java/finderbots/recommenders/hadoop/PrepareActionMatrixesJob.java:[168,71] > cannot find symbol > > [ERROR] symbol : variable SAMPLE_SIZE > > [ERROR] location: class > org.apache.mahout.cf.taste.hadoop.preparation.ToItemVectorsMapper > > [ > > > > > > On Sun, Aug 4, 2013 at 8:57 PM, Pat Ferrel <pat.fer...@gmail.com> wrote: > > Just updated to today's Mahout trunk and everything works for me. > > > > Can you send me the error? > > > > Sebastian, do we really want this limit in RowSimilairty? It will not be > applied to [B'A] unless you also do a mod to give us RowSimilairty on two > matrices. Now that would be very nice indeed… > > > > On Aug 3, 2013, at 9:48 PM, B Lyon <bradfl...@gmail.com> wrote: > > > > Hi Pat > > > > I was going to just play with building the sold-recommender stuff in its > current wip state and noticed a compile error (running mvn install) I think > because the 0.9 snapshot has some changes on July 30th > > > > http://svn.apache.org/viewvc?view=revision&revision=1508302 > > > > Basically, back on June 18, Ted noticed that the downsampling might not > be being done at the right place to actually avoid overwork due to > "perversely prolific users" (thread is here: > http://web.archiveorange.com/archive/v/z6zxQatCzHoFxbdLF0of), and someone > else (Sebastian Schelter) has already acted on this (July 30) to move the > downsampling to somewhere else (Mahout-1289 - > https://issues.apache.org/jira/browse/MAHOUT-1289), which (among other > things) removes the SAMPLE_SIZE static variable from ToItemVectorsMapper. > I don't know how the general changes affect what you were setting > up/playing with. Let me know if I've missed something here. > > > > > > > >