Re: svn commit: r896922 [1/3] - in /lucene/mahout/trunk: core/src/main/java/org/apache/mahout/common/ core/src/main/java/org/apache/mahout/fpm/pfpgrowth/ core/src/main/java/org/apache/mahout/fpm/pfp

2010-01-08 Thread Robin Anil
Try Now

compareTo() issue

2010-01-08 Thread Sean Owen
I see some compareTo() methods with logic like this -- int a = object1.foo(); int b = object2.foo(); if (a == b) { return 1; // order randomly } else { return a - b; } Three problems here: - This does not produce a random ordering when used with a sort; it's quite deterministic - This

Re: svn commit: r896922 [1/3] - in /lucene/mahout/trunk: core/src/main/java/org/apache/mahout/common/ core/src/main/java/org/apache/mahout/fpm/pfpgrowth/ core/src/main/java/org/apache/mahout/fpm/pfp

2010-01-08 Thread deneche abdelhakim
the build is successful, thanks =D On Fri, Jan 8, 2010 at 9:23 AM, Robin Anil robin.a...@gmail.com wrote: Try Now

Re: What index structure does kNN algorithm use in mahout?

2010-01-08 Thread Grant Ingersoll
Do you mean K-Means? On Jan 7, 2010, at 3:50 AM, xiao yang wrote: Like R-tree. Or it compares each record for every query? Thanks! Xiao -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene:

MapReduce Unit Testing

2010-01-08 Thread zhao zhendong
Hi, Does anybody know a out-off-shift Unit testing package for Mapreduce framework? MRUnit is good, but this package only can be found in Cloudera own Hadoop. Cheers, Zhendong -- - Zhen-Dong Zhao (Maxim) Department of Computer

Re: MapReduce Unit Testing

2010-01-08 Thread Sean Owen
Off-the-shelf? MRUnit is open source and Apache licensed, so I don't see why you can't use it. On Fri, Jan 8, 2010 at 12:27 PM, zhao zhendong zhaozhend...@gmail.com wrote: Hi, Does anybody know a out-off-shift Unit testing package for Mapreduce framework? MRUnit is good, but this package only

Re: [jira] Commented: (MAHOUT-238) Further Dependency Cleanup

2010-01-08 Thread Drew Farris
First, apologies for propagating a problem here. Since 0.20.2 is a snapshot, there's no release of hadoop that corresponds directly to it. In maven terms, a snapshot could be anything after 0.20.1 but prior to a formal release of 0.20.2. In the repo, they are timestamped. We're pulling this jar

Re: [jira] Commented: (MAHOUT-238) Further Dependency Cleanup

2010-01-08 Thread zhao zhendong
Thanks Drew, +1 for me to maintain a stable hadoop release, such as 0.20.1. The reason is obvious :) Cheers, Zhendong On Fri, Jan 8, 2010 at 10:23 PM, Drew Farris drew.far...@gmail.com wrote: First, apologies for propagating a problem here. Since 0.20.2 is a snapshot, there's no release of

Re: compareTo() issue

2010-01-08 Thread Robin Anil
That one was specifc to ordering of sub patterns in Fpgrowth stage. I did that as an optimisation where in the object needs to be at a random place in the heap if they are of equal length and support. Since it is the most called function in the entire algorithm, I got some performance benefit from

Re: compareTo() issue

2010-01-08 Thread Sean Owen
If the placement doesn't matter, why is returning 0 a problem? I'm just wondering if this doesn't introduce some subtle bugs in the way that not implementing hashCode/equals does. it may happen to work here but later... The overflow problem is remote, but not trivial... can support be large? like

Re: MapReduce Unit Testing

2010-01-08 Thread Ted Dunning
Mostly I depend on very strong unit tests for the mapper and reducer separately. As far as I have heard, MRUnit is the only game in town for creating simple unit tests for combing the mapper and reducer. On Fri, Jan 8, 2010 at 4:27 AM, zhao zhendong zhaozhend...@gmail.comwrote: Does anybody

Re: What index structure does kNN algorithm use in mahout?

2010-01-08 Thread Ted Dunning
kNN stands for k-nearest neighbor. On Fri, Jan 8, 2010 at 3:34 AM, Grant Ingersoll gsing...@apache.org wrote: Do you mean K-Means? On Jan 7, 2010, at 3:50 AM, xiao yang wrote: Like R-tree. Or it compares each record for every query? Thanks! Xiao -- Grant

Re: What index structure does kNN algorithm use in mahout?

2010-01-08 Thread Grant Ingersoll
On Jan 8, 2010, at 2:17 PM, Ted Dunning wrote: kNN stands for k-nearest neighbor. Yeah, I know. Just wasn't sure on the context of the question. On Fri, Jan 8, 2010 at 3:34 AM, Grant Ingersoll gsing...@apache.org wrote: Do you mean K-Means? On Jan 7, 2010, at 3:50 AM, xiao yang