Re: Regarding Collaborative Filtering.

2011-01-20 Thread Sean Owen
I think it's far from complete or done. I think it would be interesting to take any of the MapReduce-based jobs, set it up, run it, and benchmark/profile it to locate some bottlenecks, then propose optimizations. It is a good way to get familiar with the packages. You might also investigate

Re: Regarding Collaborative Filtering.

2011-01-20 Thread Sebastian Schelter
I'd be very interested in benchmark data for and/or performance increases of RecommenderJob (as well as ItemSimilarityJob and RowSimilarityJob which are used internally), if you feel like working on that. A good starting point to get familiar with the functionality might be Sean's talk from

[jira] Commented: (MAHOUT-293) Add more tunable parameters to PFPGrowth implementation

2011-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984074#action_12984074 ] Sean Owen commented on MAHOUT-293: -- Cleaning house and updating some issues -- Robin,

Re: Regarding Collaborative Filtering.

2011-01-20 Thread Kasun Lakpriya
Thanks Sean and Sebastian. Yes, it's still far away, just finished documentation stuff. I will go though these stuff (Thanks for the links Sebastian) and try to get familiar with Mahout. After that I can go in to your suggestions one by one. On Thu, Jan 20, 2011 at 1:46 PM, Sebastian Schelter

[jira] Updated: (MAHOUT-535) mahout seqdirectory reads only from the local filesystem, even when running over Hadoop

2011-01-20 Thread Shige Takeda (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shige Takeda updated MAHOUT-535: Affects Version/s: (was: 0.4) (was: 0.3) 0.5

[jira] Updated: (MAHOUT-535) mahout seqdirectory reads only from the local filesystem, even when running over Hadoop

2011-01-20 Thread Shige Takeda (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shige Takeda updated MAHOUT-535: Attachment: 0001-added-HDFS-support-to-seqdirectory.patch patch attached. mahout seqdirectory

[jira] Updated: (MAHOUT-535) mahout seqdirectory reads only from the local filesystem, even when running over Hadoop

2011-01-20 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-535: - Resolution: Fixed Status: Resolved (was: Patch Available) There are some minor issues with the

Build failed in Hudson: Mahout-Quality #574

2011-01-20 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Mahout-Quality/574/ -- [...truncated 8785 lines...] [INFO] Clover: Open Source License registered to Apache. [INFO] Loading coverage database from:

[jira] Updated: (MAHOUT-524) DisplaySpectralKMeans example fails

2011-01-20 Thread Shannon Quinn (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shannon Quinn updated MAHOUT-524: - Attachment: spectralkmeans.png In case anyone was interested, I wrote a quick script that

[jira] Commented: (MAHOUT-535) mahout seqdirectory reads only from the local filesystem, even when running over Hadoop

2011-01-20 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984537#action_12984537 ] Hudson commented on MAHOUT-535: --- Integrated in Mahout-Quality #575 (See

Hudson build is back to normal : Mahout-Quality #575

2011-01-20 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Mahout-Quality/575/changes