[jira] Commented: (MAHOUT-479) Streamline classification/ clustering data structures

2010-08-13 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898505#action_12898505 ] Ted Dunning commented on MAHOUT-479: Regarding vectorization strategies in general, I k

[jira] Commented: (MAHOUT-479) Streamline classification/ clustering data structures

2010-08-13 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898504#action_12898504 ] Ted Dunning commented on MAHOUT-479: Unification of the resulting models is probably mu

[jira] Updated: (MAHOUT-472) some of the pom.xml referencing old svn repository url

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-472: - Status: Resolved (was: Patch Available) Fix Version/s: 0.4 Resolution: Fixed Looks rig

[jira] Updated: (MAHOUT-472) some of the pom.xml referencing old svn repository url

2010-08-13 Thread Joe Prasanna Kumar (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Prasanna Kumar updated MAHOUT-472: -- Attachment: MAHOUT-472.patch sorry I forgot to upload the patch earlier... > some of t

[jira] Commented: (MAHOUT-477) SimilarityMatrixEntryKeyPartitioner sometimes produces illegal partition numbers

2010-08-13 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898456#action_12898456 ] Hudson commented on MAHOUT-477: --- Integrated in Mahout-Quality #186 (See [http://hudson.zones

[jira] Commented: (MAHOUT-429) Add timestamp info to DataModel

2010-08-13 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898455#action_12898455 ] Hudson commented on MAHOUT-429: --- Integrated in Mahout-Quality #186 (See [http://hudson.zones

[jira] Commented: (MAHOUT-470) Knock out some checkstyle warnings

2010-08-13 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898451#action_12898451 ] Hudson commented on MAHOUT-470: --- Integrated in Mahout-Quality #186 (See [http://hudson.zones

[jira] Commented: (MAHOUT-462) RecommenderJob should use simple cooccurrence as default similarity measure

2010-08-13 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898454#action_12898454 ] Hudson commented on MAHOUT-462: --- Integrated in Mahout-Quality #186 (See [http://hudson.zones

[jira] Commented: (MAHOUT-463) Boolean Data can not get any recommendation by running RecommnenderJob

2010-08-13 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898452#action_12898452 ] Hudson commented on MAHOUT-463: --- Integrated in Mahout-Quality #186 (See [http://hudson.zones

Hudson build is still unstable: Mahout-Quality #186

2010-08-13 Thread Apache Hudson Server
See

[jira] Updated: (MAHOUT-480) Replace manual precondition checking with Precondition utility class from Guava

2010-08-13 Thread Eugen Paraschiv (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugen Paraschiv updated MAHOUT-480: --- Attachment: MAHOUT-480_v2.patch This patch covers a much wider portion of the system, in acco

[jira] Updated: (MAHOUT-480) Replace manual precondition checking with Precondition utility class from Guava

2010-08-13 Thread Eugen Paraschiv (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugen Paraschiv updated MAHOUT-480: --- Attachment: (was: MAHOUT-480.patch) > Replace manual precondition checking with Precondit

[jira] Issue Comment Edited: (MAHOUT-480) Replace manual precondition checking with Precondition utility class from Guava

2010-08-13 Thread Eugen Paraschiv (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898432#action_12898432 ] Eugen Paraschiv edited comment on MAHOUT-480 at 8/13/10 5:48 PM:

[jira] Commented: (MAHOUT-479) Streamline classification/ clustering data structures

2010-08-13 Thread Drew Farris (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898429#action_12898429 ] Drew Farris commented on MAHOUT-479: Thanks for getting the ball rolling Isabel More d

[jira] Commented: (MAHOUT-478) Do we need normalize SimilarityMatrixEntryKey?

2010-08-13 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898408#action_12898408 ] Sebastian Schelter commented on MAHOUT-478: --- I'd say so too. > Do we need norma

[jira] Updated: (MAHOUT-477) SimilarityMatrixEntryKeyPartitioner sometimes produces illegal partition numbers

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-477: - Status: Resolved (was: Patch Available) Assignee: Sean Owen Fix Version/s: 0.4

[jira] Resolved: (MAHOUT-478) Do we need normalize SimilarityMatrixEntryKey?

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved MAHOUT-478. -- Fix Version/s: (was: 0.4) Resolution: Not A Problem Am I right we think this is "not a probl

[jira] Commented: (MAHOUT-472) some of the pom.xml referencing old svn repository url

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898373#action_12898373 ] Sean Owen commented on MAHOUT-472: -- Sounds good -- I don't see a patch attached though? >

[jira] Resolved: (MAHOUT-429) Add timestamp info to DataModel

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved MAHOUT-429. -- Resolution: Fixed I like this enough that I committed it. > Add timestamp info to DataModel >

[jira] Resolved: (MAHOUT-463) Boolean Data can not get any recommendation by running RecommnenderJob

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved MAHOUT-463. -- Assignee: Sean Owen Resolution: Fixed Submitted Sebastian's patch as it seemed reasonable and he'

[jira] Commented: (MAHOUT-480) Replace manual precondition checking with Precondition utility class from Guava

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898349#action_12898349 ] Sean Owen commented on MAHOUT-480: -- Good one, though the thrust of my message on the dev@

[jira] Commented: (MAHOUT-474) Should compress output of Job pairwiseSimilarity and Job asMatrix

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898346#action_12898346 ] Sean Owen commented on MAHOUT-474: -- I don't doubt that compression is a good idea. But it

[jira] Commented: (MAHOUT-473) add parameter -Dmapred.reduce.tasks when call job RowSimilarityJob in RecommenderJob

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898343#action_12898343 ] Sean Owen commented on MAHOUT-473: -- I am not sure what you mean. Settings like "-Dmapred.r

[jira] Commented: (MAHOUT-475) Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898340#action_12898340 ] Sean Owen commented on MAHOUT-475: -- Yes but you should get the same effect by simply runni

[jira] Updated: (MAHOUT-477) SimilarityMatrixEntryKeyPartitioner sometimes produces illegal partition numbers

2010-08-13 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated MAHOUT-477: -- Status: Patch Available (was: Open) I had used a stupid custom Partitioner, simply swi

[jira] Updated: (MAHOUT-477) SimilarityMatrixEntryKeyPartitioner sometimes produces illegal partition numbers

2010-08-13 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated MAHOUT-477: -- Attachment: MAHOUT-477.patch > SimilarityMatrixEntryKeyPartitioner sometimes produces i

Re: question about condition checking in Mahout

2010-08-13 Thread Sean Owen
Sounds good. I'm not worried about different behavior, but just inconsistent implementation of that behavior internally I think you are likely welcome to be a little aggressive in adding argument checks. If you flag a precondition that shouldn't be restricted, it is easy to discover and may well

Re: BSD Jail (like Solaris Zone)

2010-08-13 Thread Grant Ingersoll
On Aug 13, 2010, at 11:45 AM, Isabel Drost wrote: > On Fri, 13 Aug 2010 Grant Ingersoll wrote: >> Infrastructure is setting up Free BSD Jails for projects, do we want >> one? Typically, in Lucene land, we used them for running nightlys, >> etc. > > Isn't that something we currently do via Huds

Re: BSD Jail (like Solaris Zone)

2010-08-13 Thread Isabel Drost
On Fri, 13 Aug 2010 Grant Ingersoll wrote: > Infrastructure is setting up Free BSD Jails for projects, do we want > one? Typically, in Lucene land, we used them for running nightlys, > etc. Isn't that something we currently do via Hudson? Or could having our own jail mean that we could somehow t

[jira] Commented: (MAHOUT-396) Proposal for Implementing Hidden Markov Model

2010-08-13 Thread Isabel Drost (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898285#action_12898285 ] Isabel Drost commented on MAHOUT-396: - Hmpf - please add a "good" between: "Rest of the

[jira] Commented: (MAHOUT-396) Proposal for Implementing Hidden Markov Model

2010-08-13 Thread Isabel Drost (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898284#action_12898284 ] Isabel Drost commented on MAHOUT-396: - Patch applies cleanly with "-p1" (was generated

RE: BSD Jail (like Solaris Zone)

2010-08-13 Thread Saikat Kanjilal
How does this compare with EC2 for instance where we could potentially spin up a mahout instance running on hadoop for testing etc, has there been any discussion on running mahout on top of hadoop using something like EC2, I realize its an open source project so not sure if there are funds allo

[jira] Updated: (MAHOUT-240) Parallel version of Perceptron

2010-08-13 Thread Isabel Drost (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Isabel Drost updated MAHOUT-240: Fix Version/s: 0.5 (was: 0.4) > Parallel version of Perceptron > ---

[jira] Updated: (MAHOUT-241) Example for perceptron

2010-08-13 Thread Isabel Drost (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Isabel Drost updated MAHOUT-241: Fix Version/s: 0.5 (was: 0.4) > Example for perceptron > ---

[jira] Updated: (MAHOUT-480) Replace manual precondition checking with Precondition utility class from Guava

2010-08-13 Thread Eugen Paraschiv (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugen Paraschiv updated MAHOUT-480: --- Attachment: MAHOUT-480.patch This patch contains initial work on this issue - it does not mod

[jira] Issue Comment Edited: (MAHOUT-479) Streamline classification/ clustering data structures

2010-08-13 Thread Isabel Drost (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898267#action_12898267 ] Isabel Drost edited comment on MAHOUT-479 at 8/13/10 11:06 AM: --

[jira] Commented: (MAHOUT-479) Streamline classification/ clustering data structures

2010-08-13 Thread Isabel Drost (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898267#action_12898267 ] Isabel Drost commented on MAHOUT-479: - Some thoughts that come to my mind: * Algorithm

[jira] Created: (MAHOUT-480) Replace manual precondition checking with Precondition utility class from Guava

2010-08-13 Thread Eugen Paraschiv (JIRA)
Replace manual precondition checking with Precondition utility class from Guava --- Key: MAHOUT-480 URL: https://issues.apache.org/jira/browse/MAHOUT-480 Project: Mahout

[jira] Updated: (MAHOUT-480) Replace manual precondition checking with Precondition utility class from Guava

2010-08-13 Thread Eugen Paraschiv (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugen Paraschiv updated MAHOUT-480: --- Priority: Minor (was: Major) > Replace manual precondition checking with Precondition utilit

[jira] Created: (MAHOUT-479) Streamline classification/ clustering data structures

2010-08-13 Thread Isabel Drost (JIRA)
Streamline classification/ clustering data structures - Key: MAHOUT-479 URL: https://issues.apache.org/jira/browse/MAHOUT-479 Project: Mahout Issue Type: Improvement Components: C

Re: question about condition checking in Mahout

2010-08-13 Thread Eugen Paraschiv
Sure, makes sense to do this according to the boyscout principle and based on patches. I will start working on such a patch for the code area I'm working with, not for the whole project. As for an inconsistent state for the condition checking logic, it should not be an issue, as the Preconditions c

[jira] Commented: (MAHOUT-478) Do we need normalize SimilarityMatrixEntryKey?

2010-08-13 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898245#action_12898245 ] Sebastian Schelter commented on MAHOUT-478: --- {quote} But if so ,why not Similarit

[jira] Commented: (MAHOUT-478) Do we need normalize SimilarityMatrixEntryKey?

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898241#action_12898241 ] Han Hui Wen commented on MAHOUT-478: - Sorry,I confused it. But if so ,why not Similar

Re: Contributions to mahout

2010-08-13 Thread Grant Ingersoll
On Aug 13, 2010, at 2:01 AM, Ted Dunning wrote: > >> I am looking for recommendations from the community on the process to go >> about this, should I just start with the Jira tasks and assign myself some >> tasks pertaining to the above areas or start with number 4. >> > > JIRA's tend to be fi

[jira] Commented: (MAHOUT-478) Do we need normalize SimilarityMatrixEntryKey?

2010-08-13 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898238#action_12898238 ] Sebastian Schelter commented on MAHOUT-478: --- Grouping is done by SimilarityMatrix

[jira] Updated: (MAHOUT-478) Do we need normalize SimilarityMatrixEntryKey?

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Hui Wen updated MAHOUT-478: Description: In org.apache.mahout.math.hadoop.similarity.SimilarityMatrixEntryKey {code} public sta

[jira] Updated: (MAHOUT-478) Do we need normalize SimilarityMatrixEntryKey?

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Hui Wen updated MAHOUT-478: Summary: Do we need normalize SimilarityMatrixEntryKey? (was: Do we need normo normalize Similari

[jira] Created: (MAHOUT-478) Do we need normo normalize SimilarityMatrixEntryKey?

2010-08-13 Thread Han Hui Wen (JIRA)
Do we need normo normalize SimilarityMatrixEntryKey? Key: MAHOUT-478 URL: https://issues.apache.org/jira/browse/MAHOUT-478 Project: Mahout Issue Type: Question Components: Collab

Re: Contributions to mahout

2010-08-13 Thread Saikat Kanjilal
Thanks for the updates Ted, I'll take a look at some of these topics and pick an area to start with. My apologies for my name not showing up, my name is Saikat Kanjilal. Sent from my iPhone On Aug 12, 2010, at 11:01 PM, Ted Dunning wrote: > On Thu, Aug 12, 2010 at 10:00 PM, Hotmail Email A

[jira] Created: (MAHOUT-477) SimilarityMatrixEntryKeyPartitioner sometimes produces illegal partition numbers

2010-08-13 Thread Sebastian Schelter (JIRA)
SimilarityMatrixEntryKeyPartitioner sometimes produces illegal partition numbers Key: MAHOUT-477 URL: https://issues.apache.org/jira/browse/MAHOUT-477 Project: Mahout

Re: Documentation / Help for Beginners

2010-08-13 Thread Isabel Drost
On Fri, 13 Aug 2010 Joe Kumar wrote: > Once the example steps are cleaned out for the current version > of Mahout, I'll start on each of quickstart/clustering , > quickstart/classifying and so on. Thanks for taking up this work. As the project is moving towards its next release, help with cleanin

BSD Jail (like Solaris Zone)

2010-08-13 Thread Grant Ingersoll
Infrastructure is setting up Free BSD Jails for projects, do we want one? Typically, in Lucene land, we used them for running nightlys, etc. Could also be used for setting up online demos (would be nice to have a recommendation demo online). -Grant

[jira] Commented: (MAHOUT-475) Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898223#action_12898223 ] Han Hui Wen commented on MAHOUT-475: - About MultithreadedMapper,here is the source: h

Re: Documentation / Help for Beginners

2010-08-13 Thread Drew Farris
Joe, Thanks for getting started on this work. On Fri, Aug 13, 2010 at 8:38 AM, Joe Kumar wrote: > > For wikipedia bayes example, I am assuming that we need to download data > (like how we are doing for Twenty Newsgroup example). can someone plz > reference me the link or the process of getting

[jira] Commented: (MAHOUT-474) Should compress output of Job pairwiseSimilarity and Job asMatrix

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898215#action_12898215 ] Han Hui Wen commented on MAHOUT-474: - I have done test for this, Before using patch,th

[jira] Commented: (MAHOUT-475) Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898210#action_12898210 ] Han Hui Wen commented on MAHOUT-475: - For Job RowSimilarityJob-CooccurrencesMapper-Sim

[jira] Updated: (MAHOUT-475) Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Hui Wen updated MAHOUT-475: Attachment: after_patch_20100813.jpg > Replace Mapper with MultithreadedMapper to run job pairwis

[jira] Updated: (MAHOUT-475) Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Hui Wen updated MAHOUT-475: Attachment: after_patch_20100813.jpg > Replace Mapper with MultithreadedMapper to run job pairwis

[jira] Updated: (MAHOUT-475) Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Hui Wen updated MAHOUT-475: Attachment: (was: after_patch_20100813.jpg) > Replace Mapper with MultithreadedMapper to run

[jira] Updated: (MAHOUT-474) Should compress output of Job pairwiseSimilarity and Job asMatrix

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Hui Wen updated MAHOUT-474: Attachment: (was: after_patch_20100813.jpg) > Should compress output of Job pairwiseSimilarity

[jira] Updated: (MAHOUT-474) Should compress output of Job pairwiseSimilarity and Job asMatrix

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Hui Wen updated MAHOUT-474: Attachment: after_patch_20100813.jpg > Should compress output of Job pairwiseSimilarity and Job asM

Re: Documentation / Help for Beginners

2010-08-13 Thread Joe Kumar
Thanks Sean. I'll check with you for questions regarding Recommenders. Thanks for the pointer Isabel. I'll probably start off with https://cwiki.apache.org/MAHOUT/quickstart.html and make sure the examples and steps mentioned there works well. For example, the wikipedia bayes example references a

[jira] Issue Comment Edited: (MAHOUT-473) add parameter -Dmapred.reduce.tasks when call job RowSimilarityJob in RecommenderJob

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898193#action_12898193 ] Han Hui Wen edited comment on MAHOUT-473 at 8/13/10 8:34 AM: --

[jira] Commented: (MAHOUT-473) add parameter -Dmapred.reduce.tasks when call job RowSimilarityJob in RecommenderJob

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898193#action_12898193 ] Han Hui Wen commented on MAHOUT-473: - Because RowSimilarityJob run a separated process

[jira] Resolved: (MAHOUT-474) Should compress output of Job pairwiseSimilarity and Job asMatrix

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved MAHOUT-474. -- Assignee: Sean Owen Fix Version/s: (was: 0.4) Resolution: Not A Problem AbstractJob

[jira] Resolved: (MAHOUT-473) add parameter -Dmapred.reduce.tasks when call job RowSimilarityJob in RecommenderJob

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved MAHOUT-473. -- Assignee: Sean Owen Fix Version/s: (was: 0.4) Resolution: Not A Problem It is up to

[jira] Updated: (MAHOUT-475) Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity

2010-08-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-475: - Status: Resolved (was: Patch Available) Assignee: Sean Owen Resolution: Not A Problem Peop

[jira] Commented: (MAHOUT-473) add parameter -Dmapred.reduce.tasks when call job RowSimilarityJob in RecommenderJob

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898176#action_12898176 ] Han Hui Wen commented on MAHOUT-473: - added patch https://issues.apache.org/jira/secu

[jira] Updated: (MAHOUT-475) Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Hui Wen updated MAHOUT-475: Comment: was deleted (was: Patch : 1) using MultithreadedMapper 2) compress the output 3) add --nu

[jira] Updated: (MAHOUT-475) Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Hui Wen updated MAHOUT-475: Status: Patch Available (was: Open) https://issues.apache.org/jira/secure/attachment/12452011/pat

[jira] Updated: (MAHOUT-475) Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Hui Wen updated MAHOUT-475: Attachment: patch_985097.txt Patch : 1) using MultithreadedMapper 2) compress the output 3) add --

[jira] Updated: (MAHOUT-475) Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity

2010-08-13 Thread Han Hui Wen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Han Hui Wen updated MAHOUT-475: Summary: Replace Mapper with MultithreadedMapper to run job pairwiseSimilarity (was: Replac

[jira] Commented: (MAHOUT-476) bug when running org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver on hadoop

2010-08-13 Thread leon lee (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898159#action_12898159 ] leon lee commented on MAHOUT-476: - same problem with trunk version. So I changed hadoop ma

Re: mahout hudson cleanup?

2010-08-13 Thread Isabel Drost
On Thu, 12 Aug 2010 Drew Farris wrote: > On Thu, Aug 12, 2010 at 5:43 AM, Isabel Drost > wrote: > > As soon as someone tells me the right URLs to access each Mahout > > module's java doc I am +1 here as well. So far I could only make out > > the documentation for the core module... > > Does Maho

Re: Documentation / Help for Beginners

2010-08-13 Thread Isabel Drost
On Fri, 13 Aug 2010 Joe Kumar wrote: > I am thinking of starting off with 1 classification (probably Naive > Bayes) and create a template for the documentation like > 1. Overview of the Algo > 2. I/P data set (how to prepare and sample data set) > 3. Maybe a sequence diagram explaining how the cod

[jira] Commented: (MAHOUT-476) bug when running org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver on hadoop

2010-08-13 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898116#action_12898116 ] Ted Dunning commented on MAHOUT-476: Have you tried working with the trunk version? Th

[jira] Commented: (MAHOUT-476) bug when running org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver on hadoop

2010-08-13 Thread leon lee (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898113#action_12898113 ] leon lee commented on MAHOUT-476: - similar error happened when running 20news group dataset