Re: Jenkins build is back to normal : Mahout-Quality #2128
Build still seems to be failing (see https://builds.apache.org/job/mahout-nightly/1286/consoleFull ), maybe just less frequently. Will have to look into this, after release. Kind regards, Stevo Slavić. On Sun, Jul 7, 2013 at 11:02 PM, Grant Ingersoll gsing...@apache.orgwrote: On Jul 6, 2013, at 4:38 PM, Stevo Slavić ssla...@gmail.com wrote: What did the trick (as of r1500216) for last two builds to be successful was serializing unit tests. At least some of them it seems are not designed to run in parallel (they very likely share some state), and they were running in parallel (1.5 per CPU core of Jenkins node on which build is running), causing each other to fail randomly. Now it's all sequential. So, we undid the parallel builds? Do you have a sense of the ones that were causing problems? -G
[jira] [Created] (MAHOUT-1275) Drop some of the Release Artifact File Types
Grant Ingersoll created MAHOUT-1275: --- Summary: Drop some of the Release Artifact File Types Key: MAHOUT-1275 URL: https://issues.apache.org/jira/browse/MAHOUT-1275 Project: Mahout Issue Type: Task Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 0.9 There really is no reason why we need so many release artifacts for the distribution. We run on *NIX machines. Zip and Gzip are standard tools, let's save a few bits, along with Release Manager upload times, and drop the BZ2 format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Pilot Mentoring Programme with India ICFOSS - Mentor Request Mail
Hi, I am currently persuading my Masters degree ( 2nd year ) from Indian Institute Of Information Technology And Management-Kerala ,India. I am one among the 50 Student Candidate who participated in Pilot Mentoring Programme with India ICFOSS under Mr.Luciano Resendehttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=luciano%20resende(21 st to 23 rd Jun 2013).I am interested in learning more about Open Source and would like to contribute my main project of my Masters Degree to Apache Community. I would like to do my project in Mahout- Implementation Of Apriori Algorithm in Mahout. From the list of Mentors unassigned I just saw Mr.Dan Filimonhttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=dfilimonis working in Mahout Projects.Either Mr.Dan Filimonhttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=dfilimonorany one else who is working in Mahout be my Mentor, so that I can do my project successfully as well can contribute the same for the Apache.org. -- Thanks Regards, Reshmi Raji, Master Of Science-Information Technology, Indian Institute Of Information Technology And Management-Kerala ,India.
Re: 0.8 progress
Hi Sebastian, I'm sorry for the entirely noobish questions: where can I download the judging.txt ground truth set? (netflix is pulling it off everywhere, so far I can only get the legacy trainingSet and qualifying.txt) and how do I inject the ParallelAlsFactorizationJob into a common recommender class? I was trying to reproduce your result (I own a small cluster), but don't even know where to start. The only related thing i found in mahout-example is a format converter. Thanks a lot if you can give me a hint. - Yours Peng On 13-07-01 01:24 AM, Sebastian Schelter wrote: I successfully ran the ALS and cooccurrence-based recommenders on the Netflix dataset on a 26 machine cluster using Hadoop 1.0.4. --sebastian On 28.06.2013 21:31, Jake Mannix wrote: I can run LDA on Twitter's cluster, on both reuters and some real data, as well as LR/SGD. On Fri, Jun 28, 2013 at 11:51 AM, Grant Ingersoll gsing...@apache.orgwrote: We really should setup a VM that we can run a couple of nodes (perhaps at ASF?) on that we can share w/ everyone that makes it easy to test our stuff on Hadoop for the specific version that we ship. On Jun 28, 2013, at 2:41 PM, Robin Anil robin.a...@gmail.com wrote: Can someone (if you have time and experience). Write a small shim to run all examples one after the other on a cluster and write up instructions on how to do it.? Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. On Fri, Jun 28, 2013 at 1:11 PM, Sebastian Schelter s...@apache.org wrote: Its crucial that we retest everything on a real cluster before the release. I will do this for the recommenders code next week. --sebastian Am 28.06.2013 14:03 schrieb Grant Ingersoll gsing...@apache.org: I should have time next week to do the release, if we can get these knocked out. If not next week, the following. On Jun 28, 2013, at 5:46 AM, Suneel Marthi suneel_mar...@yahoo.com wrote: 1. Could someone look at Mahout-1257? There is a patch that's been submitted but I am not sure if this has been superseded by Sean's against Mahout-1239. 2. Stevo, I am for fixing the findbugs excludes as part of 0.8 release, I see that the number of warnings has gone up over the last few builds. 3. I am more concerned about the cause of the mysterious cosmic rays that randomly fail unit tests (since we have moved to running parallel tests). I see that happening on my local repository too. From: Stevo Slavić ssla...@gmail.com To: dev@mahout.apache.org Sent: Friday, June 28, 2013 3:21 AM Subject: Re: 0.8 progress Well done team! Build is unstable, oscillates, IMO regardless of changes made. Judging from logs I suspect that some of the Jenkins nodes are not configured well, /tmp directory security related issues, and file size constraints. Could be also issue with our tests. Javadoc was reported earlier not to be OK (not all modules in aggregated javadoc), and code quality reports are not working OK, e.g. findbugs doesn't respect excludes - plan to work on this during weekend. Do we want to fix these before or after 0.8 release? Kind regards, Stevo Slavić. On Fri, Jun 28, 2013 at 12:32 AM, Robin Anil robin.a...@gmail.com wrote: All Done Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. On Sun, Jun 23, 2013 at 11:36 PM, Robin Anil robin.a...@gmail.com wrote: I sent the comments. The code is good. But without the matrix/vector input we cant ship it in the release. Hope Yiqun and Da Zhang can make those changes quickly. Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. On Sun, Jun 23, 2013 at 8:46 PM, Grant Ingersoll gsing...@apache.org wrote: I see 1 issue left: MAHOUT-1214. It is assigned to Robin. Any chance we can finish this up this week? -Grant On Jun 23, 2013, at 9:26 AM, Suneel Marthi suneel_mar...@yahoo.com wrote: Finally got to finishing up M-833, the changes can be reviewed at https://reviews.apache.org/r/11774/diff/3/. From: Grant Ingersoll gsing...@apache.org To: dev@mahout.apache.org Sent: Tuesday, June 11, 2013 10:09 AM Subject: Re: 0.8 progress I pushed M-1030 and M-1233. If we can get M-833 and M-1214 in by Thursday, I can roll an RC on Thursday. -Grant On Jun 11, 2013, at 8:56 AM, Grant Ingersoll gsing...@apache.org wrote: Down to 4 issues! I would say what they are, but JIRA is flaking out again. My instinct is that 1030 and 1233 can be pushed. Suneel has been working hard to get M-833 in. Not sure on M-1214, Robin? -G On Jun 9, 2013, at 6:10 PM, Grant Ingersoll gsing...@apache.org wrote: On Jun 9, 2013, at 6:02 PM, Grant Ingersoll gsing...@apache.org wrote: M-1067 -- Dmitriy -- This is an enhancement, should we push? Looks like this was committed already. Grant Ingersoll | @gsingers http://www.lucidworks.com Grant Ingersoll |
Re: 0.8 progress
Hi Peng, You cannot inject the ParallelALSFactorizationJob into a recommender class. Have a look at factorize-netflix.sh in examples to see how to use it for hold out tests. Best, Sebastian 2013/7/8 Peng Cheng pc...@uowmail.edu.au Hi Sebastian, I'm sorry for the entirely noobish questions: where can I download the judging.txt ground truth set? (netflix is pulling it off everywhere, so far I can only get the legacy trainingSet and qualifying.txt) and how do I inject the ParallelAlsFactorizationJob into a common recommender class? I was trying to reproduce your result (I own a small cluster), but don't even know where to start. The only related thing i found in mahout-example is a format converter. Thanks a lot if you can give me a hint. - Yours Peng On 13-07-01 01:24 AM, Sebastian Schelter wrote: I successfully ran the ALS and cooccurrence-based recommenders on the Netflix dataset on a 26 machine cluster using Hadoop 1.0.4. --sebastian On 28.06.2013 21:31, Jake Mannix wrote: I can run LDA on Twitter's cluster, on both reuters and some real data, as well as LR/SGD. On Fri, Jun 28, 2013 at 11:51 AM, Grant Ingersoll gsing...@apache.org wrote: We really should setup a VM that we can run a couple of nodes (perhaps at ASF?) on that we can share w/ everyone that makes it easy to test our stuff on Hadoop for the specific version that we ship. On Jun 28, 2013, at 2:41 PM, Robin Anil robin.a...@gmail.com wrote: Can someone (if you have time and experience). Write a small shim to run all examples one after the other on a cluster and write up instructions on how to do it.? Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. On Fri, Jun 28, 2013 at 1:11 PM, Sebastian Schelter s...@apache.org wrote: Its crucial that we retest everything on a real cluster before the release. I will do this for the recommenders code next week. --sebastian Am 28.06.2013 14:03 schrieb Grant Ingersoll gsing...@apache.org: I should have time next week to do the release, if we can get these knocked out. If not next week, the following. On Jun 28, 2013, at 5:46 AM, Suneel Marthi suneel_mar...@yahoo.com wrote: 1. Could someone look at Mahout-1257? There is a patch that's been submitted but I am not sure if this has been superseded by Sean's against Mahout-1239. 2. Stevo, I am for fixing the findbugs excludes as part of 0.8 release, I see that the number of warnings has gone up over the last few builds. 3. I am more concerned about the cause of the mysterious cosmic rays that randomly fail unit tests (since we have moved to running parallel tests). I see that happening on my local repository too. __**__ From: Stevo Slavić ssla...@gmail.com To: dev@mahout.apache.org Sent: Friday, June 28, 2013 3:21 AM Subject: Re: 0.8 progress Well done team! Build is unstable, oscillates, IMO regardless of changes made. Judging from logs I suspect that some of the Jenkins nodes are not configured well, /tmp directory security related issues, and file size constraints. Could be also issue with our tests. Javadoc was reported earlier not to be OK (not all modules in aggregated javadoc), and code quality reports are not working OK, e.g. findbugs doesn't respect excludes - plan to work on this during weekend. Do we want to fix these before or after 0.8 release? Kind regards, Stevo Slavić. On Fri, Jun 28, 2013 at 12:32 AM, Robin Anil robin.a...@gmail.com wrote: All Done Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. On Sun, Jun 23, 2013 at 11:36 PM, Robin Anil robin.a...@gmail.com wrote: I sent the comments. The code is good. But without the matrix/vector input we cant ship it in the release. Hope Yiqun and Da Zhang can make those changes quickly. Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. On Sun, Jun 23, 2013 at 8:46 PM, Grant Ingersoll gsing...@apache.org wrote: I see 1 issue left: MAHOUT-1214. It is assigned to Robin. Any chance we can finish this up this week? -Grant On Jun 23, 2013, at 9:26 AM, Suneel Marthi suneel_mar...@yahoo.com wrote: Finally got to finishing up M-833, the changes can be reviewed at https://reviews.apache.org/r/**11774/diff/3/https://reviews.apache.org/r/11774/diff/3/ . __**__ From: Grant Ingersoll gsing...@apache.org To: dev@mahout.apache.org Sent: Tuesday, June 11, 2013 10:09 AM Subject: Re: 0.8 progress I pushed M-1030 and M-1233. If we can get M-833 and M-1214 in by Thursday, I can roll an RC on Thursday. -Grant On Jun 11, 2013, at 8:56 AM, Grant Ingersoll gsing...@apache.org wrote: Down to 4 issues! I would say what they are, but JIRA is flaking out again. My instinct is that 1030 and 1233 can be pushed. Suneel has been working hard to get M-833 in. Not sure on M-1214, Robin? -G On Jun 9,
Re: 0.8 progress
Hi Sebastian, I'm sorry for the entirely noobish questions: where can I download the judging.txt ground truth set? (netflix is pulling it off everywhere, so far I can only get the legacy trainingSet and qualifying.txt) and how do I inject the ParallelAlsFactorizationJob into a common recommender class? I was trying to reproduce your result (I own a small cluster), but don't even know where to start. The only related thing i found in mahout-example is a format converter. Thanks a lot if you can give me a hint. - Yours Peng On 13-07-01 01:24 AM, Sebastian Schelter wrote: I successfully ran the ALS and cooccurrence-based recommenders on the Netflix dataset on a 26 machine cluster using Hadoop 1.0.4. --sebastian On 28.06.2013 21:31, Jake Mannix wrote: I can run LDA on Twitter's cluster, on both reuters and some real data, as well as LR/SGD. On Fri, Jun 28, 2013 at 11:51 AM, Grant Ingersoll gsing...@apache.orgwrote: We really should setup a VM that we can run a couple of nodes (perhaps at ASF?) on that we can share w/ everyone that makes it easy to test our stuff on Hadoop for the specific version that we ship. On Jun 28, 2013, at 2:41 PM, Robin Anil robin.a...@gmail.com wrote: Can someone (if you have time and experience). Write a small shim to run all examples one after the other on a cluster and write up instructions on how to do it.? Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. On Fri, Jun 28, 2013 at 1:11 PM, Sebastian Schelter s...@apache.org wrote: Its crucial that we retest everything on a real cluster before the release. I will do this for the recommenders code next week. --sebastian Am 28.06.2013 14:03 schrieb Grant Ingersoll gsing...@apache.org: I should have time next week to do the release, if we can get these knocked out. If not next week, the following. On Jun 28, 2013, at 5:46 AM, Suneel Marthi suneel_mar...@yahoo.com wrote: 1. Could someone look at Mahout-1257? There is a patch that's been submitted but I am not sure if this has been superseded by Sean's against Mahout-1239. 2. Stevo, I am for fixing the findbugs excludes as part of 0.8 release, I see that the number of warnings has gone up over the last few builds. 3. I am more concerned about the cause of the mysterious cosmic rays that randomly fail unit tests (since we have moved to running parallel tests). I see that happening on my local repository too. From: Stevo Slavić ssla...@gmail.com To: dev@mahout.apache.org Sent: Friday, June 28, 2013 3:21 AM Subject: Re: 0.8 progress Well done team! Build is unstable, oscillates, IMO regardless of changes made. Judging from logs I suspect that some of the Jenkins nodes are not configured well, /tmp directory security related issues, and file size constraints. Could be also issue with our tests. Javadoc was reported earlier not to be OK (not all modules in aggregated javadoc), and code quality reports are not working OK, e.g. findbugs doesn't respect excludes - plan to work on this during weekend. Do we want to fix these before or after 0.8 release? Kind regards, Stevo Slavić. On Fri, Jun 28, 2013 at 12:32 AM, Robin Anil robin.a...@gmail.com wrote: All Done Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. On Sun, Jun 23, 2013 at 11:36 PM, Robin Anil robin.a...@gmail.com wrote: I sent the comments. The code is good. But without the matrix/vector input we cant ship it in the release. Hope Yiqun and Da Zhang can make those changes quickly. Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc. On Sun, Jun 23, 2013 at 8:46 PM, Grant Ingersoll gsing...@apache.org wrote: I see 1 issue left: MAHOUT-1214. It is assigned to Robin. Any chance we can finish this up this week? -Grant On Jun 23, 2013, at 9:26 AM, Suneel Marthi suneel_mar...@yahoo.com wrote: Finally got to finishing up M-833, the changes can be reviewed at https://reviews.apache.org/r/11774/diff/3/. From: Grant Ingersoll gsing...@apache.org To: dev@mahout.apache.org Sent: Tuesday, June 11, 2013 10:09 AM Subject: Re: 0.8 progress I pushed M-1030 and M-1233. If we can get M-833 and M-1214 in by Thursday, I can roll an RC on Thursday. -Grant On Jun 11, 2013, at 8:56 AM, Grant Ingersoll gsing...@apache.org wrote: Down to 4 issues! I would say what they are, but JIRA is flaking out again. My instinct is that 1030 and 1233 can be pushed. Suneel has been working hard to get M-833 in. Not sure on M-1214, Robin? -G On Jun 9, 2013, at 6:10 PM, Grant Ingersoll gsing...@apache.org wrote: On Jun 9, 2013, at 6:02 PM, Grant Ingersoll gsing...@apache.org wrote: M-1067 -- Dmitriy -- This is an enhancement, should we push? Looks like this was committed already. Grant Ingersoll | @gsingers http://www.lucidworks.com Grant Ingersoll |
[jira] [Commented] (MAHOUT-1272) Parallel SGD matrix factorizer for SVDrecommender
[ https://issues.apache.org/jira/browse/MAHOUT-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702175#comment-13702175 ] Peng Cheng commented on MAHOUT-1272: Hey Sebastian, Hudson, Thank you so much for on pushing things that hard. I own you this. testing on netflix dataset has encountered some trouble, namely, I don't know where to download it :-. Great appreciation for anyone who can share his judging.txt. In the mean time I'll try more grouplens data. Since Sebastian has taken over the code, new test cases will only be posted as code snippets. Parallel SGD matrix factorizer for SVDrecommender - Key: MAHOUT-1272 URL: https://issues.apache.org/jira/browse/MAHOUT-1272 Project: Mahout Issue Type: New Feature Components: Collaborative Filtering Reporter: Peng Cheng Assignee: Sean Owen Labels: features, patch, test Fix For: 0.8 Attachments: GroupLensSVDRecomenderEvaluatorRunner.java, mahout.patch, ParallelSGDFactorizer.java, ParallelSGDFactorizer.java, ParallelSGDFactorizerTest.java, ParallelSGDFactorizerTest.java Original Estimate: 336h Remaining Estimate: 336h a parallel factorizer based on MAHOUT-1089 may achieve better performance on multicore processor. existing code is single-thread and perhaps may still be outperformed by the default ALS-WR. In addition, its hardcoded online-to-batch-conversion prevents it to be used by an online recommender. An online SGD implementation may help build high-performance online recommender as a replacement of the outdated slope-one. The new factorizer can implement either DSGD (http://www.mpi-inf.mpg.de/~rgemulla/publications/gemulla11dsgd.pdf) or hogwild! (www.cs.wisc.edu/~brecht/papers/hogwildTR.pdf). Related discussion has been carried on for a while but remain inconclusive: http://web.archiveorange.com/archive/v/z6zxQUSahofuPKEzZkzl -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (MAHOUT-1272) Parallel SGD matrix factorizer for SVDrecommender
[ https://issues.apache.org/jira/browse/MAHOUT-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702175#comment-13702175 ] Peng Cheng edited comment on MAHOUT-1272 at 7/8/13 6:06 PM: Hey Sebastian, Hudson, Thank you so much for on pushing things that hard. I own you this. I'll test more grouplens data. Since Sebastian has taken over the code, new test cases will only be posted as code snippets. was (Author: peng): Hey Sebastian, Hudson, Thank you so much for on pushing things that hard. I own you this. testing on netflix dataset has encountered some trouble, namely, I don't know where to download it :-. Great appreciation for anyone who can share his judging.txt. In the mean time I'll try more grouplens data. Since Sebastian has taken over the code, new test cases will only be posted as code snippets. Parallel SGD matrix factorizer for SVDrecommender - Key: MAHOUT-1272 URL: https://issues.apache.org/jira/browse/MAHOUT-1272 Project: Mahout Issue Type: New Feature Components: Collaborative Filtering Reporter: Peng Cheng Assignee: Sean Owen Labels: features, patch, test Fix For: 0.8 Attachments: GroupLensSVDRecomenderEvaluatorRunner.java, mahout.patch, ParallelSGDFactorizer.java, ParallelSGDFactorizer.java, ParallelSGDFactorizerTest.java, ParallelSGDFactorizerTest.java Original Estimate: 336h Remaining Estimate: 336h a parallel factorizer based on MAHOUT-1089 may achieve better performance on multicore processor. existing code is single-thread and perhaps may still be outperformed by the default ALS-WR. In addition, its hardcoded online-to-batch-conversion prevents it to be used by an online recommender. An online SGD implementation may help build high-performance online recommender as a replacement of the outdated slope-one. The new factorizer can implement either DSGD (http://www.mpi-inf.mpg.de/~rgemulla/publications/gemulla11dsgd.pdf) or hogwild! (www.cs.wisc.edu/~brecht/papers/hogwildTR.pdf). Related discussion has been carried on for a while but remain inconclusive: http://web.archiveorange.com/archive/v/z6zxQUSahofuPKEzZkzl -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Mahout-Examples-Cluster-Reuters-II #536
See https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters-II/536/changes Changes: [gsingers] [maven-release-plugin] prepare for next development iteration [gsingers] [maven-release-plugin] prepare release mahout-0.8 [ssc] MAHOUT-1272 Parallel SGD matrix factorizer for SVDrecommender -- [...truncated 2219 lines...] [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ObjectLongProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ObjectFloatProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ObjectDoubleProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteObjectProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/CharObjectProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/IntObjectProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ShortObjectProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/LongObjectProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/FloatObjectProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/DoubleObjectProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/CharProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/IntProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ShortProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/LongProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/FloatProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/DoubleProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteByteProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteCharProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteIntProcedure.java [INFO] Writing to /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteShortProcedure.java [INFO] Writing to
[jira] [Work started] (MAHOUT-1275) Drop some of the Release Artifact File Types
[ https://issues.apache.org/jira/browse/MAHOUT-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAHOUT-1275 started by Stevo Slavic. Drop some of the Release Artifact File Types Key: MAHOUT-1275 URL: https://issues.apache.org/jira/browse/MAHOUT-1275 Project: Mahout Issue Type: Task Reporter: Grant Ingersoll Assignee: Stevo Slavic Priority: Minor Fix For: 0.9 There really is no reason why we need so many release artifacts for the distribution. We run on *NIX machines. Zip and Gzip are standard tools, let's save a few bits, along with Release Manager upload times, and drop the BZ2 format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAHOUT-1275) Drop some of the Release Artifact File Types
[ https://issues.apache.org/jira/browse/MAHOUT-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stevo Slavic reassigned MAHOUT-1275: Assignee: Stevo Slavic (was: Grant Ingersoll) Drop some of the Release Artifact File Types Key: MAHOUT-1275 URL: https://issues.apache.org/jira/browse/MAHOUT-1275 Project: Mahout Issue Type: Task Reporter: Grant Ingersoll Assignee: Stevo Slavic Priority: Minor Fix For: 0.9 There really is no reason why we need so many release artifacts for the distribution. We run on *NIX machines. Zip and Gzip are standard tools, let's save a few bits, along with Release Manager upload times, and drop the BZ2 format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAHOUT-1193) We may want a BlockSparseMatrix
[ https://issues.apache.org/jira/browse/MAHOUT-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saleem Ansari updated MAHOUT-1193: -- Attachment: MAHOUT-1193-all-tests-pass.patch Patch that fixes all the tests for BlockSparseMatrix against trunk codebase. We may want a BlockSparseMatrix --- Key: MAHOUT-1193 URL: https://issues.apache.org/jira/browse/MAHOUT-1193 Project: Mahout Issue Type: Bug Reporter: Ted Dunning Fix For: Backlog Attachments: MAHOUT-1193-all-tests-pass.patch, MAHOUT-1193-fix-compile-errors-tests-still-fail.patch, MAHOUT-1193.patch Here is an implementation. Is it good enough to commit? Is it useful? Is it redundant? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-1193) We may want a BlockSparseMatrix
[ https://issues.apache.org/jira/browse/MAHOUT-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702344#comment-13702344 ] Saleem Ansari commented on MAHOUT-1193: --- Hello Ted, I have fixed the test cases. The central issue to the problem was that the class members rows and columns were conflicting with the parent class members ( AbstractMatrix ). That fixed all test cases except two: * testClone() -- this failed because of missing clone() method * testViewColumnIndexOver() -- this was failing because BlockSparseMatrix have extensible rows I have added clone() method and also fixed remaining test cases in BlockSparseMatrixTest class. Now all tests are passing. Please have a look at the patch attached in previous comment: [^MAHOUT-1193-all-tests-pass.patch] Thanks, Saleem We may want a BlockSparseMatrix --- Key: MAHOUT-1193 URL: https://issues.apache.org/jira/browse/MAHOUT-1193 Project: Mahout Issue Type: Bug Reporter: Ted Dunning Fix For: Backlog Attachments: MAHOUT-1193-all-tests-pass.patch, MAHOUT-1193-fix-compile-errors-tests-still-fail.patch, MAHOUT-1193.patch Here is an implementation. Is it good enough to commit? Is it useful? Is it redundant? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-1275) Drop some of the Release Artifact File Types
[ https://issues.apache.org/jira/browse/MAHOUT-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702370#comment-13702370 ] Grant Ingersoll commented on MAHOUT-1275: - Stevo, just FYI, please don't commit anything right now, as we are under code freeze until 0.8 is out (unless you know how to deal w/ this in Maven release plugin) Drop some of the Release Artifact File Types Key: MAHOUT-1275 URL: https://issues.apache.org/jira/browse/MAHOUT-1275 Project: Mahout Issue Type: Task Reporter: Grant Ingersoll Assignee: Stevo Slavic Priority: Minor Fix For: 0.9 There really is no reason why we need so many release artifacts for the distribution. We run on *NIX machines. Zip and Gzip are standard tools, let's save a few bits, along with Release Manager upload times, and drop the BZ2 format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-1275) Drop some of the Release Artifact File Types
[ https://issues.apache.org/jira/browse/MAHOUT-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702441#comment-13702441 ] Hudson commented on MAHOUT-1275: Integrated in Mahout-Quality #2137 (See [https://builds.apache.org/job/Mahout-Quality/2137/]) MAHOUT-1275 Dropped bz2 distribution format for source and binaries (Revision 1500898) Result = SUCCESS sslavic : Files : * /mahout/trunk/CHANGELOG * /mahout/trunk/distribution/src/main/assembly/bin.xml * /mahout/trunk/distribution/src/main/assembly/src.xml Drop some of the Release Artifact File Types Key: MAHOUT-1275 URL: https://issues.apache.org/jira/browse/MAHOUT-1275 Project: Mahout Issue Type: Task Reporter: Grant Ingersoll Assignee: Stevo Slavic Priority: Minor Fix For: 0.9 There really is no reason why we need so many release artifacts for the distribution. We run on *NIX machines. Zip and Gzip are standard tools, let's save a few bits, along with Release Manager upload times, and drop the BZ2 format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-1275) Drop some of the Release Artifact File Types
[ https://issues.apache.org/jira/browse/MAHOUT-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702458#comment-13702458 ] Grant Ingersoll commented on MAHOUT-1275: - [~sslavic] Please revert this. We are under code freeze right now on trunk. Drop some of the Release Artifact File Types Key: MAHOUT-1275 URL: https://issues.apache.org/jira/browse/MAHOUT-1275 Project: Mahout Issue Type: Task Reporter: Grant Ingersoll Assignee: Stevo Slavic Priority: Minor Fix For: 0.9 There really is no reason why we need so many release artifacts for the distribution. We run on *NIX machines. Zip and Gzip are standard tools, let's save a few bits, along with Release Manager upload times, and drop the BZ2 format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Jenkins build is back to normal : mahout-nightly » Mahout Core #1287
See https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-core/1287/changes
Build failed in Jenkins: mahout-nightly » Mahout Integration #1287
See https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/1287/changes Changes: [gsingers] [maven-release-plugin] prepare for next development iteration [gsingers] [maven-release-plugin] prepare release mahout-0.8 -- [INFO] [INFO] [INFO] Building Mahout Integration 0.9-SNAPSHOT [INFO] [INFO] [INFO] Deleting https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target [INFO] --- maven-clean-plugin:2.4.1:clean (default-clean) @ mahout-integration --- [INFO] [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ mahout-integration --- [INFO] Copying 0 resource [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ mahout-integration --- [INFO] Changes detected - recompiling the module! [INFO] Compiling 131 source files to https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target/classes [WARNING] Note: Some input files use or override a deprecated API. [WARNING] Note: Recompile with -Xlint:deprecation for details. [WARNING] Note: https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/src/main/java/org/apache/mahout/cf/taste/impl/model/mongodb/MongoDBDataModel.java uses unchecked or unsafe operations. [WARNING] Note: Recompile with -Xlint:unchecked for details. [INFO] [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ mahout-integration --- [INFO] Copying 10 resources [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ mahout-integration --- [INFO] Changes detected - recompiling the module! [INFO] Compiling 39 source files to https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target/test-classes [WARNING] Note: Some input files use or override a deprecated API. [WARNING] Note: Recompile with -Xlint:deprecation for details. [INFO] [INFO] --- maven-surefire-plugin:2.15:test (default-test) @ mahout-integration --- [INFO] Surefire report directory: https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target/surefire-reports [INFO] parallel='classes', perCoreThreadCount=false, threadCount=1, useUnlimitedThreads=false --- T E S T S --- --- T E S T S --- Running org.apache.mahout.cf.taste.impl.similarity.jdbc.MySQLJDBCInMemoryItemSimilarityTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.179 sec - in org.apache.mahout.cf.taste.impl.similarity.jdbc.MySQLJDBCInMemoryItemSimilarityTest Running org.apache.mahout.clustering.TestClusterEvaluator Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.962 sec - in org.apache.mahout.clustering.TestClusterEvaluator Running org.apache.mahout.clustering.TestClusterDumper Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.839 sec - in org.apache.mahout.clustering.TestClusterDumper Running org.apache.mahout.clustering.dirichlet.TestL1ModelClustering Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.948 sec - in org.apache.mahout.clustering.dirichlet.TestL1ModelClustering Running org.apache.mahout.clustering.cdbw.TestCDbwEvaluator Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.028 sec - in org.apache.mahout.clustering.cdbw.TestCDbwEvaluator Running org.apache.mahout.utils.TestConcatenateVectorsJob Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.42 sec - in org.apache.mahout.utils.TestConcatenateVectorsJob Running org.apache.mahout.utils.vectors.lucene.LuceneIterableTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.519 sec - in org.apache.mahout.utils.vectors.lucene.LuceneIterableTest Running org.apache.mahout.utils.vectors.lucene.DriverTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.978 sec - in org.apache.mahout.utils.vectors.lucene.DriverTest Running org.apache.mahout.utils.vectors.lucene.CachedTermInfoTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.376 sec - in org.apache.mahout.utils.vectors.lucene.CachedTermInfoTest Running org.apache.mahout.utils.vectors.csv.CSVVectorIteratorTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.156 sec - in org.apache.mahout.utils.vectors.csv.CSVVectorIteratorTest Running org.apache.mahout.utils.vectors.VectorHelperTest Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time
Build failed in Jenkins: mahout-nightly #1287
See https://builds.apache.org/job/mahout-nightly/1287/changes Changes: [sslavic] MAHOUT-1275 Dropped bz2 distribution format for source and binaries [gsingers] [maven-release-plugin] prepare for next development iteration [gsingers] [maven-release-plugin] prepare release mahout-0.8 [ssc] MAHOUT-1272 Parallel SGD matrix factorizer for SVDrecommender -- [...truncated 1728 lines...] [INFO] [INFO] --- maven-jar-plugin:2.4:test-jar (default) @ mahout-core --- [INFO] Building jar: https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-tests.jar [WARNING] Artifact org.apache.mahout:mahout-core:test-jar:tests:0.9-SNAPSHOT already attached to project, ignoring duplicate [INFO] [INFO] Reading assembly descriptor: src/main/assembly/job.xml [INFO] --- maven-assembly-plugin:2.4:single (job) @ mahout-core --- [INFO] Building jar: https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-job.jar [WARNING] Artifact org.apache.mahout:mahout-core:jar:job:0.9-SNAPSHOT already attached to project, ignoring duplicate [INFO] [INFO] --- maven-source-plugin:2.2.1:jar-no-fork (attach-sources) @ mahout-core --- [WARNING] Artifact org.apache.mahout:mahout-core:java-source:sources:0.9-SNAPSHOT already attached to project, ignoring duplicate [INFO] [INFO] Installing https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT.jar to /home/jenkins/jenkins-slave/maven-repositories/0/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT.jar [INFO] --- maven-install-plugin:2.4:install (default-install) @ mahout-core --- [INFO] Installing https://builds.apache.org/job/mahout-nightly/ws/trunk/core/pom.xml to /home/jenkins/jenkins-slave/maven-repositories/0/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT.pom [INFO] Installing https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-tests.jar to /home/jenkins/jenkins-slave/maven-repositories/0/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-tests.jar [INFO] Installing https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-job.jar to /home/jenkins/jenkins-slave/maven-repositories/0/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-job.jar [INFO] Installing https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-sources.jar to /home/jenkins/jenkins-slave/maven-repositories/0/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-sources.jar [INFO] [INFO] --- maven-deploy-plugin:2.5:deploy (default-deploy) @ mahout-core --- Downloading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1.jar Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1.jar (1605 KB at 7391.8 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1.pom Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1.pom (7 KB at 96.6 KB/sec) Downloading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml Downloaded: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml (344 B at 2.4 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml (772 B at 13.2 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml (382 B at 4.8 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1-tests.jar Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1-tests.jar (2446 KB at 9264.2 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml (982 B at
Re: (Bi-)Weekly/Monthly Dev Sessions
Hmm, seems like that old link doesn't work. Here's a new one: https://plus.google.com/hangouts/_/899b63ca1b3864c749886348cdddfcd80d00bb0b?hl=en -Grant On Jul 7, 2013, at 5:24 PM, Grant Ingersoll gsing...@apache.org wrote: How about tomorrow (Monday) night at 8:30 pm EDT? Anyone who wants to join, can browse to https://plus.google.com/hangouts/_/1aa32da8d1f9b1669cf6b5ec8bce123d12aec409?hl=en If for some reason that doesn't work, ping me on IRC (gsingers) in the #mahout channel on Freenode. Agenda: 0.8 Release Testing -Grant On Jun 25, 2013, at 6:17 PM, Suneel Marthi suneel_mar...@yahoo.com wrote: Is today's Hangout happening? On Wed, Jun 12, 2013 at 4:26 AM, Grant Ingersoll gsing...@apache.org wrote: Hi, One of the things we kicked around at Buzzwords was having a weekly/bi-weekly/monthly dev session via Google hangout (Drill does this with good success, I believe). Since we are so spread out, I thought I would throw out a Doodle (scheduling tool for those unfamiliar) to see what times work best for the majority of people interested in such a thing. Anyone is free to participate, but this is not a Q and A session, but is instead focused on writing code, fixing bugs, triaging JIRA, releasing, etc. If you are interested, please fill out http://doodle.com/gatxxkm7f25fq5y8 (note, all times are Eastern Time Zone since I did the poll!) I just grabbed a sampling of hours throughout the day. I also picked 1 week as being representative of this being on a repeating schedule. If none of the times work for you, but you are still interested, please respond here. I would imagine we would meet for 1-2 hours. Also, please reply with the frequency at which you would like to meet: [] Weekly [] Bi-weekly (every 2 weeks) [] Monthly My vote is every two weeks. -Grant -- Thanks, Pradeep Grant Ingersoll | @gsingers http://www.lucidworks.com Grant Ingersoll | @gsingers http://www.lucidworks.com
Re: Mahout vectors/matrices/solvers on spark
Anybody knows how good (or bad) our performance on matrix transpose? how long will it take to transpose a 10M non-zeros with Mahout (if i wanted to setup fully distributed but single node MR cluster?) Trying to figure if the numbers i see with Bagel-based Mahout matrix transposition are any good.
Re: (Bi-)Weekly/Monthly Dev Sessions
I'm getting an error when I build after doing svn up: $ mvn package [INFO] Scanning for projects... [ERROR] The build could not read 1 project - [Help 1] [ERROR] [ERROR] The project (/home/akm/mahout/pom.xml) has 1 error [ERROR] Non-readable POM /home/akm/mahout/pom.xml: no more data available - expected end tag /project to close start tag project from line 2, parser stopped on END_TAG seen .../reporting\n/project\n... @1030:1 But there's a /project tag at the end of that.. On Mon, Jul 8, 2013 at 5:24 PM, Grant Ingersoll gsing...@apache.org wrote: Hmm, seems like that old link doesn't work. Here's a new one: https://plus.google.com/hangouts/_/899b63ca1b3864c749886348cdddfcd80d00bb0b?hl=en -Grant On Jul 7, 2013, at 5:24 PM, Grant Ingersoll gsing...@apache.org wrote: How about tomorrow (Monday) night at 8:30 pm EDT? Anyone who wants to join, can browse to https://plus.google.com/hangouts/_/1aa32da8d1f9b1669cf6b5ec8bce123d12aec409?hl=en If for some reason that doesn't work, ping me on IRC (gsingers) in the #mahout channel on Freenode. Agenda: 0.8 Release Testing -Grant On Jun 25, 2013, at 6:17 PM, Suneel Marthi suneel_mar...@yahoo.com wrote: Is today's Hangout happening? On Wed, Jun 12, 2013 at 4:26 AM, Grant Ingersoll gsing...@apache.org wrote: Hi, One of the things we kicked around at Buzzwords was having a weekly/bi-weekly/monthly dev session via Google hangout (Drill does this with good success, I believe). Since we are so spread out, I thought I would throw out a Doodle (scheduling tool for those unfamiliar) to see what times work best for the majority of people interested in such a thing. Anyone is free to participate, but this is not a Q and A session, but is instead focused on writing code, fixing bugs, triaging JIRA, releasing, etc. If you are interested, please fill out http://doodle.com/gatxxkm7f25fq5y8 (note, all times are Eastern Time Zone since I did the poll!) I just grabbed a sampling of hours throughout the day. I also picked 1 week as being representative of this being on a repeating schedule. If none of the times work for you, but you are still interested, please respond here. I would imagine we would meet for 1-2 hours. Also, please reply with the frequency at which you would like to meet: [] Weekly [] Bi-weekly (every 2 weeks) [] Monthly My vote is every two weeks. -Grant -- Thanks, Pradeep Grant Ingersoll | @gsingers http://www.lucidworks.com Grant Ingersoll | @gsingers http://www.lucidworks.com
Re: Mahout vectors/matrices/solvers on spark
Transpose of that small a matrix should happen in memory. Sent from my iPhone On Jul 8, 2013, at 17:26, Dmitriy Lyubimov dlie...@gmail.com wrote: Anybody knows how good (or bad) our performance on matrix transpose? how long will it take to transpose a 10M non-zeros with Mahout (if i wanted to setup fully distributed but single node MR cluster?) Trying to figure if the numbers i see with Bagel-based Mahout matrix transposition are any good.
Re: Mahout vectors/matrices/solvers on spark
yes, but it is just a test and I am trying to interpolate results that i see to bigger volume. sort of. To get some taste of the programming model performance. I do get cpu-bound behavior and i hit spark cache 100% of the time. so i theory, since i am not having spills and i am not doing sorts, it should be fairly fast. I have two algorithms. One just sends elementwise messages to the vertex representing a row it should be in. Another one is using the same set of initial messages but also uses Bagel combiners which, the way i understand it, apply combining of elements to form partial vectors before shipping it off to remote vertex paritition. Reasoning here apparently since elements are combined, there's fewer io. Well, perhaps not in this case so much, since we are not really doing any sort of information aggregation. On single spark node setup i of course don't have actual io, so it should approach speed of in-core copy-by-serialization. What i am seeing is that elementwise messages work almost two times faster in cpu bound behavior than the version with combiners. it would seem the culprit is that VectorWritable serialization and then deserialization of vectorized fragments is considerably slower than serialization of elementwise messages containing only primitive types there (target row, index, value), even that the latter is significantly larger amount of objects as well as data. Still though, i am trying to convince myself that even using combiners should be ok compared to shuffle and sort overhead. But i think in reality it still looks a bit slower than i expected. well i guess i should not be lazy and benchmark it against Mahout MR-based transpose as well as spark's version of RDD shuffle-and-sort. anyway, map-only tasks on spark distributed matrices are lightning fast but Bagel serialze/deserialize scatter/gather seems to be much slower than just map-only processing. Perhaps I am doing it wrong somehow. On Mon, Jul 8, 2013 at 10:22 PM, Ted Dunning ted.dunn...@gmail.com wrote: Transpose of that small a matrix should happen in memory. Sent from my iPhone On Jul 8, 2013, at 17:26, Dmitriy Lyubimov dlie...@gmail.com wrote: Anybody knows how good (or bad) our performance on matrix transpose? how long will it take to transpose a 10M non-zeros with Mahout (if i wanted to setup fully distributed but single node MR cluster?) Trying to figure if the numbers i see with Bagel-based Mahout matrix transposition are any good.