Re: Jenkins build is back to normal : Mahout-Quality #2128

2013-07-08 Thread Stevo Slavić
Build still seems to be failing (see
https://builds.apache.org/job/mahout-nightly/1286/consoleFull ), maybe just
less frequently.
Will have to look into this, after release.

Kind regards,
Stevo Slavić.


On Sun, Jul 7, 2013 at 11:02 PM, Grant Ingersoll gsing...@apache.orgwrote:


 On Jul 6, 2013, at 4:38 PM, Stevo Slavić ssla...@gmail.com wrote:

  What did the trick (as of r1500216) for last two builds to be successful
  was serializing unit tests. At least some of them it seems are not
 designed
  to run in parallel (they very likely share some state), and they were
  running in parallel (1.5 per CPU core of Jenkins node on which build is
  running), causing each other to fail randomly. Now it's all sequential.

 So, we undid the parallel builds?  Do you have a sense of the ones that
 were causing problems?

 -G


[jira] [Created] (MAHOUT-1275) Drop some of the Release Artifact File Types

2013-07-08 Thread Grant Ingersoll (JIRA)
Grant Ingersoll created MAHOUT-1275:
---

 Summary: Drop some of the Release Artifact File Types
 Key: MAHOUT-1275
 URL: https://issues.apache.org/jira/browse/MAHOUT-1275
 Project: Mahout
  Issue Type: Task
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 0.9


There really is no reason why we need so many release artifacts for the 
distribution.  We run on *NIX machines.  Zip and Gzip are standard tools, let's 
save a few bits, along with Release Manager upload times, and drop the BZ2 
format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Pilot Mentoring Programme with India ICFOSS - Mentor Request Mail

2013-07-08 Thread Reshmi Raji
Hi,


I am currently persuading my Masters degree ( 2nd year ) from Indian
Institute Of Information Technology And Management-Kerala ,India.

I am one among the 50 Student Candidate who participated in Pilot Mentoring
Programme with India ICFOSS under Mr.Luciano
Resendehttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=luciano%20resende(21
st to 23 rd Jun 2013).I am interested in learning more about Open
Source and would like to contribute my main project of my Masters Degree to
Apache Community.


I would like to do my project in Mahout- Implementation Of Apriori
Algorithm in Mahout. From the list of Mentors unassigned I just saw Mr.Dan
Filimonhttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=dfilimonis
working in Mahout Projects.Either Mr.Dan
Filimonhttps://issues.apache.org/jira/secure/ViewProfile.jspa?name=dfilimonorany
one else who is working in Mahout be my Mentor, so that I can do my
project successfully as well can contribute the same for the Apache.org.


-- 

Thanks  Regards,

Reshmi Raji,
Master Of Science-Information Technology,
Indian Institute Of Information Technology And Management-Kerala ,India.


Re: 0.8 progress

2013-07-08 Thread Peng Cheng

Hi Sebastian,

I'm sorry for the entirely noobish questions: where can I download the 
judging.txt ground truth set? (netflix is pulling it off everywhere, so 
far I can only get the legacy trainingSet and qualifying.txt)
and how do I inject the ParallelAlsFactorizationJob into a common 
recommender class?
I was trying to reproduce your result (I own a small cluster), but don't 
even know where to start. The only related thing i found in 
mahout-example is a format converter.


Thanks a lot if you can give me a hint.

- Yours Peng

On 13-07-01 01:24 AM, Sebastian Schelter wrote:

I successfully ran the ALS and cooccurrence-based recommenders on the
Netflix dataset on a 26 machine cluster using Hadoop 1.0.4.

--sebastian


On 28.06.2013 21:31, Jake Mannix wrote:

I can run LDA on Twitter's cluster, on both reuters and some real data,
as well as LR/SGD.


On Fri, Jun 28, 2013 at 11:51 AM, Grant Ingersoll gsing...@apache.orgwrote:


We really should setup a VM that we can run a couple of nodes (perhaps at
ASF?) on that we can share w/ everyone that makes it easy to test our stuff
on Hadoop for the specific version that we ship.

On Jun 28, 2013, at 2:41 PM, Robin Anil robin.a...@gmail.com wrote:


Can someone (if you have time and experience). Write a small shim to run
all examples one after the other on a cluster and write up instructions

on

how to do it.?

Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


On Fri, Jun 28, 2013 at 1:11 PM, Sebastian Schelter s...@apache.org

wrote:

Its crucial that we retest everything on a real cluster before the

release.

I will do this for the recommenders code next week.

--sebastian
Am 28.06.2013 14:03 schrieb Grant Ingersoll gsing...@apache.org:


I should have time next week to do the release, if we can get these
knocked out.  If not next week, the following.

On Jun 28, 2013, at 5:46 AM, Suneel Marthi suneel_mar...@yahoo.com
wrote:


1. Could someone look at Mahout-1257? There is a patch that's been

submitted but I am not sure if this has been superseded by Sean's

against

Mahout-1239.

2. Stevo, I am for fixing the findbugs excludes as part of 0.8

release,

I see that the number of warnings has gone up over the last few builds.

3. I am more concerned about the cause of the mysterious cosmic rays

that randomly fail unit tests (since we have moved to running parallel
tests).  I see that happening on my local repository too.





From: Stevo Slavić ssla...@gmail.com
To: dev@mahout.apache.org
Sent: Friday, June 28, 2013 3:21 AM
Subject: Re: 0.8 progress


Well done team!

Build is unstable, oscillates, IMO regardless of changes made. Judging

from

logs I suspect that some of the Jenkins nodes are not configured well,

/tmp

directory security related issues, and file size constraints. Could be

also

issue with our tests.

Javadoc was reported earlier not to be OK (not all modules in

aggregated

javadoc), and code quality reports are not working OK, e.g. findbugs
doesn't respect excludes - plan to work on this during weekend.

Do we want to fix these before or after 0.8 release?

Kind regards,
Stevo Slavić.


On Fri, Jun 28, 2013 at 12:32 AM, Robin Anil robin.a...@gmail.com

wrote:

All Done

Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


On Sun, Jun 23, 2013 at 11:36 PM, Robin Anil robin.a...@gmail.com

wrote:

I sent the comments. The code is good. But without the matrix/vector

input

we cant ship it in the release. Hope Yiqun and Da Zhang can make

those

changes quickly.


Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


On Sun, Jun 23, 2013 at 8:46 PM, Grant Ingersoll 

gsing...@apache.org

wrote:


I see 1 issue left: MAHOUT-1214.  It is assigned to Robin.  Any

chance

we

can finish this up this week?

-Grant

On Jun 23, 2013, at 9:26 AM, Suneel Marthi 

suneel_mar...@yahoo.com

wrote:


Finally got to finishing up M-833, the changes can be reviewed at

https://reviews.apache.org/r/11774/diff/3/.






From: Grant Ingersoll gsing...@apache.org
To: dev@mahout.apache.org
Sent: Tuesday, June 11, 2013 10:09 AM
Subject: Re: 0.8 progress


I pushed M-1030 and M-1233.  If we can get M-833 and M-1214 in by

Thursday, I can roll an RC on Thursday.

-Grant

On Jun 11, 2013, at 8:56 AM, Grant Ingersoll gsing...@apache.org

wrote:

Down to 4 issues!  I would say what they are, but JIRA is flaking

out

again.

My instinct is that 1030 and 1233 can be pushed.  Suneel has been

working hard to get M-833 in.  Not sure on M-1214, Robin?

-G

On Jun 9, 2013, at 6:10 PM, Grant Ingersoll gsing...@apache.org

wrote:

On Jun 9, 2013, at 6:02 PM, Grant Ingersoll 

gsing...@apache.org

wrote:

M-1067 -- Dmitriy  --  This is an enhancement, should we push?

Looks like this was committed already.





Grant Ingersoll | @gsingers
http://www.lucidworks.com


Grant Ingersoll | 

Re: 0.8 progress

2013-07-08 Thread Sebastian Schelter
Hi Peng,

You cannot inject  the ParallelALSFactorizationJob into a recommender
class. Have a look at factorize-netflix.sh in examples to see how to use it
for hold out tests.

Best,
Sebastian


2013/7/8 Peng Cheng pc...@uowmail.edu.au

 Hi Sebastian,

 I'm sorry for the entirely noobish questions: where can I download the
 judging.txt ground truth set? (netflix is pulling it off everywhere, so far
 I can only get the legacy trainingSet and qualifying.txt)
 and how do I inject the ParallelAlsFactorizationJob into a common
 recommender class?
 I was trying to reproduce your result (I own a small cluster), but don't
 even know where to start. The only related thing i found in mahout-example
 is a format converter.

 Thanks a lot if you can give me a hint.

 - Yours Peng


 On 13-07-01 01:24 AM, Sebastian Schelter wrote:

 I successfully ran the ALS and cooccurrence-based recommenders on the
 Netflix dataset on a 26 machine cluster using Hadoop 1.0.4.

 --sebastian


 On 28.06.2013 21:31, Jake Mannix wrote:

 I can run LDA on Twitter's cluster, on both reuters and some real data,
 as well as LR/SGD.


 On Fri, Jun 28, 2013 at 11:51 AM, Grant Ingersoll gsing...@apache.org
 wrote:

  We really should setup a VM that we can run a couple of nodes (perhaps
 at
 ASF?) on that we can share w/ everyone that makes it easy to test our
 stuff
 on Hadoop for the specific version that we ship.

 On Jun 28, 2013, at 2:41 PM, Robin Anil robin.a...@gmail.com wrote:

  Can someone (if you have time and experience). Write a small shim to
 run
 all examples one after the other on a cluster and write up instructions

 on

 how to do it.?

 Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


 On Fri, Jun 28, 2013 at 1:11 PM, Sebastian Schelter s...@apache.org

 wrote:

 Its crucial that we retest everything on a real cluster before the

 release.

 I will do this for the recommenders code next week.

 --sebastian
 Am 28.06.2013 14:03 schrieb Grant Ingersoll gsing...@apache.org:

  I should have time next week to do the release, if we can get these
 knocked out.  If not next week, the following.

 On Jun 28, 2013, at 5:46 AM, Suneel Marthi suneel_mar...@yahoo.com
 wrote:

  1. Could someone look at Mahout-1257? There is a patch that's been

 submitted but I am not sure if this has been superseded by Sean's

 against

 Mahout-1239.

 2. Stevo, I am for fixing the findbugs excludes as part of 0.8

 release,

 I see that the number of warnings has gone up over the last few builds.

 3. I am more concerned about the cause of the mysterious cosmic rays

 that randomly fail unit tests (since we have moved to running
 parallel
 tests).  I see that happening on my local repository too.




 __**__
 From: Stevo Slavić ssla...@gmail.com
 To: dev@mahout.apache.org
 Sent: Friday, June 28, 2013 3:21 AM
 Subject: Re: 0.8 progress


 Well done team!

 Build is unstable, oscillates, IMO regardless of changes made.
 Judging

 from

 logs I suspect that some of the Jenkins nodes are not configured
 well,

 /tmp

 directory security related issues, and file size constraints. Could
 be

 also

 issue with our tests.

 Javadoc was reported earlier not to be OK (not all modules in

 aggregated

 javadoc), and code quality reports are not working OK, e.g. findbugs
 doesn't respect excludes - plan to work on this during weekend.

 Do we want to fix these before or after 0.8 release?

 Kind regards,
 Stevo Slavić.


 On Fri, Jun 28, 2013 at 12:32 AM, Robin Anil robin.a...@gmail.com

 wrote:

 All Done

 Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


 On Sun, Jun 23, 2013 at 11:36 PM, Robin Anil robin.a...@gmail.com
 

 wrote:

 I sent the comments. The code is good. But without the matrix/vector

 input

 we cant ship it in the release. Hope Yiqun and Da Zhang can make

 those

 changes quickly.


 Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


 On Sun, Jun 23, 2013 at 8:46 PM, Grant Ingersoll 

 gsing...@apache.org

 wrote:

  I see 1 issue left: MAHOUT-1214.  It is assigned to Robin.  Any

 chance

 we

 can finish this up this week?

 -Grant

 On Jun 23, 2013, at 9:26 AM, Suneel Marthi 

 suneel_mar...@yahoo.com

 wrote:

  Finally got to finishing up M-833, the changes can be reviewed
 at

 https://reviews.apache.org/r/**11774/diff/3/https://reviews.apache.org/r/11774/diff/3/
 .





 __**__
 From: Grant Ingersoll gsing...@apache.org
 To: dev@mahout.apache.org
 Sent: Tuesday, June 11, 2013 10:09 AM
 Subject: Re: 0.8 progress


 I pushed M-1030 and M-1233.  If we can get M-833 and M-1214 in
 by

 Thursday, I can roll an RC on Thursday.

 -Grant

 On Jun 11, 2013, at 8:56 AM, Grant Ingersoll 
 gsing...@apache.org

 wrote:

 Down to 4 issues!  I would say what they are, but JIRA is
 flaking

 out

 again.

 My instinct is that 1030 and 1233 can be pushed.  Suneel has
 been

 working hard to get M-833 in.  Not sure on M-1214, Robin?

 -G

 On Jun 9, 

Re: 0.8 progress

2013-07-08 Thread Peng Cheng

Hi Sebastian,

I'm sorry for the entirely noobish questions: where can I download the 
judging.txt ground truth set? (netflix is pulling it off everywhere, so 
far I can only get the legacy trainingSet and qualifying.txt)
and how do I inject the ParallelAlsFactorizationJob into a common 
recommender class?
I was trying to reproduce your result (I own a small cluster), but don't 
even know where to start. The only related thing i found in 
mahout-example is a format converter.


Thanks a lot if you can give me a hint.

- Yours Peng

On 13-07-01 01:24 AM, Sebastian Schelter wrote:

I successfully ran the ALS and cooccurrence-based recommenders on the
Netflix dataset on a 26 machine cluster using Hadoop 1.0.4.

--sebastian


On 28.06.2013 21:31, Jake Mannix wrote:

I can run LDA on Twitter's cluster, on both reuters and some real data,
as well as LR/SGD.


On Fri, Jun 28, 2013 at 11:51 AM, Grant Ingersoll gsing...@apache.orgwrote:


We really should setup a VM that we can run a couple of nodes (perhaps at
ASF?) on that we can share w/ everyone that makes it easy to test our stuff
on Hadoop for the specific version that we ship.

On Jun 28, 2013, at 2:41 PM, Robin Anil robin.a...@gmail.com wrote:


Can someone (if you have time and experience). Write a small shim to run
all examples one after the other on a cluster and write up instructions

on

how to do it.?

Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


On Fri, Jun 28, 2013 at 1:11 PM, Sebastian Schelter s...@apache.org

wrote:

Its crucial that we retest everything on a real cluster before the

release.

I will do this for the recommenders code next week.

--sebastian
Am 28.06.2013 14:03 schrieb Grant Ingersoll gsing...@apache.org:


I should have time next week to do the release, if we can get these
knocked out.  If not next week, the following.

On Jun 28, 2013, at 5:46 AM, Suneel Marthi suneel_mar...@yahoo.com
wrote:


1. Could someone look at Mahout-1257? There is a patch that's been

submitted but I am not sure if this has been superseded by Sean's

against

Mahout-1239.

2. Stevo, I am for fixing the findbugs excludes as part of 0.8

release,

I see that the number of warnings has gone up over the last few builds.

3. I am more concerned about the cause of the mysterious cosmic rays

that randomly fail unit tests (since we have moved to running parallel
tests).  I see that happening on my local repository too.





From: Stevo Slavić ssla...@gmail.com
To: dev@mahout.apache.org
Sent: Friday, June 28, 2013 3:21 AM
Subject: Re: 0.8 progress


Well done team!

Build is unstable, oscillates, IMO regardless of changes made. Judging

from

logs I suspect that some of the Jenkins nodes are not configured well,

/tmp

directory security related issues, and file size constraints. Could be

also

issue with our tests.

Javadoc was reported earlier not to be OK (not all modules in

aggregated

javadoc), and code quality reports are not working OK, e.g. findbugs
doesn't respect excludes - plan to work on this during weekend.

Do we want to fix these before or after 0.8 release?

Kind regards,
Stevo Slavić.


On Fri, Jun 28, 2013 at 12:32 AM, Robin Anil robin.a...@gmail.com

wrote:

All Done

Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


On Sun, Jun 23, 2013 at 11:36 PM, Robin Anil robin.a...@gmail.com

wrote:

I sent the comments. The code is good. But without the matrix/vector

input

we cant ship it in the release. Hope Yiqun and Da Zhang can make

those

changes quickly.


Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


On Sun, Jun 23, 2013 at 8:46 PM, Grant Ingersoll 

gsing...@apache.org

wrote:


I see 1 issue left: MAHOUT-1214.  It is assigned to Robin.  Any

chance

we

can finish this up this week?

-Grant

On Jun 23, 2013, at 9:26 AM, Suneel Marthi 

suneel_mar...@yahoo.com

wrote:


Finally got to finishing up M-833, the changes can be reviewed at

https://reviews.apache.org/r/11774/diff/3/.






From: Grant Ingersoll gsing...@apache.org
To: dev@mahout.apache.org
Sent: Tuesday, June 11, 2013 10:09 AM
Subject: Re: 0.8 progress


I pushed M-1030 and M-1233.  If we can get M-833 and M-1214 in by

Thursday, I can roll an RC on Thursday.

-Grant

On Jun 11, 2013, at 8:56 AM, Grant Ingersoll gsing...@apache.org

wrote:

Down to 4 issues!  I would say what they are, but JIRA is flaking

out

again.

My instinct is that 1030 and 1233 can be pushed.  Suneel has been

working hard to get M-833 in.  Not sure on M-1214, Robin?

-G

On Jun 9, 2013, at 6:10 PM, Grant Ingersoll gsing...@apache.org

wrote:

On Jun 9, 2013, at 6:02 PM, Grant Ingersoll 

gsing...@apache.org

wrote:

M-1067 -- Dmitriy  --  This is an enhancement, should we push?

Looks like this was committed already.





Grant Ingersoll | @gsingers
http://www.lucidworks.com


Grant Ingersoll | 

[jira] [Commented] (MAHOUT-1272) Parallel SGD matrix factorizer for SVDrecommender

2013-07-08 Thread Peng Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702175#comment-13702175
 ] 

Peng Cheng commented on MAHOUT-1272:


Hey Sebastian, Hudson, Thank you so much for on pushing things that hard. I own 
you this.
testing on netflix dataset has encountered some trouble, namely, I don't know 
where to download it :-. Great appreciation for anyone who can share his 
judging.txt. In the mean time I'll try more grouplens data.
Since Sebastian has taken over the code, new test cases will only be posted as 
code snippets.

 Parallel SGD matrix factorizer for SVDrecommender
 -

 Key: MAHOUT-1272
 URL: https://issues.apache.org/jira/browse/MAHOUT-1272
 Project: Mahout
  Issue Type: New Feature
  Components: Collaborative Filtering
Reporter: Peng Cheng
Assignee: Sean Owen
  Labels: features, patch, test
 Fix For: 0.8

 Attachments: GroupLensSVDRecomenderEvaluatorRunner.java, 
 mahout.patch, ParallelSGDFactorizer.java, ParallelSGDFactorizer.java, 
 ParallelSGDFactorizerTest.java, ParallelSGDFactorizerTest.java

   Original Estimate: 336h
  Remaining Estimate: 336h

 a parallel factorizer based on MAHOUT-1089 may achieve better performance on 
 multicore processor.
 existing code is single-thread and perhaps may still be outperformed by the 
 default ALS-WR.
 In addition, its hardcoded online-to-batch-conversion prevents it to be used 
 by an online recommender. An online SGD implementation may help build 
 high-performance online recommender as a replacement of the outdated 
 slope-one.
 The new factorizer can implement either DSGD 
 (http://www.mpi-inf.mpg.de/~rgemulla/publications/gemulla11dsgd.pdf) or 
 hogwild! (www.cs.wisc.edu/~brecht/papers/hogwildTR.pdf).
 Related discussion has been carried on for a while but remain inconclusive:
 http://web.archiveorange.com/archive/v/z6zxQUSahofuPKEzZkzl

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (MAHOUT-1272) Parallel SGD matrix factorizer for SVDrecommender

2013-07-08 Thread Peng Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702175#comment-13702175
 ] 

Peng Cheng edited comment on MAHOUT-1272 at 7/8/13 6:06 PM:


Hey Sebastian, Hudson, Thank you so much for on pushing things that hard. I own 
you this.
I'll test more grouplens data. Since Sebastian has taken over the code, new 
test cases will only be posted as code snippets.

  was (Author: peng):
Hey Sebastian, Hudson, Thank you so much for on pushing things that hard. I 
own you this.
testing on netflix dataset has encountered some trouble, namely, I don't know 
where to download it :-. Great appreciation for anyone who can share his 
judging.txt. In the mean time I'll try more grouplens data.
Since Sebastian has taken over the code, new test cases will only be posted as 
code snippets.
  
 Parallel SGD matrix factorizer for SVDrecommender
 -

 Key: MAHOUT-1272
 URL: https://issues.apache.org/jira/browse/MAHOUT-1272
 Project: Mahout
  Issue Type: New Feature
  Components: Collaborative Filtering
Reporter: Peng Cheng
Assignee: Sean Owen
  Labels: features, patch, test
 Fix For: 0.8

 Attachments: GroupLensSVDRecomenderEvaluatorRunner.java, 
 mahout.patch, ParallelSGDFactorizer.java, ParallelSGDFactorizer.java, 
 ParallelSGDFactorizerTest.java, ParallelSGDFactorizerTest.java

   Original Estimate: 336h
  Remaining Estimate: 336h

 a parallel factorizer based on MAHOUT-1089 may achieve better performance on 
 multicore processor.
 existing code is single-thread and perhaps may still be outperformed by the 
 default ALS-WR.
 In addition, its hardcoded online-to-batch-conversion prevents it to be used 
 by an online recommender. An online SGD implementation may help build 
 high-performance online recommender as a replacement of the outdated 
 slope-one.
 The new factorizer can implement either DSGD 
 (http://www.mpi-inf.mpg.de/~rgemulla/publications/gemulla11dsgd.pdf) or 
 hogwild! (www.cs.wisc.edu/~brecht/papers/hogwildTR.pdf).
 Related discussion has been carried on for a while but remain inconclusive:
 http://web.archiveorange.com/archive/v/z6zxQUSahofuPKEzZkzl

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Mahout-Examples-Cluster-Reuters-II #536

2013-07-08 Thread Apache Jenkins Server
See 
https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters-II/536/changes

Changes:

[gsingers] [maven-release-plugin] prepare for next development iteration

[gsingers] [maven-release-plugin] prepare release mahout-0.8

[ssc] MAHOUT-1272 Parallel SGD matrix factorizer for SVDrecommender

--
[...truncated 2219 lines...]
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ObjectLongProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ObjectFloatProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ObjectDoubleProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteObjectProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/CharObjectProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/IntObjectProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ShortObjectProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/LongObjectProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/FloatObjectProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/DoubleObjectProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/CharProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/IntProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ShortProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/LongProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/FloatProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/DoubleProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteByteProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteCharProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteIntProcedure.java
[INFO] Writing to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Mahout-Examples-Cluster-Reuters-II/trunk/math/target/generated-sources/mahout/org/apache/mahout/math/function/ByteShortProcedure.java
[INFO] Writing to 

[jira] [Work started] (MAHOUT-1275) Drop some of the Release Artifact File Types

2013-07-08 Thread Stevo Slavic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on MAHOUT-1275 started by Stevo Slavic.

 Drop some of the Release Artifact File Types
 

 Key: MAHOUT-1275
 URL: https://issues.apache.org/jira/browse/MAHOUT-1275
 Project: Mahout
  Issue Type: Task
Reporter: Grant Ingersoll
Assignee: Stevo Slavic
Priority: Minor
 Fix For: 0.9


 There really is no reason why we need so many release artifacts for the 
 distribution.  We run on *NIX machines.  Zip and Gzip are standard tools, 
 let's save a few bits, along with Release Manager upload times, and drop the 
 BZ2 format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAHOUT-1275) Drop some of the Release Artifact File Types

2013-07-08 Thread Stevo Slavic (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stevo Slavic reassigned MAHOUT-1275:


Assignee: Stevo Slavic  (was: Grant Ingersoll)

 Drop some of the Release Artifact File Types
 

 Key: MAHOUT-1275
 URL: https://issues.apache.org/jira/browse/MAHOUT-1275
 Project: Mahout
  Issue Type: Task
Reporter: Grant Ingersoll
Assignee: Stevo Slavic
Priority: Minor
 Fix For: 0.9


 There really is no reason why we need so many release artifacts for the 
 distribution.  We run on *NIX machines.  Zip and Gzip are standard tools, 
 let's save a few bits, along with Release Manager upload times, and drop the 
 BZ2 format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAHOUT-1193) We may want a BlockSparseMatrix

2013-07-08 Thread Saleem Ansari (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saleem Ansari updated MAHOUT-1193:
--

Attachment: MAHOUT-1193-all-tests-pass.patch

Patch that fixes all the tests for BlockSparseMatrix against trunk codebase.

 We may want a BlockSparseMatrix
 ---

 Key: MAHOUT-1193
 URL: https://issues.apache.org/jira/browse/MAHOUT-1193
 Project: Mahout
  Issue Type: Bug
Reporter: Ted Dunning
 Fix For: Backlog

 Attachments: MAHOUT-1193-all-tests-pass.patch, 
 MAHOUT-1193-fix-compile-errors-tests-still-fail.patch, MAHOUT-1193.patch


 Here is an implementation.
 Is it good enough to commit?
 Is it useful?
 Is it redundant?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1193) We may want a BlockSparseMatrix

2013-07-08 Thread Saleem Ansari (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702344#comment-13702344
 ] 

Saleem Ansari commented on MAHOUT-1193:
---

Hello Ted,

I have fixed the test cases. The central issue to the problem was that the 
class members rows and columns were conflicting with the parent class 
members ( AbstractMatrix ).

That fixed all test cases except two:
 * testClone() -- this failed because of missing clone() method 
 * testViewColumnIndexOver() -- this was failing because BlockSparseMatrix have 
extensible rows

I have added clone() method and also fixed remaining test cases in 
BlockSparseMatrixTest class.

Now all tests are passing. Please have a look at the patch attached in previous 
comment: [^MAHOUT-1193-all-tests-pass.patch]


Thanks,
Saleem


 We may want a BlockSparseMatrix
 ---

 Key: MAHOUT-1193
 URL: https://issues.apache.org/jira/browse/MAHOUT-1193
 Project: Mahout
  Issue Type: Bug
Reporter: Ted Dunning
 Fix For: Backlog

 Attachments: MAHOUT-1193-all-tests-pass.patch, 
 MAHOUT-1193-fix-compile-errors-tests-still-fail.patch, MAHOUT-1193.patch


 Here is an implementation.
 Is it good enough to commit?
 Is it useful?
 Is it redundant?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1275) Drop some of the Release Artifact File Types

2013-07-08 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702370#comment-13702370
 ] 

Grant Ingersoll commented on MAHOUT-1275:
-

Stevo, just FYI, please don't commit anything right now, as we are under code 
freeze until 0.8 is out (unless you know how to deal w/ this in Maven release 
plugin)

 Drop some of the Release Artifact File Types
 

 Key: MAHOUT-1275
 URL: https://issues.apache.org/jira/browse/MAHOUT-1275
 Project: Mahout
  Issue Type: Task
Reporter: Grant Ingersoll
Assignee: Stevo Slavic
Priority: Minor
 Fix For: 0.9


 There really is no reason why we need so many release artifacts for the 
 distribution.  We run on *NIX machines.  Zip and Gzip are standard tools, 
 let's save a few bits, along with Release Manager upload times, and drop the 
 BZ2 format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1275) Drop some of the Release Artifact File Types

2013-07-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702441#comment-13702441
 ] 

Hudson commented on MAHOUT-1275:


Integrated in Mahout-Quality #2137 (See 
[https://builds.apache.org/job/Mahout-Quality/2137/])
MAHOUT-1275 Dropped bz2 distribution format for source and binaries 
(Revision 1500898)

 Result = SUCCESS
sslavic : 
Files : 
* /mahout/trunk/CHANGELOG
* /mahout/trunk/distribution/src/main/assembly/bin.xml
* /mahout/trunk/distribution/src/main/assembly/src.xml


 Drop some of the Release Artifact File Types
 

 Key: MAHOUT-1275
 URL: https://issues.apache.org/jira/browse/MAHOUT-1275
 Project: Mahout
  Issue Type: Task
Reporter: Grant Ingersoll
Assignee: Stevo Slavic
Priority: Minor
 Fix For: 0.9


 There really is no reason why we need so many release artifacts for the 
 distribution.  We run on *NIX machines.  Zip and Gzip are standard tools, 
 let's save a few bits, along with Release Manager upload times, and drop the 
 BZ2 format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAHOUT-1275) Drop some of the Release Artifact File Types

2013-07-08 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702458#comment-13702458
 ] 

Grant Ingersoll commented on MAHOUT-1275:
-

[~sslavic] Please revert this.  We are under code freeze right now on trunk.

 Drop some of the Release Artifact File Types
 

 Key: MAHOUT-1275
 URL: https://issues.apache.org/jira/browse/MAHOUT-1275
 Project: Mahout
  Issue Type: Task
Reporter: Grant Ingersoll
Assignee: Stevo Slavic
Priority: Minor
 Fix For: 0.9


 There really is no reason why we need so many release artifacts for the 
 distribution.  We run on *NIX machines.  Zip and Gzip are standard tools, 
 let's save a few bits, along with Release Manager upload times, and drop the 
 BZ2 format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Jenkins build is back to normal : mahout-nightly » Mahout Core #1287

2013-07-08 Thread Apache Jenkins Server
See 
https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-core/1287/changes



Build failed in Jenkins: mahout-nightly » Mahout Integration #1287

2013-07-08 Thread Apache Jenkins Server
See 
https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/1287/changes

Changes:

[gsingers] [maven-release-plugin] prepare for next development iteration

[gsingers] [maven-release-plugin] prepare release mahout-0.8

--
[INFO] 
[INFO] 
[INFO] Building Mahout Integration 0.9-SNAPSHOT
[INFO] 
[INFO] [INFO] Deleting 
https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target

[INFO] --- maven-clean-plugin:2.4.1:clean (default-clean) @ mahout-integration 
---
[INFO] [INFO] Using 'UTF-8' encoding to copy filtered resources.

[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
mahout-integration ---
[INFO] Copying 0 resource
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
mahout-integration ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 131 source files to 
https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target/classes
[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[WARNING] Note: 
https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/src/main/java/org/apache/mahout/cf/taste/impl/model/mongodb/MongoDBDataModel.java
 uses unchecked or unsafe operations.
[WARNING] Note: Recompile with -Xlint:unchecked for details.
[INFO] [INFO] Using 'UTF-8' encoding to copy filtered resources.

[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
mahout-integration ---
[INFO] Copying 10 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
mahout-integration ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 39 source files to 
https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target/test-classes
[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-surefire-plugin:2.15:test (default-test) @ mahout-integration 
---
[INFO] Surefire report directory: 
https://builds.apache.org/job/mahout-nightly/org.apache.mahout$mahout-integration/ws/target/surefire-reports
[INFO] parallel='classes', perCoreThreadCount=false, threadCount=1, 
useUnlimitedThreads=false

---
 T E S T S
---

---
 T E S T S
---
Running 
org.apache.mahout.cf.taste.impl.similarity.jdbc.MySQLJDBCInMemoryItemSimilarityTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.179 sec - in 
org.apache.mahout.cf.taste.impl.similarity.jdbc.MySQLJDBCInMemoryItemSimilarityTest
Running org.apache.mahout.clustering.TestClusterEvaluator
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.962 sec - 
in org.apache.mahout.clustering.TestClusterEvaluator
Running org.apache.mahout.clustering.TestClusterDumper
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.839 sec - in 
org.apache.mahout.clustering.TestClusterDumper
Running org.apache.mahout.clustering.dirichlet.TestL1ModelClustering
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.948 sec - in 
org.apache.mahout.clustering.dirichlet.TestL1ModelClustering
Running org.apache.mahout.clustering.cdbw.TestCDbwEvaluator
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.028 sec - in 
org.apache.mahout.clustering.cdbw.TestCDbwEvaluator
Running org.apache.mahout.utils.TestConcatenateVectorsJob
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.42 sec - in 
org.apache.mahout.utils.TestConcatenateVectorsJob
Running org.apache.mahout.utils.vectors.lucene.LuceneIterableTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.519 sec - in 
org.apache.mahout.utils.vectors.lucene.LuceneIterableTest
Running org.apache.mahout.utils.vectors.lucene.DriverTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.978 sec - in 
org.apache.mahout.utils.vectors.lucene.DriverTest
Running org.apache.mahout.utils.vectors.lucene.CachedTermInfoTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.376 sec - in 
org.apache.mahout.utils.vectors.lucene.CachedTermInfoTest
Running org.apache.mahout.utils.vectors.csv.CSVVectorIteratorTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.156 sec - in 
org.apache.mahout.utils.vectors.csv.CSVVectorIteratorTest
Running org.apache.mahout.utils.vectors.VectorHelperTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time 

Build failed in Jenkins: mahout-nightly #1287

2013-07-08 Thread Apache Jenkins Server
See https://builds.apache.org/job/mahout-nightly/1287/changes

Changes:

[sslavic] MAHOUT-1275 Dropped bz2 distribution format for source and binaries

[gsingers] [maven-release-plugin] prepare for next development iteration

[gsingers] [maven-release-plugin] prepare release mahout-0.8

[ssc] MAHOUT-1272 Parallel SGD matrix factorizer for SVDrecommender

--
[...truncated 1728 lines...]
[INFO] 
[INFO] --- maven-jar-plugin:2.4:test-jar (default) @ mahout-core ---
[INFO] Building jar: 
https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-tests.jar
[WARNING] Artifact org.apache.mahout:mahout-core:test-jar:tests:0.9-SNAPSHOT 
already attached to project, ignoring duplicate
[INFO] [INFO] Reading assembly descriptor: src/main/assembly/job.xml

[INFO] --- maven-assembly-plugin:2.4:single (job) @ mahout-core ---
[INFO] Building jar: 
https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-job.jar
[WARNING] Artifact org.apache.mahout:mahout-core:jar:job:0.9-SNAPSHOT already 
attached to project, ignoring duplicate
[INFO] 
[INFO] --- maven-source-plugin:2.2.1:jar-no-fork (attach-sources) @ mahout-core 
---
[WARNING] Artifact 
org.apache.mahout:mahout-core:java-source:sources:0.9-SNAPSHOT already attached 
to project, ignoring duplicate
[INFO] [INFO] Installing 
https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT.jar
 to 
/home/jenkins/jenkins-slave/maven-repositories/0/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT.jar

[INFO] --- maven-install-plugin:2.4:install (default-install) @ mahout-core ---
[INFO] Installing 
https://builds.apache.org/job/mahout-nightly/ws/trunk/core/pom.xml to 
/home/jenkins/jenkins-slave/maven-repositories/0/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT.pom
[INFO] Installing 
https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-tests.jar
 to 
/home/jenkins/jenkins-slave/maven-repositories/0/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-tests.jar
[INFO] Installing 
https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-job.jar
 to 
/home/jenkins/jenkins-slave/maven-repositories/0/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-job.jar
[INFO] Installing 
https://builds.apache.org/job/mahout-nightly/ws/trunk/core/target/mahout-core-0.9-SNAPSHOT-sources.jar
 to 
/home/jenkins/jenkins-slave/maven-repositories/0/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-sources.jar
[INFO] 
[INFO] --- maven-deploy-plugin:2.5:deploy (default-deploy) @ mahout-core ---
Downloading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1.jar
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1.jar
 (1605 KB at 7391.8 KB/sec)
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1.pom
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1.pom
 (7 KB at 96.6 KB/sec)
Downloading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml
Downloaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml
 (344 B at 2.4 KB/sec)
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
 (772 B at 13.2 KB/sec)
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml
 (382 B at 4.8 KB/sec)
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1-tests.jar
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130708.234336-1-tests.jar
 (2446 KB at 9264.2 KB/sec)
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
 (982 B at 

Re: (Bi-)Weekly/Monthly Dev Sessions

2013-07-08 Thread Grant Ingersoll
Hmm, seems like that old link doesn't work.  Here's a new one: 
https://plus.google.com/hangouts/_/899b63ca1b3864c749886348cdddfcd80d00bb0b?hl=en

-Grant

On Jul 7, 2013, at 5:24 PM, Grant Ingersoll gsing...@apache.org wrote:

 How about tomorrow (Monday) night at 8:30 pm EDT?  
 
 Anyone who wants to join, can browse to 
 https://plus.google.com/hangouts/_/1aa32da8d1f9b1669cf6b5ec8bce123d12aec409?hl=en
   If for some reason that doesn't work, ping me on IRC (gsingers) in the 
 #mahout channel on Freenode.
 
 
 Agenda:
 
 0.8 Release Testing
 
 -Grant
 
 
 On Jun 25, 2013, at 6:17 PM, Suneel Marthi suneel_mar...@yahoo.com wrote:
 
 Is today's Hangout happening?
 
 
 
 On Wed, Jun 12, 2013 at 4:26 AM, Grant Ingersoll gsing...@apache.org
 wrote:
 
 Hi,
 
 One of the things we kicked around at Buzzwords was having a
 weekly/bi-weekly/monthly dev session via Google hangout (Drill does this
 with good success, I believe).  Since we are so spread out, I thought I
 would throw out a Doodle (scheduling tool for those unfamiliar) to see
 what
 times work best for the majority of people interested in such a thing.
   Anyone is free to participate, but this is not a Q and A session, but is
 instead focused on writing code, fixing bugs, triaging JIRA, releasing,
 etc.
 
 If you are interested, please fill out
 http://doodle.com/gatxxkm7f25fq5y8 (note, all times are Eastern Time Zone
 since I did the poll!)  I just
 grabbed a sampling of hours throughout the day.  I also picked 1 week as
 being representative of this being on a repeating schedule.  If none of
 the
 times work for you, but you are still interested, please respond here.  I
 would imagine we would meet for 1-2 hours.
 
 Also, please reply with the frequency at which you would like to meet:
 
 []  Weekly
 []  Bi-weekly (every 2 weeks)
 []  Monthly
 
 My vote is every two weeks.
 
 -Grant
 
 
 
 
 --
 Thanks,
 Pradeep
 
 
 Grant Ingersoll | @gsingers
 http://www.lucidworks.com
 
 
 
 
 


Grant Ingersoll | @gsingers
http://www.lucidworks.com







Re: Mahout vectors/matrices/solvers on spark

2013-07-08 Thread Dmitriy Lyubimov
Anybody knows how good (or bad) our performance on matrix transpose? how
long will it take to transpose a 10M non-zeros with Mahout (if i wanted to
setup fully distributed but single node MR cluster?)

Trying to figure if the numbers i see with Bagel-based Mahout matrix
transposition are any good.


Re: (Bi-)Weekly/Monthly Dev Sessions

2013-07-08 Thread Andrew Musselman
I'm getting an error when I build after doing svn up:

$ mvn package
[INFO] Scanning for projects...
[ERROR] The build could not read 1 project - [Help 1]
[ERROR]
[ERROR]   The project  (/home/akm/mahout/pom.xml) has 1 error
[ERROR] Non-readable POM /home/akm/mahout/pom.xml: no more data
available - expected end tag /project to close start tag project from
line 2, parser stopped on END_TAG seen .../reporting\n/project\n...
@1030:1

But there's a /project tag at the end of that..


On Mon, Jul 8, 2013 at 5:24 PM, Grant Ingersoll gsing...@apache.org wrote:

 Hmm, seems like that old link doesn't work.  Here's a new one:
 https://plus.google.com/hangouts/_/899b63ca1b3864c749886348cdddfcd80d00bb0b?hl=en

 -Grant

 On Jul 7, 2013, at 5:24 PM, Grant Ingersoll gsing...@apache.org wrote:

  How about tomorrow (Monday) night at 8:30 pm EDT?
 
  Anyone who wants to join, can browse to
 https://plus.google.com/hangouts/_/1aa32da8d1f9b1669cf6b5ec8bce123d12aec409?hl=en
  If for some reason that doesn't work, ping me on IRC (gsingers) in the
 #mahout channel on Freenode.
 
 
  Agenda:
 
  0.8 Release Testing
 
  -Grant
 
 
  On Jun 25, 2013, at 6:17 PM, Suneel Marthi suneel_mar...@yahoo.com
 wrote:
 
  Is today's Hangout happening?
 
 
 
  On Wed, Jun 12, 2013 at 4:26 AM, Grant Ingersoll gsing...@apache.org
  wrote:
 
  Hi,
 
  One of the things we kicked around at Buzzwords was having a
  weekly/bi-weekly/monthly dev session via Google hangout (Drill does
 this
  with good success, I believe).  Since we are so spread out, I thought
 I
  would throw out a Doodle (scheduling tool for those unfamiliar) to see
  what
  times work best for the majority of people interested in such a thing.
Anyone is free to participate, but this is not a Q and A session,
 but is
  instead focused on writing code, fixing bugs, triaging JIRA,
 releasing,
  etc.
 
  If you are interested, please fill out
  http://doodle.com/gatxxkm7f25fq5y8 (note, all times are Eastern Time
 Zone
  since I did the poll!)  I just
  grabbed a sampling of hours throughout the day.  I also picked 1 week
 as
  being representative of this being on a repeating schedule.  If none
 of
  the
  times work for you, but you are still interested, please respond
 here.  I
  would imagine we would meet for 1-2 hours.
 
  Also, please reply with the frequency at which you would like to meet:
 
  []  Weekly
  []  Bi-weekly (every 2 weeks)
  []  Monthly
 
  My vote is every two weeks.
 
  -Grant
 
 
 
 
  --
  Thanks,
  Pradeep
 
  
  Grant Ingersoll | @gsingers
  http://www.lucidworks.com
 
 
 
 
 

 
 Grant Ingersoll | @gsingers
 http://www.lucidworks.com








Re: Mahout vectors/matrices/solvers on spark

2013-07-08 Thread Ted Dunning
Transpose of that small a matrix should happen in memory. 

Sent from my iPhone

On Jul 8, 2013, at 17:26, Dmitriy Lyubimov dlie...@gmail.com wrote:

 Anybody knows how good (or bad) our performance on matrix transpose? how
 long will it take to transpose a 10M non-zeros with Mahout (if i wanted to
 setup fully distributed but single node MR cluster?)
 
 Trying to figure if the numbers i see with Bagel-based Mahout matrix
 transposition are any good.


Re: Mahout vectors/matrices/solvers on spark

2013-07-08 Thread Dmitriy Lyubimov
yes, but it is just a test and I am trying to interpolate results that i
see to bigger volume. sort of. To get some taste of the programming model
performance.

I do get cpu-bound behavior and i hit spark cache 100% of the time. so i
theory, since i am not having spills and i am not doing sorts, it should be
fairly fast.

I have two algorithms. One just sends elementwise messages to the vertex
representing a row it should be in. Another one is using the same set of
initial messages but also uses Bagel combiners which, the way i understand
it, apply combining of elements to form partial vectors before shipping it
off to remote vertex paritition. Reasoning here apparently since elements
are combined, there's fewer io. Well, perhaps not in this case so much,
since we are not really doing any sort of information aggregation. On
single spark node setup i of course don't have actual io, so it should
approach speed of in-core copy-by-serialization.

What i am seeing is that elementwise messages work almost two times faster
in cpu bound behavior than the version with combiners. it would seem the
culprit is that VectorWritable serialization and then deserialization of
vectorized fragments is considerably slower than serialization of
elementwise messages containing only primitive types there (target row,
index, value), even that the latter is significantly larger amount of
objects as well as data.

Still though, i am trying to convince myself that even using combiners
should be ok compared to shuffle and sort overhead. But i think in reality
it still looks a bit slower than i expected. well i guess i should not be
lazy and benchmark it against Mahout MR-based transpose as well as spark's
version of RDD shuffle-and-sort.

anyway, map-only tasks on spark distributed matrices are lightning fast but
Bagel serialze/deserialize scatter/gather seems to be much slower than just
map-only processing. Perhaps I am doing it wrong somehow.


On Mon, Jul 8, 2013 at 10:22 PM, Ted Dunning ted.dunn...@gmail.com wrote:

 Transpose of that small a matrix should happen in memory.

 Sent from my iPhone

 On Jul 8, 2013, at 17:26, Dmitriy Lyubimov dlie...@gmail.com wrote:

  Anybody knows how good (or bad) our performance on matrix transpose? how
  long will it take to transpose a 10M non-zeros with Mahout (if i wanted
 to
  setup fully distributed but single node MR cluster?)
 
  Trying to figure if the numbers i see with Bagel-based Mahout matrix
  transposition are any good.