Fwd: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-04-12 Thread Andrew Musselman
FYI

-- Forwarded message -
From: Andrew Musselman 
Date: Wed, Apr 12, 2017 at 8:30 PM
Subject: Fwd: [VOTE] Apache Mahout 0.13.0 Release Candidate
To: d...@maven.apache.org 


Hi Maven team, we are having trouble getting a couple new modules included
in our release build and our former "Maven person" can't contribute any
more at their new job.

Anyone up for some troubleshooting? We release from master, our source repo
is at https://github.com/apache/mahout, and our latest artifacts are at
https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
.

Thanks for any help you can offer.

Best
Andrew

-- Forwarded message -
From: Andrew Musselman 
Date: Wed, Apr 12, 2017 at 8:10 PM
Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
To: 


Does anyone know anyone on the Maven project or who understands the fine
points of making things like this works?

On Wed, Apr 12, 2017 at 8:06 PM Andrew Musselman 
wrote:

> Okay consider this vote cancelled.
>
> On Wed, Apr 12, 2017 at 8:02 PM Andrew Palumbo  wrote:
>
>> The line that I suggested the other day:
>>
>>
>> mvn -Pmahout-release,apache-release,viennacl,hadoop2 release:prepare
>> release:perform
>>
>>
>> doesn't seem to have activated the viennacl profile
>>
>> 
>> From: Andrew Palumbo 
>> Sent: Wednesday, April 12, 2017 10:58:16 PM
>> To: dev@mahout.apache.org
>> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>>
>>
>> off the top of my head i dont know if it is a regression.   My guess is
>> that they were missing before.  I think that this may have something to do
>> with activating the profile -Pviennacl at release time.
>>
>> 
>> From: Andrew Musselman 
>> Sent: Wednesday, April 12, 2017 10:48:13 PM
>> To: dev@mahout.apache.org; u...@mahout.apache.org
>> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>>
>> Is it a regression or were they missing before, do you know?
>>
>> On Wed, Apr 12, 2017 at 7:42 PM Andrew Palumbo 
>> wrote:
>>
>> >
>> > It looks like we're missing some jars from the binary distro:
>> >
>> >
>> > bin
>> >
>> > mahout-integration-0.13.0.jar
>> >
>> > confmahout-math-0.13.0.jar derby.log
>> >  mahout-math-scala_2.10-0.13.0.jar docs
>> > mahout-mr-0.13.0.jar examples
>> > mahout-mr-0.13.0-job.jar flink
>> >  mahout-spark_2.10-0.13.0-dependency-reduced.jar h2o
>> >  mahout-spark_2.10-0.13.0.jar lib
>> >  metastore_db LICENSE.txt NOTICE.txt
>> > mahout-examples-0.13.0.jar  README.md mahout-examples-0.13.0-job.jar
>> > viennacl mahout-hdfs-0.13.0.jar  viennacl-omp
>> >
>> > 
>> > From: Andrew Musselman 
>> > Sent: Tuesday, April 11, 2017 12:05 PM
>> > To: u...@mahout.apache.org; dev@mahout.apache.org
>> > Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>> >
>> > I've checked hashes and sigs, and run a build with passing tests for
>> > vanilla, viennacl, and viennacl-omp profiles.
>> >
>> > The spark shell runs the SparseSparseDrmTimer.mscala example in the
>> binary
>> > build and all three source profiles; I saw the GPU get exercised when
>> > running the viennacl profile from source, and saw all cores on the CPU
>> get
>> > exercised when running the viennacl-omp profile from source.
>> >
>> > So far I'm +1 (binding).
>> >
>> >
>> >
>> > On Tue, Apr 11, 2017 at 8:55 AM, Andrew Musselman <
>> > andrew.mussel...@gmail.com> wrote:
>> >
>> > > This is the vote for release 0.13.0 of Apache Mahout.
>> > >
>> > > The vote will be going for at least 72 hours and will be closed on
>> > Friday,
>> > > April 17th, 2017 or once there are at least 3 PMC +1 binding votes
>> > (whichever
>> > > occurs earlier).  Please download, test and vote with
>> > >
>> > > [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
>> > > [ ] +0, I don't care either way,
>> > > [ ] -1, do not accept RC as the official 0.13.0 release of Apache
>> Mahout,
>> > > because...
>> > >
>> > >
>> > > Maven staging repo:
>> > >
>> > > *
>> >
>> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
>> > > <
>> >
>> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
>> > >*
>> > >
>> > > The git tag to be voted upon is mahout-0.13.0
>> > >
>> >
>>
>


Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-04-12 Thread Andrew Palumbo
+1


From: Andrew Musselman 
Sent: Wednesday, April 12, 2017 11:06:29 PM
To: dev@mahout.apache.org
Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

Okay consider this vote cancelled.

On Wed, Apr 12, 2017 at 8:02 PM Andrew Palumbo  wrote:

> The line that I suggested the other day:
>
>
> mvn -Pmahout-release,apache-release,viennacl,hadoop2 release:prepare
> release:perform
>
>
> doesn't seem to have activated the viennacl profile
>
> 
> From: Andrew Palumbo 
> Sent: Wednesday, April 12, 2017 10:58:16 PM
> To: dev@mahout.apache.org
> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>
>
> off the top of my head i dont know if it is a regression.   My guess is
> that they were missing before.  I think that this may have something to do
> with activating the profile -Pviennacl at release time.
>
> 
> From: Andrew Musselman 
> Sent: Wednesday, April 12, 2017 10:48:13 PM
> To: dev@mahout.apache.org; u...@mahout.apache.org
> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>
> Is it a regression or were they missing before, do you know?
>
> On Wed, Apr 12, 2017 at 7:42 PM Andrew Palumbo  wrote:
>
> >
> > It looks like we're missing some jars from the binary distro:
> >
> >
> > bin
> >
> > mahout-integration-0.13.0.jar
> >
> > confmahout-math-0.13.0.jar derby.log
> >  mahout-math-scala_2.10-0.13.0.jar docs
> > mahout-mr-0.13.0.jar examples
> > mahout-mr-0.13.0-job.jar flink
> >  mahout-spark_2.10-0.13.0-dependency-reduced.jar h2o
> >  mahout-spark_2.10-0.13.0.jar lib
> >  metastore_db LICENSE.txt NOTICE.txt
> > mahout-examples-0.13.0.jar  README.md mahout-examples-0.13.0-job.jar
> > viennacl mahout-hdfs-0.13.0.jar  viennacl-omp
> >
> > 
> > From: Andrew Musselman 
> > Sent: Tuesday, April 11, 2017 12:05 PM
> > To: u...@mahout.apache.org; dev@mahout.apache.org
> > Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
> >
> > I've checked hashes and sigs, and run a build with passing tests for
> > vanilla, viennacl, and viennacl-omp profiles.
> >
> > The spark shell runs the SparseSparseDrmTimer.mscala example in the
> binary
> > build and all three source profiles; I saw the GPU get exercised when
> > running the viennacl profile from source, and saw all cores on the CPU
> get
> > exercised when running the viennacl-omp profile from source.
> >
> > So far I'm +1 (binding).
> >
> >
> >
> > On Tue, Apr 11, 2017 at 8:55 AM, Andrew Musselman <
> > andrew.mussel...@gmail.com> wrote:
> >
> > > This is the vote for release 0.13.0 of Apache Mahout.
> > >
> > > The vote will be going for at least 72 hours and will be closed on
> > Friday,
> > > April 17th, 2017 or once there are at least 3 PMC +1 binding votes
> > (whichever
> > > occurs earlier).  Please download, test and vote with
> > >
> > > [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
> > > [ ] +0, I don't care either way,
> > > [ ] -1, do not accept RC as the official 0.13.0 release of Apache
> Mahout,
> > > because...
> > >
> > >
> > > Maven staging repo:
> > >
> > > *
> >
> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
> > > <
> >
> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
> > >*
> > >
> > > The git tag to be voted upon is mahout-0.13.0
> > >
> >
>


Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-04-12 Thread Andrew Musselman
Does anyone know anyone on the Maven project or who understands the fine
points of making things like this works?

On Wed, Apr 12, 2017 at 8:06 PM Andrew Musselman 
wrote:

> Okay consider this vote cancelled.
>
> On Wed, Apr 12, 2017 at 8:02 PM Andrew Palumbo  wrote:
>
>> The line that I suggested the other day:
>>
>>
>> mvn -Pmahout-release,apache-release,viennacl,hadoop2 release:prepare
>> release:perform
>>
>>
>> doesn't seem to have activated the viennacl profile
>>
>> 
>> From: Andrew Palumbo 
>> Sent: Wednesday, April 12, 2017 10:58:16 PM
>> To: dev@mahout.apache.org
>> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>>
>>
>> off the top of my head i dont know if it is a regression.   My guess is
>> that they were missing before.  I think that this may have something to do
>> with activating the profile -Pviennacl at release time.
>>
>> 
>> From: Andrew Musselman 
>> Sent: Wednesday, April 12, 2017 10:48:13 PM
>> To: dev@mahout.apache.org; u...@mahout.apache.org
>> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>>
>> Is it a regression or were they missing before, do you know?
>>
>> On Wed, Apr 12, 2017 at 7:42 PM Andrew Palumbo 
>> wrote:
>>
>> >
>> > It looks like we're missing some jars from the binary distro:
>> >
>> >
>> > bin
>> >
>> > mahout-integration-0.13.0.jar
>> >
>> > confmahout-math-0.13.0.jar derby.log
>> >  mahout-math-scala_2.10-0.13.0.jar docs
>> > mahout-mr-0.13.0.jar examples
>> > mahout-mr-0.13.0-job.jar flink
>> >  mahout-spark_2.10-0.13.0-dependency-reduced.jar h2o
>> >  mahout-spark_2.10-0.13.0.jar lib
>> >  metastore_db LICENSE.txt NOTICE.txt
>> > mahout-examples-0.13.0.jar  README.md mahout-examples-0.13.0-job.jar
>> > viennacl mahout-hdfs-0.13.0.jar  viennacl-omp
>> >
>> > 
>> > From: Andrew Musselman 
>> > Sent: Tuesday, April 11, 2017 12:05 PM
>> > To: u...@mahout.apache.org; dev@mahout.apache.org
>> > Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>> >
>> > I've checked hashes and sigs, and run a build with passing tests for
>> > vanilla, viennacl, and viennacl-omp profiles.
>> >
>> > The spark shell runs the SparseSparseDrmTimer.mscala example in the
>> binary
>> > build and all three source profiles; I saw the GPU get exercised when
>> > running the viennacl profile from source, and saw all cores on the CPU
>> get
>> > exercised when running the viennacl-omp profile from source.
>> >
>> > So far I'm +1 (binding).
>> >
>> >
>> >
>> > On Tue, Apr 11, 2017 at 8:55 AM, Andrew Musselman <
>> > andrew.mussel...@gmail.com> wrote:
>> >
>> > > This is the vote for release 0.13.0 of Apache Mahout.
>> > >
>> > > The vote will be going for at least 72 hours and will be closed on
>> > Friday,
>> > > April 17th, 2017 or once there are at least 3 PMC +1 binding votes
>> > (whichever
>> > > occurs earlier).  Please download, test and vote with
>> > >
>> > > [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
>> > > [ ] +0, I don't care either way,
>> > > [ ] -1, do not accept RC as the official 0.13.0 release of Apache
>> Mahout,
>> > > because...
>> > >
>> > >
>> > > Maven staging repo:
>> > >
>> > > *
>> >
>> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
>> > > <
>> >
>> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
>> > >*
>> > >
>> > > The git tag to be voted upon is mahout-0.13.0
>> > >
>> >
>>
>


Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-04-12 Thread Andrew Musselman
Okay consider this vote cancelled.

On Wed, Apr 12, 2017 at 8:02 PM Andrew Palumbo  wrote:

> The line that I suggested the other day:
>
>
> mvn -Pmahout-release,apache-release,viennacl,hadoop2 release:prepare
> release:perform
>
>
> doesn't seem to have activated the viennacl profile
>
> 
> From: Andrew Palumbo 
> Sent: Wednesday, April 12, 2017 10:58:16 PM
> To: dev@mahout.apache.org
> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>
>
> off the top of my head i dont know if it is a regression.   My guess is
> that they were missing before.  I think that this may have something to do
> with activating the profile -Pviennacl at release time.
>
> 
> From: Andrew Musselman 
> Sent: Wednesday, April 12, 2017 10:48:13 PM
> To: dev@mahout.apache.org; u...@mahout.apache.org
> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>
> Is it a regression or were they missing before, do you know?
>
> On Wed, Apr 12, 2017 at 7:42 PM Andrew Palumbo  wrote:
>
> >
> > It looks like we're missing some jars from the binary distro:
> >
> >
> > bin
> >
> > mahout-integration-0.13.0.jar
> >
> > confmahout-math-0.13.0.jar derby.log
> >  mahout-math-scala_2.10-0.13.0.jar docs
> > mahout-mr-0.13.0.jar examples
> > mahout-mr-0.13.0-job.jar flink
> >  mahout-spark_2.10-0.13.0-dependency-reduced.jar h2o
> >  mahout-spark_2.10-0.13.0.jar lib
> >  metastore_db LICENSE.txt NOTICE.txt
> > mahout-examples-0.13.0.jar  README.md mahout-examples-0.13.0-job.jar
> > viennacl mahout-hdfs-0.13.0.jar  viennacl-omp
> >
> > 
> > From: Andrew Musselman 
> > Sent: Tuesday, April 11, 2017 12:05 PM
> > To: u...@mahout.apache.org; dev@mahout.apache.org
> > Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
> >
> > I've checked hashes and sigs, and run a build with passing tests for
> > vanilla, viennacl, and viennacl-omp profiles.
> >
> > The spark shell runs the SparseSparseDrmTimer.mscala example in the
> binary
> > build and all three source profiles; I saw the GPU get exercised when
> > running the viennacl profile from source, and saw all cores on the CPU
> get
> > exercised when running the viennacl-omp profile from source.
> >
> > So far I'm +1 (binding).
> >
> >
> >
> > On Tue, Apr 11, 2017 at 8:55 AM, Andrew Musselman <
> > andrew.mussel...@gmail.com> wrote:
> >
> > > This is the vote for release 0.13.0 of Apache Mahout.
> > >
> > > The vote will be going for at least 72 hours and will be closed on
> > Friday,
> > > April 17th, 2017 or once there are at least 3 PMC +1 binding votes
> > (whichever
> > > occurs earlier).  Please download, test and vote with
> > >
> > > [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
> > > [ ] +0, I don't care either way,
> > > [ ] -1, do not accept RC as the official 0.13.0 release of Apache
> Mahout,
> > > because...
> > >
> > >
> > > Maven staging repo:
> > >
> > > *
> >
> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
> > > <
> >
> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
> > >*
> > >
> > > The git tag to be voted upon is mahout-0.13.0
> > >
> >
>


Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-04-12 Thread Andrew Palumbo
The line that I suggested the other day:


mvn -Pmahout-release,apache-release,viennacl,hadoop2 release:prepare 
release:perform


doesn't seem to have activated the viennacl profile


From: Andrew Palumbo 
Sent: Wednesday, April 12, 2017 10:58:16 PM
To: dev@mahout.apache.org
Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate


off the top of my head i dont know if it is a regression.   My guess is that 
they were missing before.  I think that this may have something to do with 
activating the profile -Pviennacl at release time.


From: Andrew Musselman 
Sent: Wednesday, April 12, 2017 10:48:13 PM
To: dev@mahout.apache.org; u...@mahout.apache.org
Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

Is it a regression or were they missing before, do you know?

On Wed, Apr 12, 2017 at 7:42 PM Andrew Palumbo  wrote:

>
> It looks like we're missing some jars from the binary distro:
>
>
> bin
>
> mahout-integration-0.13.0.jar
>
> confmahout-math-0.13.0.jar derby.log
>  mahout-math-scala_2.10-0.13.0.jar docs
> mahout-mr-0.13.0.jar examples
> mahout-mr-0.13.0-job.jar flink
>  mahout-spark_2.10-0.13.0-dependency-reduced.jar h2o
>  mahout-spark_2.10-0.13.0.jar lib
>  metastore_db LICENSE.txt NOTICE.txt
> mahout-examples-0.13.0.jar  README.md mahout-examples-0.13.0-job.jar
> viennacl mahout-hdfs-0.13.0.jar  viennacl-omp
>
> 
> From: Andrew Musselman 
> Sent: Tuesday, April 11, 2017 12:05 PM
> To: u...@mahout.apache.org; dev@mahout.apache.org
> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>
> I've checked hashes and sigs, and run a build with passing tests for
> vanilla, viennacl, and viennacl-omp profiles.
>
> The spark shell runs the SparseSparseDrmTimer.mscala example in the binary
> build and all three source profiles; I saw the GPU get exercised when
> running the viennacl profile from source, and saw all cores on the CPU get
> exercised when running the viennacl-omp profile from source.
>
> So far I'm +1 (binding).
>
>
>
> On Tue, Apr 11, 2017 at 8:55 AM, Andrew Musselman <
> andrew.mussel...@gmail.com> wrote:
>
> > This is the vote for release 0.13.0 of Apache Mahout.
> >
> > The vote will be going for at least 72 hours and will be closed on
> Friday,
> > April 17th, 2017 or once there are at least 3 PMC +1 binding votes
> (whichever
> > occurs earlier).  Please download, test and vote with
> >
> > [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
> > [ ] +0, I don't care either way,
> > [ ] -1, do not accept RC as the official 0.13.0 release of Apache Mahout,
> > because...
> >
> >
> > Maven staging repo:
> >
> > *
> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
> > <
> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
> >*
> >
> > The git tag to be voted upon is mahout-0.13.0
> >
>


Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-04-12 Thread Andrew Palumbo
off the top of my head i dont know if it is a regression.   My guess is that 
they were missing before.  I think that this may have something to do with 
activating the profile -Pviennacl at release time.


From: Andrew Musselman 
Sent: Wednesday, April 12, 2017 10:48:13 PM
To: dev@mahout.apache.org; u...@mahout.apache.org
Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

Is it a regression or were they missing before, do you know?

On Wed, Apr 12, 2017 at 7:42 PM Andrew Palumbo  wrote:

>
> It looks like we're missing some jars from the binary distro:
>
>
> bin
>
> mahout-integration-0.13.0.jar
>
> confmahout-math-0.13.0.jar derby.log
>  mahout-math-scala_2.10-0.13.0.jar docs
> mahout-mr-0.13.0.jar examples
> mahout-mr-0.13.0-job.jar flink
>  mahout-spark_2.10-0.13.0-dependency-reduced.jar h2o
>  mahout-spark_2.10-0.13.0.jar lib
>  metastore_db LICENSE.txt NOTICE.txt
> mahout-examples-0.13.0.jar  README.md mahout-examples-0.13.0-job.jar
> viennacl mahout-hdfs-0.13.0.jar  viennacl-omp
>
> 
> From: Andrew Musselman 
> Sent: Tuesday, April 11, 2017 12:05 PM
> To: u...@mahout.apache.org; dev@mahout.apache.org
> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>
> I've checked hashes and sigs, and run a build with passing tests for
> vanilla, viennacl, and viennacl-omp profiles.
>
> The spark shell runs the SparseSparseDrmTimer.mscala example in the binary
> build and all three source profiles; I saw the GPU get exercised when
> running the viennacl profile from source, and saw all cores on the CPU get
> exercised when running the viennacl-omp profile from source.
>
> So far I'm +1 (binding).
>
>
>
> On Tue, Apr 11, 2017 at 8:55 AM, Andrew Musselman <
> andrew.mussel...@gmail.com> wrote:
>
> > This is the vote for release 0.13.0 of Apache Mahout.
> >
> > The vote will be going for at least 72 hours and will be closed on
> Friday,
> > April 17th, 2017 or once there are at least 3 PMC +1 binding votes
> (whichever
> > occurs earlier).  Please download, test and vote with
> >
> > [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
> > [ ] +0, I don't care either way,
> > [ ] -1, do not accept RC as the official 0.13.0 release of Apache Mahout,
> > because...
> >
> >
> > Maven staging repo:
> >
> > *
> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
> > <
> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
> >*
> >
> > The git tag to be voted upon is mahout-0.13.0
> >
>


Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-04-12 Thread Andrew Palumbo
oops- sent last email accidentally without finishing..



It looks like we are missing some jars from the binary distro:


andy@micheal:~/sandbox/apache-mahout-distribution-0.13.0$ ls *.jar


mahout-integration-0.13.0.jar

mahout-math-0.13.0.jar

mahout-math-scala_2.10-0.13.0.jar

mahout-mr-0.13.0.jar

mahout-mr-0.13.0-job.jar

mahout-spark_2.10-0.13.0-dependency-reduced.jar h2o 
mahout-spark_2.10-0.13.0.jar

mahout-examples-0.13.0.jar

mahout-examples-0.13.0-job.jar

mahout-hdfs-0.13.0.jar


we are missing mahout-native-viennacl_2.10.jar and 
mahout-native-viennacl-omp.jar


I think that we need to try a different build command.


Andy



From: Andrew Palumbo 
Sent: Wednesday, April 12, 2017 10:42:30 PM
To: u...@mahout.apache.org
Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate



It looks like we're missing some jars from the binary distro:


bin

mahout-integration-0.13.0.jar

confmahout-math-0.13.0.jar derby.log
   mahout-math-scala_2.10-0.13.0.jar docs
mahout-mr-0.13.0.jar examplesmahout-mr-0.13.0-job.jar 
flink   mahout-spark_2.10-0.13.0-dependency-reduced.jar 
h2o mahout-spark_2.10-0.13.0.jar lib
 metastore_db LICENSE.txt NOTICE.txt 
mahout-examples-0.13.0.jar  README.md mahout-examples-0.13.0-job.jar  
viennacl mahout-hdfs-0.13.0.jar  viennacl-omp


From: Andrew Musselman 
Sent: Tuesday, April 11, 2017 12:05 PM
To: u...@mahout.apache.org; dev@mahout.apache.org
Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

I've checked hashes and sigs, and run a build with passing tests for
vanilla, viennacl, and viennacl-omp profiles.

The spark shell runs the SparseSparseDrmTimer.mscala example in the binary
build and all three source profiles; I saw the GPU get exercised when
running the viennacl profile from source, and saw all cores on the CPU get
exercised when running the viennacl-omp profile from source.

So far I'm +1 (binding).



On Tue, Apr 11, 2017 at 8:55 AM, Andrew Musselman <
andrew.mussel...@gmail.com> wrote:

> This is the vote for release 0.13.0 of Apache Mahout.
>
> The vote will be going for at least 72 hours and will be closed on Friday,
> April 17th, 2017 or once there are at least 3 PMC +1 binding votes (whichever
> occurs earlier).  Please download, test and vote with
>
> [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
> [ ] +0, I don't care either way,
> [ ] -1, do not accept RC as the official 0.13.0 release of Apache Mahout,
> because...
>
>
> Maven staging repo:
>
> *https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
> *
>
> The git tag to be voted upon is mahout-0.13.0
>


Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-04-12 Thread Andrew Musselman
Is it a regression or were they missing before, do you know?

On Wed, Apr 12, 2017 at 7:42 PM Andrew Palumbo  wrote:

>
> It looks like we're missing some jars from the binary distro:
>
>
> bin
>
> mahout-integration-0.13.0.jar
>
> confmahout-math-0.13.0.jar derby.log
>  mahout-math-scala_2.10-0.13.0.jar docs
> mahout-mr-0.13.0.jar examples
> mahout-mr-0.13.0-job.jar flink
>  mahout-spark_2.10-0.13.0-dependency-reduced.jar h2o
>  mahout-spark_2.10-0.13.0.jar lib
>  metastore_db LICENSE.txt NOTICE.txt
> mahout-examples-0.13.0.jar  README.md mahout-examples-0.13.0-job.jar
> viennacl mahout-hdfs-0.13.0.jar  viennacl-omp
>
> 
> From: Andrew Musselman 
> Sent: Tuesday, April 11, 2017 12:05 PM
> To: u...@mahout.apache.org; dev@mahout.apache.org
> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>
> I've checked hashes and sigs, and run a build with passing tests for
> vanilla, viennacl, and viennacl-omp profiles.
>
> The spark shell runs the SparseSparseDrmTimer.mscala example in the binary
> build and all three source profiles; I saw the GPU get exercised when
> running the viennacl profile from source, and saw all cores on the CPU get
> exercised when running the viennacl-omp profile from source.
>
> So far I'm +1 (binding).
>
>
>
> On Tue, Apr 11, 2017 at 8:55 AM, Andrew Musselman <
> andrew.mussel...@gmail.com> wrote:
>
> > This is the vote for release 0.13.0 of Apache Mahout.
> >
> > The vote will be going for at least 72 hours and will be closed on
> Friday,
> > April 17th, 2017 or once there are at least 3 PMC +1 binding votes
> (whichever
> > occurs earlier).  Please download, test and vote with
> >
> > [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
> > [ ] +0, I don't care either way,
> > [ ] -1, do not accept RC as the official 0.13.0 release of Apache Mahout,
> > because...
> >
> >
> > Maven staging repo:
> >
> > *
> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
> > <
> https://repository.apache.org/content/repositories/orgapachemahout-1042/org/apache/mahout/apache-mahout-distribution/0.13.0/
> >*
> >
> > The git tag to be voted upon is mahout-0.13.0
> >
>


RE: Trying to write the KMeans Clustering Using "Apache Mahout Samsara"

2017-04-12 Thread Andrew Palumbo
+1 to creating a branch.



Sent from my Verizon Wireless 4G LTE smartphone


 Original message 
From: Dmitriy Lyubimov 
Date: 04/12/2017 11:25 (GMT-08:00)
To: dev@mahout.apache.org
Subject: Re: Trying to write the KMeans Clustering Using "Apache Mahout Samsara"

can't say i can read this code well formatted that way...

it would seem to me that the code is not using the broadcast variable and
instead is using closure variable. that's the only thing i can immediately
see by looking in the middle of it.

it would be better if you created a branch on github for that code that
would allow for easy check-outs and comments.

-d

On Wed, Apr 12, 2017 at 10:29 AM, KHATWANI PARTH BHARAT <
h2016...@pilani.bits-pilani.ac.in> wrote:

> @Dmitriy Sir
>
> I have completed the Kmeans code as per the algorithm you have Outline
> above
>
> My code is as follows
>
> This code works fine till step number 10
>
> In step 11 i am assigning the new centriod index  to corresponding row key
> of data Point in the matrix
> I think i am doing something wrong in step 11 may be i am using incorrect
> syntax
>
> Can you help me find out what am i doing wrong.
>
>
> //start of main method
>
> def main(args: Array[String]) {
>  //1. initialize the spark and mahout context
> val conf = new SparkConf()
>   .setAppName("DRMExample")
>   .setMaster(args(0))
>   .set("spark.serializer", "org.apache.spark.serializer.
> KryoSerializer")
>   .set("spark.kryo.registrator",
> "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator")
> implicit val sc = new SparkDistributedContext(new SparkContext(conf))
>
> //2. read the data file and save it in the rdd
> val lines = sc.textFile(args(1))
>
> //3. convert data read in as string in to array of double
> val test = lines.map(line => line.split('\t').map(_.toDouble))
>
> //4. add a column having value 1 in array of double this will
> create something like (1 | D)',  which will be used while calculating
> (1 | D)'
> val augumentedArray = test.map(addCentriodColumn _)
>
> //5. convert rdd of array of double in rdd of DenseVector
> val rdd = augumentedArray.map(dvec(_))
>
> //6. convert rdd to DrmRdd
> val rddMatrixLike: DrmRdd[Int] = rdd.zipWithIndex.map { case (v,
> idx) => (idx.toInt, v) }//7. convert DrmRdd to
> CheckpointedDrm[Int]val matrix = drmWrap(rddMatrixLike)//8.
> seperating the column having all ones created in step 4 and will use
> it laterval oneVector = matrix(::, 0 until 1)//9. final
> input data in DrmLike[Int] formatval dataDrmX = matrix(::, 1 until
> 4)//9. Sampling to select initial centriodsval
> centriods = drmSampleKRows(dataDrmX, 2, false)centriods.size
> //10. Broad Casting the initial centriodsval broadCastMatrix =
> drmBroadcast(centriods)//11. Iterating over the Data
> Matrix(in DrmLike[Int] format) to calculate the initial centriods
> dataDrmX.mapBlock() {  case (keys, block) =>for (row <- 0
> until block.nrow) {  var dataPoint = block(row, ::)
> //12. findTheClosestCentriod find the closest centriod to the
> Data point specified by "dataPoint"  val closesetIndex =
> findTheClosestCentriod(dataPoint, centriods)//13.
> assigning closest index to key  keys(row) = closesetIndex
>   }keys -> block}
>
> //14. Calculating the (1|D)  val b = (oneVector cbind
> dataDrmX)//15. Aggregating Transpose (1|D)'val bTranspose
> = (oneVector cbind dataDrmX).t// after step 15 bTranspose will
> have data in the following format/*(n+1)*K where n=dimension
> of the data point, K=number of clusters* zeroth row will contain
> the count of points assigned to each cluster* assuming 3d data
> points * */
>
>
> val nrows = b.nrow.toInt//16. slicing the count vectors out
>  val pointCountVectors = drmBroadcast(b(0 until 1, ::).collect(0, ::))
>val vectorSums = b(1 until nrows, ::)//17. dividing the data
> point by count vectorvectorSums.mapBlock() {  case (keys,
> block) =>for (row <- 0 until block.nrow) {  block(row,
> ::) /= pointCountVectors}keys -> block}//18.
> seperating the count vectorsval newCentriods = vectorSums.t(::,1
> until centriods.size)//19. iterate over the above code
> till convergence criteria is meet   }//end of main method
>
>
>
>   // method to find the closest centriod to data point( vec: Vector
> in the arguments)  def findTheClosestCentriod(vec: Vector, matrix:
> Matrix): Int = {
> var index = 0
> var closest = Double.PositiveInfinity
> for (row <- 0 until matrix.nrow) {
>   val squaredSum = ssr(vec, matrix(row, ::))
>   val tempDist = Math.sqrt(ssr(vec, matrix(row, ::)))
>   if (tempDist < closest) {
> closest = tempDist
> index = row
>   }
> }
> index
>   }
>
>//calcul

Re: Trying to write the KMeans Clustering Using "Apache Mahout Samsara"

2017-04-12 Thread KHATWANI PARTH BHARAT
Ok i will do that.

On Wed, Apr 12, 2017 at 11:55 PM, Dmitriy Lyubimov 
wrote:

> can't say i can read this code well formatted that way...
>
> it would seem to me that the code is not using the broadcast variable and
> instead is using closure variable. that's the only thing i can immediately
> see by looking in the middle of it.
>
> it would be better if you created a branch on github for that code that
> would allow for easy check-outs and comments.
>
> -d
>
> On Wed, Apr 12, 2017 at 10:29 AM, KHATWANI PARTH BHARAT <
> h2016...@pilani.bits-pilani.ac.in> wrote:
>
> > @Dmitriy Sir
> >
> > I have completed the Kmeans code as per the algorithm you have Outline
> > above
> >
> > My code is as follows
> >
> > This code works fine till step number 10
> >
> > In step 11 i am assigning the new centriod index  to corresponding row
> key
> > of data Point in the matrix
> > I think i am doing something wrong in step 11 may be i am using incorrect
> > syntax
> >
> > Can you help me find out what am i doing wrong.
> >
> >
> > //start of main method
> >
> > def main(args: Array[String]) {
> >  //1. initialize the spark and mahout context
> > val conf = new SparkConf()
> >   .setAppName("DRMExample")
> >   .setMaster(args(0))
> >   .set("spark.serializer", "org.apache.spark.serializer.
> > KryoSerializer")
> >   .set("spark.kryo.registrator",
> > "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator")
> > implicit val sc = new SparkDistributedContext(new SparkContext(conf))
> >
> > //2. read the data file and save it in the rdd
> > val lines = sc.textFile(args(1))
> >
> > //3. convert data read in as string in to array of double
> > val test = lines.map(line => line.split('\t').map(_.toDouble))
> >
> > //4. add a column having value 1 in array of double this will
> > create something like (1 | D)',  which will be used while calculating
> > (1 | D)'
> > val augumentedArray = test.map(addCentriodColumn _)
> >
> > //5. convert rdd of array of double in rdd of DenseVector
> > val rdd = augumentedArray.map(dvec(_))
> >
> > //6. convert rdd to DrmRdd
> > val rddMatrixLike: DrmRdd[Int] = rdd.zipWithIndex.map { case (v,
> > idx) => (idx.toInt, v) }//7. convert DrmRdd to
> > CheckpointedDrm[Int]val matrix = drmWrap(rddMatrixLike)//8.
> > seperating the column having all ones created in step 4 and will use
> > it laterval oneVector = matrix(::, 0 until 1)//9. final
> > input data in DrmLike[Int] formatval dataDrmX = matrix(::, 1 until
> > 4)//9. Sampling to select initial centriodsval
> > centriods = drmSampleKRows(dataDrmX, 2, false)centriods.size
> > //10. Broad Casting the initial centriodsval broadCastMatrix =
> > drmBroadcast(centriods)//11. Iterating over the Data
> > Matrix(in DrmLike[Int] format) to calculate the initial centriods
> > dataDrmX.mapBlock() {  case (keys, block) =>for (row <- 0
> > until block.nrow) {  var dataPoint = block(row, ::)
> > //12. findTheClosestCentriod find the closest centriod to the
> > Data point specified by "dataPoint"  val closesetIndex =
> > findTheClosestCentriod(dataPoint, centriods)//13.
> > assigning closest index to key  keys(row) = closesetIndex
> >   }keys -> block}
> >
> > //14. Calculating the (1|D)  val b = (oneVector cbind
> > dataDrmX)//15. Aggregating Transpose (1|D)'val bTranspose
> > = (oneVector cbind dataDrmX).t// after step 15 bTranspose will
> > have data in the following format/*(n+1)*K where n=dimension
> > of the data point, K=number of clusters* zeroth row will contain
> > the count of points assigned to each cluster* assuming 3d data
> > points * */
> >
> >
> > val nrows = b.nrow.toInt//16. slicing the count vectors out
> >  val pointCountVectors = drmBroadcast(b(0 until 1, ::).collect(0, ::))
> >val vectorSums = b(1 until nrows, ::)//17. dividing the data
> > point by count vectorvectorSums.mapBlock() {  case (keys,
> > block) =>for (row <- 0 until block.nrow) {  block(row,
> > ::) /= pointCountVectors}keys -> block}//18.
> > seperating the count vectorsval newCentriods = vectorSums.t(::,1
> > until centriods.size)//19. iterate over the above code
> > till convergence criteria is meet   }//end of main method
> >
> >
> >
> >   // method to find the closest centriod to data point( vec: Vector
> > in the arguments)  def findTheClosestCentriod(vec: Vector, matrix:
> > Matrix): Int = {
> > var index = 0
> > var closest = Double.PositiveInfinity
> > for (row <- 0 until matrix.nrow) {
> >   val squaredSum = ssr(vec, matrix(row, ::))
> >   val tempDist = Math.sqrt(ssr(vec, matrix(row, ::)))
> >   if (tempDist < closest) {
> > closest = tempDist
> > index = row
> >   }
> > }
> > inde

Re: Trying to write the KMeans Clustering Using "Apache Mahout Samsara"

2017-04-12 Thread Dmitriy Lyubimov
can't say i can read this code well formatted that way...

it would seem to me that the code is not using the broadcast variable and
instead is using closure variable. that's the only thing i can immediately
see by looking in the middle of it.

it would be better if you created a branch on github for that code that
would allow for easy check-outs and comments.

-d

On Wed, Apr 12, 2017 at 10:29 AM, KHATWANI PARTH BHARAT <
h2016...@pilani.bits-pilani.ac.in> wrote:

> @Dmitriy Sir
>
> I have completed the Kmeans code as per the algorithm you have Outline
> above
>
> My code is as follows
>
> This code works fine till step number 10
>
> In step 11 i am assigning the new centriod index  to corresponding row key
> of data Point in the matrix
> I think i am doing something wrong in step 11 may be i am using incorrect
> syntax
>
> Can you help me find out what am i doing wrong.
>
>
> //start of main method
>
> def main(args: Array[String]) {
>  //1. initialize the spark and mahout context
> val conf = new SparkConf()
>   .setAppName("DRMExample")
>   .setMaster(args(0))
>   .set("spark.serializer", "org.apache.spark.serializer.
> KryoSerializer")
>   .set("spark.kryo.registrator",
> "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator")
> implicit val sc = new SparkDistributedContext(new SparkContext(conf))
>
> //2. read the data file and save it in the rdd
> val lines = sc.textFile(args(1))
>
> //3. convert data read in as string in to array of double
> val test = lines.map(line => line.split('\t').map(_.toDouble))
>
> //4. add a column having value 1 in array of double this will
> create something like (1 | D)',  which will be used while calculating
> (1 | D)'
> val augumentedArray = test.map(addCentriodColumn _)
>
> //5. convert rdd of array of double in rdd of DenseVector
> val rdd = augumentedArray.map(dvec(_))
>
> //6. convert rdd to DrmRdd
> val rddMatrixLike: DrmRdd[Int] = rdd.zipWithIndex.map { case (v,
> idx) => (idx.toInt, v) }//7. convert DrmRdd to
> CheckpointedDrm[Int]val matrix = drmWrap(rddMatrixLike)//8.
> seperating the column having all ones created in step 4 and will use
> it laterval oneVector = matrix(::, 0 until 1)//9. final
> input data in DrmLike[Int] formatval dataDrmX = matrix(::, 1 until
> 4)//9. Sampling to select initial centriodsval
> centriods = drmSampleKRows(dataDrmX, 2, false)centriods.size
> //10. Broad Casting the initial centriodsval broadCastMatrix =
> drmBroadcast(centriods)//11. Iterating over the Data
> Matrix(in DrmLike[Int] format) to calculate the initial centriods
> dataDrmX.mapBlock() {  case (keys, block) =>for (row <- 0
> until block.nrow) {  var dataPoint = block(row, ::)
> //12. findTheClosestCentriod find the closest centriod to the
> Data point specified by "dataPoint"  val closesetIndex =
> findTheClosestCentriod(dataPoint, centriods)//13.
> assigning closest index to key  keys(row) = closesetIndex
>   }keys -> block}
>
> //14. Calculating the (1|D)  val b = (oneVector cbind
> dataDrmX)//15. Aggregating Transpose (1|D)'val bTranspose
> = (oneVector cbind dataDrmX).t// after step 15 bTranspose will
> have data in the following format/*(n+1)*K where n=dimension
> of the data point, K=number of clusters* zeroth row will contain
> the count of points assigned to each cluster* assuming 3d data
> points * */
>
>
> val nrows = b.nrow.toInt//16. slicing the count vectors out
>  val pointCountVectors = drmBroadcast(b(0 until 1, ::).collect(0, ::))
>val vectorSums = b(1 until nrows, ::)//17. dividing the data
> point by count vectorvectorSums.mapBlock() {  case (keys,
> block) =>for (row <- 0 until block.nrow) {  block(row,
> ::) /= pointCountVectors}keys -> block}//18.
> seperating the count vectorsval newCentriods = vectorSums.t(::,1
> until centriods.size)//19. iterate over the above code
> till convergence criteria is meet   }//end of main method
>
>
>
>   // method to find the closest centriod to data point( vec: Vector
> in the arguments)  def findTheClosestCentriod(vec: Vector, matrix:
> Matrix): Int = {
> var index = 0
> var closest = Double.PositiveInfinity
> for (row <- 0 until matrix.nrow) {
>   val squaredSum = ssr(vec, matrix(row, ::))
>   val tempDist = Math.sqrt(ssr(vec, matrix(row, ::)))
>   if (tempDist < closest) {
> closest = tempDist
> index = row
>   }
> }
> index
>   }
>
>//calculating the sum of squared distance between the points(Vectors)
>   def ssr(a: Vector, b: Vector): Double = {
> (a - b) ^= 2 sum
>   }
>
>   //method used to create (1|D)
>   def addCentriodColumn(arg: Array[Double]): Array[Double] = {
> val newArr = new Array[Double](arg.l

Re: Trying to write the KMeans Clustering Using "Apache Mahout Samsara"

2017-04-12 Thread KHATWANI PARTH BHARAT
@Dmitriy Sir

I have completed the Kmeans code as per the algorithm you have Outline above

My code is as follows

This code works fine till step number 10

In step 11 i am assigning the new centriod index  to corresponding row key
of data Point in the matrix
I think i am doing something wrong in step 11 may be i am using incorrect
syntax

Can you help me find out what am i doing wrong.


//start of main method

def main(args: Array[String]) {
 //1. initialize the spark and mahout context
val conf = new SparkConf()
  .setAppName("DRMExample")
  .setMaster(args(0))
  .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
  .set("spark.kryo.registrator",
"org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator")
implicit val sc = new SparkDistributedContext(new SparkContext(conf))

//2. read the data file and save it in the rdd
val lines = sc.textFile(args(1))

//3. convert data read in as string in to array of double
val test = lines.map(line => line.split('\t').map(_.toDouble))

//4. add a column having value 1 in array of double this will
create something like (1 | D)',  which will be used while calculating
(1 | D)'
val augumentedArray = test.map(addCentriodColumn _)

//5. convert rdd of array of double in rdd of DenseVector
val rdd = augumentedArray.map(dvec(_))

//6. convert rdd to DrmRdd
val rddMatrixLike: DrmRdd[Int] = rdd.zipWithIndex.map { case (v,
idx) => (idx.toInt, v) }//7. convert DrmRdd to
CheckpointedDrm[Int]val matrix = drmWrap(rddMatrixLike)//8.
seperating the column having all ones created in step 4 and will use
it laterval oneVector = matrix(::, 0 until 1)//9. final
input data in DrmLike[Int] formatval dataDrmX = matrix(::, 1 until
4)//9. Sampling to select initial centriodsval
centriods = drmSampleKRows(dataDrmX, 2, false)centriods.size
//10. Broad Casting the initial centriodsval broadCastMatrix =
drmBroadcast(centriods)//11. Iterating over the Data
Matrix(in DrmLike[Int] format) to calculate the initial centriods
dataDrmX.mapBlock() {  case (keys, block) =>for (row <- 0
until block.nrow) {  var dataPoint = block(row, ::)
//12. findTheClosestCentriod find the closest centriod to the
Data point specified by "dataPoint"  val closesetIndex =
findTheClosestCentriod(dataPoint, centriods)//13.
assigning closest index to key  keys(row) = closesetIndex
  }keys -> block}

//14. Calculating the (1|D)  val b = (oneVector cbind
dataDrmX)//15. Aggregating Transpose (1|D)'val bTranspose
= (oneVector cbind dataDrmX).t// after step 15 bTranspose will
have data in the following format/*(n+1)*K where n=dimension
of the data point, K=number of clusters* zeroth row will contain
the count of points assigned to each cluster* assuming 3d data
points * */


val nrows = b.nrow.toInt//16. slicing the count vectors out
 val pointCountVectors = drmBroadcast(b(0 until 1, ::).collect(0, ::))
   val vectorSums = b(1 until nrows, ::)//17. dividing the data
point by count vectorvectorSums.mapBlock() {  case (keys,
block) =>for (row <- 0 until block.nrow) {  block(row,
::) /= pointCountVectors}keys -> block}//18.
seperating the count vectorsval newCentriods = vectorSums.t(::,1
until centriods.size)//19. iterate over the above code
till convergence criteria is meet   }//end of main method



  // method to find the closest centriod to data point( vec: Vector
in the arguments)  def findTheClosestCentriod(vec: Vector, matrix:
Matrix): Int = {
var index = 0
var closest = Double.PositiveInfinity
for (row <- 0 until matrix.nrow) {
  val squaredSum = ssr(vec, matrix(row, ::))
  val tempDist = Math.sqrt(ssr(vec, matrix(row, ::)))
  if (tempDist < closest) {
closest = tempDist
index = row
  }
}
index
  }

   //calculating the sum of squared distance between the points(Vectors)
  def ssr(a: Vector, b: Vector): Double = {
(a - b) ^= 2 sum
  }

  //method used to create (1|D)
  def addCentriodColumn(arg: Array[Double]): Array[Double] = {
val newArr = new Array[Double](arg.length + 1)
newArr(0) = 1.0;
for (i <- 0 until (arg.size)) {
  newArr(i + 1) = arg(i);
}
newArr
  }


Thanks & Regards
Parth Khatwani



On Mon, Apr 3, 2017 at 7:37 PM, KHATWANI PARTH BHARAT <
h2016...@pilani.bits-pilani.ac.in> wrote:

>
> -- Forwarded message --
> From: Dmitriy Lyubimov 
> Date: Fri, Mar 31, 2017 at 11:34 PM
> Subject: Re: Trying to write the KMeans Clustering Using "Apache Mahout
> Samsara"
> To: "dev@mahout.apache.org" 
>
>
> ps1 this assumes row-wise construction of A based on training set of m
> n-dimensional points.
> ps2 since we are doing multiple passes over A it may make sense to make
> sure it is