Re: Marketing

2017-03-24 Thread Ted Dunning
On Fri, Mar 24, 2017 at 8:27 AM, Pat Ferrel  wrote:

> maybe we should drop the name Mahout altogether.


I have been told that there is a cool secondary interpretation of Mahout as
well.

I think that the Hebrew word is pronounced roughly like Mahout.

מַהוּת

The cool thing is that this word means "essence" or possibly "truth". So
regardless of the guy riding the elephant, Mahout still has something to be
said for it.

(I have no Hebrew, btw)
(real speakers may want to comment here)


Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-03-24 Thread Andrew Musselman
If no one objects I'm okay with closing on Sunday.

On Fri, Mar 24, 2017 at 4:53 PM, Andrew Palumbo  wrote:

> Andrew,
>
> Could we push the vote another 72hrs?  I just got off of an all nighter at
> the office and will not be able to test the RC tonight.
>
>
> Pat could you please Continue with your integration testing if you can?
>
>
> Andy
>
>
>
>
>
>
>
>
>
> 
> From: Pat Ferrel 
> Sent: Friday, March 24, 2017 11:11:09 AM
> To: dev@mahout.apache.org
> Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate
>
> I can’t +1 because of system integration errors that have to do with
> scoring that could be in Mahout. I doubt it is but don’t have time in the
> allotted vote period to track it down.
>
> My close looking tests of Mahout including the previous driver issues pass.
>
> Not sure if we use this style of vote for releases but I guess I’m +0,
> wanting to see a release and not wanting to block it, just not enough info
> for a +1.
>
>
> On Mar 23, 2017, at 11:26 PM, Trevor Grant 
> wrote:
>
> +1 binding
>
> Verified Signatures and Verification
> mvn clean install
> On source compiled with `mvn clean install`, `mvn clean package -Phadoop2
> -PviennaCL`, and precompiled jars;  ran following in a docker contain
> (constructed via maven plugin, details below)
>
> $MAHOUT_HOME/bin/mahout spark-itemsimilarity \
>--master spark://$HOSTNAME:7077 \
> --input /data/ratings.csv \
> --output /tmp/spark_item_sim_output \
> --itemIDColumn 1 \
> --rowIDColumn 0 \
> --sparkExecutorMem 6g
>
> $MAHOUT_HOME/examples/bin/classify-wikipedia.sh -n 2
>
> $MAHOUT_HOME/bin/mahout spark-shell -i
> $MAHOUT_HOME/examples/bin/spark-document-classifier.mscala --master
> spark://$HOSTNAME:7077
>
>
> 
> -
>
> Maven plugin:
>
> Hadoop Version = 2.4.1
> Spark.Version=1.6.3
> (^^ from parent pom)
>
> io.fabric8
> docker-maven-plugin
> ...
> sequenceiq/hadoop-docker:${hadoop.version}
> 
> /usr/local/hadoop-${hadoop.version}
> /usr/local/spark
> /opt/mahout
> 
> 
> 
> curl -s
> http://d3kbcqa49mib13.cloudfront.net/spark-${spark.
> version}-bin-hadoop2.6.tgz
> | tar -xz -C /usr/local/
> cd /usr/local  ln -s spark-${spark.version}-bin-hadoop2.6
> spark
> $SPARK_HOME/sbin/start-all.sh
> 
> echo "Downloading Movie Lens Ratings Data"
> curl -s
> http://files.grouplens.org/datasets/movielens/ml-latest-small.zip -o
> /usr/local/ml-latest-small.zip
> unzip /usr/local/ml-latest-small.zip -d /usr/local
> 
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
> On Tue, Mar 21, 2017 at 11:17 AM, Andrew Musselman  wrote:
>
> > This is the vote for release 0.13.0 of Apache Mahout.
> >
> > The vote will be going for at least 72 hours and will be closed on
> Friday,
> > March 26th, 2017 or once there are at least 3 PMC +1 binding votes
> > (whichever
> > occurs earlier).  Please download, test and vote with
> >
> > [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
> > [ ] +0, I don't care either way,
> > [ ] -1, do not accept RC as the official 0.13.0 release of Apache Mahout,
> > because...
> >
> >
> > Maven staging repo:
> >
> > https://repository.apache.org/content/repositories/
> > orgapachemahout-1038/org/apache/mahout/apache-mahout-distribution/0.13.0
> >
> > The git tag to be voted upon is mahout-0.13.0
> >
>
>


Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-03-24 Thread Andrew Palumbo
Andrew,

Could we push the vote another 72hrs?  I just got off of an all nighter at the 
office and will not be able to test the RC tonight.


Pat could you please Continue with your integration testing if you can?


Andy










From: Pat Ferrel 
Sent: Friday, March 24, 2017 11:11:09 AM
To: dev@mahout.apache.org
Subject: Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

I can’t +1 because of system integration errors that have to do with scoring 
that could be in Mahout. I doubt it is but don’t have time in the allotted vote 
period to track it down.

My close looking tests of Mahout including the previous driver issues pass.

Not sure if we use this style of vote for releases but I guess I’m +0, wanting 
to see a release and not wanting to block it, just not enough info for a +1.


On Mar 23, 2017, at 11:26 PM, Trevor Grant  wrote:

+1 binding

Verified Signatures and Verification
mvn clean install
On source compiled with `mvn clean install`, `mvn clean package -Phadoop2
-PviennaCL`, and precompiled jars;  ran following in a docker contain
(constructed via maven plugin, details below)

$MAHOUT_HOME/bin/mahout spark-itemsimilarity \
   --master spark://$HOSTNAME:7077 \
--input /data/ratings.csv \
--output /tmp/spark_item_sim_output \
--itemIDColumn 1 \
--rowIDColumn 0 \
--sparkExecutorMem 6g

$MAHOUT_HOME/examples/bin/classify-wikipedia.sh -n 2

$MAHOUT_HOME/bin/mahout spark-shell -i
$MAHOUT_HOME/examples/bin/spark-document-classifier.mscala --master
spark://$HOSTNAME:7077


-

Maven plugin:

Hadoop Version = 2.4.1
Spark.Version=1.6.3
(^^ from parent pom)

io.fabric8
docker-maven-plugin
...
sequenceiq/hadoop-docker:${hadoop.version}

/usr/local/hadoop-${hadoop.version}
/usr/local/spark
/opt/mahout



curl -s
http://d3kbcqa49mib13.cloudfront.net/spark-${spark.version}-bin-hadoop2.6.tgz
| tar -xz -C /usr/local/
cd /usr/local  ln -s spark-${spark.version}-bin-hadoop2.6
spark
$SPARK_HOME/sbin/start-all.sh

echo "Downloading Movie Lens Ratings Data"
curl -s
http://files.grouplens.org/datasets/movielens/ml-latest-small.zip -o
/usr/local/ml-latest-small.zip
unzip /usr/local/ml-latest-small.zip -d /usr/local


Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Tue, Mar 21, 2017 at 11:17 AM, Andrew Musselman  wrote:

> This is the vote for release 0.13.0 of Apache Mahout.
>
> The vote will be going for at least 72 hours and will be closed on Friday,
> March 26th, 2017 or once there are at least 3 PMC +1 binding votes
> (whichever
> occurs earlier).  Please download, test and vote with
>
> [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
> [ ] +0, I don't care either way,
> [ ] -1, do not accept RC as the official 0.13.0 release of Apache Mahout,
> because...
>
>
> Maven staging repo:
>
> https://repository.apache.org/content/repositories/
> orgapachemahout-1038/org/apache/mahout/apache-mahout-distribution/0.13.0
>
> The git tag to be voted upon is mahout-0.13.0
>



Re: Marketing

2017-03-24 Thread Muhammed Olgun
Hi folks,

I have been following Mahout for a long time. I used it in some of my
projects.

Honey Badget... yes I think it is cool and funny. But I think it is not
able to represent the Mahout. Mahout is about learning, math and abstract
things.

What does describe Mahout better? For me,

"Doing distributed (math) machine learning, before it was cool"

Best!
Muhammed

24 Mar 2017 Cum, saat 22:44 tarihinde dustin vanstee <
dustinvans...@gmail.com> şunu yazdı:

Hi , I am a newcomer to the project, and I think the current website
definitely could use a re-build.  One of the things I think that enables
quick adoption of any new project are the quick start tutorials.  In
particular I like the way the Apache zeppelin site is structured.  There is
a lot of information on the Mahout site, and a lot of good hints, but it
took me a while to get something that worked.
  It might be nice to have something like
  quick start mahout/samsara basic
  quick start mahout/samsara with spark
  quick start mahout/samsara with gpu
  etc etc

Sometimes less is more.   There is too much information being put on the
homepage and its confusing...  I think the old mapreduce stuff could
probably be pushed very low down.  Also, having scala docs available would
be great too.  Just my 2pennies.

On Fri, Mar 24, 2017 at 1:02 PM, Pat Ferrel  wrote:

> Yeah point taken from @Dmitriy and @Trevor, the ability is great and will
> be needed some day soon (Spark is especially troublesome in some apps).
> However the existing compute engines code seems of dubious value.
>
> Completely agree with flexible solvers extremely important and yes, we
> should flog the hell out of it. This includes BLAS and future improvements
> to that layer as well as GPUs. Super important.
>
> I also like your taxonomy.
>
> On Mar 24, 2017, at 9:43 AM, Trevor Grant 
> wrote:
>
> To date we have referred to the GPU/CPU/CUDA as 'pluggable
native-solvers'.
> 'plugable backends' are the Spark - Flink -H20- whatever.
>
> With the advent of both, I could see the confusion and we may want to
> rethink the naming as part of of this too.
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
> On Fri, Mar 24, 2017 at 11:15 AM, Nikolai Sakharnykh <
> nsakharn...@nvidia.com
> > wrote:
>
> > I guess we might have different interpretation of a backend. So just to
> > avoid any confusion in my world (coming from accelerating applications
on
> > GPUs) the backends would be CUDA, OpenCL, OpenMP and JVM. I think it
> > definitely makes sense to advertise GPU support on the front page, along
> > with JVM and/or OpenMP for CPUs.
> >
> > -Original Message-
> > From: Suneel Marthi [mailto:smar...@apache.org]
> > Sent: Friday, March 24, 2017 11:13 AM
> > To: mahout 
> > Cc: u...@mahout.apache.org
> > Subject: Re: Marketing
> >
> > On Fri, Mar 24, 2017 at 12:09 PM, Dmitriy Lyubimov 
> > wrote:
> >
> >> On Fri, Mar 24, 2017 at 8:27 AM, Pat Ferrel 
> > wrote:
> >>
> >>> The multiple backend support is such a waste of time IMO. The DSL
> >>> and GPU support is super important and should be made even more
> >>> distributed. The current (as I understand it) single threaded GPU
> >>> per VM is only the first step in what will make Mahout important for a
> > long time to come.
> >>>
> >>
> >> This seems self contradicting a bit. Multiple backends is the only
> >> thing that remedies it for me. By that i mean both distributed (i/o)
> >> backends and the in-memory.
> >>
> >> Good CPU and GPU plugins will be important, as well as communication
> >> layer alternatives to spark. Spark is not working out well for
> >> interconnected problems, and H20 and Flink, well, I'd just forget
> >> about them. I'd certainly drop H20 for now.
> >
> >
> > FWIW, the H2O backend is more stable than the F%* backend, best to
drop
> > both.
> >
> >
> >
> >> But ability to plug in new communication backend primitives seems to
> >> be critical in my experience, as well as variety of cpu/gpu chipset
> >> support. (I do use both in-memory and i/o custom backends that IMO are
> >> a must).
> >>
> >> In that sense, it is super-important that custom backends are easy to
> >> plug (even if you are absolutely legitimately dissatisfied with the
> >> existing ones).
> >>
> >>
> >>> Think of Mahout in 5 years what will be important? H2O? Hadoop
> > Mapreduce?
> >>> Flink? I’ll stake my dollar on no. GPUs yes and up the stakes.
> >>> Streaming online learning (kappa style) yes but not sure Mahout is
> >>> made for this right now.
> >>>
> >>> Or if we are talking about web site revamp +1, I’d be happy to
> >>> upgrade my section and have only held off waiting to see a redesign
> >>> or moving to Jekyll.
> >>>
> >>> As to a new mascot, ok, but 

Re: Marketing

2017-03-24 Thread dustin vanstee
Hi , I am a newcomer to the project, and I think the current website
definitely could use a re-build.  One of the things I think that enables
quick adoption of any new project are the quick start tutorials.  In
particular I like the way the Apache zeppelin site is structured.  There is
a lot of information on the Mahout site, and a lot of good hints, but it
took me a while to get something that worked.
  It might be nice to have something like
  quick start mahout/samsara basic
  quick start mahout/samsara with spark
  quick start mahout/samsara with gpu
  etc etc

Sometimes less is more.   There is too much information being put on the
homepage and its confusing...  I think the old mapreduce stuff could
probably be pushed very low down.  Also, having scala docs available would
be great too.  Just my 2pennies.

On Fri, Mar 24, 2017 at 1:02 PM, Pat Ferrel  wrote:

> Yeah point taken from @Dmitriy and @Trevor, the ability is great and will
> be needed some day soon (Spark is especially troublesome in some apps).
> However the existing compute engines code seems of dubious value.
>
> Completely agree with flexible solvers extremely important and yes, we
> should flog the hell out of it. This includes BLAS and future improvements
> to that layer as well as GPUs. Super important.
>
> I also like your taxonomy.
>
> On Mar 24, 2017, at 9:43 AM, Trevor Grant 
> wrote:
>
> To date we have referred to the GPU/CPU/CUDA as 'pluggable native-solvers'.
> 'plugable backends' are the Spark - Flink -H20- whatever.
>
> With the advent of both, I could see the confusion and we may want to
> rethink the naming as part of of this too.
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
> On Fri, Mar 24, 2017 at 11:15 AM, Nikolai Sakharnykh <
> nsakharn...@nvidia.com
> > wrote:
>
> > I guess we might have different interpretation of a backend. So just to
> > avoid any confusion in my world (coming from accelerating applications on
> > GPUs) the backends would be CUDA, OpenCL, OpenMP and JVM. I think it
> > definitely makes sense to advertise GPU support on the front page, along
> > with JVM and/or OpenMP for CPUs.
> >
> > -Original Message-
> > From: Suneel Marthi [mailto:smar...@apache.org]
> > Sent: Friday, March 24, 2017 11:13 AM
> > To: mahout 
> > Cc: u...@mahout.apache.org
> > Subject: Re: Marketing
> >
> > On Fri, Mar 24, 2017 at 12:09 PM, Dmitriy Lyubimov 
> > wrote:
> >
> >> On Fri, Mar 24, 2017 at 8:27 AM, Pat Ferrel 
> > wrote:
> >>
> >>> The multiple backend support is such a waste of time IMO. The DSL
> >>> and GPU support is super important and should be made even more
> >>> distributed. The current (as I understand it) single threaded GPU
> >>> per VM is only the first step in what will make Mahout important for a
> > long time to come.
> >>>
> >>
> >> This seems self contradicting a bit. Multiple backends is the only
> >> thing that remedies it for me. By that i mean both distributed (i/o)
> >> backends and the in-memory.
> >>
> >> Good CPU and GPU plugins will be important, as well as communication
> >> layer alternatives to spark. Spark is not working out well for
> >> interconnected problems, and H20 and Flink, well, I'd just forget
> >> about them. I'd certainly drop H20 for now.
> >
> >
> > FWIW, the H2O backend is more stable than the F%* backend, best to drop
> > both.
> >
> >
> >
> >> But ability to plug in new communication backend primitives seems to
> >> be critical in my experience, as well as variety of cpu/gpu chipset
> >> support. (I do use both in-memory and i/o custom backends that IMO are
> >> a must).
> >>
> >> In that sense, it is super-important that custom backends are easy to
> >> plug (even if you are absolutely legitimately dissatisfied with the
> >> existing ones).
> >>
> >>
> >>> Think of Mahout in 5 years what will be important? H2O? Hadoop
> > Mapreduce?
> >>> Flink? I’ll stake my dollar on no. GPUs yes and up the stakes.
> >>> Streaming online learning (kappa style) yes but not sure Mahout is
> >>> made for this right now.
> >>>
> >>> Or if we are talking about web site revamp +1, I’d be happy to
> >>> upgrade my section and have only held off waiting to see a redesign
> >>> or moving to Jekyll.
> >>>
> >>> As to a new mascot, ok, but the old one fits the name. We tried
> >> sub-naming
> >>> Mahout-Samsara to symbolize the changing nature and rebirth of the
> >> project,
> >>> maybe we should drop the name Mahout altogether. the name Mahout,
> >>> like
> >> the
> >>> blue man, is not relevant to the project anymore and maybe renaming,
> >>> is good for marketing.
> >>>
> >>> On Mar 24, 2017, at 7:37 AM, Nikolai Sakharnykh
> >>> 
> >>> wrote:
> >>>
> >>> Agree that 

Re: Marketing

2017-03-24 Thread Pat Ferrel
Yeah point taken from @Dmitriy and @Trevor, the ability is great and will be 
needed some day soon (Spark is especially troublesome in some apps). However 
the existing compute engines code seems of dubious value.

Completely agree with flexible solvers extremely important and yes, we should 
flog the hell out of it. This includes BLAS and future improvements to that 
layer as well as GPUs. Super important.

I also like your taxonomy.

On Mar 24, 2017, at 9:43 AM, Trevor Grant  wrote:

To date we have referred to the GPU/CPU/CUDA as 'pluggable native-solvers'.
'plugable backends' are the Spark - Flink -H20- whatever.

With the advent of both, I could see the confusion and we may want to
rethink the naming as part of of this too.

Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Fri, Mar 24, 2017 at 11:15 AM, Nikolai Sakharnykh  wrote:

> I guess we might have different interpretation of a backend. So just to
> avoid any confusion in my world (coming from accelerating applications on
> GPUs) the backends would be CUDA, OpenCL, OpenMP and JVM. I think it
> definitely makes sense to advertise GPU support on the front page, along
> with JVM and/or OpenMP for CPUs.
> 
> -Original Message-
> From: Suneel Marthi [mailto:smar...@apache.org]
> Sent: Friday, March 24, 2017 11:13 AM
> To: mahout 
> Cc: u...@mahout.apache.org
> Subject: Re: Marketing
> 
> On Fri, Mar 24, 2017 at 12:09 PM, Dmitriy Lyubimov 
> wrote:
> 
>> On Fri, Mar 24, 2017 at 8:27 AM, Pat Ferrel 
> wrote:
>> 
>>> The multiple backend support is such a waste of time IMO. The DSL
>>> and GPU support is super important and should be made even more
>>> distributed. The current (as I understand it) single threaded GPU
>>> per VM is only the first step in what will make Mahout important for a
> long time to come.
>>> 
>> 
>> This seems self contradicting a bit. Multiple backends is the only
>> thing that remedies it for me. By that i mean both distributed (i/o)
>> backends and the in-memory.
>> 
>> Good CPU and GPU plugins will be important, as well as communication
>> layer alternatives to spark. Spark is not working out well for
>> interconnected problems, and H20 and Flink, well, I'd just forget
>> about them. I'd certainly drop H20 for now.
> 
> 
> FWIW, the H2O backend is more stable than the F%* backend, best to drop
> both.
> 
> 
> 
>> But ability to plug in new communication backend primitives seems to
>> be critical in my experience, as well as variety of cpu/gpu chipset
>> support. (I do use both in-memory and i/o custom backends that IMO are
>> a must).
>> 
>> In that sense, it is super-important that custom backends are easy to
>> plug (even if you are absolutely legitimately dissatisfied with the
>> existing ones).
>> 
>> 
>>> Think of Mahout in 5 years what will be important? H2O? Hadoop
> Mapreduce?
>>> Flink? I’ll stake my dollar on no. GPUs yes and up the stakes.
>>> Streaming online learning (kappa style) yes but not sure Mahout is
>>> made for this right now.
>>> 
>>> Or if we are talking about web site revamp +1, I’d be happy to
>>> upgrade my section and have only held off waiting to see a redesign
>>> or moving to Jekyll.
>>> 
>>> As to a new mascot, ok, but the old one fits the name. We tried
>> sub-naming
>>> Mahout-Samsara to symbolize the changing nature and rebirth of the
>> project,
>>> maybe we should drop the name Mahout altogether. the name Mahout,
>>> like
>> the
>>> blue man, is not relevant to the project anymore and maybe renaming,
>>> is good for marketing.
>>> 
>>> On Mar 24, 2017, at 7:37 AM, Nikolai Sakharnykh
>>> 
>>> wrote:
>>> 
>>> Agree that the website feels outdated. I would add Samsara code
>>> example
>> on
>>> the front page, list of key algorithms implemented, supported
>>> backends, github & download links, and cut down the news part
>>> especially towards
>> the
>>> end with flat release numbers and dates. Also probably reorganize
>>> the
>> tabs.
>>> 
>>> If we go with honey badger as a mascot do we have any ideas on the
>>> logo itself? Honey badger biting/eating a snake?)
>>> 
>>> -Original Message-
>>> From: Trevor Grant [mailto:trevor.d.gr...@gmail.com]
>>> Sent: Thursday, March 23, 2017 8:53 PM
>>> To: dev@mahout.apache.org
>>> Cc: u...@mahout.apache.org
>>> Subject: Re: Marketing
>>> 
>>> A student once asked his teacher, "Master, what is enlightenment?"
>>> 
>>> The master replied, "When hungry, eat. When tired, sleep."
>>> 
>>> Sounds like the honey badger to me...
>>> 
>>> Trevor Grant
>>> Data Scientist
>>> https://github.com/rawkintrevo
>>> http://stackexchange.com/users/3002022/rawkintrevo
>>> http://trevorgrant.org
>>> 
>>> *"Fortunate is he, who is able to know the causes of 

Re: Marketing

2017-03-24 Thread Trevor Grant
To date we have referred to the GPU/CPU/CUDA as 'pluggable native-solvers'.
 'plugable backends' are the Spark - Flink -H20- whatever.

With the advent of both, I could see the confusion and we may want to
rethink the naming as part of of this too.

Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Fri, Mar 24, 2017 at 11:15 AM, Nikolai Sakharnykh  wrote:

> I guess we might have different interpretation of a backend. So just to
> avoid any confusion in my world (coming from accelerating applications on
> GPUs) the backends would be CUDA, OpenCL, OpenMP and JVM. I think it
> definitely makes sense to advertise GPU support on the front page, along
> with JVM and/or OpenMP for CPUs.
>
> -Original Message-
> From: Suneel Marthi [mailto:smar...@apache.org]
> Sent: Friday, March 24, 2017 11:13 AM
> To: mahout 
> Cc: u...@mahout.apache.org
> Subject: Re: Marketing
>
> On Fri, Mar 24, 2017 at 12:09 PM, Dmitriy Lyubimov 
> wrote:
>
> > On Fri, Mar 24, 2017 at 8:27 AM, Pat Ferrel 
> wrote:
> >
> > > The multiple backend support is such a waste of time IMO. The DSL
> > > and GPU support is super important and should be made even more
> > > distributed. The current (as I understand it) single threaded GPU
> > > per VM is only the first step in what will make Mahout important for a
> long time to come.
> > >
> >
> > This seems self contradicting a bit. Multiple backends is the only
> > thing that remedies it for me. By that i mean both distributed (i/o)
> > backends and the in-memory.
> >
> > Good CPU and GPU plugins will be important, as well as communication
> > layer alternatives to spark. Spark is not working out well for
> > interconnected problems, and H20 and Flink, well, I'd just forget
> > about them. I'd certainly drop H20 for now.
>
>
> FWIW, the H2O backend is more stable than the F%* backend, best to drop
> both.
>
>
>
> > But ability to plug in new communication backend primitives seems to
> > be critical in my experience, as well as variety of cpu/gpu chipset
> > support. (I do use both in-memory and i/o custom backends that IMO are
> > a must).
> >
> > In that sense, it is super-important that custom backends are easy to
> > plug (even if you are absolutely legitimately dissatisfied with the
> > existing ones).
> >
> >
> > > Think of Mahout in 5 years what will be important? H2O? Hadoop
> Mapreduce?
> > > Flink? I’ll stake my dollar on no. GPUs yes and up the stakes.
> > > Streaming online learning (kappa style) yes but not sure Mahout is
> > > made for this right now.
> > >
> > > Or if we are talking about web site revamp +1, I’d be happy to
> > > upgrade my section and have only held off waiting to see a redesign
> > > or moving to Jekyll.
> > >
> > > As to a new mascot, ok, but the old one fits the name. We tried
> > sub-naming
> > > Mahout-Samsara to symbolize the changing nature and rebirth of the
> > project,
> > > maybe we should drop the name Mahout altogether. the name Mahout,
> > > like
> > the
> > > blue man, is not relevant to the project anymore and maybe renaming,
> > > is good for marketing.
> > >
> > > On Mar 24, 2017, at 7:37 AM, Nikolai Sakharnykh
> > > 
> > > wrote:
> > >
> > > Agree that the website feels outdated. I would add Samsara code
> > > example
> > on
> > > the front page, list of key algorithms implemented, supported
> > > backends, github & download links, and cut down the news part
> > > especially towards
> > the
> > > end with flat release numbers and dates. Also probably reorganize
> > > the
> > tabs.
> > >
> > > If we go with honey badger as a mascot do we have any ideas on the
> > > logo itself? Honey badger biting/eating a snake?)
> > >
> > > -Original Message-
> > > From: Trevor Grant [mailto:trevor.d.gr...@gmail.com]
> > > Sent: Thursday, March 23, 2017 8:53 PM
> > > To: dev@mahout.apache.org
> > > Cc: u...@mahout.apache.org
> > > Subject: Re: Marketing
> > >
> > > A student once asked his teacher, "Master, what is enlightenment?"
> > >
> > > The master replied, "When hungry, eat. When tired, sleep."
> > >
> > > Sounds like the honey badger to me...
> > >
> > > Trevor Grant
> > > Data Scientist
> > > https://github.com/rawkintrevo
> > > http://stackexchange.com/users/3002022/rawkintrevo
> > > http://trevorgrant.org
> > >
> > > *"Fortunate is he, who is able to know the causes of things."
> > > -Virgil*
> > >
> > >
> > > On Thu, Mar 23, 2017 at 5:43 PM, Pat Ferrel 
> > wrote:
> > >
> > > > The little blue man (the mahout) was reborn (samsara) as a
> > honey-badger?
> > > > He must be close indeed to reaching true enlightenment, or is that
> > > Buddhism?
> > > >
> > > >
> > > > On Mar 23, 2017, at 12:42 PM, Andrew Palumbo 
> > wrote:

Re: Marketing

2017-03-24 Thread Trevor Grant
I don't think the backends we have now off the shelf are particularly
exciting, but the fact you CAN plug different ones back in is the value
prop (and a big one that we need to 'sell' more). The difference is subtle
but since this is the marketing thread also worth bringing up. Basically to
your point in paragraph two.   Flink, H2O, Spark, they come and go- with
Mahout your algorithms keep porting (more on this shortly).

Kappa arch- yes, we need to start thinking at least how that is going to
play into mahout.  I agree with you 100% on this.

It sounds like we're getting quorum at least on website revamp. Awesome.  I
think we should solve the other issues (name, logo, etc.) but at least
start looking for community memebers that have the time and skill / trying
to recruit people into the project who do.  In my mind, the website
revamp+jekyl should happen simultaneously.  Let's just build the new site
in Jekyll and then when its ready we'll switch techs + launch in one shot.
If you have some good info, it might be prudent to just post it now, bc I
don't know what the time line for all this will look like.

Naming- can a project simply change its name? Is that even an option.  If
it is- might be a way to go, but do a transition- e.g. Introduce
-Mahout but slowly drop the mahout part, until finally its just
. Or do it quick like a bandaid on a major release (e.g. 0.14.0).  Or
do we just call it Apache Samsara, which, going back to the modular
backends is kind of elegantly appropriate in that the backend have
lifecycles, but your "algorithmic soul" is persistently reincarnated.
There is a 'story' there of - look we had the first big data ML package,
and when the back end it was built for began to fade, so did our projects.
We were the first to have to stare down the barrel of the pains of backends
coming and going so the whole point of this project was to prevent that
from happening again.

Re: nikolai-
100% agree with the structure changes proposed.  Let's find someone who is
willing and able to take point on that project, then start brainstorming
layout but this is a good start. Or even better, lets start a jira ticket
to discuss structure of new site (since again, I think we agree that DOES
need to happen, and we're really just waiting on some techicalities of
how/when/who/what).

I know i brought up that honey badgers eat snakes (python), the one big
danger in this- if we ever do decide to implement python bindings then all
of the sudden things get awkward. (I'm imaginging going to a Python meetup
to talk about new Mahout-Samsara- bindings, and some one asks,
"why is he eating a snake", "oh well because at the time we thought python
was trash and we were very arrogant").





Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Fri, Mar 24, 2017 at 10:27 AM, Pat Ferrel  wrote:

> The multiple backend support is such a waste of time IMO. The DSL and GPU
> support is super important and should be made even more distributed. The
> current (as I understand it) single threaded GPU per VM is only the first
> step in what will make Mahout important for a long time to come.
>
> Think of Mahout in 5 years what will be important? H2O? Hadoop Mapreduce?
> Flink? I’ll stake my dollar on no. GPUs yes and up the stakes. Streaming
> online learning (kappa style) yes but not sure Mahout is made for this
> right now.
>
> Or if we are talking about web site revamp +1, I’d be happy to upgrade my
> section and have only held off waiting to see a redesign or moving to
> Jekyll.
>
> As to a new mascot, ok, but the old one fits the name. We tried sub-naming
> Mahout-Samsara to symbolize the changing nature and rebirth of the project,
> maybe we should drop the name Mahout altogether. the name Mahout, like the
> blue man, is not relevant to the project anymore and maybe renaming, is
> good for marketing.
>
> On Mar 24, 2017, at 7:37 AM, Nikolai Sakharnykh 
> wrote:
>
> Agree that the website feels outdated. I would add Samsara code example on
> the front page, list of key algorithms implemented, supported backends,
> github & download links, and cut down the news part especially towards the
> end with flat release numbers and dates. Also probably reorganize the tabs.
>
> If we go with honey badger as a mascot do we have any ideas on the logo
> itself? Honey badger biting/eating a snake?)
>
> -Original Message-
> From: Trevor Grant [mailto:trevor.d.gr...@gmail.com]
> Sent: Thursday, March 23, 2017 8:53 PM
> To: dev@mahout.apache.org
> Cc: u...@mahout.apache.org
> Subject: Re: Marketing
>
> A student once asked his teacher, "Master, what is enlightenment?"
>
> The master replied, "When hungry, eat. When tired, sleep."
>
> Sounds like the honey badger to me...
>
> Trevor Grant
> Data Scientist
> 

Re: [jira] [Commented] (MAHOUT-1929) Add Generalized Linear Models

2017-03-24 Thread Saikat Kanjilal
@Trevor

Its been a week and haven't heard back on this, guessing you are buried with 
0.13 release at this point and that you have not had time to check this in 
detail, what do you think should I keep progressing on the design, I want to 
make sure I am making timely progress to a deliverable for GLM for 0.14, otoh 
would hate to do throw away work before a design review in detail, let me know 
your thoughts.


Regards



From: Trevor Grant 
Sent: Thursday, March 16, 2017 7:41 AM
To: dev@mahout.apache.org
Subject: Re: [jira] [Commented] (MAHOUT-1929) Add Generalized Linear Models

Also at Strata- thanks for reaching out a second time- totally missed your
first message.

I'll check this out tomorrow on the flight home. We can set something up
from there.  Thanks for the contribution!!

tg


Trevor Grant
Data Scientist
https://github.com/rawkintrevo
[https://avatars3.githubusercontent.com/u/5852441?v=3=400]

rawkintrevo (Trevor Grant) · GitHub
github.com
rawkintrevo has 22 repositories available. Follow their code on GitHub.



http://stackexchange.com/users/3002022/rawkintrevo
User rawkintrevo - Stack 
Exchange
stackexchange.com
Fortuna Audaces Iuvat ~Chance Favors the Bold. top accounts reputation activity 
favorites subscriptions. Top Questions



http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Wed, Mar 15, 2017 at 10:13 AM, Saikat Kanjilal 
wrote:

> @Trevor, I know you're busy with 0.13, when you have a moment I would
> really appreciate some input before I forge further ahead or a quick google
> hangout design review of what I've put into place.  Just wanting to make
> sure this gets into the next release.
> Thanks in advance for your time.
>
> Sent from my iPhone
>
> Begin forwarded message:
>
> From: "Saikat Kanjilal (JIRA)" >
> Date: March 13, 2017 at 3:33:41 PM PDT
> To: >
> Subject: [jira] [Commented] (MAHOUT-1929) Add Generalized Linear Models
> Reply-To: >
>
>
>[ https://issues.apache.org/jira/browse/MAHOUT-1929?page=
> com.atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel=15923098#comment-15923098 ]
>
> Saikat Kanjilal commented on MAHOUT-1929:
> -
>
> More progress, added a successful unit test using the new GlmModel here:
> https://github.com/skanjila/mahout/blob/mahout-1929/math-
> scala/src/test/scala/org/apache/mahout/math/algorithms/GlmSuiteBase.scala
>
> Refactored the GlmModel code to leverage LinearRegressorModel instead:
> https://github.com/skanjila/mahout/blob/mahout-1929/math-
> scala/src/main/scala/org/apache/mahout/math/algorithms/
> regression/GlmModel.scala to make things a bit easier given the current
> architecture
>
>
> [~rawkintrevo] any chance we could do a quick google hangout to discuss
> design before I get too far ahead?  Would love some feedback to make sure
> this goes smoothly, I know you are busy with 0.13 so let me know what works
>
>
>
> Add Generalized Linear Models
> -
>
>Key: MAHOUT-1929
>URL: https://issues.apache.org/jira/browse/MAHOUT-1929
>Project: Mahout
> Issue Type: Wish
> Components: Algorithms
>   Affects Versions: 0.13.1
>   Reporter: Trevor Grant
>
> Implement generalize Linear Models (GLM)
> https://en.wikipedia.org/wiki/Generalized_linear_model
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.15#6346)
>


RE: Marketing

2017-03-24 Thread Nikolai Sakharnykh
I guess we might have different interpretation of a backend. So just to avoid 
any confusion in my world (coming from accelerating applications on GPUs) the 
backends would be CUDA, OpenCL, OpenMP and JVM. I think it definitely makes 
sense to advertise GPU support on the front page, along with JVM and/or OpenMP 
for CPUs. 

-Original Message-
From: Suneel Marthi [mailto:smar...@apache.org] 
Sent: Friday, March 24, 2017 11:13 AM
To: mahout 
Cc: u...@mahout.apache.org
Subject: Re: Marketing

On Fri, Mar 24, 2017 at 12:09 PM, Dmitriy Lyubimov 
wrote:

> On Fri, Mar 24, 2017 at 8:27 AM, Pat Ferrel  wrote:
>
> > The multiple backend support is such a waste of time IMO. The DSL 
> > and GPU support is super important and should be made even more 
> > distributed. The current (as I understand it) single threaded GPU 
> > per VM is only the first step in what will make Mahout important for a long 
> > time to come.
> >
>
> This seems self contradicting a bit. Multiple backends is the only 
> thing that remedies it for me. By that i mean both distributed (i/o) 
> backends and the in-memory.
>
> Good CPU and GPU plugins will be important, as well as communication 
> layer alternatives to spark. Spark is not working out well for 
> interconnected problems, and H20 and Flink, well, I'd just forget 
> about them. I'd certainly drop H20 for now.


FWIW, the H2O backend is more stable than the F%* backend, best to drop both.



> But ability to plug in new communication backend primitives seems to 
> be critical in my experience, as well as variety of cpu/gpu chipset 
> support. (I do use both in-memory and i/o custom backends that IMO are 
> a must).
>
> In that sense, it is super-important that custom backends are easy to 
> plug (even if you are absolutely legitimately dissatisfied with the 
> existing ones).
>
>
> > Think of Mahout in 5 years what will be important? H2O? Hadoop Mapreduce?
> > Flink? I’ll stake my dollar on no. GPUs yes and up the stakes. 
> > Streaming online learning (kappa style) yes but not sure Mahout is 
> > made for this right now.
> >
> > Or if we are talking about web site revamp +1, I’d be happy to 
> > upgrade my section and have only held off waiting to see a redesign 
> > or moving to Jekyll.
> >
> > As to a new mascot, ok, but the old one fits the name. We tried
> sub-naming
> > Mahout-Samsara to symbolize the changing nature and rebirth of the
> project,
> > maybe we should drop the name Mahout altogether. the name Mahout, 
> > like
> the
> > blue man, is not relevant to the project anymore and maybe renaming, 
> > is good for marketing.
> >
> > On Mar 24, 2017, at 7:37 AM, Nikolai Sakharnykh 
> > 
> > wrote:
> >
> > Agree that the website feels outdated. I would add Samsara code 
> > example
> on
> > the front page, list of key algorithms implemented, supported 
> > backends, github & download links, and cut down the news part 
> > especially towards
> the
> > end with flat release numbers and dates. Also probably reorganize 
> > the
> tabs.
> >
> > If we go with honey badger as a mascot do we have any ideas on the 
> > logo itself? Honey badger biting/eating a snake?)
> >
> > -Original Message-
> > From: Trevor Grant [mailto:trevor.d.gr...@gmail.com]
> > Sent: Thursday, March 23, 2017 8:53 PM
> > To: dev@mahout.apache.org
> > Cc: u...@mahout.apache.org
> > Subject: Re: Marketing
> >
> > A student once asked his teacher, "Master, what is enlightenment?"
> >
> > The master replied, "When hungry, eat. When tired, sleep."
> >
> > Sounds like the honey badger to me...
> >
> > Trevor Grant
> > Data Scientist
> > https://github.com/rawkintrevo
> > http://stackexchange.com/users/3002022/rawkintrevo
> > http://trevorgrant.org
> >
> > *"Fortunate is he, who is able to know the causes of things."  
> > -Virgil*
> >
> >
> > On Thu, Mar 23, 2017 at 5:43 PM, Pat Ferrel 
> wrote:
> >
> > > The little blue man (the mahout) was reborn (samsara) as a
> honey-badger?
> > > He must be close indeed to reaching true enlightenment, or is that
> > Buddhism?
> > >
> > >
> > > On Mar 23, 2017, at 12:42 PM, Andrew Palumbo 
> wrote:
> > >
> > > +1 on revamp.
> > >
> > >
> > >
> > > Sent from my Verizon Wireless 4G LTE smartphone
> > >
> > >
> > >  Original message 
> > > From: Trevor Grant 
> > > Date: 03/23/2017 12:36 PM (GMT-08:00)
> > > To: u...@mahout.apache.org, dev@mahout.apache.org
> > > Subject: Marketing
> > >
> > > Hey user and dev,
> > >
> > > With 0.13.0 the Apache Mahout project has added some significant
> updates.
> > >
> > > The website is starting to feel 'dated' I think it could use a reboot.
> > >
> > > The blue person riding the elephant has less signifigance in 
> > > Mahout-Samsara's modular backends.
> > >
> > > Would like to open the floor to discussion on website reboot (and 
> > > who might be 

Re: Marketing

2017-03-24 Thread Suneel Marthi
On Fri, Mar 24, 2017 at 12:09 PM, Dmitriy Lyubimov 
wrote:

> On Fri, Mar 24, 2017 at 8:27 AM, Pat Ferrel  wrote:
>
> > The multiple backend support is such a waste of time IMO. The DSL and GPU
> > support is super important and should be made even more distributed. The
> > current (as I understand it) single threaded GPU per VM is only the first
> > step in what will make Mahout important for a long time to come.
> >
>
> This seems self contradicting a bit. Multiple backends is the only thing
> that remedies it for me. By that i mean both distributed (i/o) backends and
> the in-memory.
>
> Good CPU and GPU plugins will be important, as well as communication layer
> alternatives to spark. Spark is not working out well for interconnected
> problems, and H20 and Flink, well, I'd just forget about them. I'd
> certainly drop H20 for now.


FWIW, the H2O backend is more stable than the F%* backend, best to drop
both.



> But ability to plug in new communication
> backend primitives seems to be critical in my experience, as well as
> variety of cpu/gpu chipset support. (I do use both in-memory and i/o custom
> backends that IMO are a must).
>
> In that sense, it is super-important that custom backends are easy to plug
> (even if you are absolutely legitimately dissatisfied with the existing
> ones).
>
>
> > Think of Mahout in 5 years what will be important? H2O? Hadoop Mapreduce?
> > Flink? I’ll stake my dollar on no. GPUs yes and up the stakes. Streaming
> > online learning (kappa style) yes but not sure Mahout is made for this
> > right now.
> >
> > Or if we are talking about web site revamp +1, I’d be happy to upgrade my
> > section and have only held off waiting to see a redesign or moving to
> > Jekyll.
> >
> > As to a new mascot, ok, but the old one fits the name. We tried
> sub-naming
> > Mahout-Samsara to symbolize the changing nature and rebirth of the
> project,
> > maybe we should drop the name Mahout altogether. the name Mahout, like
> the
> > blue man, is not relevant to the project anymore and maybe renaming, is
> > good for marketing.
> >
> > On Mar 24, 2017, at 7:37 AM, Nikolai Sakharnykh 
> > wrote:
> >
> > Agree that the website feels outdated. I would add Samsara code example
> on
> > the front page, list of key algorithms implemented, supported backends,
> > github & download links, and cut down the news part especially towards
> the
> > end with flat release numbers and dates. Also probably reorganize the
> tabs.
> >
> > If we go with honey badger as a mascot do we have any ideas on the logo
> > itself? Honey badger biting/eating a snake?)
> >
> > -Original Message-
> > From: Trevor Grant [mailto:trevor.d.gr...@gmail.com]
> > Sent: Thursday, March 23, 2017 8:53 PM
> > To: dev@mahout.apache.org
> > Cc: u...@mahout.apache.org
> > Subject: Re: Marketing
> >
> > A student once asked his teacher, "Master, what is enlightenment?"
> >
> > The master replied, "When hungry, eat. When tired, sleep."
> >
> > Sounds like the honey badger to me...
> >
> > Trevor Grant
> > Data Scientist
> > https://github.com/rawkintrevo
> > http://stackexchange.com/users/3002022/rawkintrevo
> > http://trevorgrant.org
> >
> > *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> >
> >
> > On Thu, Mar 23, 2017 at 5:43 PM, Pat Ferrel 
> wrote:
> >
> > > The little blue man (the mahout) was reborn (samsara) as a
> honey-badger?
> > > He must be close indeed to reaching true enlightenment, or is that
> > Buddhism?
> > >
> > >
> > > On Mar 23, 2017, at 12:42 PM, Andrew Palumbo 
> wrote:
> > >
> > > +1 on revamp.
> > >
> > >
> > >
> > > Sent from my Verizon Wireless 4G LTE smartphone
> > >
> > >
> > >  Original message 
> > > From: Trevor Grant 
> > > Date: 03/23/2017 12:36 PM (GMT-08:00)
> > > To: u...@mahout.apache.org, dev@mahout.apache.org
> > > Subject: Marketing
> > >
> > > Hey user and dev,
> > >
> > > With 0.13.0 the Apache Mahout project has added some significant
> updates.
> > >
> > > The website is starting to feel 'dated' I think it could use a reboot.
> > >
> > > The blue person riding the elephant has less signifigance in
> > > Mahout-Samsara's modular backends.
> > >
> > > Would like to open the floor to discussion on website reboot (and who
> > > might be willing to take on such a project), as well as new mascot.
> > >
> > > To kick off- in an offline talk there was the idea of A honey badger
> > > (bc honey-badger don't care, just like mahout don't care what back end
> > > or native solvers you are using, and also bc a cobra bites a honey
> > > badger and he takes a little nap then wakes up and finishes eating the
> > > cobra. honey badger eats snakes, and does all the work while the other
> > > animals pick up the scraps.
> > > see this short documentary on the honey badger:
> > > https://www.youtube.com/watch?v=4r7wHMg5Yjg ) 

Re: Marketing

2017-03-24 Thread Dmitriy Lyubimov
On Fri, Mar 24, 2017 at 8:27 AM, Pat Ferrel  wrote:

> The multiple backend support is such a waste of time IMO. The DSL and GPU
> support is super important and should be made even more distributed. The
> current (as I understand it) single threaded GPU per VM is only the first
> step in what will make Mahout important for a long time to come.
>

This seems self contradicting a bit. Multiple backends is the only thing
that remedies it for me. By that i mean both distributed (i/o) backends and
the in-memory.

Good CPU and GPU plugins will be important, as well as communication layer
alternatives to spark. Spark is not working out well for interconnected
problems, and H20 and Flink, well, I'd just forget about them. I'd
certainly drop H20 for now. But ability to plug in new communication
backend primitives seems to be critical in my experience, as well as
variety of cpu/gpu chipset support. (I do use both in-memory and i/o custom
backends that IMO are a must).

In that sense, it is super-important that custom backends are easy to plug
(even if you are absolutely legitimately dissatisfied with the existing
ones).


> Think of Mahout in 5 years what will be important? H2O? Hadoop Mapreduce?
> Flink? I’ll stake my dollar on no. GPUs yes and up the stakes. Streaming
> online learning (kappa style) yes but not sure Mahout is made for this
> right now.
>
> Or if we are talking about web site revamp +1, I’d be happy to upgrade my
> section and have only held off waiting to see a redesign or moving to
> Jekyll.
>
> As to a new mascot, ok, but the old one fits the name. We tried sub-naming
> Mahout-Samsara to symbolize the changing nature and rebirth of the project,
> maybe we should drop the name Mahout altogether. the name Mahout, like the
> blue man, is not relevant to the project anymore and maybe renaming, is
> good for marketing.
>
> On Mar 24, 2017, at 7:37 AM, Nikolai Sakharnykh 
> wrote:
>
> Agree that the website feels outdated. I would add Samsara code example on
> the front page, list of key algorithms implemented, supported backends,
> github & download links, and cut down the news part especially towards the
> end with flat release numbers and dates. Also probably reorganize the tabs.
>
> If we go with honey badger as a mascot do we have any ideas on the logo
> itself? Honey badger biting/eating a snake?)
>
> -Original Message-
> From: Trevor Grant [mailto:trevor.d.gr...@gmail.com]
> Sent: Thursday, March 23, 2017 8:53 PM
> To: dev@mahout.apache.org
> Cc: u...@mahout.apache.org
> Subject: Re: Marketing
>
> A student once asked his teacher, "Master, what is enlightenment?"
>
> The master replied, "When hungry, eat. When tired, sleep."
>
> Sounds like the honey badger to me...
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
> On Thu, Mar 23, 2017 at 5:43 PM, Pat Ferrel  wrote:
>
> > The little blue man (the mahout) was reborn (samsara) as a honey-badger?
> > He must be close indeed to reaching true enlightenment, or is that
> Buddhism?
> >
> >
> > On Mar 23, 2017, at 12:42 PM, Andrew Palumbo  wrote:
> >
> > +1 on revamp.
> >
> >
> >
> > Sent from my Verizon Wireless 4G LTE smartphone
> >
> >
> >  Original message 
> > From: Trevor Grant 
> > Date: 03/23/2017 12:36 PM (GMT-08:00)
> > To: u...@mahout.apache.org, dev@mahout.apache.org
> > Subject: Marketing
> >
> > Hey user and dev,
> >
> > With 0.13.0 the Apache Mahout project has added some significant updates.
> >
> > The website is starting to feel 'dated' I think it could use a reboot.
> >
> > The blue person riding the elephant has less signifigance in
> > Mahout-Samsara's modular backends.
> >
> > Would like to open the floor to discussion on website reboot (and who
> > might be willing to take on such a project), as well as new mascot.
> >
> > To kick off- in an offline talk there was the idea of A honey badger
> > (bc honey-badger don't care, just like mahout don't care what back end
> > or native solvers you are using, and also bc a cobra bites a honey
> > badger and he takes a little nap then wakes up and finishes eating the
> > cobra. honey badger eats snakes, and does all the work while the other
> > animals pick up the scraps.
> > see this short documentary on the honey badger:
> > https://www.youtube.com/watch?v=4r7wHMg5Yjg ) ^^audio not safe for
> > work
> >
> > Con: its almost tooo jokey.
> >
> > Other idea: are coy-wolfs.
> >
> > Trevor Grant
> > Data Scientist
> > https://github.com/rawkintrevo
> > http://stackexchange.com/users/3002022/rawkintrevo
> > http://trevorgrant.org
> >
> > *"Fortunate is he, who is able to know the causes of things."
> > -Virgil*
> >
> >
>
> 

Re: Marketing

2017-03-24 Thread Pat Ferrel
The multiple backend support is such a waste of time IMO. The DSL and GPU 
support is super important and should be made even more distributed. The 
current (as I understand it) single threaded GPU per VM is only the first step 
in what will make Mahout important for a long time to come.

Think of Mahout in 5 years what will be important? H2O? Hadoop Mapreduce? 
Flink? I’ll stake my dollar on no. GPUs yes and up the stakes. Streaming online 
learning (kappa style) yes but not sure Mahout is made for this right now. 

Or if we are talking about web site revamp +1, I’d be happy to upgrade my 
section and have only held off waiting to see a redesign or moving to Jekyll.

As to a new mascot, ok, but the old one fits the name. We tried sub-naming 
Mahout-Samsara to symbolize the changing nature and rebirth of the project, 
maybe we should drop the name Mahout altogether. the name Mahout, like the blue 
man, is not relevant to the project anymore and maybe renaming, is good for 
marketing.
 
On Mar 24, 2017, at 7:37 AM, Nikolai Sakharnykh  wrote:

Agree that the website feels outdated. I would add Samsara code example on the 
front page, list of key algorithms implemented, supported backends, github & 
download links, and cut down the news part especially towards the end with flat 
release numbers and dates. Also probably reorganize the tabs.

If we go with honey badger as a mascot do we have any ideas on the logo itself? 
Honey badger biting/eating a snake?) 

-Original Message-
From: Trevor Grant [mailto:trevor.d.gr...@gmail.com] 
Sent: Thursday, March 23, 2017 8:53 PM
To: dev@mahout.apache.org
Cc: u...@mahout.apache.org
Subject: Re: Marketing

A student once asked his teacher, "Master, what is enlightenment?"

The master replied, "When hungry, eat. When tired, sleep."

Sounds like the honey badger to me...

Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Thu, Mar 23, 2017 at 5:43 PM, Pat Ferrel  wrote:

> The little blue man (the mahout) was reborn (samsara) as a honey-badger?
> He must be close indeed to reaching true enlightenment, or is that Buddhism?
> 
> 
> On Mar 23, 2017, at 12:42 PM, Andrew Palumbo  wrote:
> 
> +1 on revamp.
> 
> 
> 
> Sent from my Verizon Wireless 4G LTE smartphone
> 
> 
>  Original message 
> From: Trevor Grant 
> Date: 03/23/2017 12:36 PM (GMT-08:00)
> To: u...@mahout.apache.org, dev@mahout.apache.org
> Subject: Marketing
> 
> Hey user and dev,
> 
> With 0.13.0 the Apache Mahout project has added some significant updates.
> 
> The website is starting to feel 'dated' I think it could use a reboot.
> 
> The blue person riding the elephant has less signifigance in 
> Mahout-Samsara's modular backends.
> 
> Would like to open the floor to discussion on website reboot (and who 
> might be willing to take on such a project), as well as new mascot.
> 
> To kick off- in an offline talk there was the idea of A honey badger 
> (bc honey-badger don't care, just like mahout don't care what back end 
> or native solvers you are using, and also bc a cobra bites a honey 
> badger and he takes a little nap then wakes up and finishes eating the 
> cobra. honey badger eats snakes, and does all the work while the other 
> animals pick up the scraps.
> see this short documentary on the honey badger:
> https://www.youtube.com/watch?v=4r7wHMg5Yjg ) ^^audio not safe for 
> work
> 
> Con: its almost tooo jokey.
> 
> Other idea: are coy-wolfs.
> 
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
> 
> *"Fortunate is he, who is able to know the causes of things."  
> -Virgil*
> 
> 

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---



Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-03-24 Thread Pat Ferrel
I can’t +1 because of system integration errors that have to do with scoring 
that could be in Mahout. I doubt it is but don’t have time in the allotted vote 
period to track it down.

My close looking tests of Mahout including the previous driver issues pass.

Not sure if we use this style of vote for releases but I guess I’m +0, wanting 
to see a release and not wanting to block it, just not enough info for a +1.


On Mar 23, 2017, at 11:26 PM, Trevor Grant  wrote:

+1 binding

Verified Signatures and Verification
mvn clean install
On source compiled with `mvn clean install`, `mvn clean package -Phadoop2
-PviennaCL`, and precompiled jars;  ran following in a docker contain
(constructed via maven plugin, details below)

$MAHOUT_HOME/bin/mahout spark-itemsimilarity \
   --master spark://$HOSTNAME:7077 \
--input /data/ratings.csv \
--output /tmp/spark_item_sim_output \
--itemIDColumn 1 \
--rowIDColumn 0 \
--sparkExecutorMem 6g

$MAHOUT_HOME/examples/bin/classify-wikipedia.sh -n 2

$MAHOUT_HOME/bin/mahout spark-shell -i
$MAHOUT_HOME/examples/bin/spark-document-classifier.mscala --master
spark://$HOSTNAME:7077


-

Maven plugin:

Hadoop Version = 2.4.1
Spark.Version=1.6.3
(^^ from parent pom)

io.fabric8
docker-maven-plugin
...
sequenceiq/hadoop-docker:${hadoop.version}

/usr/local/hadoop-${hadoop.version}
/usr/local/spark
/opt/mahout



curl -s
http://d3kbcqa49mib13.cloudfront.net/spark-${spark.version}-bin-hadoop2.6.tgz
| tar -xz -C /usr/local/
cd /usr/local  ln -s spark-${spark.version}-bin-hadoop2.6
spark
$SPARK_HOME/sbin/start-all.sh

echo "Downloading Movie Lens Ratings Data"
curl -s
http://files.grouplens.org/datasets/movielens/ml-latest-small.zip -o
/usr/local/ml-latest-small.zip
unzip /usr/local/ml-latest-small.zip -d /usr/local


Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Tue, Mar 21, 2017 at 11:17 AM, Andrew Musselman  wrote:

> This is the vote for release 0.13.0 of Apache Mahout.
> 
> The vote will be going for at least 72 hours and will be closed on Friday,
> March 26th, 2017 or once there are at least 3 PMC +1 binding votes
> (whichever
> occurs earlier).  Please download, test and vote with
> 
> [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
> [ ] +0, I don't care either way,
> [ ] -1, do not accept RC as the official 0.13.0 release of Apache Mahout,
> because...
> 
> 
> Maven staging repo:
> 
> https://repository.apache.org/content/repositories/
> orgapachemahout-1038/org/apache/mahout/apache-mahout-distribution/0.13.0
> 
> The git tag to be voted upon is mahout-0.13.0
> 



RE: Marketing

2017-03-24 Thread Nikolai Sakharnykh
Agree that the website feels outdated. I would add Samsara code example on the 
front page, list of key algorithms implemented, supported backends, github & 
download links, and cut down the news part especially towards the end with flat 
release numbers and dates. Also probably reorganize the tabs.

If we go with honey badger as a mascot do we have any ideas on the logo itself? 
Honey badger biting/eating a snake?) 

-Original Message-
From: Trevor Grant [mailto:trevor.d.gr...@gmail.com] 
Sent: Thursday, March 23, 2017 8:53 PM
To: dev@mahout.apache.org
Cc: u...@mahout.apache.org
Subject: Re: Marketing

A student once asked his teacher, "Master, what is enlightenment?"

The master replied, "When hungry, eat. When tired, sleep."

Sounds like the honey badger to me...

Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Thu, Mar 23, 2017 at 5:43 PM, Pat Ferrel  wrote:

> The little blue man (the mahout) was reborn (samsara) as a honey-badger?
> He must be close indeed to reaching true enlightenment, or is that Buddhism?
>
>
> On Mar 23, 2017, at 12:42 PM, Andrew Palumbo  wrote:
>
> +1 on revamp.
>
>
>
> Sent from my Verizon Wireless 4G LTE smartphone
>
>
>  Original message 
> From: Trevor Grant 
> Date: 03/23/2017 12:36 PM (GMT-08:00)
> To: u...@mahout.apache.org, dev@mahout.apache.org
> Subject: Marketing
>
> Hey user and dev,
>
> With 0.13.0 the Apache Mahout project has added some significant updates.
>
> The website is starting to feel 'dated' I think it could use a reboot.
>
> The blue person riding the elephant has less signifigance in 
> Mahout-Samsara's modular backends.
>
> Would like to open the floor to discussion on website reboot (and who 
> might be willing to take on such a project), as well as new mascot.
>
> To kick off- in an offline talk there was the idea of A honey badger 
> (bc honey-badger don't care, just like mahout don't care what back end 
> or native solvers you are using, and also bc a cobra bites a honey 
> badger and he takes a little nap then wakes up and finishes eating the 
> cobra. honey badger eats snakes, and does all the work while the other 
> animals pick up the scraps.
> see this short documentary on the honey badger:
> https://www.youtube.com/watch?v=4r7wHMg5Yjg ) ^^audio not safe for 
> work
>
> Con: its almost tooo jokey.
>
> Other idea: are coy-wolfs.
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  
> -Virgil*
>
>

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


Re: [VOTE] Apache Mahout 0.13.0 Release Candidate

2017-03-24 Thread Trevor Grant
+1 binding

Verified Signatures and Verification
mvn clean install
On source compiled with `mvn clean install`, `mvn clean package -Phadoop2
-PviennaCL`, and precompiled jars;  ran following in a docker contain
(constructed via maven plugin, details below)

$MAHOUT_HOME/bin/mahout spark-itemsimilarity \
--master spark://$HOSTNAME:7077 \
 --input /data/ratings.csv \
 --output /tmp/spark_item_sim_output \
 --itemIDColumn 1 \
 --rowIDColumn 0 \
 --sparkExecutorMem 6g

$MAHOUT_HOME/examples/bin/classify-wikipedia.sh -n 2

$MAHOUT_HOME/bin/mahout spark-shell -i
$MAHOUT_HOME/examples/bin/spark-document-classifier.mscala --master
spark://$HOSTNAME:7077


-

Maven plugin:

Hadoop Version = 2.4.1
Spark.Version=1.6.3
(^^ from parent pom)

io.fabric8
docker-maven-plugin
...
sequenceiq/hadoop-docker:${hadoop.version}

/usr/local/hadoop-${hadoop.version}
/usr/local/spark
/opt/mahout



curl -s
http://d3kbcqa49mib13.cloudfront.net/spark-${spark.version}-bin-hadoop2.6.tgz
| tar -xz -C /usr/local/
cd /usr/local  ln -s spark-${spark.version}-bin-hadoop2.6
spark
$SPARK_HOME/sbin/start-all.sh

echo "Downloading Movie Lens Ratings Data"
curl -s
http://files.grouplens.org/datasets/movielens/ml-latest-small.zip -o
/usr/local/ml-latest-small.zip
unzip /usr/local/ml-latest-small.zip -d /usr/local


Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Tue, Mar 21, 2017 at 11:17 AM, Andrew Musselman  wrote:

> This is the vote for release 0.13.0 of Apache Mahout.
>
> The vote will be going for at least 72 hours and will be closed on Friday,
> March 26th, 2017 or once there are at least 3 PMC +1 binding votes
> (whichever
> occurs earlier).  Please download, test and vote with
>
> [ ] +1, accept RC as the official 0.13.0 release of Apache Mahout
> [ ] +0, I don't care either way,
> [ ] -1, do not accept RC as the official 0.13.0 release of Apache Mahout,
> because...
>
>
> Maven staging repo:
>
> https://repository.apache.org/content/repositories/
> orgapachemahout-1038/org/apache/mahout/apache-mahout-distribution/0.13.0
>
> The git tag to be voted upon is mahout-0.13.0
>