Howdy all.  Andrew asked me to take a little time and verify the release so
we could get another PMC +1, so I tried to dust off my Mahout skills and
help out... but frankly, the "getting started" from a binary distribution
docs are pretty hard for _me_ to follow.

I start on the main page and navigate to:
https://mahout.apache.org/general/downloads to read what to do with the
14.1 tarball y'all linked above.

Ok, so I'm supposed to set env vars for MAHOUT_HOME and MAHOUT_LOCAL if I'm
running on my laptop, sounds good:

bash-3.2$ mkdir -p mahout/mahout-14.1 && mv
Downloads/apache-mahout-distribution-14.1.tar.bz2 mahout/mahout-14.1 && cd
mahout/mahout-14.1 && bunzip2 *.bz2 && tar xvf *.tar


Not sure what where to go from here, as the Downloads page tells me
nothing about how to drop into the mahout shell.  So I follow the next top
link to Overview, and read

"You’ve probably already noticed Mahout has a lot of things going on at
different levels, and it can be hard to know where to start."

Indeed, not sure where to start, hopefully this page will tell me!  Hmm.
The page talks a bit about abstractions, generic application code,
something about the DSL, some DAG stuff and optimizations, engine bindings
and... native solvers?   I thought I was getting started?

Ok, maybe this page is named weird.  How about the Tutorial, that sounds
promising!  So many tutorials to choose from: "Recommenders: CCO with
Last.fm", ok I vaguely imagine "CCO" is probably some kind of co-occurrence
thing, and Last.fm was a music site from before when some members of my
current team were born.  Let's start with something more straightforward,
oooh "Text Classification(shell)", that's better, onward to
https://mahout.apache.org/docs/latest/tutorials/samsara/classify-a-doc-from-the-shell.html
(interestingly, the <title> tag on this page is called "Perceptron and
Winnow", ok... I'll just let that slide on by, but )

Yay, a Prerequisites section!  It links to
http://mahout.apache.org/users/sparkbindings/play-with-shell.html (why
wasn't this already in my quickstart browsing, did I miss it?).  Right at
the top, I'm warned:

"This tutorial will show you how to play with Mahout’s scala DSL for linear
algebra and its Spark shell. Please keep in mind that this code is still in
a very early experimental stage.
*(Edited for 0.10.2)*"

I'm testing build 14.1.  Seems this may no longer be "in a very
experimental stage" (hopefully)?

Keeping that in mind, I forge onward, to learn about classifying cereals.
A section on "Installing Mahout and Spark", excellent, I hope this is up to
date! Step 1:

"

   1. Download Apache Spark 1.6.2
   <http://d3kbcqa49mib13.cloudfront.net/spark-1.6.2-bin-hadoop2.6.tgz> and
   unpack the archive file

"

Hmm... Spark 1.6.2, you say?  I'm gonna hope my new and shiny Spark 3.0.1
download doesn't break anything?

"

   1. Change to the directory where you unpacked Spark and type sbt/sbt
   assembly to build it

"

Type "sbt/sbt assembly" to build ... it?  To build Spark?  Both the
prehistoric 1.6.2 link, and the 3.0.1 bits I've got are a *binary*
distribution of spark.  I don't think I need to build it.  If I'm wrong
here, happy to hear what I'm supposed to do, but there's certainly no
build.sbt hanging around any of these binary distros.  I'll imagine I'm ok
with this binary distro and see where things go from here.

"

   1. Create a directory for Mahout somewhere on your machine, change to
   there and checkout the master branch of Apache Mahout from GitHub git
   clone https://github.com/apache/mahout mahout

"

Hmm, I have build from source?  Scanning down the page, no, apparently no
instructions on what to do with my tarball here.

So I suppose I could try to build from source and go from there, but... not
sure how, as a "so old I'm new again" person can even get started with the
binary distribution to try out a simple text classification in the mahout
shell, and verify that said binary distro is worthy of release.

:\

  -jake



On Fri, Sep 25, 2020 at 3:15 PM Andrew Musselman <andrew.mussel...@gmail.com>
wrote:

> I am +1 (binding) on this release; I have:
> (1) Downloaded both binary and source releases
> (2) Verified checksums and signatures
> (3) Built from source
> (4) Run the spark-shell, and loaded and run the sparse DRM multiplier
> function for both distros:
>
> Note that in the source build the shell script is
> `./distribution/target/apache-mahout-14.1/bin/mahout
> spark-shell`.
>
> scala> :load /home/akm/a/src/test/
>
> repository.apache.org/content/repositories/orgapachemahout-1066/org/apache/mahout/mahout/14.1/apache-mahout-14.1/distribution/target/apache-mahout-14.1/examples/bin/SparseSparseDrmTimer.mscala
> Loading /home/akm/a/src/test/
>
> repository.apache.org/content/repositories/orgapachemahout-1066/org/apache/mahout/mahout/14.1/apache-mahout-14.1/distribution/target/apache-mahout-14.1/examples/bin/SparseSparseDrmTimer.mscala
> .
> ..
> timeSparseDRMMMul: (m: Int, n: Int, s: Int, para: Int, pctDense: Double,
> seed: Long)Long
>
> scala> timeSparseDRMMMul(1000,1000,1000,1,.02,1234L)
> res6: Long = 2205
>
> So far that is two binding +1 votes and two non-binding +1 votes, with no
> -1 votes. Still looking for another PMC vote.
>
> On Thu, Sep 24, 2020 at 6:45 PM Trevor Grant <trevor.d.gr...@gmail.com>
> wrote:
>
> > Sorry for the delay on my vote.
> >
> > Sigs checked out- source built without issues and passed all tests.
> > Additional tests as described on RC6 candidate.
> >
> > I'm +1 binding.
> >
> > Thanks again Andrew!!
> >
> > tg
> >
> >
> > On Wed, Sep 16, 2020 at 12:56 PM Trevor Grant <trevor.d.gr...@gmail.com>
> > wrote:
> >
> > > Away from my computer on vacation- if it's still up next week I can
> vote.
> > >
> > > On Wed, Sep 16, 2020, 11:27 AM Christofer Dutz <
> > christofer.d...@c-ware.de>
> > > wrote:
> > >
> > >> +1 (non-binding)
> > >>
> > >> Chris
> > >>
> > >> [OK] Download all staged artifacts under the url specified in the
> > release
> > >> vote email into a directory we’ll now call download-dir.
> > >> [MINOR] Verify the signature is correct: Additional Apache tutorial on
> > >> how to verify downloads can be found here.
> > >> [OK] Check if the signature references an Apache email address.
> > >> [MINOR] Verify the SHA512 hashes:
> > >> [OK] Unzip the archive
> > >> [OK] Verify the existence of LICENSE, NOTICE, README files in the
> > >> extracted source bundle.
> > >> [MINOR] Verify the content of LICENSE, NOTICE, README files in the
> > >> extracted source bundle.
> > >> [MINOR] Run RAT externally to ensure there are no surprises.
> > >> [MINOR] Search for SNAPSHOT references
> > >> [OK] Search for Copyright references, and if they are in headers, make
> > >> sure these files containing them are mentioned in the LICENSE file.
> > >> [OK] Build the project according to the information in the README.md
> > file.
> > >>
> > >> Remarks:
> > >> - The signature is correct, but no secure trust chain could be
> > >> established (Possibly worth attending a Key-Signing-Party as soon as
> > they
> > >> are happening again)
> > >> - There are no SHA512 hashes, I validated against the SHA1 hashes
> > instead
> > >> - Two CERN files weren't listed in the LICENSE
> > (KeyTypeValueTypeProcedure
> > >> , ValueTypeComparator)
> > >> - One CERN file still has a double CERN/Apache Header
> (NegativeBinomial)
> > >> - The community modules all contain SNAPSHOT references as they
> weren't
> > >> enabled in the release
> > >> - The distribution references SNAPSHOT versions of community modules
> > >> (However not in an always-on profile)
> > >>
> > >>
> > >>
> > >> Am 11.09.20, 19:18 schrieb "Andrew Musselman" <
> > >> andrew.mussel...@gmail.com>:
> > >>
> > >>     My bad, RC7 out now!
> > >>
> > >>     Binaries:
> > >>
> > >>
> >
> https://repository.apache.org/content/repositories/orgapachemahout-1066/org/apache/mahout/apache-mahout-distribution/14.1/
> > >>
> > >>     Source:
> > >>
> > >>
> >
> https://repository.apache.org/content/repositories/orgapachemahout-1066/org/apache/mahout/mahout/14.1/
> > >>
> > >>     On Fri, Sep 11, 2020 at 1:04 AM Christofer Dutz <
> > >> christofer.d...@c-ware.de>
> > >>     wrote:
> > >>
> > >>     > Hi,
> > >>     >
> > >>     > Sad to see you didn't merge my changes in "issue/MAHOUT-2117"
> back
> > >> to
> > >>     > master before cutting the next RC :-(
> > >>     >
> > >>     > So I'm not going to vote this time as the result would be the
> same
> > >> as last
> > >>     > time ...
> > >>     >
> > >>     > Chris
> > >>     >
> > >>     >
> > >>     > Am 11.09.20, 03:17 schrieb "Trevor Grant" <
> > trevor.d.gr...@gmail.com
> > >> >:
> > >>     >
> > >>     >     Thank you so much for getting this out Andrew.
> > >>     >
> > >>     >     I verified all checksums/sigs.
> > >>     >
> > >>     >     I successfully built the source including all tests. (I did
> > >> this in the
> > >>     >     public docker container rawkintrevo/mahout-builder-base)
> > >>     >
> > >>     >     I also tested the binaries in the public docker container
> > >>     >     rawkintrevo/mahoutgui , but bashing into the running
> > container,
> > >>     > unpacking
> > >>     >     both the binary archives, and then aiming and running the
> > mahout
> > >>     > example
> > >>     >     notebook in turn against each of the unpacked binaries from
> > >> each of the
> > >>     >     archives.  I did this in place of spark-shell, as I think
> it's
> > >> a more
> > >>     >     elegant solution going forward, but would encourage others
> to
> > >> test
> > >>     > against
> > >>     >     mahout spark-shell.
> > >>     >
> > >>     >     So given all of that, I give an enthusiastic +1
> > >>     >
> > >>     >     On Thu, Sep 10, 2020 at 3:37 PM Andrew Musselman <
> > >> a...@apache.org>
> > >>     > wrote:
> > >>     >
> > >>     >     > Binaries:
> > >>     >     >
> > >>     >     >
> > >>     >
> > >>
> >
> https://repository.apache.org/content/repositories/orgapachemahout-1065/org/apache/mahout/apache-mahout-distribution/14.1/
> > >>     >     >
> > >>     >     > Source:
> > >>     >     >
> > >>     >     >
> > >>     >
> > >>
> >
> https://repository.apache.org/content/repositories/orgapachemahout-1065/org/apache/mahout/mahout/14.1/
> > >>     >     >
> > >>     >     > Please check checksums and signatures, run the shell, do
> > some
> > >>     > computation,
> > >>     >     > run your favorite jobs, and let us know how it looks.
> > >>     >     >
> > >>     >     > Thanks!
> > >>     >     >
> > >>     >     > Best
> > >>     >     > Andrew
> > >>     >     >
> > >>     >
> > >>     >
> > >>
> > >>
> >
>


-- 

  -jake

Reply via email to