In any case, still getting this error in the console when I run this block:
"import org.apache.mahout.math._ import org.apache.mahout.math.scalabindings._ import org.apache.mahout.math.drm._ import org.apache.mahout.math.scalabindings.RLikeOps._ import org.apache.mahout.math.drm.RLikeDrmOps._ import org.apache.mahout.sparkbindings._ implicit val sdc: org.apache.mahout.sparkbindings.SparkDistributedContext = sc2sdc(sc)" "<console>:21: error: object mahout is not a member of package org.apache import org.apache.mahout.math._" On Fri, May 20, 2016 at 2:31 PM, Andrew Musselman < andrew.mussel...@gmail.com> wrote: > Ah, well I cloned the Till branch per your Nov 3 article.. > > git clone https://github.com/tillrohrmann/incubator-zeppelin.git > > On Fri, May 20, 2016 at 2:28 PM, Trevor Grant <trevor.d.gr...@gmail.com> > wrote: > >> That's a "new" feature in the 0.6-snapshot... Say within the last month or >> two, how long has it been since you did a git pull? >> >> I'll update soon with a note on that. >> >> I can also create a gist with the code. >> On May 20, 2016 4:24 PM, "Andrew Musselman" <andrew.mussel...@gmail.com> >> wrote: >> >> > At this step of the tutorial I'm stuck because I don't have an "Import >> > Note" link in my Zeppelin home: >> > >> > "I’m going to do you another favor. Go to the Zeppelin home page and >> click >> > on ‘Import Note’. When given the option between URL and json, click on >> URL >> > and enter the following link: >> > >> > >> > >> https://raw.githubusercontent.com/rawkintrevo/mahout-zeppelin/master/%5BMAHOUT%5D%5BPROVING-GROUNDS%5DLinear%20Regression%20in%20Spark.json >> > " >> > >> > On Fri, May 20, 2016 at 12:35 PM, Trevor Grant < >> trevor.d.gr...@gmail.com> >> > wrote: >> > >> > > FYI: >> > > >> > > Looks like Flink shell is fixed :D >> > > >> > > https://github.com/apache/flink/pull/1913 >> > > >> > > (I tested, is working good). >> > > >> > > >> > > >> > > Trevor Grant >> > > Data Scientist >> > > https://github.com/rawkintrevo >> > > http://stackexchange.com/users/3002022/rawkintrevo >> > > http://trevorgrant.org >> > > >> > > *"Fortunate is he, who is able to know the causes of things." >> -Virgil* >> > > >> > > >> > > On Fri, May 20, 2016 at 1:46 PM, Suneel Marthi <smar...@apache.org> >> > wrote: >> > > >> > > > On Fri, May 20, 2016 at 12:54 PM, Trevor Grant < >> > trevor.d.gr...@gmail.com >> > > > >> > > > wrote: >> > > > >> > > > > Dmitriy really nailed it on the head in his reply to the post >> which >> > > I'll >> > > > > rebroadcast below. In essence the whole reason you are >> > (theoretically) >> > > > > using Mahout is the data is to big to fit in memory. If it's to >> big >> > to >> > > > fit >> > > > > in memory, well then its probably too big to plot each point (e.g. >> > > > > trillions of row, you only have so many pixels). For the >> example I >> > > > > randomly sampled a matrix. >> > > > > >> > > > > So as Dmitriy says, in Mahout we need to have functions that will >> > > > > 'preprocess' the data into something plotable. >> > > > > >> > > > > For the Zepplin-Plotting thing, we need to have a function that >> will >> > > spit >> > > > > out a tsv like string of the data we wanted plotted. >> > > > > >> > > > > I agree an honest Mahout interpreter in Zeppelin is probably worth >> > > doing. >> > > > > There are a couple of ways to go about it. I opened up the >> discussion >> > > on >> > > > > dev@Zeppelin and didn't get any replies. I'm going to take that >> to >> > > mean >> > > > we >> > > > > can do it in a way that makes the most sense to Mahout users... >> > > > > >> > > > > First steps are to include some methods in Mahout that will do >> that >> > > > > preprocessing, and one that will turn something into a tsv string. >> > > > > >> > > > > I have some general ideas on possible approached to making an >> > > > honest-mahout >> > > > > interpreter but I want to play in the code and look at the >> > Flink-Mahout >> > > > > shell a bit before I try to organize my thoughts and present them. >> > > > > >> > > > >> > > > FYI Trevor, there's no Flink-Mahout shell today; in large part >> because >> > > the >> > > > Flink Shell is still busted on their end and we on the Mahout end >> have >> > > not >> > > > had time to muck with it. What exists today is the Mahout-Spark >> shell. >> > > > >> > > > > >> > > > > ...(2) not sure what is the point of supporting distributed >> anything. >> > > It >> > > > is >> > > > > distributed presumably because it is hard to keep it in memory. >> > > > Therefore, >> > > > > plotting anything distributed potentially presents 2 problems: >> > storage >> > > > > space and overplotting due to number of points. The idea is that >> we >> > > have >> > > > to >> > > > > work out algorithms that condense big data information into small >> > > > plottable >> > > > > information (like density grids, for example, or histograms).... >> > > > > >> > > > >> > > > Agreed, something like sampling x% of points from a DRM (like the >> > > visuals I >> > > > had from Palumbo for the talk in Vancouver that demonstrated the >> > concept) >> > > > >> > > > >> > > > > >> > > > > Trevor Grant >> > > > > Data Scientist >> > > > > https://github.com/rawkintrevo >> > > > > http://stackexchange.com/users/3002022/rawkintrevo >> > > > > http://trevorgrant.org >> > > > > >> > > > > *"Fortunate is he, who is able to know the causes of things." >> > -Virgil* >> > > > > >> > > > > >> > > > > On Fri, May 20, 2016 at 10:22 AM, Pat Ferrel < >> p...@occamsmachete.com> >> > > > > wrote: >> > > > > >> > > > > > Great job Trevor, we’ll need this detail to smooth out the sharp >> > > edges >> > > > > and >> > > > > > any guidance from you or the Zeppelin community will be a big >> help. >> > > > > > >> > > > > > >> > > > > > On May 20, 2016, at 8:13 AM, Shannon Quinn <squ...@gatech.edu> >> > > wrote: >> > > > > > >> > > > > > Agreed, thoroughly enjoying the blog post. >> > > > > > >> > > > > > On 5/19/16 12:01 AM, Andrew Palumbo wrote: >> > > > > > > Well done, Trevor! I've not yet had a chance to try this in >> > > zeppelin >> > > > > > but I just read the blog which is great! >> > > > > > > >> > > > > > > -------- Original message -------- >> > > > > > > From: Trevor Grant <trevor.d.gr...@gmail.com> >> > > > > > > Date: 05/18/2016 2:44 PM (GMT-05:00) >> > > > > > > To: dev@mahout.apache.org >> > > > > > > Subject: Re: Future Mahout - Zeppelin work >> > > > > > > >> > > > > > > Ah thank you. >> > > > > > > >> > > > > > > Fixing now. >> > > > > > > >> > > > > > > >> > > > > > > Trevor Grant >> > > > > > > Data Scientist >> > > > > > > https://github.com/rawkintrevo >> > > > > > > http://stackexchange.com/users/3002022/rawkintrevo >> > > > > > > http://trevorgrant.org >> > > > > > > >> > > > > > > *"Fortunate is he, who is able to know the causes of things." >> > > > -Virgil* >> > > > > > > >> > > > > > > >> > > > > > > On Wed, May 18, 2016 at 1:04 PM, Andrew Palumbo < >> > > ap....@outlook.com> >> > > > > > wrote: >> > > > > > > >> > > > > > >> Hey Trevor- Just refreshed your readme. The jar that I >> > mentioned >> > > is >> > > > > > >> actually: >> > > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> /home/username/.m2/repository/org/apache/mahout/mahout-spark_2.10/0.12.1-SNAPSHOT/mahout-spark_2.10-0.12.1-SNAPSHOT-dependency-reduced.jar >> > > > > > >> >> > > > > > >> rather than: >> > > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> /home/username/.m2/repository/org/apache/mahout/mahout-spark-shell_2.10/0.12.1-SNAPSHOT/mahout-spark_2.10-0.12.1-SNAPSHOT-dependency-reduced.jar >> > > > > > >> >> > > > > > >> (In the spark module that is) >> > > > > > >> ________________________________________ >> > > > > > >> From: Trevor Grant <trevor.d.gr...@gmail.com> >> > > > > > >> Sent: Wednesday, May 18, 2016 11:02:43 AM >> > > > > > >> To: dev@mahout.apache.org >> > > > > > >> Subject: Re: Future Mahout - Zeppelin work >> > > > > > >> >> > > > > > >> ah yes- I remember you pointing that out to me too. >> > > > > > >> >> > > > > > >> I got side tracked yesterday for most of the day on an >> adventure >> > > in >> > > > > > getting >> > > > > > >> Zeppelin to work right after I accidently updated to the new >> > > > snapshot >> > > > > > (free >> > > > > > >> hint: the secret was to clear my cache *face-palm*) >> > > > > > >> >> > > > > > >> I'm going to add that dependency to the readme.md now. >> > > > > > >> >> > > > > > >> thanks, >> > > > > > >> tg >> > > > > > >> >> > > > > > >> Trevor Grant >> > > > > > >> Data Scientist >> > > > > > >> https://github.com/rawkintrevo >> > > > > > >> http://stackexchange.com/users/3002022/rawkintrevo >> > > > > > >> http://trevorgrant.org >> > > > > > >> >> > > > > > >> *"Fortunate is he, who is able to know the causes of things." >> > > > > -Virgil* >> > > > > > >> >> > > > > > >> >> > > > > > >> On Wed, May 18, 2016 at 9:59 AM, Andrew Palumbo < >> > > ap....@outlook.com >> > > > > >> > > > > > >> wrote: >> > > > > > >> >> > > > > > >>> Trevor this is very cool- I have not been able to look at it >> > > > closely >> > > > > > yet >> > > > > > >>> but just a small point: I believe that you'll also need to >> add >> > > the >> > > > > > >>> >> > > > > > >>> mahout-spark_2.10-0.12.1-SNAPSHOT-dependency-reduced.jar >> > > > > > >>> >> > > > > > >>> For things like the classification stats, confusion matrix, >> and >> > > > > > t-digest. >> > > > > > >>> >> > > > > > >>> Andy >> > > > > > >>> >> > > > > > >>> ________________________________________ >> > > > > > >>> From: Trevor Grant <trevor.d.gr...@gmail.com> >> > > > > > >>> Sent: Wednesday, May 18, 2016 10:47:21 AM >> > > > > > >>> To: dev@mahout.apache.org >> > > > > > >>> Subject: Re: Future Mahout - Zeppelin work >> > > > > > >>> >> > > > > > >>> I still need to update my readme/env per Pat's comments >> below, >> > > > > however >> > > > > > >> with >> > > > > > >>> out further ado, I present two notebooks that integrate >> Mahout >> > + >> > > > > Spark >> > > > > > + >> > > > > > >>> Zeppelin + ggplot2 >> > > > > > >>> >> > > > > > >>> https://github.com/rawkintrevo/mahout-zeppelin >> > > > > > >>> >> > > > > > >>> Supposing you have a somewhat recent version of Zeppelin 0.6 >> > with >> > > > > > sparkr >> > > > > > >>> support running already, you may import the following raw >> notes >> > > > > > directly >> > > > > > >>> into Zeppelin: >> > > > > > >>> >> > > > > > >>> >> > > > > > >>> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> https://raw.githubusercontent.com/rawkintrevo/mahout-zeppelin/master/%5BMAHOUT%5D%5BPROVING-GROUNDS%5DLinear%20Regression%20in%20Spark.json >> > > > > > >>> >> > > > > > >>> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> https://raw.githubusercontent.com/rawkintrevo/mahout-zeppelin/master/%5BMAHOUT%5D%5BPROVING-GROUNDS%5DSpark-Mahout%2Bggplot2.json >> > > > > > >>> So my thoughs on next steps, which I'm positing only as a >> > > starting >> > > > > > point >> > > > > > >>> for discussion, and are in no particular order of >> importance: >> > > > > > >>> >> > > > > > >>> - Blog on HOWTO for everyman (assumes no familiarity with >> > Mahout, >> > > > and >> > > > > > >> only >> > > > > > >>> enough familiarity with Zeppelin to have Zeppelin + SparkR >> > > support) >> > > > > > >>> - Some syntactic sugar somewhere in Mahout to convert a >> matrix >> > > > into a >> > > > > > tsv >> > > > > > >>> string. (with some sanity, eg a sample of a matrix) >> > > > > > >>> - Figure out with Zeppelin community what deeper integration >> > > feels >> > > > > > like - >> > > > > > >>> e.g. build-profile vs. tutorial >> > > > > > >>> - I think the case for making a build-profile is that >> > Zeppelin >> > > is >> > > > > > first >> > > > > > >>> and foremost a datascience tool for non technical users. >> > > > > > >>> - If we go that route I'll need some more support finding >> out >> > > > what >> > > > > is >> > > > > > >> the >> > > > > > >>> absolute minimum 'bare-bones' mahout we can include, e.g. >> does >> > > the >> > > > > user >> > > > > > >>> have to have mahout installed? To be discussed. >> > > > > > >>> - Add matplotlib (python) "support" -> paragraph showing >> how to >> > > do >> > > > > the >> > > > > > >> same >> > > > > > >>> thing in Python. >> > > > > > >>> >> > > > > > >>> The basic deal here is we are: >> > > > > > >>> 1) Setting up a standard Zeppelin Spark Interpretter to act >> > like >> > > a >> > > > > > Mahout >> > > > > > >>> interpretter >> > > > > > >>> - This is taken care of by setting some env. variables, >> > > adding >> > > > > some >> > > > > > >>> dependencies, and importing relevent packages >> > > > > > >>> 2) do mahout things as you do >> > > > > > >>> 3) export table to tsv string, which is passed to a resource >> > pool >> > > > > > >>> - This could be done to a disk if you didn't have >> zeppelin >> > > > > > >>> 4) read the tsv from the resource pool (or disk if you >> didn't >> > > have >> > > > > > >>> zeppelin) in R (python soon) and create a <plot package of >> your >> > > > > choice> >> > > > > > >>> >> > > > > > >>> To Pat's point- this is a kind of clumsy pipeline, however >> the >> > > > > Zeppelin >> > > > > > >>> wrapper at least makes it *feel* less so. >> > > > > > >>> >> > > > > > >>> >> > > > > > >>> Trevor Grant >> > > > > > >>> Data Scientist >> > > > > > >>> https://github.com/rawkintrevo >> > > > > > >>> http://stackexchange.com/users/3002022/rawkintrevo >> > > > > > >>> http://trevorgrant.org >> > > > > > >>> >> > > > > > >>> *"Fortunate is he, who is able to know the causes of >> things." >> > > > > -Virgil* >> > > > > > >>> >> > > > > > >>> >> > > > > > >>> On Tue, May 17, 2016 at 1:17 PM, Pat Ferrel < >> > > p...@occamsmachete.com >> > > > > >> > > > > > >> wrote: >> > > > > > >>>> Seems like there is plenty to use in ggplot or python but >> the >> > > > > pipeline >> > > > > > >> is >> > > > > > >>>> a little convoluted (so maybe no need for Angular >> > integration). >> > > To >> > > > > get >> > > > > > >>>> graphics out of Mahout it would be nice to not require >> > knowledge >> > > > of >> > > > > R >> > > > > > >>>> and/or python. Knowing Mahout is already bad enough but I >> > guess >> > > > the >> > > > > > API >> > > > > > >>>> from the Mahout side for plotting could be Scala syntactic >> > > sugar. >> > > > > What >> > > > > > >>> and >> > > > > > >>>> how this all is installed and setup is the next question. >> > > > > > >>>> >> > > > > > >>>> BTW this is what I use elsewhere (Mahout as a lib to this >> > code) >> > > > > > >>>> >> > > > > > >>>> "spark.serializer": >> > > > > "org.apache.spark.serializer.KryoSerializer", >> > > > > > >>>> "spark.kryo.registrator": >> > > > > > >>>> "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator", >> > > > > > >>>> "spark.kryo.referenceTracking": "false", >> > > > > > >>>> "spark.kryoserializer.buffer": "300m”, >> > > > > > >>>> >> > > > > > >>>> afaik you will only see if Kryo is working when you have to >> > > > > serialize >> > > > > > a >> > > > > > >>>> mahout specific data type like vector of drm, something >> > > registered >> > > > > > with >> > > > > > >>>> Kryo. >> > > > > > >>>> >> > > > > > >>>> >> > > > > > >>>> On May 16, 2016, at 6:18 PM, Trevor Grant < >> > > > trevor.d.gr...@gmail.com >> > > > > > >> > > > > > >>>> wrote: >> > > > > > >>>> >> > > > > > >>>> As a quick recap- we're trying to leverage Zeppelin for >> > > charting. >> > > > > > >>>> >> > > > > > >>>> It seems as though this can be achieved by >> > > > > > >>>> - Adding properties to the Spark Interpreter >> > > > > > >>>> - Adding dependency jars to the spark interpreter >> > > > > > >>>> - importing in a spark paragraph >> > > > > > >>>> >> > > > > > >>>> All seems to be working well, but I've fooled myself into >> > > thinking >> > > > > > >> things >> > > > > > >>>> were 'working' before because I wasn't actually >> integrating. >> > > > Lower I >> > > > > > >> will >> > > > > > >>>> outline the imports/properties, please look over and tell >> me >> > if >> > > > I'm >> > > > > > >>>> theoretically missing anything. >> > > > > > >>>> >> > > > > > >>>> The next phase for me will be >> > > > > > >>>> 1) Convert a matrix to some sort of serializable object >> that I >> > > can >> > > > > > >> easily >> > > > > > >>>> unpack from R >> > > > > > >>>> 2) use Zeppelin's resource buffers to pass the object >> > > > > > >>>> 3) collect the object in an R paragraph, convert it to a >> > > dataframe >> > > > > > then >> > > > > > >>> map >> > > > > > >>>> using ggplot >> > > > > > >>>> >> > > > > > >>>> Once I have a working prototype I will work add some >> syntactic >> > > > sugar >> > > > > > to >> > > > > > >>>> prepare the matrix from the scala side and pass to zeppelin >> > > (using >> > > > > > >>> resource >> > > > > > >>>> pools so the same functionality can be reused in Flink) and >> > an R >> > > > > > >> library >> > > > > > >>>> containing some functions which will pull the data out of >> the >> > > > > resource >> > > > > > >>> pool >> > > > > > >>>> and spit out a dataframe. >> > > > > > >>>> >> > > > > > >>>> Once its in a Dataframe in R- go nuts with any plotting >> > package >> > > > you >> > > > > > >> like. >> > > > > > >>>> Likewise, it should be possible to do the same thing with >> > > > matplotlib >> > > > > > >> and >> > > > > > >>>> python ( >> > > https://gist.github.com/andershammar/9070e0f6916a0fbda7a5 >> > > > ) >> > > > > > >>>> >> > > > > > >>>> All of this doesn't necessarily require any changing of the >> > > > Zeppelin >> > > > > > >>> source >> > > > > > >>>> code, and isn't very intrusive or difficult to set up, I'll >> > > make a >> > > > > > blog >> > > > > > >>>> post but its almost a text book entry tutorial on using >> > imports >> > > in >> > > > > > >>>> Zeppelin. (e.g. a tutorial would be just as at home on the >> > > > Zeppelin >> > > > > > >> site >> > > > > > >>> as >> > > > > > >>>> it would on the Mahout site). >> > > > > > >>>> >> > > > > > >>>> Now, there has been some talk of using Zeppelin's >> angularJS. >> > > > Things >> > > > > > >> get >> > > > > > >>> a >> > > > > > >>>> little more harry in that case, but we could make an >> optional >> > > > build >> > > > > > >>> profile >> > > > > > >>>> that would make zeppelin recognize matrices at tables and >> > expose >> > > > all >> > > > > > of >> > > > > > >>> the >> > > > > > >>>> built in charting features of Zeppelin. >> > > > > > >>>> >> > > > > > >>>> If you're not adding a bunch of custom charts to Zeppelin >> > (which >> > > > > would >> > > > > > >> be >> > > > > > >>>> somewhat tedious), you're going to end up with a lot of >> > examples >> > > > > where >> > > > > > >>> you >> > > > > > >>>> create a table in Mahout/Spark pass it to AngularJS then >> some >> > > > > > AngularJS >> > > > > > >>>> code charts it for you. At that point however, you're >> doing >> > > just >> > > > as >> > > > > > >> much >> > > > > > >>>> work, if not more than it would be to simply pass to R or >> > Python >> > > > and >> > > > > > >> let >> > > > > > >>>> ggplot or matlibplot do the work for you. >> > > > > > >>>> >> > > > > > >>>> Finally, I haven't run into any errors yet using Kyro >> (which >> > in >> > > > part >> > > > > > is >> > > > > > >>>> what makes me fear I'm not doing this right... it was too >> > > easy...) >> > > > > If >> > > > > > >>>> anything seems redundant or missing, please call it out. >> > > > > > >>>> >> > > > > > >>>> Add Properties to Spark interp: >> > > > > > >>>> >> > > > > > >>>> spark.kryo.registrator >> > > > > > >>>> org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator >> > > > > > >>>> spark.serializer org.apache.spark.serializer.KryoSerializer >> > > > > > >>>> >> > > > > > >>>> Add artifacts (need to change these to maven not local, >> also >> > > need >> > > > to >> > > > > > >>>> add/change one jar per below, however this does run): >> > > > > > >>>> >> > > > > > >>>> >> > > > > > >>>> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> /home/trevor/.m2/repository/org/apache/mahout/mahout-math/0.12.1-SNAPSHOT/mahout-math-0.12.1-SNAPSHOT.jar >> > > > > > >>>> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> /home/trevor/.m2/repository/org/apache/mahout/mahout-math-scala_2.10/0.12.1-SNAPSHOT/mahout-math-scala_2.10-0.12.1-SNAPSHOT.jar >> > > > > > >>>> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> /home/trevor/.m2/repository/org/apache/mahout/mahout-spark_2.10/0.12.1-SNAPSHOT/mahout-spark_2.10-0.12.1-SNAPSHOT.jar >> > > > > > >>>> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> /home/trevor/.m2/repository/org/apache/mahout/mahout-spark-shell_2.10/0.12.1-SNAPSHOT/mahout-spark-shell_2.10-0.12.1-SNAPSHOT.jar >> > > > > > >>>> Add following code to first paragraph of notebook: >> > > > > > >>>> ``` >> > > > > > >>>> %spark >> > > > > > >>>> import org.apache.mahout.math._ >> > > > > > >>>> import org.apache.mahout.math.scalabindings._ >> > > > > > >>>> import org.apache.mahout.math.drm._ >> > > > > > >>>> import org.apache.mahout.math.scalabindings.RLikeOps._ >> > > > > > >>>> import org.apache.mahout.math.drm.RLikeDrmOps._ >> > > > > > >>>> import org.apache.mahout.sparkbindings._ >> > > > > > >>>> >> > > > > > >>>> implicit val sdc: >> > > > > > >>> org.apache.mahout.sparkbindings.SparkDistributedContext = >> > > > > > >>>> sc2sdc(sc) >> > > > > > >>>> ``` >> > > > > > >>>> >> > > > > > >>>> >> > > > > > >>>> >> > > > > > >>>> Trevor Grant >> > > > > > >>>> Data Scientist >> > > > > > >>>> https://github.com/rawkintrevo >> > > > > > >>>> http://stackexchange.com/users/3002022/rawkintrevo >> > > > > > >>>> http://trevorgrant.org >> > > > > > >>>> >> > > > > > >>>> *"Fortunate is he, who is able to know the causes of >> things." >> > > > > > -Virgil* >> > > > > > >>>> >> > > > > > >>>> >> > > > > > >>>> On Mon, May 16, 2016 at 6:42 PM, Pat Ferrel < >> > > > p...@occamsmachete.com> >> > > > > > >>> wrote: >> > > > > > >>>>> Creating an mc used to do some Kryo setup, like >> registering >> > > > > > >> serializers >> > > > > > >>>> or >> > > > > > >>>>> serializer factories IIRC. Also there is the Spark conf >> for >> > > > > > >> allocating >> > > > > > >>>>> memory for the Kryo buffer. Look at the code in the mc >> > creation >> > > > > code >> > > > > > >> in >> > > > > > >>>> the >> > > > > > >>>>> Spark package helpers. All can be done in straight Spark >> and >> > > > passed >> > > > > > >> in >> > > > > > >>> to >> > > > > > >>>>> create the mc when needed. Again from old weak brain cells >> > but >> > > I >> > > > > > >> think >> > > > > > >>>> that >> > > > > > >>>>> is part of what makes the Mahout shell different than teh >> > Spark >> > > > > shell >> > > > > > >>>> plus >> > > > > > >>>>> imports, it auto-creates the mc instead of or along with >> an >> > sc. >> > > > > > >>>>> >> > > > > > >>>>> When I get back to my computer I can check. >> > > > > > >>>>> >> > > > > > >>>>> On May 16, 2016, at 3:40 PM, Andrew Palumbo < >> > > ap....@outlook.com> >> > > > > > >>> wrote: >> > > > > > >>>>> Trevor, >> > > > > > >>>>> >> > > > > > >>>>> Could you post any kryo errors that you may be having? >> > > > > > >>>>> >> > > > > > >>>>> ________________________________ >> > > > > > >>>>> From: Andrew Palumbo <ap....@outlook.com> >> > > > > > >>>>> Sent: Monday, May 16, 2016 6:25:07 PM >> > > > > > >>>>> To: mahout >> > > > > > >>>>> Subject: Future Mahout - Zeppelin work >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> To Dmitriy's point, I agree ggplot is def the priority, >> The >> > > > mahout >> > > > > > >>> plots >> > > > > > >>>>> are at this point are really just a POC, but at some >> point we >> > > may >> > > > > be >> > > > > > >>> want >> > > > > > >>>>> to integrate some data transformation features into the >> > mahout >> > > > > plots >> > > > > > >>>>> classes so they're really more future work. >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> long story short: >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>>> OK. I'll read through the examples and try to do >> something >> > > with >> > > > > some >> > > > > > >>>>> data, then do a ggplot and/or an angular plot on it >> (probably >> > > > > > >> ggplot). >> > > > > > >>>>>> I'll do a quick tutorial. Then I'll reopen discussion on >> > that >> > > > > > >> Zeppelin >> > > > > > >>>>> issue about weather we want to go ahead and add another >> > > > > interpreter. >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> Souds Great. >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> Thank you. >> > > > > > >>>>> >> > > > > > >>>>> ________________________________ >> > > > > > >>>>> From: Trevor Grant <trevor.d.gr...@gmail.com> >> > > > > > >>>>> Sent: Monday, May 16, 2016 5:49:17 PM >> > > > > > >>>>> To: Dmitriy Lyubimov >> > > > > > >>>>> Cc: Andrew Palumbo; Pat Ferrel; Suneel Marthi >> > > > > > >>>>> Subject: Re: Intro - Future Mahout - Zeppelin work >> > > > > > >>>>> >> > > > > > >>>>> I just signed up for dev, should i just reply all and cc >> dev >> > or >> > > > > > >> start a >> > > > > > >>>>> new thread? >> > > > > > >>>>> >> > > > > > >>>>> Trevor Grant >> > > > > > >>>>> Data Scientist >> > > > > > >>>>> https://github.com/rawkintrevo >> > > > > > >>>>> [ >> https://avatars3.githubusercontent.com/u/5852441?v=3&s=400 >> > ]< >> > > > > > >>>>> https://github.com/rawkintrevo> >> > > > > > >>>>> >> > > > > > >>>>> rawkintrevo (Trevor Grant) · GitHub< >> > > > https://github.com/rawkintrevo >> > > > > > >> > > > > > >>>>> github.com >> > > > > > >>>>> rawkintrevo has 12 repositories written in Python, >> Batchfile, >> > > and >> > > > > R. >> > > > > > >>>>> Follow their code on GitHub. >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> http://stackexchange.com/users/3002022/rawkintrevo >> > > > > > >>>>> http://trevorgrant.org >> > > > > > >>>>> >> > > > > > >>>>> "Fortunate is he, who is able to know the causes of >> things." >> > > > > -Virgil >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> On Mon, May 16, 2016 at 4:46 PM, Dmitriy Lyubimov < >> > > > > dlie...@gmail.com >> > > > > > >>>>> <mailto:dlie...@gmail.com>> wrote: >> > > > > > >>>>> fwiw ggplot2 is pretty darn advanced:) i am a bit >> skeptical >> > > smile >> > > > > > >> would >> > > > > > >>>>> have something that ggplot2 would not, the other way >> around >> > is >> > > > much >> > > > > > >>> more >> > > > > > >>>>> expected by me:) >> > > > > > >>>>> >> > > > > > >>>>> anyhow if ggplot2 and matplotlib are available in Zeppelin >> > > > without >> > > > > > >>> major >> > > > > > >>>>> limitations, it sounds like Zeppelin should be an all >> around >> > > very >> > > > > > >> nice >> > > > > > >>>>> venue then. >> > > > > > >>>>> >> > > > > > >>>>> On Mon, May 16, 2016 at 2:42 PM, Andrew Palumbo < >> > > > > ap....@outlook.com >> > > > > > >>>>> <mailto:ap....@outlook.com>> wrote: >> > > > > > >>>>> >> > > > > > >>>>> yeah we should probably move this over to dev@ >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> sorry- answering a question from a couple emails back on >> the >> > > > > thread. >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> If possible, I think it would be great to eventually have >> > both >> > > > > > >> (native >> > > > > > >>>>> mahout/smile plots and ggplot), since in the future we're >> > going >> > > > to >> > > > > be >> > > > > > >>>>> adding more visualization features rather than simple >> scatter >> > > > plots >> > > > > > >> etc >> > > > > > >>>>> that may not be covered by ggplot. >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> That's why we were thinking about using angular and the >> pngs. >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> But what youre saying in your last email would be great! >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> Thank you! >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> ________________________________ >> > > > > > >>>>> From: Trevor Grant <trevor.d.gr...@gmail.com<mailto: >> > > > > > >>>>> trevor.d.gr...@gmail.com>> >> > > > > > >>>>> Sent: Monday, May 16, 2016 5:33:12 PM >> > > > > > >>>>> To: Andrew Palumbo >> > > > > > >>>>> Cc: Pat Ferrel; Suneel Marthi; Dmitriy Lyubimov >> > > > > > >>>>> >> > > > > > >>>>> Subject: Re: Intro - Future Mahout - Zeppelin work >> > > > > > >>>>> >> > > > > > >>>>> I somehow replied to your last email without seeing it... >> > > > > > >>>>> >> > > > > > >>>>> OK. I'll read through the examples and try to do something >> > with >> > > > > some >> > > > > > >>>> data, >> > > > > > >>>>> then do a ggplot and/or an angular plot on it (probably >> > > ggplot). >> > > > > > >>>>> >> > > > > > >>>>> I'll do a quick tutorial. Then I'll reopen discussion on >> that >> > > > > > >> Zeppelin >> > > > > > >>>>> issue about weather we want to go ahead and add another >> > > > > interpreter. >> > > > > > >>>>> >> > > > > > >>>>> Trevor Grant >> > > > > > >>>>> Data Scientist >> > > > > > >>>>> https://github.com/rawkintrevo >> > > > > > >>>>> http://stackexchange.com/users/3002022/rawkintrevo >> > > > > > >>>>> http://trevorgrant.org >> > > > > > >>>>> >> > > > > > >>>>> "Fortunate is he, who is able to know the causes of >> things." >> > > > > -Virgil >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> On Mon, May 16, 2016 at 4:26 PM, Trevor Grant < >> > > > > > >>> trevor.d.gr...@gmail.com >> > > > > > >>>>> <mailto:trevor.d.gr...@gmail.com>> wrote: >> > > > > > >>>>> sorry for double email but are you thinking visualization >> > > should >> > > > > be a >> > > > > > >>>>> library internal to mahout or should we leverage zeppelins >> > > > > > >>> visualization >> > > > > > >>>>> capabilities? >> > > > > > >>>>> >> > > > > > >>>>> Also, should we move this discussion to dev? >> > > > > > >>>>> >> > > > > > >>>>> tg >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> Trevor Grant >> > > > > > >>>>> Data Scientist >> > > > > > >>>>> https://github.com/rawkintrevo >> > > > > > >>>>> http://stackexchange.com/users/3002022/rawkintrevo >> > > > > > >>>>> http://trevorgrant.org >> > > > > > >>>>> >> > > > > > >>>>> "Fortunate is he, who is able to know the causes of >> things." >> > > > > -Virgil >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> On Mon, May 16, 2016 at 4:14 PM, Andrew Palumbo < >> > > > > ap....@outlook.com >> > > > > > >>>>> <mailto:ap....@outlook.com>> wrote: >> > > > > > >>>>> >> > > > > > >>>>> Sorry- to be a little more clear, Part of what we're >> trying >> > to >> > > > is >> > > > > to >> > > > > > >>> get >> > > > > > >>>>> the new plotting features integrated with Zeppelin. We >> plan >> > on >> > > > > adding >> > > > > > >>>> more >> > > > > > >>>>> advanced plotting. >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> ________________________________ >> > > > > > >>>>> From: Andrew Palumbo <ap....@outlook.com<mailto: >> > > > ap....@outlook.com >> > > > > >> >> > > > > > >>>>> Sent: Monday, May 16, 2016 5:04:49 PM >> > > > > > >>>>> To: Pat Ferrel; Trevor Grant >> > > > > > >>>>> Cc: Suneel Marthi; Dmitriy Lyubimov >> > > > > > >>>>> Subject: Re: Intro - Future Mahout - Zeppelin work >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> Awesome! >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> most of the hard work was done by Dmitriy[??] , I've just >> > > > reworked >> > > > > > >> it a >> > > > > > >>>>> couple of times to keep up with spark's refactoring. >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> I think that you will also need to include: >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> mahout-spark_2.10-0.12.1-SNAPSHOT-dependency-reduced.jar >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> For the new plotting features that we're working on. >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> the plotting is still a work in progress, and the grid and >> > > > surface >> > > > > > >>> plots >> > > > > > >>>>> are not working properly. The plots are swing based and >> can >> > > > > > >> currently >> > > > > > >>> be >> > > > > > >>>>> exported as PNGs. There are a few examples on the closed >> > PR: >> > > > > > >>>>> https://github.com/apache/mahout/pull/230 >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> There is an example script in >> > > > examples/bin/spark-shell-plot.mscala >> > > > > > >>>>> (commited to master) : >> > > > > > >>>>> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/mahout/blob/master/examples/bin/spark-shell-plot.mscala >> > > > > > >>>>> >> > > > > > >>>>> Thanks! >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> ________________________________ >> > > > > > >>>>> From: Pat Ferrel <p...@occamsmachete.com<mailto: >> > > > > p...@occamsmachete.com >> > > > > > >>>>> Sent: Monday, May 16, 2016 4:54:15 PM >> > > > > > >>>>> To: Trevor Grant >> > > > > > >>>>> Cc: Andrew Palumbo; Suneel Marthi; Dmitriy Lyubimov >> > > > > > >>>>> Subject: Re: Intro - Future Mahout - Zeppelin work >> > > > > > >>>>> >> > > > > > >>>>> This is only the beginning. Andy has been using Smile as a >> > > > > > >>> visualization >> > > > > > >>>>> lib since it is pretty rich in ML support. We are looking >> at >> > > > > > >>> integrating >> > > > > > >>>>> some of that with Zeppelin then adding code to feed the >> new >> > > > > > >>>> visualizations >> > > > > > >>>>> in Mahout. I’m here because I’m fairly familiar with >> > AngularJS >> > > if >> > > > > > >>> that’s >> > > > > > >>>>> the way to go. Smile is swing based but can output pngs, >> > maybe >> > > > > other >> > > > > > >>>> image >> > > > > > >>>>> formats—Andy? >> > > > > > >>>>> >> > > > > > >>>>> BTW Dmitriy is still very involved but has rouble getting >> > > > > permission >> > > > > > >> to >> > > > > > >>>>> donate code. >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> On May 16, 2016, at 1:45 PM, Trevor Grant < >> > > > > trevor.d.gr...@gmail.com >> > > > > > >>>>> <mailto:trevor.d.gr...@gmail.com>> wrote: >> > > > > > >>>>> >> > > > > > >>>>> Hey Andrew, >> > > > > > >>>>> >> > > > > > >>>>> thanks- you basically did all of the hard work for me! >> > > > > > >>>>> >> > > > > > >>>>> I've got the linear regression example working from: >> > > > > > >>>>> >> > > > http://mahout.apache.org/users/sparkbindings/play-with-shell.html >> > > > > > >>>>> >> > > > > > >>>>> my java is sketchy at best, i tend to over import. I >> pulled >> > in >> > > > the >> > > > > > >>>>> following jars: >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> org/apache/mahout/mahout-math/0.12.1-SNAPSHOT/mahout-math-0.12.1-SNAPSHOT.jar >> > > > > > >>>>> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> org/apache/mahout/mahout-math-scala_2.10/0.12.1-SNAPSHOT/mahout-math-scala_2.10-0.12.1-SNAPSHOT.jar >> > > > > > >>>>> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> org/apache/mahout/mahout-spark_2.10/0.12.1-SNAPSHOT/mahout-spark_2.10-0.12.1-SNAPSHOT.jar >> > > > > > >>>>> >> > > > > > >> >> > > > > > >> > > > > >> > > > >> > > >> > >> org/apache/mahout/mahout-spark-shell_2.10/0.12.1-SNAPSHOT/mahout-spark-shell_2.10-0.12.1-SNAPSHOT.jar >> > > > > > >>>>> I think those are all necessary... should I be pulling in >> > > more? >> > > > > > >>>>> >> > > > > > >>>>> I hate to say it (but will do so bc this isn't public) >> this >> > > > > > >> integration >> > > > > > >>>> is >> > > > > > >>>>> super easy from a user perspective, almost too easy- eg >> why >> > not >> > > > let >> > > > > > >> the >> > > > > > >>>>> user add it themselves... Add the appropriate maven >> > artifacts, >> > > > > > >> restart >> > > > > > >>>> the >> > > > > > >>>>> interpreter and run the following in a notebook: >> > > > > > >>>>> ``` >> > > > > > >>>>> import org.apache.mahout.math._ >> > > > > > >>>>> import org.apache.mahout.math.scalabindings._ >> > > > > > >>>>> import org.apache.mahout.math.drm._ >> > > > > > >>>>> import org.apache.mahout.math.scalabindings.RLikeOps._ >> > > > > > >>>>> import org.apache.mahout.math.drm.RLikeDrmOps._ >> > > > > > >>>>> import org.apache.mahout.sparkbindings._ >> > > > > > >>>>> >> > > > > > >>>>> implicit val sdc: >> > > > > > >>> org.apache.mahout.sparkbindings.SparkDistributedContext >> > > > > > >>>>> = sc2sdc(sc) >> > > > > > >>>>> ``` >> > > > > > >>>>> Then whatever code you want and you're off to the races... >> > > > > > >>>>> >> > > > > > >>>>> that said, adding a build profile like -PsparkMahout and >> > > creating >> > > > > an >> > > > > > >>>>> interpretter like %spark.mahout should be fairly straight >> > > > forward. >> > > > > > >>>>> >> > > > > > >>>>> Second question, do you have an example that would be more >> > > > > > >>> 'visualization >> > > > > > >>>>> friendly'? I could pass the results to Angular or R just >> to >> > > show >> > > > > off >> > > > > > >>> how >> > > > > > >>>> to >> > > > > > >>>>> do it. >> > > > > > >>>>> >> > > > > > >>>>> Which leads back to the question, is this even worth >> > building a >> > > > > full >> > > > > > >>>>> interpreter for or just make a really nice blog post with >> > > > examples >> > > > > on >> > > > > > >>> how >> > > > > > >>>>> to integrate with R...? >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> Trevor Grant >> > > > > > >>>>> Data Scientist >> > > > > > >>>>> https://github.com/rawkintrevo >> > > > > > >>>>> http://stackexchange.com/users/3002022/rawkintrevo >> > > > > > >>>>> http://trevorgrant.org<http://trevorgrant.org/> >> > > > > > >>>>> >> > > > > > >>>>> "Fortunate is he, who is able to know the causes of >> things." >> > > > > -Virgil >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> On Mon, May 16, 2016 at 2:09 PM, Andrew Palumbo < >> > > > > ap....@outlook.com >> > > > > > >>>>> <mailto:ap....@outlook.com>> wrote: >> > > > > > >>>>> Hi Trevor, welcome! >> > > > > > >>>>> >> > > > > > >>>>> It's great to have you helping out, thanks very much. >> I've >> > > done >> > > > a >> > > > > > >> good >> > > > > > >>>>> amount of work on our mahout spark shell .. so let me >> know if >> > > you >> > > > > > >> have >> > > > > > >>>> any >> > > > > > >>>>> questions there about what we did there.. >> > > > > > >>>>> >> > > > > > >>>>> Thanks alot! >> > > > > > >>>>> >> > > > > > >>>>> Andy >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> -------- Original message -------- >> > > > > > >>>>> From: Suneel Marthi <smar...@apache.org<mailto: >> > > > smar...@apache.org >> > > > > >> >> > > > > > >>>>> Date: 05/16/2016 2:44 PM (GMT-05:00) >> > > > > > >>>>> To: Trevor Grant <trevor.d.gr...@gmail.com<mailto: >> > > > > > >>>> trevor.d.gr...@gmail.com >> > > > > > >>>>> Cc: Suneel Marthi <smar...@apache.org<mailto: >> > > smar...@apache.org >> > > > >>, >> > > > > > >> Pat >> > > > > > >>>>> Ferrel <p...@occamsmachete.com<mailto: >> p...@occamsmachete.com >> > >>, >> > > > > Andrew >> > > > > > >>>>> Palumbo <ap....@outlook.com<mailto:ap....@outlook.com>> >> > > > > > >>>>> Subject: Re: Intro - Future Mahout - Zeppelin work >> > > > > > >>>>> >> > > > > > >>>>> Oh yes, he's around. I see him online. >> > > > > > >>>>> >> > > > > > >>>>> On Mon, May 16, 2016 at 2:42 PM, Trevor Grant < >> > > > > > >>> trevor.d.gr...@gmail.com >> > > > > > >>>>> <mailto:trevor.d.gr...@gmail.com>> wrote: >> > > > > > >>>>> Is Dmitriy Lyubimov still around? >> > > > > > >>>>> >> > > > > > >>>>> Looks like he created this issue for Zeppelin a while ago. >> > (The >> > > > old >> > > > > > >>> lost >> > > > > > >>>>> code to which you were referring?) >> > > > > > >>>>> >> > > > > > >>>>> https://issues.apache.org/jira/browse/ZEPPELIN-116 >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> tg >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> Trevor Grant >> > > > > > >>>>> Data Scientist >> > > > > > >>>>> https://github.com/rawkintrevo >> > > > > > >>>>> http://stackexchange.com/users/3002022/rawkintrevo >> > > > > > >>>>> http://trevorgrant.org<http://trevorgrant.org/> >> > > > > > >>>>> >> > > > > > >>>>> "Fortunate is he, who is able to know the causes of >> things." >> > > > > -Virgil >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> On Mon, May 16, 2016 at 1:37 PM, Suneel Marthi < >> > > > smar...@apache.org >> > > > > > >>>> <mailto: >> > > > > > >>>>> smar...@apache.org>> wrote: >> > > > > > >>>>> Welcome to the party TG !! >> > > > > > >>>>> >> > > > > > >>>>> On Mon, May 16, 2016 at 2:28 PM, Trevor Grant < >> > > > > > >>> trevor.d.gr...@gmail.com >> > > > > > >>>>> <mailto:trevor.d.gr...@gmail.com>> wrote: >> > > > > > >>>>> Hey all, >> > > > > > >>>>> >> > > > > > >>>>> I'm excited for a chance to help out. I'm actually >> getting >> > > ready >> > > > > to >> > > > > > >>>>> download now and start playing around. >> > > > > > >>>>> >> > > > > > >>>>> I had talked about this briefly but it given a properly >> > > > functioning >> > > > > > >>>>> Zeppelin interpreter for Apache Mahout, one could leverage >> > all >> > > of >> > > > > the >> > > > > > >>>>> Zeppelin visualizations, anything in AngularJS, or >> anything >> > in >> > > R >> > > > > > >>> (through >> > > > > > >>>>> clever use of Zeppelin's Resource Pools). >> > > > > > >>>>> >> > > > > > >>>>> I'll work on getting logged in to the slack channel as >> well. >> > > > > > >>>>> >> > > > > > >>>>> Nice to meet you all, looking forward to helping out! >> > > > > > >>>>> >> > > > > > >>>>> tg >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> Trevor Grant >> > > > > > >>>>> Data Scientist >> > > > > > >>>>> https://github.com/rawkintrevo >> > > > > > >>>>> http://stackexchange.com/users/3002022/rawkintrevo >> > > > > > >>>>> http://trevorgrant.org<http://trevorgrant.org/> >> > > > > > >>>>> >> > > > > > >>>>> "Fortunate is he, who is able to know the causes of >> things." >> > > > > -Virgil >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> On Sun, May 15, 2016 at 12:56 PM, Suneel Marthi < >> > > > > smar...@apache.org >> > > > > > >>>>> <mailto:smar...@apache.org>> wrote: >> > > > > > >>>>> FYi... >> > > > > > >>>>> Trevor was there for my talk, so he has some idea of >> Mahout >> > > > > Samsara. >> > > > > > >>>>> >> > > > > > >>>>> On Sun, May 15, 2016 at 1:51 PM, Pat Ferrel < >> > > > p...@occamsmachete.com >> > > > > > >>>> <mailto: >> > > > > > >>>>> p...@occamsmachete.com>> wrote: >> > > > > > >>>>> Hey Trevor, >> > > > > > >>>>> >> > > > > > >>>>> Good to meet you. As you probably know Mahout-Samsara is a >> > > > > > >>> reincarnation >> > > > > > >>>>> of the project in a new body, which is less a collection >> of >> > > > > > >> algorithms >> > > > > > >>>> than >> > > > > > >>>>> a roll-your-own math/algorithm tool. The major benefit is >> > that >> > > > > during >> > > > > > >>>>> experimentation and later in production the code is by >> nature >> > > > > > >> scalable >> > > > > > >>> on >> > > > > > >>>>> Spark and Flink. Most of the Mahout DSL is R-like and >> > supports >> > > > > tensor >> > > > > > >>>> math >> > > > > > >>>>> but we are now looking at streaming online algo support >> too. >> > > > > > >>>>> >> > > > > > >>>>> In any case you probably know we have a Mahout version of >> the >> > > > Spark >> > > > > > >>>> Shell, >> > > > > > >>>>> which has been integrated with an old version of Zeppelin >> > (code >> > > > is >> > > > > > >>> lost). >> > > > > > >>>>> Recently Andy has experimented with some very nice >> > > visualizations >> > > > > of >> > > > > > >> ML >> > > > > > >>>>> data (not just analytics data). We as a project are >> > interested >> > > in >> > > > > > >>>> Zeppelin >> > > > > > >>>>> integration of our shell and graphics. From what I >> understand >> > > the >> > > > > > >>>> graphics >> > > > > > >>>>> extension mechanism of Zeppelin is based on AngularJS, >> which >> > I >> > > > have >> > > > > > >>> some >> > > > > > >>>>> experience with. >> > > > > > >>>>> >> > > > > > >>>>> So, we’d like to start the conversation about how to >> proceed. >> > > We >> > > > > > >> would >> > > > > > >>>>> love some help but will move ahead in any case. >> > > > > > >>>>> >> > > > > > >>>>> Pat >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>>> On May 15, 2016, at 9:52 AM, Suneel Marthi < >> > smar...@apache.org >> > > > > > >> <mailto: >> > > > > > >>>>> smar...@apache.org>> wrote: >> > > > > > >>>>> >> > > > > > >>>>> Hi Trevor, >> > > > > > >>>>> >> > > > > > >>>>> Nice meeting u last week in Vancouver. Per our >> > conversation, I >> > > > > > >> wanted >> > > > > > >>> to >> > > > > > >>>>> introduce u to Andrew Palumbo (Mahout Chair) and Pat >> Ferrel >> > > > (Mahout >> > > > > > >>> PMC). >> > > > > > >>>>> As I mentioned in my talk, we are actively looking at >> > Zeppelin >> > > > > > >>>> integration >> > > > > > >>>>> with Mahout (primarily for spark) and would appreciate >> your >> > > help >> > > > > (as >> > > > > > >>> also >> > > > > > >>>>> all things DL and ML). >> > > > > > >>>>> >> > > > > > >>>>> We definitely can use all your help as we r revamping the >> > > Mahout >> > > > > > >>> project >> > > > > > >>>>> and shedding its legacy MapReduce image. >> > > > > > >>>>> >> > > > > > >>>>> I sent u an invite to the Mahout slack channel, >> > > > mahout.apache.org< >> > > > > > >>>>> http://mahout.apache.org/> - that's where we all hangout >> and >> > > not >> > > > > > >>> having >> > > > > > >>>>> to worry about avoiding naughty words. >> > > > > > >>>>> >> > > > > > >>>>> Looking forward to working with you >> > > > > > >>>>> >> > > > > > >>>>> Suneel >> > > > > > >>>>> >> > > > > > >>>>> >> > > > > > >>>> >> > > > > > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >