Re: [ANNOUNCE] Andrew Musselman, New Mahout PMC Chair

2018-07-19 Thread Isabel Drost-Fromm




On 19/07/18 10:24, Sebastian Schelter wrote:

Congrats!


+1

Looking forward to hearing Andrew's voice on one of the upcoming board 
calls - please do feel invited to join as a new PMC chair.



Isabel


Re: New logo

2017-05-06 Thread Isabel Drost-Fromm
The green logo was the very first design iteration before iirc Robin came up 
with the yellow one. The should be like five TShirts world wide with the old 
logo printed in 2009.


Am 1. Mai 2017 20:41:43 MESZ schrieb Trevor Grant :
>Thanks Scott,
>
>You are correct- in fact we're going even further now, that you can do
>native optimization regardless of the architecture with native-solvers.
>
>Do you or anyone more familiar with the history of the website know
>anything about the origins/uses of this:
>https://mahout.apache.org/images/Mahout-logo-245x300.png
>It seems to be a green mahout logo.
>
>Also Scott, or anyone lurking who may be able to help.  As part of the
>website reboot I've included a "history" page and would really
>apppreciate
>some help capturing that from first person sources if possible. Ive put
>in
>some headers but those are only directional:
>
>https://github.com/rawkintrevo/mahout/blob/website/website/front/community/history.md
>
>
>
>Trevor Grant
>Data Scientist
>https://github.com/rawkintrevo
>http://stackexchange.com/users/3002022/rawkintrevo
>http://trevorgrant.org
>
>*"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
>On Mon, May 1, 2017 at 11:18 AM, scott cote 
>wrote:
>
>> Trevor et al:
>>
>> Some ideas to spur you on (and related points):
>>
>> Mahout is no longer a grab bag of algorithms and routines, but a math
>> language right?  You don’t care about the under the cover
>implementation.
>> Today its Spark with alternative implementations in Flink, etc ….
>>
>> Don’t know if that is the long term goal still  - haven’t kept up -
>but it
>> seems like you are insulating yourself from the underlying
>technology.
>>
>> Math is a universal language.  Right?
>>
>> Tower of Babel is coming to mind ….
>>
>> SCott
>>
>> > On Apr 27, 2017, at 10:27 PM, Trevor Grant
>
>> wrote:
>> >
>> > It also bugs me when I can't suggest any alternatives, yet don't
>like the
>> > ones in front of me...
>> >
>> > I became aware of a symbol a week or so ago, and it keeps coming
>back to
>> > me.
>> >
>> > The Enso.
>> > https://en.wikipedia.org/wiki/Ens%C5%8D
>> >
>> > Things I like about it:
>> > (all from wikipedia, since the only thing I knew about this symbol
>prior
>> is
>> > that someone I met had a tattoo of it).
>> > It represents (among a few other things) enlightenment.
>> > ^^ This resonated with the 'alternate definition of mahout' from
>Hebrew-
>> > which may be something akin to essence or truth.
>> >
>> > It is a circle- which plays to the Samsara theme.
>> >
>> > It is very expressive, a simple one or two brush stroke circle
>which
>> > symbolizes several large concepts and things about the creator,
>> expressive
>> > like our DSL (I feel gross comparing such a symbol to a Scala DSL,
>but
>> I'm
>> > spit balling here, please forgive me- I am not so expressive).
>> >
>> > "Once the *ensō* is drawn, one does not change it. It evidences the
>> > character of its creator and the context of its creation in a
>brief,
>> > contiguous period of time." Which reminds me of the DRMs
>> >
>> > In closed form it represents something akin to Plato's perfection-
>which
>> a
>> > little more wiki surfing tells me is the idea that no one can
>create a
>> > perfect circle because a circle is a collection of infinite points
>and
>> how
>> > could ever be sure that you have arranged each one properly, yet
>such
>> > things must exist, or what blueprint would a creator of circles be
>> striving
>> > for.  This, by-the-by reminds me of stochastic approaches to
>solving
>> > problems, and really statistics / "machine-learning" in general, in
>that
>> we
>> > can't find perfect solutions, yet we believe solutions exist and
>serve as
>> > our blueprint.
>> >
>> > Finally, I like that it is simple.
>> >
>> > Things I don't like about it:
>> > Lucent Technologies used it back in the 90s, however they used a
>very
>> > specific red one, and this isn't a deal breaker for me.
>> >
>> > Other thoughts:
>> > Based on the tattoo I saw- one could make an Enso using old mahout
>color
>> > palatte if one were to dab their brush in the appropriate colors.
>This
>> > could also be represented in any single color. (Not sure what that
>does
>> to
>> > our TM, is it ok if we just keep slapping TMs on the side of it? If
>that
>> is
>> > the case is there any reason we must have a single Enso?)
>> >
>> > So there is something to throw in the pot that is a little more
>grown up
>> > than my runner up favorites (honey badger, blueman riding bomb
>waving
>> > cowboy hat, blueman riding lighting bolt into a squirrel covered in
>> water,
>> > etc).
>> >
>> > Again, only know what wiki has told me, so if anyone is more
>familiar
>> with
>> > this symbol (like was it used as a logo by some horrible dictator
>which
>> > carried out terrible attrocities?) or just general comments.
>> > tg
>> >
>> >
>> >
>> > Trevor Grant
>> > Data Scientist
>> > https://github.com/rawkintrevo
>> > http://stackexchange.com/user

RE: Welcome our GSoC Student Aditya Sarma

2017-05-05 Thread Isabel Drost-Fromm
Hi Aditya,

Welcome. Great to have you here. 

Isabel

-- 
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.

Re: Marketing

2017-03-29 Thread Isabel Drost-Fromm
One more thing: what was really helpful in spreading the word in the early days 
was collecting real user stories: who achieved what with Mahout. Could be 
helpful for the new multi backend version as well. Imagine quotes like "we've 
successfully used Mahout on $insertBackendHere to solve 
$insertSuperDuperCoolUsecaseHere in no time" says $name, CTO of $hotNewStartup 
in an article about the project.

Warning: this is tedious work, involves monitoring Twitter, having a Google 
alert for the name and talking to any number of people over long periods of 
time to nudge them go public with their potentially confidential story.


Am 30. März 2017 01:03:31 MESZ schrieb Isabel Drost-Fromm :
>That is an awesome second interpretation.
>
>Having voted on the original name I'm 100% biased so take my opinion
>with a huge grain of salt: on the one hand I think name changes are
>over rated (anyone remember ethereal?), on the other hand IMHO Mahout
>is a fairly strong brand representing machine learning at scale.
>
>Maybe a combination of any of a new logo, design, documentation,
>release that drops the zero in "0.x.y", a press release for that
>release that Sally can help you with, a new front page that publishes
>the new focus of development, maybe a few snippets on that shift in
>focus that editors can use, dropping deprecated code would already go a
>long way... Just some random ideas.
>
>Isabel
>
>
>Am 25. März 2017 03:21:50 MEZ schrieb Ted Dunning
>:
>>On Fri, Mar 24, 2017 at 8:27 AM, Pat Ferrel 
>>wrote:
>>
>>> maybe we should drop the name Mahout altogether.
>>
>>
>>I have been told that there is a cool secondary interpretation of
>>Mahout as
>>well.
>>
>>I think that the Hebrew word is pronounced roughly like Mahout.
>>
>>מַהוּת
>>
>>The cool thing is that this word means "essence" or possibly "truth".
>>So
>>regardless of the guy riding the elephant, Mahout still has something
>>to be
>>said for it.
>>
>>(I have no Hebrew, btw)
>>(real speakers may want to comment here)
>
>-- 
>Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail
>gesendet.

-- 
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.

Re: Marketing

2017-03-29 Thread Isabel Drost-Fromm
That is an awesome second interpretation.

Having voted on the original name I'm 100% biased so take my opinion with a 
huge grain of salt: on the one hand I think name changes are over rated (anyone 
remember ethereal?), on the other hand IMHO Mahout is a fairly strong brand 
representing machine learning at scale.

Maybe a combination of any of a new logo, design, documentation, release that 
drops the zero in "0.x.y", a press release for that release that Sally can help 
you with, a new front page that publishes the new focus of development, maybe a 
few snippets on that shift in focus that editors can use, dropping deprecated 
code would already go a long way... Just some random ideas.

Isabel


Am 25. März 2017 03:21:50 MEZ schrieb Ted Dunning :
>On Fri, Mar 24, 2017 at 8:27 AM, Pat Ferrel 
>wrote:
>
>> maybe we should drop the name Mahout altogether.
>
>
>I have been told that there is a cool secondary interpretation of
>Mahout as
>well.
>
>I think that the Hebrew word is pronounced roughly like Mahout.
>
>מַהוּת
>
>The cool thing is that this word means "essence" or possibly "truth".
>So
>regardless of the guy riding the elephant, Mahout still has something
>to be
>said for it.
>
>(I have no Hebrew, btw)
>(real speakers may want to comment here)

-- 
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.

Re: Mahout ML vs Spark Mlib vs Mahout-Spark integreation

2017-01-31 Thread Isabel Drost-Fromm

Hi,

On Fri, Sep 16, 2016 at 11:36:03PM -0700, Andrew Musselman wrote:
> and we're thinking about just how many pre-built algorithms we
> should include in the library versus working on performance behind the
> scenes.

To pick this question up: I've been watching Mahout from a distance for quite
some time. So from what limited background I have of Samsara I really like it's
approach to be able to run on more than one execution engine.

To give some advise to downstream users in the field - what would be your advise
for people tasked with concrete use cases (stuff like fraud detection, anomaly
detection, learning search ranking functions, building a recommender system)? Is
that something that can still be done with Mahout? What would it take to get
from raw data to finished system? Is there something we can do to help users get
that accomplished? Is there even interest from users in such a use case based
perspective? If so, would there be interest among the Mahout committers to help
users publicly create docs/examples/modules to support these use cases?


Isabel



Re: Welcome Trevor Grant as a new Mahout Committer

2016-05-24 Thread Isabel Drost-Fromm
On Mon, May 23, 2016 at 08:39:01PM -0400, Andrew Palumbo wrote:
> In recognition of Trevor Grant's contributions to the Mahout project
> notably his Zeppelin Integration work, the PMC has invited and is pleased
> to announce that he has accepted our invitation to join the Mahout project
> as a committer.

Welcome Trevor - great to have you.


Isabel



Berlin Buzzwords 2014: CfP is open

2014-01-23 Thread Isabel Drost-Fromm
I'm super happy to announce that the call for submissions for Berlin
Buzzwords 2013 is open. For those who don't know the conference - in
my "absolutely objective opinion" the event is the most exciting
conference on storing, processing and searching large amounts of
digital data for engineers.

The 5th edition of Berlin Buzzwords will take place on May 25-28,
2014 at Kulturbrauerei Berlin.

Berlin Buzzwords is looking for speakers who submit talks on the
following topics:

* Information Retrieval / Search i.e. Lucene, Solr, katta, ElasticSearch or
comparable solutions

* NoSQL and SQL i.e. CouchDB, MongoDB, Jackrabbit, Hbase and others

* Large Data Processing i.e. Hadoop itself, MapReduce, Cascading, Pig,
Spark and friends

Closely related topics not explicity listed above are welcome as well.

The Call for Submissions will be open until February 9! Be part of
Berlin Buzzwords and submit your session idea. Please register here:
.

Looking forward to lots of interesting proposals - and looking forward to
meeting all of you in Berlin later this year (did I mention that Berlin
rocks in summer?)


Isabel

PS: As always, any help with spreading the word is highly welcome.

PS2: One final hint - even though speakers of course get a complimentary
conference pass make sure to still check out our ticket page in
particular if you'd like to bring your children to the conference - we
do provide child day care on a donation basis but need your registration
for capacity planning: http://berlinbuzzwords.de/tickets



Re: Mahout fpg

2013-11-29 Thread Isabel Drost-Fromm
On Fri, 22 Nov 2013 17:55:13 +0800
Jason Lee  wrote:

> I noticed lots of algorithms implementations has deprecated in Mahout
> 0.8 and removed in 0.9,  but no reasons or comments been marked. Can
> i ask why?

As Suneel mentioned earlier: Before removing these algorithms we asked
on the user list for input on what users really needed.

If you need anything that was marked deprecated you are welcome to step
up, provide patches and improvements to re-vive implementations that
are currently in the danger of being deleted soon.


> Btw, Mahout API is a little lack javadoc comments, every contributors
> of Mahout should has the responsibility to add more javadoc comments
> to the java file they created.

Not an excuse but maybe a step forward: If you find classes and
packages lacking documentation that you know well (or are in the
process of getting to know well) we'd be grateful if you could provide
the missing documentation as a patch to the code base*. 


Isabel

* Also in my experience documentation patches tend to be easier to get
  approval for from your employer than donating whole new
  implementations that you have developed internally...


Re: java.lang.NoClassDefFoundError: com/google/common/base/Preconditions

2013-11-29 Thread Isabel Drost-Fromm
On Thu, 28 Nov 2013 13:24:26 +0530
Tharindu Rusira  wrote:

> Yes that's the exact issue Suneel, it was a careless mistake while
> adding projects to Eclipse that I missed those .jars.

When changing Mahout code make sure to either run

mvn eclipse:eclipse before importing the project into your workspace or
enable maven support in Eclipse.

When integrating Mahout into your project it's best to use Maven, Ivy,
Gradle or some other build system that supports resolving transitive
dependencies automatically to avoid these issues.


Isabel


Re: Could OpenNLP use Mahout for classification?

2013-04-10 Thread Isabel Drost-Fromm

Hi Jörn,


On Tuesday, April 09, 2013 10:12:47 PM Jörn Kottmann wrote:
> Logistic Regression (is that similar to our maxent ?)
> Online Passive Aggressive
> HMM

> The datasets we are training OpenNLP are usually rather small and can
> easily be processed with a single CPU, does Mahout support training on
> small scale datasets as well?

In particular the Logistic Regression and HMM stuff should be well suitable 
even for smaller data sets. You can find the JavaDoc for each there:



and here:



Both have versions that can run standalone on a single box - they may come 
with Hadoop as a dependency, mainly for serializing vectors and matrices to 
disk but not for computation distribution.


Isabel