Re: LDA for multi label classification was: Mahout Book

2009-10-16 Thread zhao zhendong
I have seen the implementation of L-LDA using Java, Stanford Topic Modeling Toolbox http://nlp.stanford.edu/software/tmt/ Does any one know whether they provide the source code or not? Thanks, Maxim On Fri, Oct 16, 2009 at 12:39 PM, David Hall d...@cs.berkeley.edu wrote: Sorry, this slipped out

Re: LDA for multi label classification was: Mahout Book

2009-10-16 Thread David Hall
On Fri, Oct 16, 2009 at 4:08 AM, zhao zhendong zhaozhend...@gmail.com wrote: I have seen the implementation of L-LDA using Java, Stanford Topic Modeling Toolbox http://nlp.stanford.edu/software/tmt/ Does any one know whether they provide the source code or not? I'm pretty sure it's scala, no?

Re: LDA for multi label classification was: Mahout Book

2009-10-15 Thread David Hall
Sorry, this slipped out of my inbox and I just found it! On Thu, Oct 8, 2009 at 12:05 PM, Robin Anil robin.a...@gmail.com wrote: Posting to the dev list. Great Paper Thanks!. Looks like L-LDA could be used to create some interesting examples. Thanks! The Paper shows L-LDA could be used to

LDA for multi label classification was: Mahout Book

2009-10-08 Thread Robin Anil
Posting to the dev list. Great Paper Thanks!. Looks like L-LDA could be used to create some interesting examples. The Paper shows L-LDA could be used to creating word-tag model for accurate tag(s) prediction given a document of words. I will complete reading and tell How much work is need to

Re: Mahout book

2009-09-23 Thread Isabel Drost
On Tue, 22 Sep 2009 14:43:03 -0400 zaki rahaman zaki.raha...@gmail.com wrote: Sounds good, I'd love to take a look at an outline. I too would love to see a cookbook style manual which focuses more on the details of implementation, how to optimize systems, best practices, etc. and fills in

Mahout book

2009-09-22 Thread Sean Owen
As I mentioned to some of you, there's a proposal to begin work on a book on Mahout. It sounds early, but the publisher assures me it's about the right time to begin, if we want a book out at roughly the time '1.0' rolls out in a year or so. I've heard support for the idea, and think it's a good

Re: Mahout book

2009-09-22 Thread zaki rahaman
the only one who feels this way, but any Mahout book should have some basic introductory background material -- some discussion about machine learning (classification, clustering), high level overviews of algorithms, and maybe some case studies/examples (why use mahout vs. other tools?). And of course

Re: Mahout book

2009-09-22 Thread Sean Owen
used to the ins and outs of using Mahout (I've made some hacks to source in my own environment and have done some testing, but nothing in production yet) but I'd love to help out on a book, maybe with some of the background material. Maybe I'm the only one who feels this way, but any Mahout book

Re: Mahout book

2009-09-22 Thread Isabel Drost
On Tuesday 22 September 2009 18:17:29 Sean Owen wrote: - Who else might be interested in being a co-author and putting in significant work? - Would anyone care to read the proposal before I send it in? - Would anyone help me, in the short term, draft an outline of the content of the

Re: Mahout book

2009-09-22 Thread Ted Dunning
I would amend that (again) to clustering, classification and recommendations at scale. With Hadoop where necessary. On Tue, Sep 22, 2009 at 9:48 AM, Sean Owen sro...@gmail.com wrote: I sense some consensus that Mahout v1.0 is primarily clustering, classification and recommendations at scale

Re: Mahout book

2009-09-22 Thread Grant Ingersoll
On Sep 22, 2009, at 12:59 PM, Ted Dunning wrote: I would amend that (again) to clustering, classification and recommendations at scale. With Hadoop where necessary. +1 On Tue, Sep 22, 2009 at 9:48 AM, Sean Owen sro...@gmail.com wrote: I sense some consensus that Mahout v1.0 is

Re: Mahout book

2009-09-22 Thread Sean Owen
The difference being, not emphasizing Hadoop? I understand that. I also recall we'd agreed that we were not realistically considering any other distributed processing framework in the near future, which I took to mean before v1.0? On Tue, Sep 22, 2009 at 11:59 AM, Ted Dunning

Re: Mahout book

2009-09-22 Thread Ted Dunning
The difference being that we focus on scalable. This might involve hadoop for some, all or none of the steps. My definition of scalable is handles data as big as nearly anybody produces. That may or may not require Hadoop to do. Many on-line learning systems are so fast that a single machine

Re: Mahout book

2009-09-22 Thread Tanton Gibbs
I hope I'm one of the targeted audience members for the book. I've used hadoop, done clustering (not with Mahout), have read about collaborative filtering, and plan on using Mahout in a business intelligence setting in 1-2 years. However, I've never used Mahout itself. What I would like to see

Re: Mahout book

2009-09-22 Thread Robin Anil
I could help out with internals of CBayes/Bayes, FPGrowth(if it becomes ready by then) and writeups or how to's to improve efficiency on different datasets. how to understand your data and to disable enable various parameters of CBayes/Bayes to fit non text data. Sparse database v/s dense

Re: Mahout book

2009-09-22 Thread Robin Anil
+1 for cookbook style. Thats what i meant when i said tuning CBayes/Bayes(there are around 4-5 knobs which you can modify for fitting you data perfectly On Tue, Sep 22, 2009 at 11:15 PM, Tanton Gibbs tanton.gi...@gmail.comwrote: I hope I'm one of the targeted audience members for the book.

Re: Mahout book

2009-09-22 Thread Sean Owen
There is certainly no reason to make 'using Hadoop, and nothing else' a long-term goal. I think there are many reasons to focus on Hadoop in the short term. And I think this book is about the short term, Mahout v1.0. That is I don't disagree -- there's every reason to state the long-term goal of

Re: Mahout book

2009-09-22 Thread Sean Owen
That is indeed how I am positioning it in this draft book proposal -- it's for a 'Mahout in Action' book from Manning. They want to understand why this wouldn't be just another Collective Intelligence in Action (which I do think is quite a good book, at least, I learned a good deal about Lucene

Re: Mahout book

2009-09-22 Thread zaki rahaman
Sean, Sounds good, I'd love to take a look at an outline. I too would love to see a cookbook style manual which focuses more on the details of implementation, how to optimize systems, best practices, etc. and fills in with some of the theory material where appropriate/needed. It wouldn't hurt to

Re: Mahout book

2009-09-22 Thread Ted Dunning
I think that there is a real need for a more general Learning at Scale book, but I don't think that any of us here are really qualified to write it. On Tue, Sep 22, 2009 at 11:00 AM, Sean Owen sro...@gmail.com wrote: At least, I can't write that theoretical book, and at the moment, if there is

Re: Mahout book

2009-09-22 Thread Lukáš Vlček
Hello Sean, as a Mahout fan I can help with charts, diagrams or schema pictures if needed. Let's make this book looking real good. Is it true that Manning is forcing authors to use MS Word? Still it should be possible to use PS, EPS or maybe PDF for vector graphics, correct? Anyway, I would love