Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-08-05 Thread Xiaobo Gu
Hi Stanley, Can you help with this: You might encode the feature to vector and serialize them to the file system by MapReduce to reduce cost on data parsing. And I have started a new thread on http://mail-archives.apache.org/mod_mbox/mahout-dev/201108.mbox/%3cCACOCgckzcAm4V8y3CQhnBWtUy9jVgAbK

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-08-05 Thread Ted Dunning
For small problems, you can even retain the training data in memory for maximum speed. On Fri, Aug 5, 2011 at 9:59 PM, Xiaobo Gu wrote: > Hi Stanley, > Can you help with this: > > You might encode the > feature to vector and serialize them to the file system by MapReduce to > reduce cost on

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-12 Thread Ted Dunning
Can you be more specific about what you have and what you want? The book Mahout in Action provides quite a lot of details with sample code for a server farm. The TrainNewsGroups example provides code that you can copy. Do you have these resources? Do you want more? Did you want more theory? O

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-12 Thread Ted Dunning
This lecture might help some: http://www.meetup.com/LA-HUG/pages/Video_from_March_16th_LA-HUG_Ted_Dunning_Mahout On Tue, Apr 12, 2011 at 10:02 AM, Ted Dunning wrote: > Can you be more specific about what you have and what you want? > > The book Mahout in Action provides quite a lot of details wi

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-12 Thread Xiaobo Gu
On Wed, Apr 13, 2011 at 1:03 AM, Ted Dunning wrote: > This lecture might help > some: http://www.meetup.com/LA-HUG/pages/Video_from_March_16th_LA-HUG_Ted_Dunning_Mahout Thanks, but I can't access the URL. > On Tue, Apr 12, 2011 at 10:02 AM, Ted Dunning wrote: >> >> Can you be more specific abou

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-12 Thread Ted Dunning
Pity. Don't think I can help. Talk to your internet provider. On Tue, Apr 12, 2011 at 7:28 PM, Xiaobo Gu wrote: > On Wed, Apr 13, 2011 at 1:03 AM, Ted Dunning > wrote: > > This lecture might help > > some: > http://www.meetup.com/LA-HUG/pages/Video_from_March_16th_LA-HUG_Ted_Dunning_Mahout >

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-12 Thread Eric Charles
Hi Ted, Video and PDF are accessible from here. Very instructive. Tks! You were talking about 'Mahout in Action' book. I suppose you were referring about the EBook version. Hard copy are not yet available as far as I can read on http://www.manning.com/owen/. Any idea on shipping date ? Tks, -

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-12 Thread pcollins
Eric the book is still being written, but you can buy the interim PDF version from the site. It seems quite complete (save for a few typos here and there). The publisher will email you with updates as the document chapters are being finalized. You also have the option of having the dead-tree editi

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-12 Thread Chris Schilling
Yeah, it works for me. Nice lecture! On Apr 12, 2011, at 9:03 PM, Ted Dunning wrote: > Pity. Don't think I can help. Talk to your internet provider. > > On Tue, Apr 12, 2011 at 7:28 PM, Xiaobo Gu wrote: > >> On Wed, Apr 13, 2011 at 1:03 AM, Ted Dunning >> wrote: >>> This lecture might help

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-12 Thread Ted Dunning
Yes. That's the one. The hard copy should be out before long. The final passes by the production editors are happening now. On Tue, Apr 12, 2011 at 9:19 PM, Eric Charles wrote: > You were talking about 'Mahout in Action' book. > I suppose you were referring about the EBook version. > Hard copy

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-12 Thread Ted Dunning
The book is all there. All that is happening now are tiny edits and final production formatting. On Tue, Apr 12, 2011 at 9:22 PM, wrote: > Eric the book is still being written, but you can buy the interim PDF > version from the site. It seems quite complete (save for a few typos here > and ther

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-13 Thread Eric Charles
Can't wait for that :) Just bought PDF. Tks, - Eric On 13/04/2011 06:57, Ted Dunning wrote: Yes. That's the one. The hard copy should be out before long. The final passes by the production editors are happening now. On Tue, Apr 12, 2011 at 9:19 PM, Eric Charleswrote: You were talking about

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-13 Thread Lance Norskog
Woohoo! On Wed, Apr 13, 2011 at 7:15 AM, Eric Charles wrote: > Can't wait for that :) > Just bought PDF. > Tks, > - Eric > > On 13/04/2011 06:57, Ted Dunning wrote: >> >> Yes.  That's the one. >> >> The hard copy should be out before long.  The final passes by the >> production >> editors are hap

RE: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-19 Thread XiaoboGu
e: Is any more detailed documentation aout the sgd logistic regression example. Can you be more specific about what you have and what you want? The book Mahout in Action provides quite a lot of details with sample code for a server farm. The TrainNewsGroups example provides code that you can copy

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-19 Thread Stanley Xu
working horse of credit scoring in > industry, I think it will make Mahout friends of more analysts if LR support > is smooth. > > Regards, > > Xiaobo Gu > > From: Ted Dunning [mailto:ted.dunn...@gmail.com] > Sent: Wednesday, April 13, 2011 1:02 AM > To: user@mahout.apach

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-21 Thread Ted Dunning
f credit scoring in >> industry, I think it will make Mahout friends of more analysts if LR support >> is smooth. >> >> Regards, >> >> Xiaobo Gu >> >> From: Ted Dunning [mailto:ted.dunn...@gmail.com] >> Sent: Wednesday, April 13, 2011 1:02 AM >

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-04-26 Thread Xiaobo Gu
on is the working horse of credit scoring in >>> industry, I think it will make Mahout friends of more analysts if LR support >>> is smooth. >>> >>> Regards, >>> >>> Xiaobo Gu >>> >>> From: Ted Dunning [mailto:ted.dunn...@gma

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-02 Thread Xiaobo Gu
gt;> 3. How can interpret the results. >>> >>> Because Logistic Regression is the working horse of credit scoring in >>> industry, I think it will make Mahout friends of more analysts if LR support >>> is smooth. >>> >>> Regards, >>> &

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-05 Thread Ted Dunning
On Thu, May 5, 2011 at 7:48 AM, Xiaobo Gu wrote: > On Thu, May 5, 2011 at 10:40 PM, Stanley Xu wrote: > > 1. You could use the command line to add shape as category features, it > will > > hash categoryname=value as the feature and set the value as 1.0, it is > the > > standard way to convert a

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-06 Thread Xiaobo Gu
On Thu, May 5, 2011 at 11:21 PM, Ted Dunning wrote: > On Thu, May 5, 2011 at 7:48 AM, Xiaobo Gu wrote: > >> On Thu, May 5, 2011 at 10:40 PM, Stanley Xu wrote: >> > 1. You could use the command line to add shape as category features, it >> will >> > hash categoryname=value as the feature and set

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-07 Thread Ted Dunning
Huh? What program are you talking about? On Fri, May 6, 2011 at 9:36 PM, Xiaobo Gu wrote: > >> > 2. In production mode, don't use csv, you will find most of the time > >> spent > >> > are on parse the csv data and hash them to features. You might encode > the > >> > feature to vector and serial

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-07 Thread Xiaobo Gu
trainlogistic and runlogistic 2011/5/7, Ted Dunning : > Huh? > > What program are you talking about? > > On Fri, May 6, 2011 at 9:36 PM, Xiaobo Gu wrote: > >> >> > 2. In production mode, don't use csv, you will find most of the time >> >> spent >> >> > are on parse the csv data and hash them to f

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-07 Thread Ted Dunning
You can't do that directly. You can use the http address of the file in HDFS. Note also that trainlogistic and runlogistic are intended pretty much only for simple demonstration purposes. For large scale data, you should use AdaptiveLogisticRegression 2011/5/7 Xiaobo Gu > trainlogistic and ru

RE: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-10 Thread XiaoboGu
> -Original Message- > From: Ted Dunning [mailto:ted.dunn...@gmail.com] > Sent: Sunday, May 08, 2011 4:23 AM > To: user@mahout.apache.org > Subject: Re: Is any more detailed documentation aout the sgd logistic > regression example. > > You can't do that d

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-10 Thread Ted Dunning
Go for it. Produce a JIRA and a patch. On Tue, May 10, 2011 at 8:19 AM, XiaoboGu wrote: > Can you add a --algorithm option to the trainlogistic and runlogistic > program, and other options need by specific algorithms, such as using L1 or > L2 prior, then TL and RL will be production ready tool

RE: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-10 Thread XiaoboGu
> -Original Message- > From: Ted Dunning [mailto:ted.dunn...@gmail.com] > Sent: Thursday, May 05, 2011 11:22 PM > To: user@mahout.apache.org > Subject: Re: Is any more detailed documentation aout the sgd logistic > regression example. > > On Thu, May 5, 20

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-10 Thread Ted Dunning
In the meantime, look at building your own command line tool for AdaptiveLogisticRegression. On Tue, May 10, 2011 at 8:25 AM, Ted Dunning wrote: > Go for it. > > Produce a JIRA and a patch. > > > On Tue, May 10, 2011 at 8:19 AM, XiaoboGu wrote: > >> Can you add a --algorithm option to the train

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-10 Thread Ted Dunning
Great idea. Why don't you implement something like what you need? Others will be happy to contribute improvements. On Tue, May 10, 2011 at 8:26 AM, XiaoboGu wrote: > > There isn't a good command line for this, largely because it is difficult > to > > describe how to convert each CSV field. Th

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-17 Thread Xiaobo Gu
I have write a command line program proto type for RunAdaptiveLogistic, 1. How can I make it invokeable from mahout 2. Can you help to fine tune the AdaptiveLogisticRegression creating and settings to make it make sense. On Tue, May 10, 2011 at 11:30 PM, Ted Dunning wrote: > Great idea.  Why d

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-17 Thread Xiaobo Gu
On Sun, May 8, 2011 at 4:22 AM, Ted Dunning wrote: > You can't do that directly. > > You can use the http address of the file in HDFS. What's the HTTP URL for a example file named /data/gpwext/data.csv HTTP://namenode:8020/data/gpwext/data.csv? > > Note also that trainlogistic and runlogistic a

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-17 Thread Ted Dunning
Please file a bug report at http://issues.apache.org/jira/browse/MAHOUT Attach a diff file with the extension .patch. Create the diff at the mahout top directory. On Tue, May 17, 2011 at 8:49 PM, Xiaobo Gu wrote: > I have write a command line program proto type for RunAdaptiveLogistic, > 1. Ho

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-19 Thread Xiaobo Gu
Hi Ted, Are interval, averagingWindow, thread count, and prior Fuction the only four tuneable options of AdaptiveLogisticRegression? Regards, Xiaobo Gu On Tue, May 10, 2011 at 11:26 PM, Ted Dunning wrote: > In the meantime, look at building your own command line tool for > AdaptiveLogisticReg

Re: Is any more detailed documentation aout the sgd logistic regression example.

2011-05-20 Thread Ted Dunning
There are a few others as well. >From the code, there are these: public void setInterval(int interval) public void setInterval(int minInterval, int maxInterval) public void setPoolSize(int poolSize) public void setThreadCount(int threadCount) public void setAucEvaluator(OnlineAuc auc) priva