For small problems, you can even retain the training data in memory for
maximum speed.
On Fri, Aug 5, 2011 at 9:59 PM, Xiaobo Gu wrote:
> Hi Stanley,
> Can you help with this:
>
> You might encode the
> feature to vector and serialize them to the file system by MapReduce to
> reduce cost on
Hi Stanley,
Can you help with this:
You might encode the
feature to vector and serialize them to the file system by MapReduce to
reduce cost on data parsing.
And I have started a new thread on
http://mail-archives.apache.org/mod_mbox/mahout-dev/201108.mbox/%3cCACOCgckzcAm4V8y3CQhnBWtUy9jVgAbK
There are a few others as well.
>From the code, there are these:
public void setInterval(int interval)
public void setInterval(int minInterval, int maxInterval)
public void setPoolSize(int poolSize)
public void setThreadCount(int threadCount)
public void setAucEvaluator(OnlineAuc auc)
priva
Hi Ted,
Are interval, averagingWindow, thread count, and prior Fuction the
only four tuneable options of AdaptiveLogisticRegression?
Regards,
Xiaobo Gu
On Tue, May 10, 2011 at 11:26 PM, Ted Dunning wrote:
> In the meantime, look at building your own command line tool for
> AdaptiveLogisticReg
Please file a bug report at http://issues.apache.org/jira/browse/MAHOUT
Attach a diff file with the extension .patch. Create the diff at the mahout
top directory.
On Tue, May 17, 2011 at 8:49 PM, Xiaobo Gu wrote:
> I have write a command line program proto type for RunAdaptiveLogistic,
> 1. Ho
On Sun, May 8, 2011 at 4:22 AM, Ted Dunning wrote:
> You can't do that directly.
>
> You can use the http address of the file in HDFS.
What's the HTTP URL for a example file named /data/gpwext/data.csv
HTTP://namenode:8020/data/gpwext/data.csv?
>
> Note also that trainlogistic and runlogistic a
I have write a command line program proto type for RunAdaptiveLogistic,
1. How can I make it invokeable from mahout
2. Can you help to fine tune the AdaptiveLogisticRegression creating
and settings to make it make sense.
On Tue, May 10, 2011 at 11:30 PM, Ted Dunning wrote:
> Great idea. Why d
Great idea. Why don't you implement something like what you need? Others
will be happy to contribute improvements.
On Tue, May 10, 2011 at 8:26 AM, XiaoboGu wrote:
> > There isn't a good command line for this, largely because it is difficult
> to
> > describe how to convert each CSV field. Th
In the meantime, look at building your own command line tool for
AdaptiveLogisticRegression.
On Tue, May 10, 2011 at 8:25 AM, Ted Dunning wrote:
> Go for it.
>
> Produce a JIRA and a patch.
>
>
> On Tue, May 10, 2011 at 8:19 AM, XiaoboGu wrote:
>
>> Can you add a --algorithm option to the train
> -Original Message-
> From: Ted Dunning [mailto:ted.dunn...@gmail.com]
> Sent: Thursday, May 05, 2011 11:22 PM
> To: user@mahout.apache.org
> Subject: Re: Is any more detailed documentation aout the sgd logistic
> regression example.
>
> On Thu, May 5, 20
Go for it.
Produce a JIRA and a patch.
On Tue, May 10, 2011 at 8:19 AM, XiaoboGu wrote:
> Can you add a --algorithm option to the trainlogistic and runlogistic
> program, and other options need by specific algorithms, such as using L1 or
> L2 prior, then TL and RL will be production ready tool
> -Original Message-
> From: Ted Dunning [mailto:ted.dunn...@gmail.com]
> Sent: Sunday, May 08, 2011 4:23 AM
> To: user@mahout.apache.org
> Subject: Re: Is any more detailed documentation aout the sgd logistic
> regression example.
>
> You can't do that d
You can't do that directly.
You can use the http address of the file in HDFS.
Note also that trainlogistic and runlogistic are intended pretty much only
for simple demonstration purposes.
For large scale data, you should use AdaptiveLogisticRegression
2011/5/7 Xiaobo Gu
> trainlogistic and ru
trainlogistic and runlogistic
2011/5/7, Ted Dunning :
> Huh?
>
> What program are you talking about?
>
> On Fri, May 6, 2011 at 9:36 PM, Xiaobo Gu wrote:
>
>> >> > 2. In production mode, don't use csv, you will find most of the time
>> >> spent
>> >> > are on parse the csv data and hash them to f
Huh?
What program are you talking about?
On Fri, May 6, 2011 at 9:36 PM, Xiaobo Gu wrote:
> >> > 2. In production mode, don't use csv, you will find most of the time
> >> spent
> >> > are on parse the csv data and hash them to features. You might encode
> the
> >> > feature to vector and serial
On Thu, May 5, 2011 at 11:21 PM, Ted Dunning wrote:
> On Thu, May 5, 2011 at 7:48 AM, Xiaobo Gu wrote:
>
>> On Thu, May 5, 2011 at 10:40 PM, Stanley Xu wrote:
>> > 1. You could use the command line to add shape as category features, it
>> will
>> > hash categoryname=value as the feature and set
On Thu, May 5, 2011 at 7:48 AM, Xiaobo Gu wrote:
> On Thu, May 5, 2011 at 10:40 PM, Stanley Xu wrote:
> > 1. You could use the command line to add shape as category features, it
> will
> > hash categoryname=value as the feature and set the value as 1.0, it is
> the
> > standard way to convert a
t users
>> >>> of
>> >>> Mahout are data analysts, who can't write Java code, a command line is
>> >>> more
>> >>> convenient. Some specific questions are :
>> >>> 1. What format should we apply when preparing data for logistic
>>
gt;> 3. How can interpret the results.
>>>
>>> Because Logistic Regression is the working horse of credit scoring in
>>> industry, I think it will make Mahout friends of more analysts if LR support
>>> is smooth.
>>>
>>> Regards,
>>>
&
on is the working horse of credit scoring in
>>> industry, I think it will make Mahout friends of more analysts if LR support
>>> is smooth.
>>>
>>> Regards,
>>>
>>> Xiaobo Gu
>>>
>>> From: Ted Dunning [mailto:ted.dunn...@gma
f credit scoring in
>> industry, I think it will make Mahout friends of more analysts if LR support
>> is smooth.
>>
>> Regards,
>>
>> Xiaobo Gu
>>
>> From: Ted Dunning [mailto:ted.dunn...@gmail.com]
>> Sent: Wednesday, April 13, 2011 1:02 AM
>
working horse of credit scoring in
> industry, I think it will make Mahout friends of more analysts if LR support
> is smooth.
>
> Regards,
>
> Xiaobo Gu
>
> From: Ted Dunning [mailto:ted.dunn...@gmail.com]
> Sent: Wednesday, April 13, 2011 1:02 AM
> To: user@mahout.apach
e: Is any more detailed documentation aout the sgd logistic
regression example.
Can you be more specific about what you have and what you want?
The book Mahout in Action provides quite a lot of details with sample code for
a server farm.
The TrainNewsGroups example provides code that you can copy
Woohoo!
On Wed, Apr 13, 2011 at 7:15 AM, Eric Charles
wrote:
> Can't wait for that :)
> Just bought PDF.
> Tks,
> - Eric
>
> On 13/04/2011 06:57, Ted Dunning wrote:
>>
>> Yes. That's the one.
>>
>> The hard copy should be out before long. The final passes by the
>> production
>> editors are hap
Can't wait for that :)
Just bought PDF.
Tks,
- Eric
On 13/04/2011 06:57, Ted Dunning wrote:
Yes. That's the one.
The hard copy should be out before long. The final passes by the production
editors are happening now.
On Tue, Apr 12, 2011 at 9:19 PM, Eric Charleswrote:
You were talking about
The book is all there. All that is happening now are tiny edits and final
production formatting.
On Tue, Apr 12, 2011 at 9:22 PM, wrote:
> Eric the book is still being written, but you can buy the interim PDF
> version from the site. It seems quite complete (save for a few typos here
> and ther
Yes. That's the one.
The hard copy should be out before long. The final passes by the production
editors are happening now.
On Tue, Apr 12, 2011 at 9:19 PM, Eric Charles wrote:
> You were talking about 'Mahout in Action' book.
> I suppose you were referring about the EBook version.
> Hard copy
Yeah, it works for me. Nice lecture!
On Apr 12, 2011, at 9:03 PM, Ted Dunning wrote:
> Pity. Don't think I can help. Talk to your internet provider.
>
> On Tue, Apr 12, 2011 at 7:28 PM, Xiaobo Gu wrote:
>
>> On Wed, Apr 13, 2011 at 1:03 AM, Ted Dunning
>> wrote:
>>> This lecture might help
Eric the book is still being written, but you can buy the interim PDF
version from the site. It seems quite complete (save for a few typos here
and there).
The publisher will email you with updates as the document chapters are being
finalized. You also have the option of having the dead-tree editi
Hi Ted,
Video and PDF are accessible from here.
Very instructive. Tks!
You were talking about 'Mahout in Action' book.
I suppose you were referring about the EBook version.
Hard copy are not yet available as far as I can read on
http://www.manning.com/owen/.
Any idea on shipping date ?
Tks,
-
Pity. Don't think I can help. Talk to your internet provider.
On Tue, Apr 12, 2011 at 7:28 PM, Xiaobo Gu wrote:
> On Wed, Apr 13, 2011 at 1:03 AM, Ted Dunning
> wrote:
> > This lecture might help
> > some:
> http://www.meetup.com/LA-HUG/pages/Video_from_March_16th_LA-HUG_Ted_Dunning_Mahout
>
On Wed, Apr 13, 2011 at 1:03 AM, Ted Dunning wrote:
> This lecture might help
> some: http://www.meetup.com/LA-HUG/pages/Video_from_March_16th_LA-HUG_Ted_Dunning_Mahout
Thanks, but I can't access the URL.
> On Tue, Apr 12, 2011 at 10:02 AM, Ted Dunning wrote:
>>
>> Can you be more specific abou
This lecture might help some:
http://www.meetup.com/LA-HUG/pages/Video_from_March_16th_LA-HUG_Ted_Dunning_Mahout
On Tue, Apr 12, 2011 at 10:02 AM, Ted Dunning wrote:
> Can you be more specific about what you have and what you want?
>
> The book Mahout in Action provides quite a lot of details wi
Can you be more specific about what you have and what you want?
The book Mahout in Action provides quite a lot of details with sample code
for a server farm.
The TrainNewsGroups example provides code that you can copy.
Do you have these resources? Do you want more? Did you want more theory?
O
Hi,
Documents about sgd logistic regression itself are welcome too.
Regards,
Xiaobo Gu
35 matches
Mail list logo