Re: SGD classifier demo app

2014-02-04 Thread Sebastian Schelter
Would be great to add this as an example to Mahout's codebase. On 02/04/2014 10:27 AM, Ted Dunning wrote: Frank, I just munched on your code and sent a pull request. In doing this, I made a bunch of changes. Hope you liked them. These include massive simplification of the reading and

Re: SGD classifier demo app

2014-02-04 Thread Ted Dunning
Yes. On Tue, Feb 4, 2014 at 1:31 AM, Sebastian Schelter s...@apache.org wrote: Would be great to add this as an example to Mahout's codebase. On 02/04/2014 10:27 AM, Ted Dunning wrote: Frank, I just munched on your code and sent a pull request. In doing this, I made a bunch of

Re: SGD classifier demo app

2014-02-04 Thread Frank Scholten
Thanks Ted! Would indeed be a nice example to add. On Tue, Feb 4, 2014 at 10:40 AM, Ted Dunning ted.dunn...@gmail.com wrote: Yes. On Tue, Feb 4, 2014 at 1:31 AM, Sebastian Schelter s...@apache.org wrote: Would be great to add this as an example to Mahout's codebase. On 02/04/2014

SGD classifier demo app

2014-02-03 Thread Frank Scholten
Hi all, I am exploring Mahout's SGD classifier and like some feedback because I think I didn't properly configure things. I created an example app that trains an SGD classifier on the 'bank marketing' dataset from UCI: http://archive.ics.uci.edu/ml/datasets/Bank+Marketing My app is at:

Re: SGD classifier demo app

2014-02-03 Thread Johannes Schulte
Hi Frank, you are using the feature vector encoders which hash a combination of feature name and feature value to 2 (default) locations in the vector. The vector size you configured is 11 and this is imo very small to the possible combination of values you have for your data (education, marital,