Re: What to Implement/Improve/Document?

2011-11-18 Thread urun dogan
Ok, today I will start to read SGD code which is in the repo and I will think about how to implement AGSD nicely. Your tricks are really useful. On 17 Nov 2011 18:16, "Ted Dunning" wrote:

Re: What to Implement/Improve/Document?

2011-11-17 Thread Ted Dunning
The key tricks are: - do the updates of the averaged model in a sparse fashion. This will require doubling the space kept by the model - determine when to switch to averaging In addition we should bring in at the same time - more flexibility on loss function (to allow the code to implement SVM

Re: What to Implement/Improve/Document?

2011-11-17 Thread urun dogan
Hi Ted; I start to read the paper and I think I will finish it today. It is a quite nice approach and thanks for supervision. Cheers Ürün On Wed, Nov 16, 2011 at 8:14 PM, Ted Dunning wrote: > On Wed, Nov 16, 2011 at 9:50 AM, urun dogan wrote: > > > > > I have written the previous email before

Re: What to Implement/Improve/Document?

2011-11-16 Thread Ted Dunning
On Wed, Nov 16, 2011 at 9:50 AM, urun dogan wrote: > > I have written the previous email before reading Josh's email. Are there > any objections if I conclude that: implementation of SGD/ASGD based methods > have priority and therefore I will start implement these methods soon ? > I think that t

Re: What to Implement/Improve/Document?

2011-11-16 Thread Ted Dunning
Regarding linear classifiers, I think that the cluster/classifier unification and introduction of ASGD are the only items of substantial impact. On Wed, Nov 16, 2011 at 9:39 AM, Josh Patterson wrote: > Could you then make a list of JIRAs that you think are more > interesting in the near term, po

Re: What to Implement/Improve/Document?

2011-11-16 Thread urun dogan
kernels, parallelize the algorithm and we can >> have >> > a online SVM method for large/web scale datasets. >> > >> >> Now this begins to sound right. >> >> Honestly I am so much into SVM and kernel machines and I fear that I am >> > making bi

Re: What to Implement/Improve/Document?

2011-11-16 Thread urun dogan
am > > making big fuss out of small problems. > > > My key question is whether you have problems that need solving. Or do you > have an itch to do an implementation for the sake of having the > implementation? > > Either one is a reasonable motive, but the first is prefer

Re: What to Implement/Improve/Document?

2011-11-16 Thread Josh Patterson
I'd have to admit my interest in SVMs is more of the "abstract curiosity" nature; In the case of needed focus in the near term, similar to how Grant tagged: https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&jqlQuery=labels+%3D+MAHOUT_INTRO_CONTRIBUTE Could you then make a list

Re: What to Implement/Improve/Document?

2011-11-16 Thread Ted Dunning
On Wed, Nov 16, 2011 at 12:09 AM, urun dogan wrote: > Hi All; > > As I mentioned, I really found interesting to implement SGD and Pegasos. We > can add Pegasos into SGD modules. Based on Leon Bottou's results, I would recommend a simple SGD implementation of SVM rather than Pegasos. http://leo

Re: What to Implement/Improve/Document?

2011-11-16 Thread urun dogan
Hi All; As I mentioned, I really found interesting to implement SGD and Pegasos. We can add Pegasos into SGD modules. However, I think there are two issues we need to clarify: 1) In general SGD like ideas are used for online learning (of course they can be converted to batch learning) and Pegasos

Re: What to Implement/Improve/Document?

2011-11-15 Thread Raphael Cendrillon
Hi Urun and Josh, I'd also be interested in helping out in whatever way I can. One question, I've noticed that MAHOUT-334 was not ultimately adopted. Do we know the reason for this? Would it be best to finish out the patch in 232, or instead add the functionality into the existing SGD modules

Re: What to Implement/Improve/Document?

2011-11-15 Thread Josh Patterson
Urun, I've been looking at MAHOUT-232 and reading Nello Cristianini's book on SVMs. It sounds like you've done considerable more work than I in this arena. I'd be interested in collaborating with you on finishing out this patch, if you are interested in that type arrangement (there is plenty of wor

Re: What to Implement/Improve/Document?

2011-11-15 Thread urun dogan
Dear Josh and Ted; Both ideas are very attractive. Honestly I want to do both of them. I am completely aware that this quite some work to do. As I mentioned before, I am a Postdoc now and I am trying to develop new techniques by using AGSD. During my PhD I developed an efficient solver for multicl

Re: What to Implement/Improve/Document?

2011-11-15 Thread Ted Dunning
ASGD is also an opportunity laying on the table. http://leon.bottou.org/projects/sgd It would be lovely to have the current SGD system upgraded to use ASGD and allow multiple loss functions to allow SVM training as well as the current logistic regression. I would be happy to supervise, but can't

Re: What to Implement/Improve/Document?

2011-11-15 Thread Josh Patterson
Urun, Sounds like you have quite a bit of SVM experience. There is always: https://issues.apache.org/jira/browse/MAHOUT-232 to take a look at which involves getting SVMs going in Mahout. I've looked at it a bit while working on some smaller patches, I'd be interested in discussing it with you giv

Re: What to Implement/Improve/Document?

2011-11-14 Thread Grant Ingersoll
https://cwiki.apache.org/confluence/display/MAHOUT/How+To+Contribute has some tips, ideas, etc. It's usually best to start with a few small patches to get your feet wet w/ the development process. On Nov 14, 2011, at 7:44 PM, Raphael Cendrillon wrote: > Hi Urun, > > I'm in a very similar s

Re: What to Implement/Improve/Document?

2011-11-14 Thread Raphael Cendrillon
Hi Urun, I'm in a very similar situation. I have a background (PhD) in optimization and signal processing and some experience with principal component analysis. I'm fairly comfortable with Java. I'm also very interested in Mahout, and large scale problems. If we can find a suitable area I wo

What to Implement/Improve/Document?

2011-11-14 Thread urun dogan
Hi All; I want to give my congratulation to all of the contributors of the project. I found the idea of this project so nice and I want to contribute to the project. I am postdoctoral researcher who is involved on developing machine learning algorithms. During my PhD I have developed several mult