Re: Introduction

2009-04-05 Thread Isabel Drost
On Thursday 02 April 2009, Daniel Nee wrote: > I've been following Hadoop and the Mahout project for a while now and > I thought I should introduce myself. I'm Daniel Nee, I am a master's > student at University College London studying Computational Statistics > and Machine Learning. Welcome Dani

Re: Introduction

2009-04-02 Thread Ted Dunning
Having you guys work together is entirely in keeping and compatible with both the open source ideas and google summer of code ideas. So, Daniel, don't imagine that this idea is "taken". Your suggestions and code (parallel or sequential) are highly valued. 2009/4/2 Yifan Wang > > I am Yifan. Gl

re: Introduction

2009-04-02 Thread Yifan Wang
- 发件人: Daniel Nee [mailto:nee.dan...@googlemail.com] 发送时间: 2009年4月2日 16:53 收件人: mahout-dev@lucene.apache.org 主题: Introduction Hi all, I've been following Hadoop and the Mahout project for a while now and I thought I should introduce myself. I'm Daniel Nee, I am a master's studen

Introduction

2009-04-02 Thread Daniel Nee
Hi all, I've been following Hadoop and the Mahout project for a while now and I thought I should introduce myself. I'm Daniel Nee, I am a master's student at University College London studying Computational Statistics and Machine Learning. Before that I did my undergraduate in Computer Science at

Re: Introduction for student interested in GSoC

2009-03-31 Thread David Hall
Here's a followup proposal (submitted to GSOC's site. I will add it to the wiki, but I'm having trouble accessing it right now) Thanks! -- David Title/Summary: Distributed Latent Dirichlet Allocation Student: David Hall Student e-mail: d...@cs.stanford.edu Student Major: Symbolic Systems/ Co

Re: Introduction for student interested in GSoC

2009-03-25 Thread David Hall
On Wed, Mar 25, 2009 at 6:41 PM, Ted Dunning wrote: > Groovy closures are just objects as well, but they can't easily be > serialized because they can capture references to other objects which are > unlikely to exist on the far machine. Same problem in Scala... But I just punt and assume people b

Re: Introduction for student interested in GSoC

2009-03-25 Thread Ted Dunning
One very nice thing that Cascading allows in the logical flow is that it allows groups and joins to be expressed which it then translates and schedules reasonably well into MR programs in which the appropriate functions are all collected as you suggest. On Wed, Mar 25, 2009 at 1:23 PM, David Hall

Re: Introduction for student interested in GSoC

2009-03-25 Thread Ted Dunning
Groovy closures are just objects as well, but they can't easily be serialized because they can capture references to other objects which are unlikely to exist on the far machine. Can you say more about the compiler plugin? Or provide a pointer? Also, in your example here, how would you deal with

Re: Introduction for student interested in GSoC

2009-03-25 Thread David Hall
On Wed, Mar 25, 2009 at 10:56 AM, Ted Dunning wrote: > David, > > You are right that this is veering a little bit away from Mahout's central > focus.  We will have to beg a bit of forgiveness on that. I'm not picky, certainly... :-) > > I have a question for you and some hints about useful direc

Re: Introduction for student interested in GSoC

2009-03-25 Thread Ted Dunning
David, You are right that this is veering a little bit away from Mahout's central focus. We will have to beg a bit of forgiveness on that. I have a question for you and some hints about useful directions. First, is is possible for Scala to move the byte code or other representation of a closure

Re: Introduction for student interested in GSoC

2009-03-25 Thread Ted Dunning
Get her to do it for mahout! Tell her that an open source dissertation is a great way to get noticed. On Tue, Mar 24, 2009 at 5:04 PM, David Hall wrote: > Actually, my officemate's dissertation project is very closely related > to this, except using parsing as a "base". That is to say, I probab

Re: Introduction for student interested in GSoC

2009-03-25 Thread Grant Ingersoll
On Mar 25, 2009, at 12:34 AM, David Hall wrote: On Tue, Mar 24, 2009 at 4:15 PM, Ted Dunning wrote: This sounds fantastic. I think that your scala code is interesting, but your thoughts on LDA are much more so. I tried doing a similar simplification of map-reduce program writing using

Re: Introduction for student interested in GSoC

2009-03-24 Thread David Hall
On Tue, Mar 24, 2009 at 4:34 PM, David Hall wrote: > On Tue, Mar 24, 2009 at 4:15 PM, Ted Dunning wrote: >> It would also be interesting to see how you might attack semi-supervised >> multi-task learning using a well-founded Bayesian approach.  For a >> non-Bayesian example with impressive result

Re: Introduction for student interested in GSoC

2009-03-24 Thread David Hall
On Tue, Mar 24, 2009 at 4:15 PM, Ted Dunning wrote: > This sounds fantastic. > > I think that your scala code is interesting, but your thoughts on LDA are > much more so.  I tried doing a similar simplification of map-reduce program > writing using groovy and found that in spite of even smaller pr

Re: Introduction for student interested in GSoC

2009-03-24 Thread Ted Dunning
This sounds fantastic. I think that your scala code is interesting, but your thoughts on LDA are much more so. I tried doing a similar simplification of map-reduce program writing using groovy and found that in spite of even smaller programs than you quote for word-count, that the benefits in pra

Re: Introduction for student interested in GSoC

2009-03-24 Thread Vincent Hu
Hi David, I have heard your speech on your POS research about a way ago. and Nice to meet you here. Good luck! regards, Vincent On Tue, Mar 24, 2009 at 12:26 AM, David Hall wrote: > Hi, > > I'll begin by saying I'm not 100% sure how this works; I am new to the > Mahout mailing list, though I ha

Introduction for student interested in GSoC

2009-03-24 Thread David Hall
Hi, I'll begin by saying I'm not 100% sure how this works; I am new to the Mahout mailing list, though I have been subscribed to the Hadoop list. I have been following the Mahout project with interest for some time, however. I'm David Hall, a graduating master's student in the Natural Language Pr

Re: introduction

2009-03-17 Thread Ted Dunning
Double plus concur with Grant's enthusiasm. Docs are almost more valuable than code! On Tue, Mar 17, 2009 at 11:35 AM, Grant Ingersoll wrote: > > On Mar 17, 2009, at 2:12 PM, Jessy Cowan-sharp wrote: > > hi everyone, >> >> i've been lurking on the list for a few weeks. i'm a california-based gr

Re: introduction

2009-03-17 Thread Grant Ingersoll
On Mar 17, 2009, at 2:12 PM, Jessy Cowan-sharp wrote: hi everyone, i've been lurking on the list for a few weeks. i'm a california- based grad student in NLP/digital forensics and will be using various data mining and machine learning techniques in my masters thesis. i wrote a blog post a

introduction

2009-03-17 Thread Jessy Cowan-sharp
hi everyone, i've been lurking on the list for a few weeks. i'm a california-based grad student in NLP/digital forensics and will be using various data mining and machine learning techniques in my masters thesis. i wrote a blog post a while back on 'disposable science' (and my frustration with it)