On Monday 31 March 2008, Jeff Eastman wrote:
> I think we can refer to external datasets in our documentation and load
> them on demand when we run against them. That way we do not have to store
> them either.
So I guess, we should just come up with a list of dataset that are interesting
to us.
On Sunday 30 March 2008, Rodrigo Tripodi wrote:
> I've chosen to implement one clustring and one classification algorithm, a
> priori the EM and SVM algorithms.
There is a patch still in JIRA (Mahout-4) that contains a simple EM prototype.
It is still non-parallel and could be polished. But maybe
I have added a pdf version for those that do not have oo:
http://www.isabel-drost.de/mahout_fast_feather.pdf
This evening, I will add the missing content of the "Problem setting" slide
and refactor the "Who we are" slide with your pictures and the missing names.
Isabel
--
Most people want eit
[
https://issues.apache.org/jira/browse/MAHOUT-22?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ted Dunning updated MAHOUT-22:
--
Attachment: MAHOUT-22.patch
Here are the trivial changes.
> Several matrix exceptions are checked excep
[
https://issues.apache.org/jira/browse/MAHOUT-21?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ted Dunning updated MAHOUT-21:
--
Attachment: MAHOUT-21.patch
This is a draft of a continuous variable EP optimizer based on "Recorded Ste
Several matrix exceptions are checked exceptions, but should be unchecked
-
Key: MAHOUT-22
URL: https://issues.apache.org/jira/browse/MAHOUT-22
Project: Mahout
Issue Typ
Need reference implementation of Evolutionary Programming
-
Key: MAHOUT-21
URL: https://issues.apache.org/jira/browse/MAHOUT-21
Project: Mahout
Issue Type: New Feature
Repor
I think we can refer to external datasets in our documentation and load them
on demand when we run against them. That way we do not have to store them
either.
Jeff
Jeff Eastman, Ph.D.
Windward Solutions Inc.
+1.415.298.0023
http://windwardsolutions.com
http://jeffeastman.blogspot.com
> -O
Both good points.
On 3/30/08 3:38 PM, "Paul Elschot" <[EMAIL PROTECTED]> wrote:
> Op Sunday 30 March 2008 20:51:40 schreef Ted Dunning:
>> I am sure that the entire Mahout community will be happy to help.
>>
>> You may find, however, that naïve Bayes is trivially parallel (and
>> not very diff
Thanks.
If I can't finish the whole project in summer which I'll definitely try,
then I'll manage to finish after GSoC.
On Mon, Mar 31, 2008 at 4:20 AM, Isabel Drost <[EMAIL PROTECTED]>
wrote:
> On Sunday 30 March 2008, Ted Dunning wrote:
> > This is an excellent proposal. It might be a little b
ok. thank you.
2008/3/30, Grant Ingersoll <[EMAIL PROTECTED]>:
>
> Sounds reasonable. Make sure you include information on timelines,
> bio, etc. There are many emails in the archive discussing various
> aspects of GSOC.
>
> Good luck,
>
> Grant
>
>
> On Mar 30, 2008, at 5:20 PM, Rodrigo Tripodi
Op Sunday 30 March 2008 20:51:40 schreef Ted Dunning:
> I am sure that the entire Mahout community will be happy to help.
>
> You may find, however, that naïve Bayes is trivially parallel (and
> not very difficult even without parallelism). That means you may
> want to have something additional to
Sounds reasonable. Make sure you include information on timelines,
bio, etc. There are many emails in the archive discussing various
aspects of GSOC.
Good luck,
Grant
On Mar 30, 2008, at 5:20 PM, Rodrigo Tripodi wrote:
Hello everybody,
I know it's a little bit late, but I'm really excit
See here for a picture of me: http://www.veoh.com/users/ted
On 3/30/08 1:29 PM, "Isabel Drost" <[EMAIL PROTECTED]> wrote:
>
> Hello,
>
> my proposal for presenting our project at the Fast Feather session at Apache
> Con EU was accepted.
>
> I am currently about to prepare the slides for my t
Hello everybody,
I know it's a little bit late, but I'm really excited to submit a google
summer of code proposal for the Mahout project. I've read there is already a
k-means implementation, so I've decided to implement another algorithm. I've
chosen to implement one clustring and one classificati
Hello,
my proposal for presenting our project at the Fast Feather session at Apache
Con EU was accepted.
I am currently about to prepare the slides for my talk. I would like to
include one slide on the project members that were so crazy to start all this
half a year ago. It would be nice if I
On Sunday 30 March 2008, Jeff Eastman wrote:
> I'm working with my colleagues at CollabNet who have expressed interest in
> providing us some EC2 time for this sort of testing.
Sounds great to me.
> They are working on EC2 deployment of Hadoop using their CUBiT machine
> allocation environment a
On Sunday 30 March 2008, Ted Dunning wrote:
> This is an excellent proposal. It might be a little bit ambitious for a
> summer, but it is nicely separated so that partial success will stand
> alone.
+1
--
They are called computers simply because computation is the only significant
job that has
[
https://issues.apache.org/jira/browse/MAHOUT-20?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583481#action_12583481
]
Isabel Drost commented on MAHOUT-20:
I have already done some migration for the distance
[
https://issues.apache.org/jira/browse/MAHOUT-20?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Isabel Drost reassigned MAHOUT-20:
--
Assignee: Isabel Drost (was: Jeff Eastman)
> Migrate Canopy and KMeans Implementations to Vecto
Migrate Canopy and KMeans Implementations to Vectors
Key: MAHOUT-20
URL: https://issues.apache.org/jira/browse/MAHOUT-20
Project: Mahout
Issue Type: Task
Components: Clustering
I am sure that the entire Mahout community will be happy to help.
You may find, however, that naïve Bayes is trivially parallel (and not very
difficult even without parallelism). That means you may want to have
something additional to work on in the back of your mind.
On 3/30/08 6:50 AM, "Vit
This is an excellent proposal. It might be a little bit ambitious for a
summer, but it is nicely separated so that partial success will stand alone.
I would be happy to help mentor on this, as I expect would most of the
Mahout community.
On 3/30/08 4:41 AM, "Yun Jiang" <[EMAIL PROTECTED]> wro
[
https://issues.apache.org/jira/browse/MAHOUT-15?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Eastman updated MAHOUT-15:
---
Attachment: MAHOUT-15e.patch
This patch has improved javadoc comments and removes some debugging code.
I'm working with my colleagues at CollabNet who have expressed interest in
providing us some EC2 time for this sort of testing. They are working on EC2
deployment of Hadoop using their CUBiT machine allocation environment and
the quid pro quo would be that we help them exercise this tool. We have n
Hello everybody,
I know it's a little bit late, but I'm really excited to submit a google
summer of code proposal for the Mahout project. I've read there is already a
k-means implementation, so I've decided to implement another algorithm. I've
chosen to implement one clustring and one classificati
Hi Natallia,
Have a look at https://issues.apache.org/jira/browse/MAHOUT-9. I am
hoping to have something to put up after ApacheCon Europe, at which
point testing, help would be appreciated, so I am not sure it will
make sense for a GSOC project or not. Perhaps you would be interested
i
Hi,
My name is Natallia Vitalisova and I've applied for Google SoC 2008 to
implement the Naïve Bayes algorithm on Hadoop.
Either I will be accepted for SoC or not, I want to spend my time investigating
this topic which I consider to be very interesting. But I will certainly need a
mentor
Doesn't sound like you need a mentor :-) I'd just start by picking
something you are interested in and is useful for you and work on it
and submit a patch. Consider the community to be the mentor. Just
feel free to ask questions and put up patches. Patches don't have to
be perfect, they
Hi,
Here is my proposal. Hope you can give me some advice. Thanks a lot!
*Overview*
Among those ten machine learning algorithms mentioned by Cheng-Tao Chu et
al.[1], I'm really interested in Logistic Regression(LR). I would like to
implement a LR program hadoop which can classify both binary and m
On Saturday 29 March 2008, Ted Dunning wrote:
> SVM is not the only solution to these problems. For many search engine
> applications, it isn't even likely to be the best. Regularized logistic
> regression is a strong candidate as are random forests and boosted trees.
There have been several int
On Saturday 29 March 2008, Samee Zahur wrote:
> Being an undergrad student interested in the field of data-intensive machine
> learning techniques and applications, I am interested in implementing these
> algorithms as a way of getting an exposure into this field.
Great. Nice to have you here.
>
On Sunday 30 March 2008, you wrote:
> This is my application, give me feedback, please.
Sorry, I am having a slow network connection now and made the mistake to start
answering mails before everything was here. I saw your extended application
only after replying to your initial mail :(
Isabel
On Saturday 29 March 2008, Grant Ingersoll wrote:
> Finally, I would certainly like to encourage those who don't get
> selected to stick around and contribute.
+1 from me. In addition to what Grant already said, it is a great experience
to see your code end up in an Apache project.
Isabel
--
> I think it would be great if you could add a little
> more information to your
> application - if you have not already done so in the
> GSoC web form. Some
> ideas of useful information, that can help us judge
> your application:
>
> - your background
> - your reason for applying to do this
On Saturday 29 March 2008, Ted Dunning wrote:
> The basic outline is to set up a Jira request for enhancement that
> describes what you want to do, write or find a sequential version for
> reference and then start on the actual coding.
If you want to get funding for your project from Google, you m
On Saturday 29 March 2008, Marko Novakovic wrote:
> I apply for SVM algorithm at Hadoop platform.
> I hope that I will be accepted by Google and Appache,
> I am serious in intention to do this jos as great.
I think it would be great if you could add a little more information to your
application -
37 matches
Mail list logo