[jira] Updated: (MAHOUT-6) Need a matrix implementation

2008-03-07 Thread Jeff Eastman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Eastman updated MAHOUT-6: -- Attachment: MAHOUT-6k.diff Moved all Vector and Matrix artifacts into a new org.apache.mahout.matrix pac

[jira] Commented: (MAHOUT-6) Need a matrix implementation

2008-03-07 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576495#action_12576495 ] Ted Dunning commented on MAHOUT-6: -- I agree that they should be together. I don't know whe

Re: [jira] Commented: (MAHOUT-6) Need a matrix implementation

2008-03-07 Thread Ted Dunning
Sounds good. I still like the idea of allowing updates, though, for people with less discipline. On 3/7/08 12:59 PM, "Jason Rennie (JIRA)" <[EMAIL PROTECTED]> wrote: > One way I get around the sorted constraint while constructing a sparse vector > is a SparseVectorBuilder class.

RE: [jira] Commented: (MAHOUT-6) Need a matrix implementation

2008-03-07 Thread Jeff Eastman
Vector has both dot() and cross() products. Are you looking at the latest .diff? I can easily move the vector package stuff back into matrix. It used to be there and I moved it into its own package just to "organize" it better. You are correct; however, that having them in the same package would a

[jira] Commented: (MAHOUT-6) Need a matrix implementation

2008-03-07 Thread Jason Rennie (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576402#action_12576402 ] Jason Rennie commented on MAHOUT-6: --- Hmm... actually, the HashMap implementation doubles ni

[jira] Commented: (MAHOUT-6) Need a matrix implementation

2008-03-07 Thread Jason Rennie (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576381#action_12576381 ] Jason Rennie commented on MAHOUT-6: --- Btw, noticed the matrix stuff is currently under utils

[jira] Commented: (MAHOUT-6) Need a matrix implementation

2008-03-07 Thread Jason Rennie (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576371#action_12576371 ] Jason Rennie commented on MAHOUT-6: --- Re: Jeff Sounds good. It think I might actually have

Re: [jira] Commented: (MAHOUT-4) Simple prototype for Expectation Maximization (EM)

2008-03-07 Thread Ted Dunning
That should be quite possible. K-means is an approximate implementation of an EM algorithm. On 3/7/08 10:51 AM, "Isabel Drost (JIRA)" <[EMAIL PROTECTED]> wrote: > As far as I know EM from clustering tasks it should be possible to port the > algorithm to a Hadoop setting in a similar way as th

[jira] Commented: (MAHOUT-4) Simple prototype for Expectation Maximization (EM)

2008-03-07 Thread Isabel Drost (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576324#action_12576324 ] Isabel Drost commented on MAHOUT-4: --- Hello Ankur, I checked out the patch yesterday. I was

Re: Google Summer of Code

2008-03-07 Thread Isabel Drost
On Friday 07 March 2008, Grant Ingersoll wrote: > Sounds good. I should also note that all mentoring should (barring > personal conversation) should take place on the dev list. That is, > decisions, discussions on what to do should be done on the list so > that we all benefit from the understandi

RE: [jira] Assigned: (MAHOUT-12) Point formatting and parsing improved (StringBuilder, no need for trailing comma).

2008-03-07 Thread Jeff Eastman
Ted noted an easy fix to my Excel use case that I wasn't aware of, so my point is agreeably moot. I concur that we ought to have additional Writable representations to make intra-Hadoop transfers more streamlined. This is certainly *not* too late to pursue. I would encourage you to propose a reco

[jira] Commented: (MAHOUT-6) Need a matrix implementation

2008-03-07 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576283#action_12576283 ] Ted Dunning commented on MAHOUT-6: -- Hashmaps in Java are surprisingly fast (very nearly 1 a

[jira] Issue Comment Edited: (MAHOUT-6) Need a matrix implementation

2008-03-07 Thread Jeff Eastman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576037#action_12576037 ] jeastman edited comment on MAHOUT-6 at 3/7/08 8:20 AM: --- Boy, I am s

Re: [jira] Commented: (MAHOUT-14) Need an SVM implementation

2008-03-07 Thread Grant Ingersoll
Sounds good, I was mistakenly under the impression that this was something you were going to donate... These other implementations are also good for feeding the same input and comparing results, I would imagine, although I don't think one for one results comparison is necessary. On Mar 7

Re: [jira] Commented: (MAHOUT-14) Need an SVM implementation

2008-03-07 Thread Paul Elschot
Op Friday 07 March 2008 13:14:55 schreef Grant Ingersoll: > On Mar 7, 2008, at 3:29 AM, Paul Elschot (JIRA) wrote: > >[ > > https://issues.apache.org/jira/browse/MAHOUT-14?page=com.atlassian. > >jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId= > >12576077 #action_12576077 ]

Re: [jira] Commented: (MAHOUT-4) Simple prototype for Expectation Maximization (EM)

2008-03-07 Thread Grant Ingersoll
Hi Ankur, I haven't had a chance to look at this yet, but I will most likely after the 12 since I am swamped at work right now. I do want to get it in Mahout, so if some other committer has the time before than, that would be good too. -Grant On Mar 7, 2008, at 7:43 AM, Ankur (JIRA) wr

[jira] Commented: (MAHOUT-4) Simple prototype for Expectation Maximization (EM)

2008-03-07 Thread Ankur (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576166#action_12576166 ] Ankur commented on MAHOUT-4: Hi Grant, Were you able to get the patch working ? A

Re: Google Summer of Code

2008-03-07 Thread Grant Ingersoll
On Mar 7, 2008, at 3:08 AM, Isabel Drost wrote: On Thursday 06 March 2008, Grant Ingersoll wrote: I think we can split the duties a bit, too. I think the Apache FAQ also said that - according with the usual Apache way of doing things - it would be ok if the GSoC students would receive he

Re: [jira] Commented: (MAHOUT-14) Need an SVM implementation

2008-03-07 Thread Grant Ingersoll
On Mar 7, 2008, at 3:29 AM, Paul Elschot (JIRA) wrote: [ https://issues.apache.org/jira/browse/MAHOUT-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576077 #action_12576077 ] Paul Elschot commented on MAHOUT-14: ---

Re: Class Loader Problem

2008-03-07 Thread Dawid Weiss
Ok, I'll look into it through the weekend -- there is a JIRA task for this (and the JAR hint issues), so keep an eye on it. D. Jeff Eastman wrote: Hi Dawid, I figured somebody who really understands class loaders would be able to improve on my initial implementation. I don't have a small

Re: Google Summer of Code

2008-03-07 Thread Dawid Weiss
What about encouraging your students to submit their work at Mahout? Just a naive thought of mine. Those students I'm in charge of have their area of interest defined already -- too late to change it. Good idea for the future, I have been thinking about it, actually. D.

Re: [jira] Assigned: (MAHOUT-11) Static fields used throughout clustering code (Canopy, K-Means).

2008-03-07 Thread Dawid Weiss
I do see a few advantages of using static variables, actually -- I just wasn't sure if it's contractual for Hadoop jobs to run in isolation from other jobs. This is a refactoring rather than functionality improvement, so I'll leave the issue open for some time; once I get a spare minute I'll l

[jira] Commented: (MAHOUT-14) Need an SVM implementation

2008-03-07 Thread Paul Elschot (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576077#action_12576077 ] Paul Elschot commented on MAHOUT-14: I've mentioned svmlin before: http://people.cs.uchi

Re: [jira] Assigned: (MAHOUT-12) Point formatting and parsing improved (StringBuilder, no need for trailing comma).

2008-03-07 Thread Dawid Weiss
The Excel scenario doesn't really convince me much, Jeff. For one thing, I don't have Excel, but this is a minor issue, for another -- I don't think anyone will actually import stuff that's supposed to be very large (that's why we do it in Hadoop, don't we) into a spreadsheet. In fact I did

Re: Google Summer of Code

2008-03-07 Thread Isabel Drost
On Thursday 06 March 2008, Matthew Riley wrote: > I would basically be interested in doing anything that fits in well with > the overall goals of the Mahout project. Whether that is implementing well > known algorithms within the Hadoop framework or working on some novel idea > is up to the mentors

[jira] Created: (MAHOUT-14) Need an SVM implementation

2008-03-07 Thread Paul Elschot (JIRA)
Need an SVM implementation -- Key: MAHOUT-14 URL: https://issues.apache.org/jira/browse/MAHOUT-14 Project: Mahout Issue Type: Wish Components: Classification Reporter: Paul Elschot Pr

Re: Google Summer of Code

2008-03-07 Thread Isabel Drost
On Thursday 06 March 2008, Grant Ingersoll wrote: > I think we can split the duties a bit, too. I think the Apache FAQ also said that - according with the usual Apache way of doing things - it would be ok if the GSoC students would receive help from all community members. So the actual time spe