[jira] [Commented] (MAHOUT-705) MongoDB DataModel support

2011-05-31 Thread Mike Khristo (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041996#comment-13041996 ] Mike Khristo commented on MAHOUT-705: - Thanks for the patch. The patch works great, e

[jira] [Updated] (MAHOUT-717) LDAPrintTopics only prints first topic when outputting to stdout

2011-05-31 Thread Mat Kelcey (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mat Kelcey updated MAHOUT-717: -- Summary: LDAPrintTopics only prints first topic when outputting to stdout (was: LADPrintTopics only pr

[jira] [Updated] (MAHOUT-717) LADPrintTopics only prints first topic when outputting to stdout

2011-05-31 Thread Mat Kelcey (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mat Kelcey updated MAHOUT-717: -- Attachment: (was: mahout-717.patch) > LADPrintTopics only prints first topic when outputting to std

[jira] [Updated] (MAHOUT-717) LADPrintTopics only prints first topic when outputting to stdout

2011-05-31 Thread Mat Kelcey (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mat Kelcey updated MAHOUT-717: -- Attachment: mahout-717.patch > LADPrintTopics only prints first topic when outputting to stdout > -

[jira] [Updated] (MAHOUT-717) LADPrintTopics only prints first topic when outputting to stdout

2011-05-31 Thread Mat Kelcey (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mat Kelcey updated MAHOUT-717: -- Attachment: mahout-717.patch > LADPrintTopics only prints first topic when outputting to stdout > -

[jira] [Created] (MAHOUT-717) LADPrintTopics only prints first topic when outputting to stdout

2011-05-31 Thread Mat Kelcey (JIRA)
LADPrintTopics only prints first topic when outputting to stdout Key: MAHOUT-717 URL: https://issues.apache.org/jira/browse/MAHOUT-717 Project: Mahout Issue Type: Bug Affec

Re: AdaBoost

2011-05-31 Thread Hector Yee
Wojciech, I've opened a ticked you can watch https://issues.apache.org/jira/browse/MAHOUT-716 I should have the in core code ready in ~3 days. The gradient portion is easily parallelizable if you want to implement it as mapreduce. On Tue, May 24, 2011 at 1:57 PM, Wojciech Indyk wrote: > Hi! >

[jira] [Created] (MAHOUT-716) Implement Boosting

2011-05-31 Thread Hector Yee (JIRA)
Implement Boosting -- Key: MAHOUT-716 URL: https://issues.apache.org/jira/browse/MAHOUT-716 Project: Mahout Issue Type: New Feature Components: Classification Affects Versions: 0.6 Reporter: Hector Y

RE: [VOTE] Release Mahout 0.5, take 3

2011-05-31 Thread Jeff Eastman
I've installed the take 3 .gz bits and they built without issue. I've run the synthetic control examples on both my clusters and they also ran without issue. The clustering display examples run in local mode and SpectralK still has the same file path problem I reported earlier. The 20 newsgroups

Re: [GSoC] HMM formats

2011-05-31 Thread Sergey Bartunov
I'd written the ObservedSequenceWritable for myself as the start point and placed it at my fork on github https://github.com/sbos/mahout/tree/input Dhruv, feel free to criticize and/or modify the code. I will keep all shared parts in the "input" branch. It would be much better to be compatible wit

[GSoC] HMM formats

2011-05-31 Thread Sergey Bartunov
Hi all. I'd like to discuss several things about Dhruv's and mine projects which are related to parallel HMM functionality in Mahout. Since we're working on the different parts of the same thing there are some shared questions. By this mail I just want to initiate the communication within us and ke

[jira] [Commented] (MAHOUT-715) Use Kryo for serializing vectors

2011-05-31 Thread Dhruv Kumar (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041753#comment-13041753 ] Dhruv Kumar commented on MAHOUT-715: Thanks Gustavo, I'll have a look at this over the

Re: [VOTE] Release Mahout 0.5, take 3

2011-05-31 Thread Dmitriy Lyubimov
For what it's worth i can add +1 but on the same basis i did it before take 3. I won't be able to verify any of these new problems any time soon. On Tue, May 31, 2011 at 11:22 AM, Jeff Eastman wrote: > Will try to get you one today :) > > -Original Message- > From: Benson Margulies [mai

RE: [VOTE] Release Mahout 0.5, take 3

2011-05-31 Thread Jeff Eastman
Will try to get you one today :) -Original Message- From: Benson Margulies [mailto:bimargul...@gmail.com] Sent: Tuesday, May 31, 2011 10:59 AM To: dev@mahout.apache.org Subject: Re: [VOTE] Release Mahout 0.5, take 3 Legally, we need 3 +1's on this thread on this RC. On Tue, May 31, 201

Re: [VOTE] Release Mahout 0.5, take 3

2011-05-31 Thread Benson Margulies
Legally, we need 3 +1's on this thread on this RC. On Tue, May 31, 2011 at 1:15 PM, Jeff Eastman wrote: > I did some cluster testing of take 2 and only found one potential issue > surfaced by an example that is broken. Haven't waded through the 100+ > postings since I got back last night, but

[jira] [Updated] (MAHOUT-626) T1 and T2 Values in Canopy (& MeanShift)

2011-05-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-626: - Affects Version/s: 0.4 Fix Version/s: 0.6 JIRA housekeeping: assigning for 0.6 since I bet it's e

[jira] [Commented] (MAHOUT-715) Use Kryo for serializing vectors

2011-05-31 Thread Gustavo Salazar Torres (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041701#comment-13041701 ] Gustavo Salazar Torres commented on MAHOUT-715: --- Yes, I think a good practic

[jira] [Commented] (MAHOUT-715) Use Kryo for serializing vectors

2011-05-31 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041697#comment-13041697 ] Ted Dunning commented on MAHOUT-715: Github is wonderful for this as well. Do make su

[jira] [Commented] (MAHOUT-715) Use Kryo for serializing vectors

2011-05-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041698#comment-13041698 ] Sean Owen commented on MAHOUT-715: -- Not a prob, this might have been the best place after

[jira] [Resolved] (MAHOUT-676) Random samplers in a modular library

2011-05-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved MAHOUT-676. -- Resolution: Won't Fix Assignee: Sean Owen On this issue, I had the impression you were just posti

RE: [VOTE] Release Mahout 0.5, take 3

2011-05-31 Thread Jeff Eastman
I did some cluster testing of take 2 and only found one potential issue surfaced by an example that is broken. Haven't waded through the 100+ postings since I got back last night, but it looked good to me then. -Original Message- From: Ted Dunning [mailto:ted.dunn...@gmail.com] Sent: Tu

[jira] [Updated] (MAHOUT-663) Rationalize hadoop job creation with respect to setJarByClass

2011-05-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-663: - Affects Version/s: 0.5 Fix Version/s: 0.6 I agree that this can and should be changed. I think t

[jira] [Commented] (MAHOUT-715) Use Kryo for serializing vectors

2011-05-31 Thread Gustavo Salazar Torres (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041680#comment-13041680 ] Gustavo Salazar Torres commented on MAHOUT-715: --- Thanks! Sorry for the mess,

Re: [VOTE] Release Mahout 0.5, take 3

2011-05-31 Thread Ted Dunning
I am rushing around clearing the decks for my trip to Europe, but I very much hope to take a gander today. Over the weekend, I did use the trunk distribution with good results. I didn't get the map-reduce stuff going since I was focussed on SGD work. There is a possible niggling issue with SGD,

[jira] [Updated] (MAHOUT-715) Use Kryo for serializing vectors

2011-05-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-715: - Affects Version/s: (was: 0.6) 0.5 > Use Kryo for serializing vectors > ---

[jira] [Updated] (MAHOUT-715) Use Kryo for serializing vectors

2011-05-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated MAHOUT-715: - Resolution: Won't Fix Status: Resolved (was: Patch Available) OK that's cool. It'll live in JIRA

Re: [VOTE] Release Mahout 0.5, take 3

2011-05-31 Thread Sean Owen
Thanks Grant -- any other thoughts? I'm generally assuming those +1s carry over, but would be ideal to hear the votes explicitly from the PMC. *Do* have a look through the artifacts! If nothing turns up, would be great to finish this up tonight! I have a backlog of commits ready. On Tue, May 31, 2

Re: [jira] [Updated] (MAHOUT-696) Command line program for AdaptiveLogiscticRegression

2011-05-31 Thread Ted Dunning
Not just now. Hopefully tonight. On Tue, May 31, 2011 at 5:07 AM, XiaoboGu wrote: > Hi Ted, >I'll use the raw name for the next revision, can you compile and run > the patch now? > > Regards, > > Xiaobo Gu > > > -Original Message- > > From: Ted Dunning [mailto:ted.dunn...@gmail.

Re: PageRank

2011-05-31 Thread Benson Margulies
Well, we can go send that link to legal-disc...@apache.org and see what they say. On Tue, May 31, 2011 at 10:57 AM, Sebastian Schelter wrote: > I'm a little confused now :) What do you mean by "real knowledge"? I found a > link to the patent when reading the wikipedia page about PageRank, howeve

[jira] [Commented] (MAHOUT-715) Use Kryo for serializing vectors

2011-05-31 Thread Gustavo Salazar Torres (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041622#comment-13041622 ] Gustavo Salazar Torres commented on MAHOUT-715: --- Hi Ted I haven't tested it

Re: [VOTE] Release Mahout 0.5, take 3

2011-05-31 Thread Grant Ingersoll
Looking now. On May 31, 2011, at 4:31 AM, Sean Owen wrote: > Going once, going twice... going to complete the release later this evening. > I'm guessing anyone who cares to look and check has done so already. > > On Sat, May 28, 2011 at 4:02 PM, Sean Owen wrote: > >> https://repository.apache.

Re: PageRank

2011-05-31 Thread Sebastian Schelter
I'm a little confused now :) What do you mean by "real knowledge"? I found a link to the patent when reading the wikipedia page about PageRank, however as I'm not a lawyer, I don't consider myself capable of judging the situation. --sebastian On 31.05.2011 16:41, Benson Margulies wrote: Let

Re: PageRank

2011-05-31 Thread Benson Margulies
Let me clarify this: No one is obligated to do patent research. If you have real knowledge of a patent infringement in a proposed contribution, you must disclose it. If you have no real knowledge, you need not go looking for it. On Tue, May 31, 2011 at 10:18 AM, Dhruv Kumar wrote: > Jimmy Lin's

Re: PageRank

2011-05-31 Thread Dhruv Kumar
Jimmy Lin's Cloud 9 Map Reduce library also includes an implementation of Page Rank as an example: http://www.umiacs.umd.edu/~jimmylin/cloud9/docs/ On Tue, May 31, 2011 at 9:10 AM, Sebastian Schelter wrote: > Thank you very much for the advice, I'll try to contact Stanford then. > > --sebastian

Re: PageRank

2011-05-31 Thread Sebastian Schelter
Thank you very much for the advice, I'll try to contact Stanford then. --sebastian On 31.05.2011 14:53, Benson Margulies wrote: As a contributor, you could submit a patch noting the existence of the patent, *If you know that a patent reads on it*. As a PMC member, I'm advised by legal-discuss@

Re: PageRank

2011-05-31 Thread Benson Margulies
As a contributor, you could submit a patch noting the existence of the patent, *If you know that a patent reads on it*. As a PMC member, I'm advised by legal-discuss@ to reject the patch, unless you can get a grant from Stanford. However, 'having heard' is not the same thing as *knowing*. A vague

Re: PageRank

2011-05-31 Thread Benson Margulies
Read the Apache Individual Contributor License Agreement, section 5: 5. You represent that each of Your Contributions is Your original creation (see section 7 for submissions on behalf of others). You represent that Your Contribution submissions include complete details of any third-par

PageRank

2011-05-31 Thread Sebastian Schelter
Hello everyone, I have a question regarding legal issues. I have one of my students implement PageRank in MapReduce, would it be possible to contribute this to Mahout or are there any legal issues prohibiting this? I read that Stanford is holding a patent on PageRank but I've seen a PageRan

RE: [jira] [Updated] (MAHOUT-696) Command line program for AdaptiveLogiscticRegression

2011-05-31 Thread XiaoboGu
Hi Ted, I'll use the raw name for the next revision, can you compile and run the patch now? Regards, Xiaobo Gu > -Original Message- > From: Ted Dunning [mailto:ted.dunn...@gmail.com] > Sent: Tuesday, May 31, 2011 12:33 PM > To: dev@mahout.apache.org > Subject: Re: [jira] [Update

Re: [VOTE] Release Mahout 0.5, take 3

2011-05-31 Thread Sean Owen
Going once, going twice... going to complete the release later this evening. I'm guessing anyone who cares to look and check has done so already. On Sat, May 28, 2011 at 4:02 PM, Sean Owen wrote: > https://repository.apache.org/content/repositories/orgapachemahout-014 > > This has the fix for Je