Introducing Gizzard, a framework for creating distributed datastores

2010-04-06 Thread Robin Anil
Its apache licensed and looks like a great option for storing and querying large graphs. May be useful as a model store for classifier http://engineering.twitter.com/2010/04/introducing-gizzard-framework-for.html http://github.com/twitter/gizzard Robin

Re: VOTE: release mahout-collections-codegen 1.0

2010-04-06 Thread Robin Anil
Is there a patch which pulls this dependency to build Mahout. Thats the good test for it Robin On Wed, Apr 7, 2010 at 10:45 AM, Ted Dunning wrote: > I confirm that the components exist and appear in good order. > > Is there a way for me to test this component? Is there any testing needed > bey

Re: VOTE: release mahout-collections-codegen 1.0

2010-04-06 Thread Ted Dunning
I confirm that the components exist and appear in good order. Is there a way for me to test this component? Is there any testing needed beyond checking existence? On Tue, Apr 6, 2010 at 7:13 PM, Benson Margulies wrote: > On Tue, Apr 6, 2010 at 9:40 PM, Ted Dunning wrote: > > Is that possible h

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-06 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854349#action_12854349 ] Ted Dunning commented on MAHOUT-364: This is a very nicely written proposal. One tech

[jira] Commented: (MAHOUT-358) the pref value field of output of org.apache.mahout.cf.taste.hadoop.item.RecommenderJob has negative

2010-04-06 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854317#action_12854317 ] Hui Wen Han commented on MAHOUT-358: http://java.sun.com/javase/6/docs/api/java/math/Bi

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-06 Thread Jake Mannix (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854304#action_12854304 ] Jake Mannix commented on MAHOUT-364: I've got to say, this is a fantastically well writ

[GSoC 2010] Requesting feedback on my proposal for implementing Neural Network with backpropagation learning

2010-04-06 Thread Zaid Md Abdul Wahab Sheikh
Hi all, I just submitted a GSoC proposal for implementing Neural Network with backpropagation on Hadoop. Jira issue: http://issues.apache.org/jira/browse/MAHOUT-364 I would appreciate your feedback and comments on the proposal and on the working or implementation plan. -

Re: [GSOC] 2010 Timelines

2010-04-06 Thread Robin Anil
2 days to go till the close of student submissions. A request to mentors to provide feedback to all the queries on the list so that students can go and work on tuning their proposal Robin On Sat, Apr 3, 2010 at 10:50 PM, Grant Ingersoll wrote: > > http://socghop.appspot.com/document/show/gsoc_pr

Re: A request for prospective GSOC students

2010-04-06 Thread Zaid Md Abdul Wahab Sheikh
I just submitted a proposal to implement Neural Network with backpropagation learning Jira issue: http://issues.apache.org/jira/browse/MAHOUT-364 On Sat, Apr 3, 2010 at 9:07 PM, Robin Anil wrote: > I am having a tough time separating Mahout proposals from rest of Apache on > gsoc website. So I wo

Re: VOTE: release mahout-collections-codegen 1.0

2010-04-06 Thread Benson Margulies
On Tue, Apr 6, 2010 at 9:40 PM, Ted Dunning wrote: > Is that possible here instead: > https://repository.apache.org/content/repositories/staging/org/apache/mahout/? No, that's not right. That path has our last (0.3) release in it. However, I had forgotten to close it. https://repository.apache.o

Re: VOTE: release mahout-collections-codegen 1.0

2010-04-06 Thread Ted Dunning
Is that possible here instead: https://repository.apache.org/content/repositories/staging/org/apache/mahout/? On Tue, Apr 6, 2010 at 6:08 PM, Benson Margulies wrote: > In order to decouple the mahout-collections library from the rest of > Mahout, to allow more frequent releases and other good thi

[jira] Updated: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-06 Thread Zaid Md. Abdul Wahab Sheikh (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zaid Md. Abdul Wahab Sheikh updated MAHOUT-364: --- Comment: was deleted (was: formatting :() > [GSOC] Proposal to imple

[jira] Updated: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-06 Thread Zaid Md. Abdul Wahab Sheikh (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zaid Md. Abdul Wahab Sheikh updated MAHOUT-364: --- Description: Proposal Title: Implement Multi-Layer Perceptrons with b

GSOC [mentor idea]: Clustering visualization with GraphViz

2010-04-06 Thread Robin Anil
Here is a good project wish list, If anyone wishes to take it forward I would be willing to help mentor. http://www.graphviz.org/ Check out one of the graphs which i believe is a good way to represent clusters. Creating this graph is as easy was writing cluster output to the graphviz format http:/

[jira] Created: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-06 Thread Zaid Md. Abdul Wahab Sheikh (JIRA)
[GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop --- Key: MAHOUT-364 URL: https://issues.apache.org/jira/browse/MAHOUT-364 Project: Mah

VOTE: release mahout-collections-codegen 1.0

2010-04-06 Thread Benson Margulies
In order to decouple the mahout-collections library from the rest of Mahout, to allow more frequent releases and other good things, we propose to release the code generator for the collections library as a separate Maven artifact. (Followed, in short order, by the collections library proper.) This

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Robin Anil
Great proposal. Hopefully this will push Mahout core to have faster releases Robin On Wed, Apr 7, 2010 at 3:29 AM, Grant Ingersoll wrote: > +1. Release early, release often. > > -Grant > > On Apr 6, 2010, at 5:12 PM, Benson Margulies wrote: > > > Indeed. Off I go. > > > > On Tue, Apr 6, 2010

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Grant Ingersoll
+1. Release early, release often. -Grant On Apr 6, 2010, at 5:12 PM, Benson Margulies wrote: > Indeed. Off I go. > > On Tue, Apr 6, 2010 at 4:23 PM, Ted Dunning wrote: >> Very cool. Very exciting. >> >> Benson, that sounds like consensus to me. >> >> On Tue, Apr 6, 2010 at 1:02 PM, Jake M

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Benson Margulies
Indeed. Off I go. On Tue, Apr 6, 2010 at 4:23 PM, Ted Dunning wrote: > Very cool.  Very exciting. > > Benson, that sounds like consensus to me. > > On Tue, Apr 6, 2010 at 1:02 PM, Jake Mannix wrote: > >> ... I'm in favor, I guess, of: >> >> 1: remove collections-codegen and collections from the

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Ted Dunning
Very cool. Very exciting. Benson, that sounds like consensus to me. On Tue, Apr 6, 2010 at 1:02 PM, Jake Mannix wrote: > ... I'm in favor, I guess, of: > > 1: remove collections-codegen and collections from the top-level pom's > module list. > 2: change their parents to point to the apache par

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Jake Mannix
I guess I'm fine with whatever, making fast releases of collections is in fact pretty cool, it will give us practice with making releases in mahout in general. And if we can do this for mahout-math as well, some of us who care about, for example, eventually adding unit tests for all of the old Col

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Benson Margulies
Where are we on the consensus process? Jake, have Ted and I satisfied you? Does this call for a VOTE to be sure that we're on the same page? On Tue, Apr 6, 2010 at 3:33 PM, Benson Margulies wrote: > On Tue, Apr 6, 2010 at 3:10 PM, Ted Dunning wrote: >> I should have said "there should SOON be

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Benson Margulies
On Tue, Apr 6, 2010 at 3:10 PM, Ted Dunning wrote: > I should have said "there should SOON be a vanishingly small number of > collections releases".  Clearly that isn't so just yet. > > On Tue, Apr 6, 2010 at 12:09 PM, Ted Dunning wrote: > >> if only because there should be a vanishingly small nu

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Benson Margulies
We gain the ability to release collections more frequently. *because* it is less mature, it needs that. On Tue, Apr 6, 2010 at 2:48 PM, Jake Mannix wrote: > I agree in principal, but having a whole different set of versionings seems > kinda... messy?  If m-collections goes 1.0, and then 1.1, and

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Ted Dunning
I should have said "there should SOON be a vanishingly small number of collections releases". Clearly that isn't so just yet. On Tue, Apr 6, 2010 at 12:09 PM, Ted Dunning wrote: > if only because there should be a vanishingly small number of collections > releases

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Ted Dunning
The Lucene/Solr community have decided to loosely couple release schedules and explicitly decided to not lock version numbers. One of their arguments was that it would confuse users, which doesn't apply for us. The other argument was that either side should be free to have a release that was comp

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Jake Mannix
I agree in principal, but having a whole different set of versionings seems kinda... messy? If m-collections goes 1.0, and then 1.1, and then m-math goes 1.0, and core goes to 0.5, we have a whole pile of different version numbers to keep track of. Didn't Lucene and Solr just intentionally do the

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Ted Dunning
For what it is worth, I actually prefer this approach to the multi-pom approach in many cases. If it really is a separate thing, it might as well have a separate release schedule and artifact. If it isn't a separate thing, then you might as well use a single pom. This heuristic doesn't always wo

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Benson Margulies
Substance: 1: remove collections-codegen and collections from the top-level pom's module list. 2: change their parents to point to the apache parent. 3: tweak their poms so that the release plugin works right with them. 4: release them 5: change rest of mahout to consume release. On Tue, Apr 6,

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Sean Owen
This still lives in Mahout, just has a different version number? what's the substance of the change in the short-term; I think I missed that step. On Tue, Apr 6, 2010 at 6:41 PM, Benson Margulies wrote: > Hearing no other remarks, I will proceed to disconnect and make the > version 1.0-SNAPSHOT,

Re: Proposal: make collections releases independent of the rest of Mahout

2010-04-06 Thread Benson Margulies
Hearing no other remarks, I will proceed to disconnect and make the version 1.0-SNAPSHOT, and call a release vote RSN. On Sun, Apr 4, 2010 at 7:58 PM, Benson Margulies wrote: > Last question: What's the first version going to be? I propose '1.0'. > 0.4 would get mighty confusion. I really don't

[jira] Commented: (MAHOUT-358) the pref value field of output of org.apache.mahout.cf.taste.hadoop.item.RecommenderJob has negative

2010-04-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854077#action_12854077 ] Sean Owen commented on MAHOUT-358: -- You mean that you do not see those negative values? Tr

[jira] Commented: (MAHOUT-358) the pref value field of output of org.apache.mahout.cf.taste.hadoop.item.RecommenderJob has negative

2010-04-06 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854073#action_12854073 ] Hui Wen Han commented on MAHOUT-358: if use Text as the out format ,everything is ok.

[jira] Commented: (MAHOUT-345) [GSOC] integrate Mahout with Drupal/PHP

2010-04-06 Thread Y.W.D.D.Dissanayake (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854008#action_12854008 ] Y.W.D.D.Dissanayake commented on MAHOUT-345: i'm computer science student. i li

[jira] Commented: (MAHOUT-358) the pref value field of output of org.apache.mahout.cf.taste.hadoop.item.RecommenderJob has negative

2010-04-06 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853983#action_12853983 ] Hui Wen Han commented on MAHOUT-358: I will debug and tell you the result . Thanks :)

[jira] Commented: (MAHOUT-358) the pref value field of output of org.apache.mahout.cf.taste.hadoop.item.RecommenderJob has negative

2010-04-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853982#action_12853982 ] Sean Owen commented on MAHOUT-358: -- That is truly strange then. The user vectors have nonn

[jira] Commented: (MAHOUT-358) the pref value field of output of org.apache.mahout.cf.taste.hadoop.item.RecommenderJob has negative

2010-04-06 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853977#action_12853977 ] Hui Wen Han commented on MAHOUT-358: I have no negative ratings. > the pref value fi

[jira] Commented: (MAHOUT-356) ClassNotFoundException: org.apache.mahout.math.function.IntDoubleProcedure

2010-04-06 Thread Kris Jack (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853930#action_12853930 ] Kris Jack commented on MAHOUT-356: -- No, I haven't set a CLASSPATH var (not intentionally a

[jira] Commented: (MAHOUT-358) the pref value field of output of org.apache.mahout.cf.taste.hadoop.item.RecommenderJob has negative

2010-04-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853929#action_12853929 ] Sean Owen commented on MAHOUT-358: -- Maybe it also clarifies to say: those valus are *not*

[jira] Commented: (MAHOUT-358) the pref value field of output of org.apache.mahout.cf.taste.hadoop.item.RecommenderJob has negative

2010-04-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853927#action_12853927 ] Sean Owen commented on MAHOUT-358: -- Yes, the very small zero values are now being output.

[jira] Commented: (MAHOUT-358) the pref value field of output of org.apache.mahout.cf.taste.hadoop.item.RecommenderJob has negative

2010-04-06 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853919#action_12853919 ] Hui Wen Han commented on MAHOUT-358: I used the latest code test again , the final outp

[jira] Updated: (MAHOUT-358) the pref value field of output of org.apache.mahout.cf.taste.hadoop.item.RecommenderJob has negative

2010-04-06 Thread Hui Wen Han (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Wen Han updated MAHOUT-358: --- Attachment: screenshot-2.jpg > the pref value field of output of > org.apache.mahout.cf.taste.hadoo

Re: Build failed in Hudson: Mahout Trunk #584

2010-04-06 Thread Sebastian Schelter
Hi Sean, I think I saw another potential problem, lines 233 to 237 should be changed from if (tmpDir.exists()) { recursiveDelete(tmpDir); } else { tmpDir.mkdirs(); } to if (tmpDir.exists()) { recursiveDelete(tmpDir); } tmpDir.mkdirs();

Re: Build failed in Hudson: Mahout Trunk #584

2010-04-06 Thread Benson Margulies
I mean new File(SomeDir, SomeFile) whenever I need to compose, and let it worry over delimiters. On Tue, Apr 6, 2010 at 7:19 AM, Sean Owen wrote: > You mean File.createTempFile()? Yes, though here the test wants to > create a temp directory. Is there a good way to do that? > > On Tue, Apr 6, 201

Re: Build failed in Hudson: Mahout Trunk #584

2010-04-06 Thread Sean Owen
You mean File.createTempFile()? Yes, though here the test wants to create a temp directory. Is there a good way to do that? On Tue, Apr 6, 2010 at 12:17 PM, Benson Margulies wrote: > The File class is my usual solution here.

Re: Build failed in Hudson: Mahout Trunk #584

2010-04-06 Thread Benson Margulies
The File class is my usual solution here. On Tue, Apr 6, 2010 at 7:10 AM, Sean Owen wrote: > That must be it. I had removed the '/' earlier since on OS X the temp > dir path ends with '/', and at the time I believed it was the cause of > some other failures (which I'm guessing I was wrong about).

Re: Build failed in Hudson: Mahout Trunk #584

2010-04-06 Thread Sean Owen
That must be it. I had removed the '/' earlier since on OS X the temp dir path ends with '/', and at the time I believed it was the cause of some other failures (which I'm guessing I was wrong about). I can easily make the logic account for both cases. Sean On Tue, Apr 6, 2010 at 11:24 AM, Sebast

Re: Build failed in Hudson: Mahout Trunk #584

2010-04-06 Thread Sebastian Schelter
Hi Sean, I can only do I guess why the test fails: Line 225 in ItemSimilarityTest is missing a / when constructing the path to the temporary directory: String tmpDirPath = System.getProperty("java.io.tmpdir") + ItemSimilarityTest.class.getCanonicalName(); which makes it /tmporg.

Re: Build failed in Hudson: Mahout Trunk #584

2010-04-06 Thread Sean Owen
I can't reproduce this at all and don't see how to get details out of Hudson. Does anyone know where it sticks test output? or can anyone repro this? On Tue, Apr 6, 2010 at 10:58 AM, Sean Owen wrote: > I see all tests pass in a full clean / test. :( I will look at > Hudson's output to see why it

Re: Build failed in Hudson: Mahout Trunk #584

2010-04-06 Thread Sean Owen
I see all tests pass in a full clean / test. :( I will look at Hudson's output to see why it think it failed. On Tue, Apr 6, 2010 at 10:48 AM, Robin Anil wrote: > I have tasted this before, That was when I didn't do a clean install before > checking in. >

Re: Build failed in Hudson: Mahout Trunk #584

2010-04-06 Thread Robin Anil
I have tasted this before, That was when I didn't do a clean install before checking in. On Tue, Apr 6, 2010 at 3:13 PM, Sean Owen wrote: > Weak, surely my changes that did it but I don't know why I didn't see > this in a local build / test. > > On Tue, Apr 6, 2010 at 10:41 AM, Apache Hudson Ser

Re: Build failed in Hudson: Mahout Trunk #584

2010-04-06 Thread Robin Anil
Running org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityTest Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.039 sec <<< FAILURE! On Tue, Apr 6, 2010 at 3:13 PM, Sean Owen wrote: > Weak, surely my changes that did it but I don't know why I didn't see > this in

Re: Build failed in Hudson: Mahout Trunk #584

2010-04-06 Thread Sean Owen
Weak, surely my changes that did it but I don't know why I didn't see this in a local build / test. On Tue, Apr 6, 2010 at 10:41 AM, Apache Hudson Server wrote: > See > > Changes: > > [srowen] MAHOUT-362 last refactorings for

Re: Problems installing Mahout

2010-04-06 Thread Sean Owen
I'm ready to patch this issue, but I went the other way -- fixed the output to use Locale.ENGLISH. Either way works, what's preferred to you guys? Is it making the output deterministic, or locale-friendly? I opted for fixing it to Locale.ENGLISH because I like not depending on the platform, and be