Re: consensus statement?

2014-05-15 Thread Cliff Click
We recognize the value of a non-Java-centric API for doing math/algo work, similar in spirit to what R has done. So... 0xdata is - looking at how the h2o API/programming model fits with the existing Mahout Java API - doing some initial exploratory porting to the existing Java API - watching w

Re: Exploring moving Mahout to git as main repo

2014-05-15 Thread Andrew Musselman
+1 On Fri, May 9, 2014 at 4:17 PM, Pat Ferrel wrote: > Yes! I mentioned this awhile back. Pull requests are just patches under > the covers, just soo easy to create. > > But pull requests would just be for contributors, right? Committers should > be able to push to the master directly, right? >

Re: Exploring moving Mahout to git as main repo

2014-05-15 Thread Pat Ferrel
Yes! I mentioned this awhile back. Pull requests are just patches under the covers, just soo easy to create. But pull requests would just be for contributors, right? Committers should be able to push to the master directly, right? On May 6, 2014, at 6:39 PM, Dmitriy Lyubimov wrote: Hi, We a

[jira] [Commented] (MAHOUT-1549) Extracting tfidf-vectors by key

2014-05-15 Thread Richard Scharrer (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997495#comment-13997495 ] Richard Scharrer commented on MAHOUT-1549: -- Hi Andy, drahcos is actually my acc

[jira] [Updated] (MAHOUT-1446) Create an intro for matrix factorization

2014-05-15 Thread jian wang (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jian wang updated MAHOUT-1446: -- Attachment: matrix-factorization.patch Please help review if the comment on the factorizers are correc

[jira] [Commented] (MAHOUT-1490) Data frame R-like bindings

2014-05-15 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997143#comment-13997143 ] Dmitriy Lyubimov commented on MAHOUT-1490: -- so h2o guys seem to use unsafe to sl

[jira] [Updated] (MAHOUT-1527) Fix wikipedia classifier example

2014-05-15 Thread Andrew Palumbo (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Palumbo updated MAHOUT-1527: --- Affects Version/s: 0.7 0.8 0.9 S

Re: [jira] [Created] (MAHOUT-1549) Extracting tfidf-vectors by key

2014-05-15 Thread Sebastian Schelter
I'm not sure I understand your question correctly. If you know the keys, you could put them into a file, write a Map-only Job that loads the keys from the file and filters the data to only retain the key-values pairs where the key is contained in your list. Does that make sense? --sebastian

[jira] [Commented] (MAHOUT-1549) Extracting tfidf-vectors by key

2014-05-15 Thread Andy Schlaikjer (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996646#comment-13996646 ] Andy Schlaikjer commented on MAHOUT-1549: - Hi Richard, Were you the one who recen

[jira] [Commented] (MAHOUT-1550) Naive Bayes training fails with Hadoop 2

2014-05-15 Thread Gokhan Capan (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996351#comment-13996351 ] Gokhan Capan commented on MAHOUT-1550: -- Paul, Did you try build mahout using hadoop

[jira] [Resolved] (MAHOUT-1550) Naive Bayes training fails with Hadoop 2

2014-05-15 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi resolved MAHOUT-1550. --- Resolution: Not a Problem The issue is due to not having built with Hadoop 2 profiles. Pleas

[jira] [Updated] (MAHOUT-1550) Naive Bayes training fails with Hadoop 2

2014-05-15 Thread Paul Marret (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Marret updated MAHOUT-1550: Attachment: stacktrace.txt mahout-snapshot.patch Patch + stacktrace > Naive Bayes

Re: consensus statement?

2014-05-15 Thread Pat Ferrel
This doesn’t seem to be a vision statement. I was +1 to a simple consensus statement. The vision is up to you. We have an interactive shell that scales to huge datasets without resorting to massive subsampling. One that allows you to deal with the exact data your black box algos work on. Ever

[jira] [Commented] (MAHOUT-1485) Clean up Recommender Overview page

2014-05-15 Thread Yash Sharma (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993029#comment-13993029 ] Yash Sharma commented on MAHOUT-1485: - Have updated the google document at the same l

[jira] [Commented] (MAHOUT-1541) Create CLI Driver for Spark Cooccurrence Analysis

2014-05-15 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994045#comment-13994045 ] Pat Ferrel commented on MAHOUT-1541: The basic import, do the cooccurrence, export is

[jira] [Commented] (MAHOUT-1527) Fix wikipedia classifier example

2014-05-15 Thread Andrew Palumbo (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13994074#comment-13994074 ] Andrew Palumbo commented on MAHOUT-1527: The issue re: crashing testnb should pro