RE: [jira] [Created] (MAHOUT-1568) Build an I/O model that can replace sequence files for import/export

2014-06-01 Thread Andrew Palumbo
Thanks. Yeah its a weird going from Java to Scala. Everything makes sense until all of the sudden it doesn't. I appreciate the pointers! > Subject: Re: [jira] [Created] (MAHOUT-1568) Build an I/O model that can > replace sequence files for import/export > From: pat.fer...@gmail.com > Date: Su

Re: [jira] [Created] (MAHOUT-1568) Build an I/O model that can replace sequence files for import/export

2014-06-01 Thread Pat Ferrel
Well it does run so I need to clean that stuff up anyway. The use of Traits is very powerful but is nothing like Python of Ruby mixins. Took me a lot of head scratching to get it straight and these are about as simple as you can get. The key thing to look at is the reader and writer methods an

RE: [jira] [Created] (MAHOUT-1568) Build an I/O model that can replace sequence files for import/export

2014-06-01 Thread Andrew Palumbo
Cool- I was just going through it to get familiar with the DSL (and really scala in general at this point) and the read/write traits that you were talking about... Just looking at the code really- I don't have any need to build it right now. Wanted to make sure i wasn't totally off... Thanks

Re: [jira] [Created] (MAHOUT-1568) Build an I/O model that can replace sequence files for import/export

2014-06-01 Thread Pat Ferrel
Sorry, wasn’t expecting someone to build it. Don’t know if the packaging is right yet and it's about a month behind on the trunk. You pull the repo at the same level as the major pieces like math-scala—into MAHOUT_HOME, apply MAHOUT-1464 patch, but all you need is org.apache.mahout.cf.Cooccurre

RE: [jira] [Created] (MAHOUT-1568) Build an I/O model that can replace sequence files for import/export

2014-06-01 Thread Andrew Palumbo
Hi Pat, Does Harness compile against the mahout trunk + MAHOUT-1464.patch (cooccurance)? I have a patched up branch of the mahout trunk with basically a gutted MAHOUT-1464.patch- just something that defines org.apache.mahout.cf.CooccurrenceAnalysis and compiles (so i wouldn't be able to run H

Re: mlib versus spark

2014-06-01 Thread Dmitriy Lyubimov
I would add that what we were doing, (well, at least what i was doing), was aimed at building a ML environment, rather than simply a collection of algorithms. In practice I always wanted to customize something in of the shelf algorithms. E.g. for things like als and rlfm there're a thousand custom

Re: [jira] [Commented] (MAHOUT-1566) Regular ALS factorizer with convergence test.

2014-06-01 Thread Dmitriy Lyubimov
Honestly i don't see a big deal in keeping it around. Mllib does and nobody really cared. (They have als.train() for regular als with regularization and als.trainImplicit for the implicit one). Our primary woes with too many algorithms were associated with support, but with 2 lines it is clearly no

Re: Problems with mapBlock()

2014-06-01 Thread Dmitriy Lyubimov
Imports have changed with abstraction migration. Check the updated docs. On May 31, 2014 11:21 PM, "Sebastian Schelter" wrote: > I've updated the codebase to work on the cooccurrence analysis algo, but I > always run into this error now: > > error: value mapBlock is not a member of org.apache.mah

[jira] [Updated] (MAHOUT-1568) Build an I/O model that can replace sequence files for import/export

2014-06-01 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pat Ferrel updated MAHOUT-1568: --- Description: Implement mechanisms to read and write data from/to flexible stores. These will suppor

[jira] [Created] (MAHOUT-1569) Create CLI driver that supports Spark jobs

2014-06-01 Thread Pat Ferrel (JIRA)
Pat Ferrel created MAHOUT-1569: -- Summary: Create CLI driver that supports Spark jobs Key: MAHOUT-1569 URL: https://issues.apache.org/jira/browse/MAHOUT-1569 Project: Mahout Issue Type: New Featu

[jira] [Created] (MAHOUT-1568) Build an I/O model that can replace sequence files for import/export

2014-06-01 Thread Pat Ferrel (JIRA)
Pat Ferrel created MAHOUT-1568: -- Summary: Build an I/O model that can replace sequence files for import/export Key: MAHOUT-1568 URL: https://issues.apache.org/jira/browse/MAHOUT-1568 Project: Mahout

[jira] [Commented] (MAHOUT-1567) Add online sparse dictionary learning (dimensionality reduction)

2014-06-01 Thread Maciej Kula (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015014#comment-14015014 ] Maciej Kula commented on MAHOUT-1567: - Thanks, I'll have a look. My implementation i

[jira] [Commented] (MAHOUT-1567) Add online sparse dictionary learning (dimensionality reduction)

2014-06-01 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015006#comment-14015006 ] Ted Dunning commented on MAHOUT-1567: - Here are three possible pages that might help:

[jira] [Comment Edited] (MAHOUT-1529) Finalize abstraction of distributed logical plans from backend operations

2014-06-01 Thread Gokhan Capan (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014985#comment-14014985 ] Gokhan Capan edited comment on MAHOUT-1529 at 6/1/14 3:03 PM: -

[jira] [Comment Edited] (MAHOUT-1529) Finalize abstraction of distributed logical plans from backend operations

2014-06-01 Thread Gokhan Capan (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014985#comment-14014985 ] Gokhan Capan edited comment on MAHOUT-1529 at 6/1/14 2:55 PM: -

[jira] [Commented] (MAHOUT-1529) Finalize abstraction of distributed logical plans from backend operations

2014-06-01 Thread Gokhan Capan (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014985#comment-14014985 ] Gokhan Capan commented on MAHOUT-1529: -- [~dlyubimov], I imagine in the near future w

[jira] [Commented] (MAHOUT-1567) Add online sparse dictionary learning (dimensionality reduction)

2014-06-01 Thread Maciej Kula (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014948#comment-14014948 ] Maciej Kula commented on MAHOUT-1567: - Is this what you have in mind? https://issues