Re: [jira] [Commented] (MAHOUT-1464) Cooccurrence Analysis on Spark

Dmitriy Lyubimov Mon, 14 Apr 2014 14:19:27 -0700

PS like i said, the "Client" feature only appeared in 0.9. Nobody missed it
before that and it never was a prerequisite to run anything.



On Mon, Apr 14, 2014 at 2:14 PM, Dmitriy Lyubimov (JIRA) <[email protected]>wrote:

>
>     [
> https://issues.apache.org/jira/browse/MAHOUT-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968849#comment-13968849]
>
> Dmitriy Lyubimov commented on MAHOUT-1464:
> ------------------------------------------
>
>
>
> IDEA is driver. but output is written by spark workers. Not the same
> environment, and in most cases, not the same machine. Just like it happens
> for MR reducers. Unless it is "local" master url. Which i assume it was
> not.
>
>
> This is strange. I can, was able to and will able to. why wouldn't it able
> to? unless there are network or security issues. There's nothing
> fundamentally different between reading/writing hdfs from a worker process
> or any other process.
>
>
>
> No. Spark client is about shipping driver and have it running somewhere
> else. it is as if somebody was running mahout cli command on one of the
> worker nodes. this is it. it knows nothing about hdfs -- and even what the
> driver program is going to do. One might use the Client code to print out
> "Hello, World" and exit on some of the worker nodes, the Client wouldn't
> know or care. Using a worker to run driver programs, that's all it does.
>
>
>
>
> > Cooccurrence Analysis on Spark
> > ------------------------------
> >
> >                 Key: MAHOUT-1464
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-1464
> >             Project: Mahout
> >          Issue Type: Improvement
> >          Components: Collaborative Filtering
> >         Environment: hadoop, spark
> >            Reporter: Pat Ferrel
> >            Assignee: Sebastian Schelter
> >             Fix For: 1.0
> >
> >         Attachments: MAHOUT-1464.patch, MAHOUT-1464.patch,
> MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch, MAHOUT-1464.patch,
> run-spark-xrsj.sh
> >
> >
> > Create a version of Cooccurrence Analysis (RowSimilarityJob with LLR)
> that runs on Spark. This should be compatible with Mahout Spark DRM DSL so
> a DRM can be used as input.
> > Ideally this would extend to cover MAHOUT-1422. This cross-cooccurrence
> has several applications including cross-action recommendations.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.2#6252)
>

Re: [jira] [Commented] (MAHOUT-1464) Cooccurrence Analysis on Spark

Reply via email to