Re: Filter operation to return two RDDs at once.

2015-06-03 Thread Jeff Zhang
As far as I know, spark don't support multiple outputs On Wed, Jun 3, 2015 at 2:15 PM, ayan guha guha.a...@gmail.com wrote: Why do you need to do that if filter and content of the resulting rdd are exactly same? You may as well declare them as 1 RDD. On 3 Jun 2015 15:28, ÐΞ€ρ@Ҝ (๏̯͡๏)

Re: Filter operation to return two RDDs at once.

2015-06-03 Thread Jeff Zhang
I check the RDD#randSplit, it is much more like multiple one-to-one transformation rather than a one-to-multiple transformation. I write one sample code as following, it would generate 3 stages. Although we can use cache here to make it better, If spark can support multiple outputs, only 2 stages

Re: Filter operation to return two RDDs at once.

2015-06-03 Thread Sean Owen
In the sense here, Spark actually does have operations that make multiple RDDs like randomSplit. However there is not an equivalent of the partition operation which gives the elements that matched and did not match at once. On Wed, Jun 3, 2015, 8:32 AM Jeff Zhang zjf...@gmail.com wrote: As far

Re: Filter operation to return two RDDs at once.

2015-06-03 Thread ayan guha
Why do you need to do that if filter and content of the resulting rdd are exactly same? You may as well declare them as 1 RDD. On 3 Jun 2015 15:28, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: I want to do this val qtSessionsWithQt = rawQtSession.filter(_._2.qualifiedTreatmentId !=

Filter operation to return two RDDs at once.

2015-06-02 Thread ๏̯͡๏
I want to do this val qtSessionsWithQt = rawQtSession.filter(_._2.qualifiedTreatmentId != NULL_VALUE) val guidUidMapSessions = rawQtSession.filter(_._2.qualifiedTreatmentId == NULL_VALUE) This will run two different stages can this be done in one stage ? val (qtSessionsWithQt,