As far as I know, spark don't support multiple outputs
On Wed, Jun 3, 2015 at 2:15 PM, ayan guha guha.a...@gmail.com wrote:
Why do you need to do that if filter and content of the resulting rdd are
exactly same? You may as well declare them as 1 RDD.
On 3 Jun 2015 15:28, ÐΞ€ρ@Ҝ (๏̯͡๏)
I check the RDD#randSplit, it is much more like multiple one-to-one
transformation rather than a one-to-multiple transformation.
I write one sample code as following, it would generate 3 stages. Although
we can use cache here to make it better, If spark can support multiple
outputs, only 2 stages
In the sense here, Spark actually does have operations that make multiple
RDDs like randomSplit. However there is not an equivalent of the partition
operation which gives the elements that matched and did not match at once.
On Wed, Jun 3, 2015, 8:32 AM Jeff Zhang zjf...@gmail.com wrote:
As far
Why do you need to do that if filter and content of the resulting rdd are
exactly same? You may as well declare them as 1 RDD.
On 3 Jun 2015 15:28, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
I want to do this
val qtSessionsWithQt = rawQtSession.filter(_._2.qualifiedTreatmentId
!=
I want to do this
val qtSessionsWithQt = rawQtSession.filter(_._2.qualifiedTreatmentId !=
NULL_VALUE)
val guidUidMapSessions = rawQtSession.filter(_._2.qualifiedTreatmentId
== NULL_VALUE)
This will run two different stages can this be done in one stage ?
val (qtSessionsWithQt,