I wonder that may be a bug in the Python API. Please file it as a JIRA along with sample code to reproduce it and sample output you get.
On Tue, May 12, 2015 at 10:00 AM, Vadim Bichutskiy < vadim.bichuts...@gmail.com> wrote: > @TD I kept getting an empty RDD (i.e. rdd.take(1) was False). > ᐧ > > On Tue, May 12, 2015 at 12:57 PM, Tathagata Das < > tathagata.das1...@gmail.com> wrote: > >> @Vadim What happened when you tried unioning using DStream.union in >> python? >> >> TD >> >> On Tue, May 12, 2015 at 9:53 AM, Evo Eftimov <evo.efti...@isecc.com> >> wrote: >> >>> I can confirm it does work in Java >>> >>> >>> >>> *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] >>> *Sent:* Tuesday, May 12, 2015 5:53 PM >>> *To:* Evo Eftimov >>> *Cc:* Saisai Shao; user@spark.apache.org >>> >>> *Subject:* Re: DStream Union vs. StreamingContext Union >>> >>> >>> >>> Thanks Evo. I tried chaining Dstream unions like what you have and it >>> didn't work for me. But passing >>> >>> multiple arguments to StreamingContext.union worked fine. Any idea why? >>> I am using Python, BTW. >>> >>> ᐧ >>> >>> >>> >>> On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov <evo.efti...@isecc.com> >>> wrote: >>> >>> You can also union multiple DstreamRDDs in this way >>> DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3) etc etc >>> >>> >>> >>> Ps: the API is not “redundant” it offers several ways for achivieving >>> the same thing as a convenience depending on the situation >>> >>> >>> >>> *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] >>> *Sent:* Tuesday, May 12, 2015 5:37 PM >>> *To:* Saisai Shao >>> *Cc:* user@spark.apache.org >>> *Subject:* Re: DStream Union vs. StreamingContext Union >>> >>> >>> >>> Thanks Saisai. That makes sense. Just seems redundant to have both. >>> >>> ᐧ >>> >>> >>> >>> On Mon, May 11, 2015 at 10:36 PM, Saisai Shao <sai.sai.s...@gmail.com> >>> wrote: >>> >>> DStream.union can only union two DStream, one is itself. While >>> StreamingContext.union can union an array of DStreams, internally >>> DStream.union is a special case of StreamingContext.union: >>> >>> >>> >>> def union(that: DStream[T]): DStream[T] = new >>> UnionDStream[T](Array(this, that)) >>> >>> >>> >>> So there's no difference, if you want to union more than two DStreams, >>> just use the one in StreamingContext, otherwise, both two APIs are fine. >>> >>> >>> >>> >>> >>> 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy <vadim.bichuts...@gmail.com>: >>> >>> Can someone explain to me the difference between DStream union and >>> StreamingContext union? >>> >>> When do you use one vs the other? >>> >>> >>> >>> Thanks, >>> >>> Vadim >>> >>> ᐧ >>> >>> >>> >>> >>> >>> >>> >> >> >