I wonder that may be a bug in the Python API. Please file it as a JIRA
along with sample code to reproduce it and sample output you get.

On Tue, May 12, 2015 at 10:00 AM, Vadim Bichutskiy <
vadim.bichuts...@gmail.com> wrote:

> @TD I kept getting an empty RDD (i.e. rdd.take(1) was False).
> ᐧ
>
> On Tue, May 12, 2015 at 12:57 PM, Tathagata Das <
> tathagata.das1...@gmail.com> wrote:
>
>> @Vadim What happened when you tried unioning using DStream.union in
>> python?
>>
>> TD
>>
>> On Tue, May 12, 2015 at 9:53 AM, Evo Eftimov <evo.efti...@isecc.com>
>> wrote:
>>
>>> I can confirm it does work in Java
>>>
>>>
>>>
>>> *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com]
>>> *Sent:* Tuesday, May 12, 2015 5:53 PM
>>> *To:* Evo Eftimov
>>> *Cc:* Saisai Shao; user@spark.apache.org
>>>
>>> *Subject:* Re: DStream Union vs. StreamingContext Union
>>>
>>>
>>>
>>> Thanks Evo. I tried chaining Dstream unions like what you have and it
>>> didn't work for me. But passing
>>>
>>> multiple arguments to StreamingContext.union worked fine. Any idea why?
>>> I am using Python, BTW.
>>>
>>> ᐧ
>>>
>>>
>>>
>>> On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov <evo.efti...@isecc.com>
>>> wrote:
>>>
>>> You can also union multiple DstreamRDDs in this way
>>> DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3)  etc etc
>>>
>>>
>>>
>>> Ps: the API is not “redundant” it offers several ways for achivieving
>>> the same thing as a convenience depending on the situation
>>>
>>>
>>>
>>> *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com]
>>> *Sent:* Tuesday, May 12, 2015 5:37 PM
>>> *To:* Saisai Shao
>>> *Cc:* user@spark.apache.org
>>> *Subject:* Re: DStream Union vs. StreamingContext Union
>>>
>>>
>>>
>>> Thanks Saisai. That makes sense. Just seems redundant to have both.
>>>
>>> ᐧ
>>>
>>>
>>>
>>> On Mon, May 11, 2015 at 10:36 PM, Saisai Shao <sai.sai.s...@gmail.com>
>>> wrote:
>>>
>>> DStream.union can only union two DStream, one is itself. While
>>> StreamingContext.union can union an array of DStreams, internally
>>> DStream.union is a special case of StreamingContext.union:
>>>
>>>
>>>
>>> def union(that: DStream[T]): DStream[T] = new
>>> UnionDStream[T](Array(this, that))
>>>
>>>
>>>
>>> So there's no difference, if you want to union more than two DStreams,
>>> just use the one in StreamingContext, otherwise, both two APIs are fine.
>>>
>>>
>>>
>>>
>>>
>>> 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy <vadim.bichuts...@gmail.com>:
>>>
>>> Can someone explain to me the difference between DStream union and
>>> StreamingContext union?
>>>
>>> When do you use one vs the other?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Vadim
>>>
>>> ᐧ
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>

Reply via email to