Re: DStream Union vs. StreamingContext Union
@TD How do I file a JIRA? ᐧ On Tue, May 12, 2015 at 2:06 PM, Tathagata Das tathagata.das1...@gmail.com wrote: I wonder that may be a bug in the Python API. Please file it as a JIRA along with sample code to reproduce it and sample output you get. On Tue, May 12, 2015 at 10:00 AM, Vadim Bichutskiy vadim.bichuts...@gmail.com wrote: @TD I kept getting an empty RDD (i.e. rdd.take(1) was False). ᐧ On Tue, May 12, 2015 at 12:57 PM, Tathagata Das tathagata.das1...@gmail.com wrote: @Vadim What happened when you tried unioning using DStream.union in python? TD On Tue, May 12, 2015 at 9:53 AM, Evo Eftimov evo.efti...@isecc.com wrote: I can confirm it does work in Java *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] *Sent:* Tuesday, May 12, 2015 5:53 PM *To:* Evo Eftimov *Cc:* Saisai Shao; user@spark.apache.org *Subject:* Re: DStream Union vs. StreamingContext Union Thanks Evo. I tried chaining Dstream unions like what you have and it didn't work for me. But passing multiple arguments to StreamingContext.union worked fine. Any idea why? I am using Python, BTW. ᐧ On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com wrote: You can also union multiple DstreamRDDs in this way DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3) etc etc Ps: the API is not “redundant” it offers several ways for achivieving the same thing as a convenience depending on the situation *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] *Sent:* Tuesday, May 12, 2015 5:37 PM *To:* Saisai Shao *Cc:* user@spark.apache.org *Subject:* Re: DStream Union vs. StreamingContext Union Thanks Saisai. That makes sense. Just seems redundant to have both. ᐧ On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com wrote: DStream.union can only union two DStream, one is itself. While StreamingContext.union can union an array of DStreams, internally DStream.union is a special case of StreamingContext.union: def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this, that)) So there's no difference, if you want to union more than two DStreams, just use the one in StreamingContext, otherwise, both two APIs are fine. 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com : Can someone explain to me the difference between DStream union and StreamingContext union? When do you use one vs the other? Thanks, Vadim ᐧ
Re: DStream Union vs. StreamingContext Union
@Vadim What happened when you tried unioning using DStream.union in python? TD On Tue, May 12, 2015 at 9:53 AM, Evo Eftimov evo.efti...@isecc.com wrote: I can confirm it does work in Java *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] *Sent:* Tuesday, May 12, 2015 5:53 PM *To:* Evo Eftimov *Cc:* Saisai Shao; user@spark.apache.org *Subject:* Re: DStream Union vs. StreamingContext Union Thanks Evo. I tried chaining Dstream unions like what you have and it didn't work for me. But passing multiple arguments to StreamingContext.union worked fine. Any idea why? I am using Python, BTW. ᐧ On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com wrote: You can also union multiple DstreamRDDs in this way DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3) etc etc Ps: the API is not “redundant” it offers several ways for achivieving the same thing as a convenience depending on the situation *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] *Sent:* Tuesday, May 12, 2015 5:37 PM *To:* Saisai Shao *Cc:* user@spark.apache.org *Subject:* Re: DStream Union vs. StreamingContext Union Thanks Saisai. That makes sense. Just seems redundant to have both. ᐧ On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com wrote: DStream.union can only union two DStream, one is itself. While StreamingContext.union can union an array of DStreams, internally DStream.union is a special case of StreamingContext.union: def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this, that)) So there's no difference, if you want to union more than two DStreams, just use the one in StreamingContext, otherwise, both two APIs are fine. 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com: Can someone explain to me the difference between DStream union and StreamingContext union? When do you use one vs the other? Thanks, Vadim ᐧ
Re: DStream Union vs. StreamingContext Union
@TD I kept getting an empty RDD (i.e. rdd.take(1) was False). ᐧ On Tue, May 12, 2015 at 12:57 PM, Tathagata Das tathagata.das1...@gmail.com wrote: @Vadim What happened when you tried unioning using DStream.union in python? TD On Tue, May 12, 2015 at 9:53 AM, Evo Eftimov evo.efti...@isecc.com wrote: I can confirm it does work in Java *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] *Sent:* Tuesday, May 12, 2015 5:53 PM *To:* Evo Eftimov *Cc:* Saisai Shao; user@spark.apache.org *Subject:* Re: DStream Union vs. StreamingContext Union Thanks Evo. I tried chaining Dstream unions like what you have and it didn't work for me. But passing multiple arguments to StreamingContext.union worked fine. Any idea why? I am using Python, BTW. ᐧ On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com wrote: You can also union multiple DstreamRDDs in this way DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3) etc etc Ps: the API is not “redundant” it offers several ways for achivieving the same thing as a convenience depending on the situation *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] *Sent:* Tuesday, May 12, 2015 5:37 PM *To:* Saisai Shao *Cc:* user@spark.apache.org *Subject:* Re: DStream Union vs. StreamingContext Union Thanks Saisai. That makes sense. Just seems redundant to have both. ᐧ On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com wrote: DStream.union can only union two DStream, one is itself. While StreamingContext.union can union an array of DStreams, internally DStream.union is a special case of StreamingContext.union: def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this, that)) So there's no difference, if you want to union more than two DStreams, just use the one in StreamingContext, otherwise, both two APIs are fine. 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com: Can someone explain to me the difference between DStream union and StreamingContext union? When do you use one vs the other? Thanks, Vadim ᐧ
RE: DStream Union vs. StreamingContext Union
I can confirm it does work in Java From: Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] Sent: Tuesday, May 12, 2015 5:53 PM To: Evo Eftimov Cc: Saisai Shao; user@spark.apache.org Subject: Re: DStream Union vs. StreamingContext Union Thanks Evo. I tried chaining Dstream unions like what you have and it didn't work for me. But passing multiple arguments to StreamingContext.union worked fine. Any idea why? I am using Python, BTW. https://mailfoogae.appspot.com/t?sender=admFkaW0uYmljaHV0c2tpeUBnbWFpbC5jb20%3Dtype=zerocontentguid=b343f6c5-5a2e-45fc-8317-54caf52e49ed ᐧ http://t.signauxcinq.com/e1t/o/5/f18dQhb0S7ks8dDMPbW2n0x6l2B9gXrN7sKj6v5dsrxW7gbZX-8q-6ZdVdnPvF2zlZNzW3hF9wD1k1H6H0?si=5533377798602752pi=6d288bce-f90c-47b8-b786-1cc26adf5b93 On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com wrote: You can also union multiple DstreamRDDs in this way DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3) etc etc Ps: the API is not “redundant” it offers several ways for achivieving the same thing as a convenience depending on the situation From: Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] Sent: Tuesday, May 12, 2015 5:37 PM To: Saisai Shao Cc: user@spark.apache.org Subject: Re: DStream Union vs. StreamingContext Union Thanks Saisai. That makes sense. Just seems redundant to have both. https://mailfoogae.appspot.com/t?sender=admFkaW0uYmljaHV0c2tpeUBnbWFpbC5jb20%3Dtype=zerocontentguid=7c28f88f-f212-4811-a16e-e8b21035b172 ᐧ On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com wrote: DStream.union can only union two DStream, one is itself. While StreamingContext.union can union an array of DStreams, internally DStream.union is a special case of StreamingContext.union: def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this, that)) So there's no difference, if you want to union more than two DStreams, just use the one in StreamingContext, otherwise, both two APIs are fine. 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com: Can someone explain to me the difference between DStream union and StreamingContext union? When do you use one vs the other? Thanks, Vadim https://mailfoogae.appspot.com/t?sender=admFkaW0uYmljaHV0c2tpeUBnbWFpbC5jb20%3Dtype=zerocontentguid=6cd729de-8339-40af-b2c5-b249011d6c3e ᐧ
Re: DStream Union vs. StreamingContext Union
Thanks Saisai. That makes sense. Just seems redundant to have both. ᐧ On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com wrote: DStream.union can only union two DStream, one is itself. While StreamingContext.union can union an array of DStreams, internally DStream.union is a special case of StreamingContext.union: def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this, that)) So there's no difference, if you want to union more than two DStreams, just use the one in StreamingContext, otherwise, both two APIs are fine. 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com: Can someone explain to me the difference between DStream union and StreamingContext union? When do you use one vs the other? Thanks, Vadim ᐧ
Re: DStream Union vs. StreamingContext Union
Thanks Evo. I tried chaining Dstream unions like what you have and it didn't work for me. But passing multiple arguments to StreamingContext.union worked fine. Any idea why? I am using Python, BTW. ᐧ On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com wrote: You can also union multiple DstreamRDDs in this way DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3) etc etc Ps: the API is not “redundant” it offers several ways for achivieving the same thing as a convenience depending on the situation *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] *Sent:* Tuesday, May 12, 2015 5:37 PM *To:* Saisai Shao *Cc:* user@spark.apache.org *Subject:* Re: DStream Union vs. StreamingContext Union Thanks Saisai. That makes sense. Just seems redundant to have both. ᐧ On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com wrote: DStream.union can only union two DStream, one is itself. While StreamingContext.union can union an array of DStreams, internally DStream.union is a special case of StreamingContext.union: def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this, that)) So there's no difference, if you want to union more than two DStreams, just use the one in StreamingContext, otherwise, both two APIs are fine. 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com: Can someone explain to me the difference between DStream union and StreamingContext union? When do you use one vs the other? Thanks, Vadim ᐧ
Re: DStream Union vs. StreamingContext Union
I wonder that may be a bug in the Python API. Please file it as a JIRA along with sample code to reproduce it and sample output you get. On Tue, May 12, 2015 at 10:00 AM, Vadim Bichutskiy vadim.bichuts...@gmail.com wrote: @TD I kept getting an empty RDD (i.e. rdd.take(1) was False). ᐧ On Tue, May 12, 2015 at 12:57 PM, Tathagata Das tathagata.das1...@gmail.com wrote: @Vadim What happened when you tried unioning using DStream.union in python? TD On Tue, May 12, 2015 at 9:53 AM, Evo Eftimov evo.efti...@isecc.com wrote: I can confirm it does work in Java *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] *Sent:* Tuesday, May 12, 2015 5:53 PM *To:* Evo Eftimov *Cc:* Saisai Shao; user@spark.apache.org *Subject:* Re: DStream Union vs. StreamingContext Union Thanks Evo. I tried chaining Dstream unions like what you have and it didn't work for me. But passing multiple arguments to StreamingContext.union worked fine. Any idea why? I am using Python, BTW. ᐧ On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com wrote: You can also union multiple DstreamRDDs in this way DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3) etc etc Ps: the API is not “redundant” it offers several ways for achivieving the same thing as a convenience depending on the situation *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] *Sent:* Tuesday, May 12, 2015 5:37 PM *To:* Saisai Shao *Cc:* user@spark.apache.org *Subject:* Re: DStream Union vs. StreamingContext Union Thanks Saisai. That makes sense. Just seems redundant to have both. ᐧ On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com wrote: DStream.union can only union two DStream, one is itself. While StreamingContext.union can union an array of DStreams, internally DStream.union is a special case of StreamingContext.union: def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this, that)) So there's no difference, if you want to union more than two DStreams, just use the one in StreamingContext, otherwise, both two APIs are fine. 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com: Can someone explain to me the difference between DStream union and StreamingContext union? When do you use one vs the other? Thanks, Vadim ᐧ