Re: DStream Union vs. StreamingContext Union

2015-05-14 Thread Vadim Bichutskiy
@TD How do I file a JIRA?
ᐧ

On Tue, May 12, 2015 at 2:06 PM, Tathagata Das tathagata.das1...@gmail.com
wrote:

 I wonder that may be a bug in the Python API. Please file it as a JIRA
 along with sample code to reproduce it and sample output you get.

 On Tue, May 12, 2015 at 10:00 AM, Vadim Bichutskiy 
 vadim.bichuts...@gmail.com wrote:

 @TD I kept getting an empty RDD (i.e. rdd.take(1) was False).
 ᐧ

 On Tue, May 12, 2015 at 12:57 PM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 @Vadim What happened when you tried unioning using DStream.union in
 python?

 TD

 On Tue, May 12, 2015 at 9:53 AM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 I can confirm it does work in Java



 *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com]
 *Sent:* Tuesday, May 12, 2015 5:53 PM
 *To:* Evo Eftimov
 *Cc:* Saisai Shao; user@spark.apache.org

 *Subject:* Re: DStream Union vs. StreamingContext Union



 Thanks Evo. I tried chaining Dstream unions like what you have and it
 didn't work for me. But passing

 multiple arguments to StreamingContext.union worked fine. Any idea why?
 I am using Python, BTW.

 ᐧ



 On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 You can also union multiple DstreamRDDs in this way
 DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3)  etc etc



 Ps: the API is not “redundant” it offers several ways for achivieving
 the same thing as a convenience depending on the situation



 *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com]
 *Sent:* Tuesday, May 12, 2015 5:37 PM
 *To:* Saisai Shao
 *Cc:* user@spark.apache.org
 *Subject:* Re: DStream Union vs. StreamingContext Union



 Thanks Saisai. That makes sense. Just seems redundant to have both.

 ᐧ



 On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com
 wrote:

 DStream.union can only union two DStream, one is itself. While
 StreamingContext.union can union an array of DStreams, internally
 DStream.union is a special case of StreamingContext.union:



 def union(that: DStream[T]): DStream[T] = new
 UnionDStream[T](Array(this, that))



 So there's no difference, if you want to union more than two DStreams,
 just use the one in StreamingContext, otherwise, both two APIs are fine.





 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com
 :

 Can someone explain to me the difference between DStream union and
 StreamingContext union?

 When do you use one vs the other?



 Thanks,

 Vadim

 ᐧ













Re: DStream Union vs. StreamingContext Union

2015-05-12 Thread Tathagata Das
@Vadim What happened when you tried unioning using DStream.union in python?

TD

On Tue, May 12, 2015 at 9:53 AM, Evo Eftimov evo.efti...@isecc.com wrote:

 I can confirm it does work in Java



 *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com]
 *Sent:* Tuesday, May 12, 2015 5:53 PM
 *To:* Evo Eftimov
 *Cc:* Saisai Shao; user@spark.apache.org

 *Subject:* Re: DStream Union vs. StreamingContext Union



 Thanks Evo. I tried chaining Dstream unions like what you have and it
 didn't work for me. But passing

 multiple arguments to StreamingContext.union worked fine. Any idea why? I
 am using Python, BTW.

 ᐧ



 On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 You can also union multiple DstreamRDDs in this way
 DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3)  etc etc



 Ps: the API is not “redundant” it offers several ways for achivieving the
 same thing as a convenience depending on the situation



 *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com]
 *Sent:* Tuesday, May 12, 2015 5:37 PM
 *To:* Saisai Shao
 *Cc:* user@spark.apache.org
 *Subject:* Re: DStream Union vs. StreamingContext Union



 Thanks Saisai. That makes sense. Just seems redundant to have both.

 ᐧ



 On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com
 wrote:

 DStream.union can only union two DStream, one is itself. While
 StreamingContext.union can union an array of DStreams, internally
 DStream.union is a special case of StreamingContext.union:



 def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this,
 that))



 So there's no difference, if you want to union more than two DStreams,
 just use the one in StreamingContext, otherwise, both two APIs are fine.





 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com:

 Can someone explain to me the difference between DStream union and
 StreamingContext union?

 When do you use one vs the other?



 Thanks,

 Vadim

 ᐧ









Re: DStream Union vs. StreamingContext Union

2015-05-12 Thread Vadim Bichutskiy
@TD I kept getting an empty RDD (i.e. rdd.take(1) was False).
ᐧ

On Tue, May 12, 2015 at 12:57 PM, Tathagata Das tathagata.das1...@gmail.com
 wrote:

 @Vadim What happened when you tried unioning using DStream.union in python?

 TD

 On Tue, May 12, 2015 at 9:53 AM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 I can confirm it does work in Java



 *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com]
 *Sent:* Tuesday, May 12, 2015 5:53 PM
 *To:* Evo Eftimov
 *Cc:* Saisai Shao; user@spark.apache.org

 *Subject:* Re: DStream Union vs. StreamingContext Union



 Thanks Evo. I tried chaining Dstream unions like what you have and it
 didn't work for me. But passing

 multiple arguments to StreamingContext.union worked fine. Any idea why? I
 am using Python, BTW.

 ᐧ



 On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 You can also union multiple DstreamRDDs in this way
 DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3)  etc etc



 Ps: the API is not “redundant” it offers several ways for achivieving the
 same thing as a convenience depending on the situation



 *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com]
 *Sent:* Tuesday, May 12, 2015 5:37 PM
 *To:* Saisai Shao
 *Cc:* user@spark.apache.org
 *Subject:* Re: DStream Union vs. StreamingContext Union



 Thanks Saisai. That makes sense. Just seems redundant to have both.

 ᐧ



 On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com
 wrote:

 DStream.union can only union two DStream, one is itself. While
 StreamingContext.union can union an array of DStreams, internally
 DStream.union is a special case of StreamingContext.union:



 def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this,
 that))



 So there's no difference, if you want to union more than two DStreams,
 just use the one in StreamingContext, otherwise, both two APIs are fine.





 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com:

 Can someone explain to me the difference between DStream union and
 StreamingContext union?

 When do you use one vs the other?



 Thanks,

 Vadim

 ᐧ











RE: DStream Union vs. StreamingContext Union

2015-05-12 Thread Evo Eftimov
I can confirm it does work in Java 

 

From: Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] 
Sent: Tuesday, May 12, 2015 5:53 PM
To: Evo Eftimov
Cc: Saisai Shao; user@spark.apache.org
Subject: Re: DStream Union vs. StreamingContext Union

 

Thanks Evo. I tried chaining Dstream unions like what you have and it didn't 
work for me. But passing

multiple arguments to StreamingContext.union worked fine. Any idea why? I am 
using Python, BTW.

  
https://mailfoogae.appspot.com/t?sender=admFkaW0uYmljaHV0c2tpeUBnbWFpbC5jb20%3Dtype=zerocontentguid=b343f6c5-5a2e-45fc-8317-54caf52e49ed
 ᐧ

  
http://t.signauxcinq.com/e1t/o/5/f18dQhb0S7ks8dDMPbW2n0x6l2B9gXrN7sKj6v5dsrxW7gbZX-8q-6ZdVdnPvF2zlZNzW3hF9wD1k1H6H0?si=5533377798602752pi=6d288bce-f90c-47b8-b786-1cc26adf5b93
 

 

On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com wrote:

You can also union multiple DstreamRDDs in this way 
DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3)  etc etc

 

Ps: the API is not “redundant” it offers several ways for achivieving the same 
thing as a convenience depending on the situation 

 

From: Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com] 
Sent: Tuesday, May 12, 2015 5:37 PM
To: Saisai Shao
Cc: user@spark.apache.org
Subject: Re: DStream Union vs. StreamingContext Union

 

Thanks Saisai. That makes sense. Just seems redundant to have both.

  
https://mailfoogae.appspot.com/t?sender=admFkaW0uYmljaHV0c2tpeUBnbWFpbC5jb20%3Dtype=zerocontentguid=7c28f88f-f212-4811-a16e-e8b21035b172
 ᐧ

 

On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com wrote:

DStream.union can only union two DStream, one is itself. While 
StreamingContext.union can union an array of DStreams, internally DStream.union 
is a special case of StreamingContext.union:

 

def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this, that))

 

So there's no difference, if you want to union more than two DStreams, just use 
the one in StreamingContext, otherwise, both two APIs are fine.

 

 

2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com:

Can someone explain to me the difference between DStream union and 
StreamingContext union? 

When do you use one vs the other?

 

Thanks,

Vadim

  
https://mailfoogae.appspot.com/t?sender=admFkaW0uYmljaHV0c2tpeUBnbWFpbC5jb20%3Dtype=zerocontentguid=6cd729de-8339-40af-b2c5-b249011d6c3e
 ᐧ

 

 

 



Re: DStream Union vs. StreamingContext Union

2015-05-12 Thread Vadim Bichutskiy
Thanks Saisai. That makes sense. Just seems redundant to have both.
ᐧ

On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com
wrote:

 DStream.union can only union two DStream, one is itself. While
 StreamingContext.union can union an array of DStreams, internally
 DStream.union is a special case of StreamingContext.union:

 def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this,
 that))

 So there's no difference, if you want to union more than two DStreams,
 just use the one in StreamingContext, otherwise, both two APIs are fine.


 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com:

 Can someone explain to me the difference between DStream union and
 StreamingContext union?
 When do you use one vs the other?

 Thanks,
 Vadim
 ᐧ





Re: DStream Union vs. StreamingContext Union

2015-05-12 Thread Vadim Bichutskiy
Thanks Evo. I tried chaining Dstream unions like what you have and it
didn't work for me. But passing
multiple arguments to StreamingContext.union worked fine. Any idea why? I
am using Python, BTW.
ᐧ

On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com wrote:

 You can also union multiple DstreamRDDs in this way
 DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3)  etc etc



 Ps: the API is not “redundant” it offers several ways for achivieving the
 same thing as a convenience depending on the situation



 *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com]
 *Sent:* Tuesday, May 12, 2015 5:37 PM
 *To:* Saisai Shao
 *Cc:* user@spark.apache.org
 *Subject:* Re: DStream Union vs. StreamingContext Union



 Thanks Saisai. That makes sense. Just seems redundant to have both.

 ᐧ



 On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com
 wrote:

 DStream.union can only union two DStream, one is itself. While
 StreamingContext.union can union an array of DStreams, internally
 DStream.union is a special case of StreamingContext.union:



 def union(that: DStream[T]): DStream[T] = new UnionDStream[T](Array(this,
 that))



 So there's no difference, if you want to union more than two DStreams,
 just use the one in StreamingContext, otherwise, both two APIs are fine.





 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com:

 Can someone explain to me the difference between DStream union and
 StreamingContext union?

 When do you use one vs the other?



 Thanks,

 Vadim

 ᐧ







Re: DStream Union vs. StreamingContext Union

2015-05-12 Thread Tathagata Das
I wonder that may be a bug in the Python API. Please file it as a JIRA
along with sample code to reproduce it and sample output you get.

On Tue, May 12, 2015 at 10:00 AM, Vadim Bichutskiy 
vadim.bichuts...@gmail.com wrote:

 @TD I kept getting an empty RDD (i.e. rdd.take(1) was False).
 ᐧ

 On Tue, May 12, 2015 at 12:57 PM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 @Vadim What happened when you tried unioning using DStream.union in
 python?

 TD

 On Tue, May 12, 2015 at 9:53 AM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 I can confirm it does work in Java



 *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com]
 *Sent:* Tuesday, May 12, 2015 5:53 PM
 *To:* Evo Eftimov
 *Cc:* Saisai Shao; user@spark.apache.org

 *Subject:* Re: DStream Union vs. StreamingContext Union



 Thanks Evo. I tried chaining Dstream unions like what you have and it
 didn't work for me. But passing

 multiple arguments to StreamingContext.union worked fine. Any idea why?
 I am using Python, BTW.

 ᐧ



 On Tue, May 12, 2015 at 12:45 PM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 You can also union multiple DstreamRDDs in this way
 DstreamRDD1.union(DstreamRDD2).union(DstreamRDD3)  etc etc



 Ps: the API is not “redundant” it offers several ways for achivieving
 the same thing as a convenience depending on the situation



 *From:* Vadim Bichutskiy [mailto:vadim.bichuts...@gmail.com]
 *Sent:* Tuesday, May 12, 2015 5:37 PM
 *To:* Saisai Shao
 *Cc:* user@spark.apache.org
 *Subject:* Re: DStream Union vs. StreamingContext Union



 Thanks Saisai. That makes sense. Just seems redundant to have both.

 ᐧ



 On Mon, May 11, 2015 at 10:36 PM, Saisai Shao sai.sai.s...@gmail.com
 wrote:

 DStream.union can only union two DStream, one is itself. While
 StreamingContext.union can union an array of DStreams, internally
 DStream.union is a special case of StreamingContext.union:



 def union(that: DStream[T]): DStream[T] = new
 UnionDStream[T](Array(this, that))



 So there's no difference, if you want to union more than two DStreams,
 just use the one in StreamingContext, otherwise, both two APIs are fine.





 2015-05-12 6:49 GMT+08:00 Vadim Bichutskiy vadim.bichuts...@gmail.com:

 Can someone explain to me the difference between DStream union and
 StreamingContext union?

 When do you use one vs the other?



 Thanks,

 Vadim

 ᐧ