Re: The Future Of DStream

2016-07-27 Thread Chang Chen
Things like kafka and user-defined sources are not supported yet, just
because Structure Streaming is in alpha stage.

Things like sort are not supported because of implementation difficulty,
and I don't think DStream can support either

What I want to know is the difference between API (or abstraction), for
example, It is quite easy to use same codes for processing batch data
because of unbounded table abstraction ( which comes from google's Dataflow
paper), that's why the internal engine is based on logical plan, spark plan
and RDD. In contrast, DStream can't do same thing easily

Actually, Dataset supports map,flatMap and reduce,  and hence I can do any
user-defined work in theory, that's why I ask what kind of low-level
control that DStream can do while Structure Stream can not.

Thanks
Chang





On Wed, Jul 27, 2016 at 6:03 PM, Ofir Manor  wrote:

> For the 2.0 release, look for "Unsupported Operations" here:
>
> http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html
> Also, there are bigger gaps - like no Kafka support, no way to plug
> user-defined sources or sinks etc
>
> Ofir Manor
>
> Co-Founder & CTO | Equalum
>
> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
>
> On Wed, Jul 27, 2016 at 11:24 AM, Chang Chen  wrote:
>
>>
>> I don't understand what kind of low level control that DStream can do
>> while Structure Streaming can not
>>
>> Thanks
>> Chang
>>
>> On Wednesday, July 27, 2016, Matei Zaharia 
>> wrote:
>>
>>> Yup, they will definitely coexist. Structured Streaming is currently
>>> alpha and will probably be complete in the next few releases, but Spark
>>> Streaming will continue to exist, because it gives the user more low-level
>>> control. It's similar to DataFrames vs RDDs (RDDs are the lower-level API
>>> for when you want control, while DataFrames do more optimizations
>>> automatically by restricting the computation model).
>>>
>>> Matei
>>>
>>> On Jul 27, 2016, at 12:03 AM, Ofir Manor  wrote:
>>>
>>> Structured Streaming in 2.0 is declared as alpha - plenty of bits still
>>> missing:
>>>
>>> http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html
>>> I assume that it will be declared stable / GA in a future 2.x release,
>>> and then it will co-exist with DStream for quite a while before someone
>>> will suggest to start a deprecation process that will eventually lead to
>>> its removal...
>>> As a user, I guess we will need to apply judgement about when to switch
>>> to Structured Streaming - each of us have a different risk/value tradeoff,
>>> based on our specific situation...
>>>
>>> Ofir Manor
>>>
>>> Co-Founder & CTO | Equalum
>>>
>>> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
>>>
>>> On Wed, Jul 27, 2016 at 8:02 AM, Chang Chen 
>>> wrote:
>>>
 Hi guys

 Structure Stream is coming with spark 2.0,  but I noticed that DStream
 is still here

 What's the future of the DStream, will it be deprecated and removed
 eventually? Or co-existed with  Structure Stream forever?

 Thanks
 Chang


>>>
>>>
>


Re: The Future Of DStream

2016-07-27 Thread Ofir Manor
For the 2.0 release, look for "Unsupported Operations" here:

http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html
Also, there are bigger gaps - like no Kafka support, no way to plug
user-defined sources or sinks etc

Ofir Manor

Co-Founder & CTO | Equalum

Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io

On Wed, Jul 27, 2016 at 11:24 AM, Chang Chen  wrote:

>
> I don't understand what kind of low level control that DStream can do
> while Structure Streaming can not
>
> Thanks
> Chang
>
> On Wednesday, July 27, 2016, Matei Zaharia 
> wrote:
>
>> Yup, they will definitely coexist. Structured Streaming is currently
>> alpha and will probably be complete in the next few releases, but Spark
>> Streaming will continue to exist, because it gives the user more low-level
>> control. It's similar to DataFrames vs RDDs (RDDs are the lower-level API
>> for when you want control, while DataFrames do more optimizations
>> automatically by restricting the computation model).
>>
>> Matei
>>
>> On Jul 27, 2016, at 12:03 AM, Ofir Manor  wrote:
>>
>> Structured Streaming in 2.0 is declared as alpha - plenty of bits still
>> missing:
>>
>> http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html
>> I assume that it will be declared stable / GA in a future 2.x release,
>> and then it will co-exist with DStream for quite a while before someone
>> will suggest to start a deprecation process that will eventually lead to
>> its removal...
>> As a user, I guess we will need to apply judgement about when to switch
>> to Structured Streaming - each of us have a different risk/value tradeoff,
>> based on our specific situation...
>>
>> Ofir Manor
>>
>> Co-Founder & CTO | Equalum
>>
>> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
>>
>> On Wed, Jul 27, 2016 at 8:02 AM, Chang Chen  wrote:
>>
>>> Hi guys
>>>
>>> Structure Stream is coming with spark 2.0,  but I noticed that DStream
>>> is still here
>>>
>>> What's the future of the DStream, will it be deprecated and removed
>>> eventually? Or co-existed with  Structure Stream forever?
>>>
>>> Thanks
>>> Chang
>>>
>>>
>>
>>


Re: The Future Of DStream

2016-07-27 Thread Chang Chen
I don't understand what kind of low level control that DStream can do while
Structure Streaming can not

Thanks
Chang

On Wednesday, July 27, 2016, Matei Zaharia  wrote:

> Yup, they will definitely coexist. Structured Streaming is currently alpha
> and will probably be complete in the next few releases, but Spark Streaming
> will continue to exist, because it gives the user more low-level control.
> It's similar to DataFrames vs RDDs (RDDs are the lower-level API for when
> you want control, while DataFrames do more optimizations automatically by
> restricting the computation model).
>
> Matei
>
> On Jul 27, 2016, at 12:03 AM, Ofir Manor  > wrote:
>
> Structured Streaming in 2.0 is declared as alpha - plenty of bits still
> missing:
>
> http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html
> I assume that it will be declared stable / GA in a future 2.x release, and
> then it will co-exist with DStream for quite a while before someone will
> suggest to start a deprecation process that will eventually lead to its
> removal...
> As a user, I guess we will need to apply judgement about when to switch to
> Structured Streaming - each of us have a different risk/value tradeoff,
> based on our specific situation...
>
> Ofir Manor
>
> Co-Founder & CTO | Equalum
>
> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
> 
>
> On Wed, Jul 27, 2016 at 8:02 AM, Chang Chen  > wrote:
>
>> Hi guys
>>
>> Structure Stream is coming with spark 2.0,  but I noticed that DStream is
>> still here
>>
>> What's the future of the DStream, will it be deprecated and removed
>> eventually? Or co-existed with  Structure Stream forever?
>>
>> Thanks
>> Chang
>>
>>
>
>


Re: The Future Of DStream

2016-07-27 Thread Matei Zaharia
Yup, they will definitely coexist. Structured Streaming is currently alpha and 
will probably be complete in the next few releases, but Spark Streaming will 
continue to exist, because it gives the user more low-level control. It's 
similar to DataFrames vs RDDs (RDDs are the lower-level API for when you want 
control, while DataFrames do more optimizations automatically by restricting 
the computation model).

Matei

> On Jul 27, 2016, at 12:03 AM, Ofir Manor  wrote:
> 
> Structured Streaming in 2.0 is declared as alpha - plenty of bits still 
> missing:
>  
> http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html
>  
> 
> I assume that it will be declared stable / GA in a future 2.x release, and 
> then it will co-exist with DStream for quite a while before someone will 
> suggest to start a deprecation process that will eventually lead to its 
> removal...
> As a user, I guess we will need to apply judgement about when to switch to 
> Structured Streaming - each of us have a different risk/value tradeoff, based 
> on our specific situation...
> 
> Ofir Manor
> 
> Co-Founder & CTO | Equalum
> 
> 
> Mobile: +972-54-7801286  | Email: 
> ofir.ma...@equalum.io 
> On Wed, Jul 27, 2016 at 8:02 AM, Chang Chen  > wrote:
> Hi guys
> 
> Structure Stream is coming with spark 2.0,  but I noticed that DStream is 
> still here
> 
> What's the future of the DStream, will it be deprecated and removed 
> eventually? Or co-existed with  Structure Stream forever?
> 
> Thanks
> Chang
> 
> 



Re: The Future Of DStream

2016-07-27 Thread Ofir Manor
Structured Streaming in 2.0 is declared as alpha - plenty of bits still
missing:

http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html
I assume that it will be declared stable / GA in a future 2.x release, and
then it will co-exist with DStream for quite a while before someone will
suggest to start a deprecation process that will eventually lead to its
removal...
As a user, I guess we will need to apply judgement about when to switch to
Structured Streaming - each of us have a different risk/value tradeoff,
based on our specific situation...

Ofir Manor

Co-Founder & CTO | Equalum

Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io

On Wed, Jul 27, 2016 at 8:02 AM, Chang Chen  wrote:

> Hi guys
>
> Structure Stream is coming with spark 2.0,  but I noticed that DStream is
> still here
>
> What's the future of the DStream, will it be deprecated and removed
> eventually? Or co-existed with  Structure Stream forever?
>
> Thanks
> Chang
>
>