Hi

Thanks for you started this discussion for adding spark streaming support.
1. Please try to utilize the current code(structured streaming), not adding
separated logic code for spark streaming. 
2. I suggest that by default is using structured streaming , please consider
how to make configuration for enabling/switching to spark streaming.

Regards
Liang


xm_zzc wrote
> Hi dev:
>   Currently CarbonData 1.3(will be released soon) just support to
> integrate
> with Spark Structured Streaming which requires Kafka's version must be >=
> 0.10. I think there are still many users  integrating Spark Streaming with
> kafka 0.8, at least our cluster is, but the cost of upgrading kafka is too
> much. So should CarbonData need to integrate with Spark Streaming too?
>   
>   I think there are two ways to integrate with Spark Streaming, as
> following:
>   1). CarbonData batch data loading + Auto compaction
>   Use CarbonSession.createDataFrame to convert rdd to DataFrame in
> InputDStream.foreachRDD, and then save rdd data into CarbonData table
> which
> support auto compaction. In this way, it can support to create
> pre-aggregate
> tables on this main table too (Streaming table does not support to create
> pre-aggregate tables on it).
>   
>   I can test with this way in our QA env and add example to CarbonData.
>   
>   2). The same as integration with Structured Streaming
>   With this way, Structured Streaming append every mini-batch data into
> stream segment which is row format, and then when the size of stream
> segment
> is greater than 'carbon.streaming.segment.max.size', it will auto convert
> stream segment to batch segment(column format) at the begin of each batch
> and create a new stream segment to append data.
>   However, I have no idea how to integrate with Spark Streaming yet, *any
> suggestion for this*? 
> 
> 
> 
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/





--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Reply via email to