Re: Should CarbonData need to integrate with Spark Streaming too?

2018-01-16 Thread xm_zzc
Liang Chen wrote > Hi > > Thanks for you started this discussion for adding spark streaming support. > 1. Please try to utilize the current code(structured streaming), not > adding > separated logic code for spark streaming. [reply] The original idea is to reuse the current code(structured strea

Re: Should CarbonData need to integrate with Spark Streaming too?

2018-01-16 Thread xm_zzc
Hi Jacky: >> 1). CarbonData batch data loading + Auto compaction >> Use CarbonSession.createDataFrame to convert rdd to DataFrame in >> InputDStream.foreachRDD, and then save rdd data into CarbonData table >> which >> support auto compaction. In this way, it can support to create >> pre-aggreg

Re: Should CarbonData need to integrate with Spark Streaming too?

2018-01-16 Thread Liang Chen
Hi Thanks for you started this discussion for adding spark streaming support. 1. Please try to utilize the current code(structured streaming), not adding separated logic code for spark streaming. 2. I suggest that by default is using structured streaming , please consider how to make configuratio

Select" query failed when executing "COMPACT" and "CLEAN".

2018-01-16 Thread yixu2001
dev spark2.1+ carbondata1.1.1 "Select" query failed when executing "COMPACT" and "CLEAN". After my cluster running for a period of time, there are so many fragments generated in it. Then I input the following 2 command. "ALTER table e_carbon.offer COMPACT 'MAJOR';" "CLEAN FILES FOR TABLE e_c

Re: Should CarbonData need to integrate with Spark Streaming too?

2018-01-16 Thread Jacky Li
> 在 2018年1月17日,上午1:38,xm_zzc <441586...@qq.com> 写道: > > Hi dev: > Currently CarbonData 1.3(will be released soon) just support to integrate > with Spark Structured Streaming which requires Kafka's version must be >= > 0.10. I think there are still many users integrating Spark Streaming with >

Should CarbonData need to integrate with Spark Streaming too?

2018-01-16 Thread xm_zzc
Hi dev: Currently CarbonData 1.3(will be released soon) just support to integrate with Spark Structured Streaming which requires Kafka's version must be >= 0.10. I think there are still many users integrating Spark Streaming with kafka 0.8, at least our cluster is, but the cost of upgrading kafk