Help, carbondata issues on spark

2018-02-02 Thread ilegend
Hi guys 
We're testing carbondata for our project. The performance of the carbondata is 
better than parquet under the special rules, but there are some problems. Do 
you have any solutions for our issues. 
Hdfs 2.6, spark 2.1, carbondata 1.3
1.no multiple levels partitions , we need three levels partitions, like 
year,day,hour
2.spark needs import carbondata jar, we wouldn't modify the existing sql 
algorithm 
3.low stability, insert failure frequently 

Look forward to your reply.

发自我的 iPhone








Re: Help, carbondata issues on spark

2018-02-03 Thread Liang Chen
Hi

1.no multiple levels partitions , we need three levels partitions, like 
year,day,hour 

Reply : Year,day,hour belong to one column(field)  or three columns ?   Can 
you explain, what are your exact scenarios?  we can help you to design 
partition + sort columns to solve your specific query issues. 

2.spark needs import carbondata jar, we wouldn't modify the existing sql 
algorithm 

Reply : No need to modify any sql rules , you can use all sql which be 
supported by SparkSQL to query carbondata. 

3.low stability, insert failure frequently 
Reply : What are the exact error ? 

Regards
Liang

ilegend wrote
> Hi guys 
> We're testing carbondata for our project. The performance of the
> carbondata is better than parquet under the special rules, but there are
> some problems. Do you have any solutions for our issues. 
> Hdfs 2.6, spark 2.1, carbondata 1.3
> 1.no multiple levels partitions , we need three levels partitions, like
> year,day,hour
> 2.spark needs import carbondata jar, we wouldn't modify the existing sql
> algorithm 
> 3.low stability, insert failure frequently 
> 
> Look forward to your reply.
> 
> 发自我的 iPhone





--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: Help, carbondata issues on spark

2018-02-03 Thread Jacky Li


> 在 2018年2月2日,上午11:30,ilegend <511618...@qq.com> 写道:
> 
> Hi guys 
> We're testing carbondata for our project. The performance of the carbondata 
> is better than parquet under the special rules, but there are some problems. 
> Do you have any solutions for our issues. 
> Hdfs 2.6, spark 2.1, carbondata 1.3
> 1.no multiple levels partitions , we need three levels partitions, like 
> year,day,hour

If you are looking for OLAP on timeseries day, you can try timeseries feature 
in 1.3, you can refer to the timeseries section in 
https://github.com/apache/carbondata/blob/master/docs/data-management-on-carbondata.md#pre-aggregate-tables
 


> 2.spark needs import carbondata jar, we wouldn't modify the existing sql 
> algorithm 

I think if you are using CarbonSession, you have all builtin sql optimization 
support from carbon. You do not need to modify your spark jar.

> 3.low stability, insert failure frequently 

Is it memory issue?

> 
> Look forward to your reply.
> 
> 发自我的 iPhone
> 
> 
> 
> 
> 
>