Re:[DISCUSSION]support new feature: Partition Table

2017-04-01 Thread a
ition info to schema. > > 2. during data loading, re-partition the input data, start a task process >a partition, write partition information to footer and index file. > > 3. during data query, prune B+Tree by partition if the filter contain the >partition column. or prune d

Re:Re:Re: Load data into carbondata executors distributed unevenly

2017-03-30 Thread a
Yes ,it is.Babu,thanks for your help! Best regards! At 2017-03-30 17:32:07, "babu lal jangir" wrote: >Hi >Please refer below jira id . I guess your issue is same . >CARBONDATA-830 >(Data loading scheduling has some issue) > >Thanks >Babu >On Mar 30, 2017 12:26

Re:Re: Load data into carbondata executors distributed unevenly

2017-03-29 Thread a
add attachments At 2017-03-30 10:38:08, "Ravindra Pesala" wrote: >Hi, > >It seems attachments are missing.Can you attach them again. > >Regards, >Ravindra. > >On 30 March 2017 at 08:02, a wrote: > >> Hello! >> >> *Test result:* &g

Load data into carbondata executors distributed unevenly

2017-03-29 Thread a
Hello! Test result: When I load csv data into carbondata table 3 times,the executors distributed unevenly。My purpose is one node one task,but the result is some node has 2 task and some node has no task。 See the load data 1.png,data 2.png,data 3.png。 The carbondata data.PNG is the data structu

Re:Re: Re:Re:Re:Re:Re:Re: insert into carbon table failed

2017-03-28 Thread a
Thank you very much! I have divided 2 billions data into 4 pieces and loaded in the table 。 The three paramaters carbon.graph.rowset.size、 carbon.sort.size 、carbon.number.of.cores.while.loading may be also effect。 Best regards! At 2017-03-27 13:53:58, "Liang Chen" wrote: >Hi > >1.Use your

Re:Re:Re:Re:Re:Re: insert into carbon table failed

2017-03-26 Thread a
017-01-01' and sty='' and fo like '%%' group by tv distinct 多列 select count(distinct user_id) ,count(distinct mid),count(distinct case when sty='' then mid end) from carbon_table where dt='2017-01-01' and sty='' 排序查询

Re:Re:Re:Re:Re: insert into carbon table failed

2017-03-26 Thread a
) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) At 2017-03-27 00:42:28, "a" wrote: Container log : error executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM。 spark log: 17/03/26 23:40:30 ERROR YarnScheduler: Lost executor 2 on hd25: Contai

Re:Re:Re:Re: insert into carbon table failed

2017-03-26 Thread a
spark.yarn.executor.memoryOverhead. The test sql At 2017-03-26 23:34:36, "a" wrote: > > >I have set the parameters as follow: >1、fs.hdfs.impl.disable.cache=true >2、dfs.socket.timeout=180 (Exception:aused by: java.io.IOException: >Filesystem closed) >3、dfs.datanode.sock

Re:Re:Re: insert into carbon table failed

2017-03-26 Thread a
rt 20 records into carbondata? Should me set executor-memory big enough? Or Should me generate the csv file from the hive table first ,then load the csv file into carbon table? Any body give me same help? Regards fish At 2017-03-26 00:34:18, "a" wrote: >Thank you

Re:Re: insert into carbon table failed

2017-03-25 Thread a
ng,play_pv Int,spt_cnt >> Int,prg_spt_cnt Int) row format delimited fields terminated by '|' STORED >> BY 'carbondata' TBLPROPERTIES ('DICTIONARY_EXCLUDE'='pip,sh, >> mid,fo,user_id','DICTIONARY_INCLUDE'='dt,pt,lst,plat,sty, >> is_

Re:Re: insert into carbon table failed

2017-03-25 Thread a
;DICTIONARY_INCLUDE'='dt,pt,lst,plat,sty,is_pay,is_vip,is_mpack,scene,status,nw,isc,area,spttag,province,isp,city,tv,hwm','NO_INVERTED_INDEX'='lst,plat,hwm,pip,sh,mid','BUCKETNUMBER'='10','BUCKETCOLUMNS'='fo')") >>

Re:Re: insert into carbon table failed

2017-03-25 Thread a
27;|' STORED >> BY 'carbondata' TBLPROPERTIES >> ('DICTIONARY_EXCLUDE'='pip,sh,mid,fo,user_id','DICTIONARY_INCLUDE'='dt,pt,lst,plat,sty,is_pay,is_vip,is_mpack,scene,status,nw,isc,area,spttag,province,isp,city,tv,hwm','NO_INVERTED_INDEX'=&#x