Hi 

Please enable vector , it might help limit query.

import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.constants.CarbonCommonConstants
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_VECTOR_READER,
"true")

Regards
Liang


a wrote
> TEST SQL :
> 高基数随机查询
> select * From carbon_table where dt='2017-01-01' and user_id='XXXX' limit
> 100;
> 
> 
> 高基数随机查询like
> select * From carbon_table where dt='2017-01-01' and fo like '%YYYY%'
> limit 100;
> 
> 
> 低基数随机查询
> select * From carbon_table where dt='2017-01-01' and plat='android' and
> tv='8400' limit 100
> 
> 
> 1维度查询
> select province,sum(play_pv) play_pv ,sum(spt_cnt) spt_cnt
> from carbon_table where dt='2017-01-01' and sty='AAAA'
> group by province
> 
> 
> 2维度查询
> select province,city,sum(play_pv) play_pv ,sum(spt_cnt) spt_cnt
> from carbon_table where dt='2017-01-01' and sty='AAAA'
> group by province,city
> 
> 
> 3维度查询
> select province,city,isp,sum(play_pv) play_pv ,sum(spt_cnt) spt_cnt
> from carbon_table where dt='2017-01-01' and sty='AAAA'
> group by province,city,isp
> 
> 
> 多维度查询
> select sty,isc,status,nw,tv,area,province,city,isp,sum(play_pv)
> play_pv_sum ,sum(spt_cnt) spt_cnt_sum
> from carbon_table where dt='2017-01-01' and sty='AAAA'
> group by sty,isc,status,nw,tv,area,province,city,isp
> 
> 
> distinct 单列
> select tv, count(distinct user_id) 
> from carbon_table where dt='2017-01-01' and sty='AAAA' and fo like
> '%YYYY%' group by tv
> 
> 
> distinct 多列
> select count(distinct user_id) ,count(distinct mid),count(distinct case
> when sty='AAAA' then mid end)
> from carbon_table where dt='2017-01-01' and sty='AAAA'
> 
> 
> 排序查询
> select user_id,sum(play_pv) play_pv_sum
> from carbon_table
> group by user_id
> order by play_pv_sum desc limit 100
> 
> 
> 简单join查询
> select b.fo_level1,b.fo_level2,sum(a.play_pv) play_pv_sum From
> carbon_table a
> left join dim_carbon_table b
> on a.fo=b.fo and a.dt = b.dt where a.dt = '2017-01-01' group by
> b.fo_level1,b.fo_level2
> 
> At 2017-03-27 04:10:04, "a" <

> wwyxg@

> > wrote:
>>I download  the newest sourcecode (master) and compile,generate the jar
carbondata_2.11-1.1.0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar
>>Then i use spark2.1 test again.The error logs are as follow:
>>
>>
>> Container log :
>>17/03/27 02:27:21 ERROR newflow.DataLoadExecutor: Executor task launch
worker-9 Data Loading failed for table carbon_table
>>java.lang.NullPointerException
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.createConfiguration(DataLoadProcessBuilder.java:158)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.build(DataLoadProcessBuilder.java:60)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:43)
>>        at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.
> <init>
> (NewCarbonDataLoadRDD.scala:365)
>>        at
>> org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.compute(NewCarbonDataLoadRDD.scala:322)
>>        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>        at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>        at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>        at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>        at java.lang.Thread.run(Thread.java:745)
>>17/03/27 02:27:21 INFO rdd.NewDataFrameLoaderRDD: DataLoad failure
>>org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException:
Data Loading failed for table carbon_table
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:54)
>>        at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.
> <init>
> (NewCarbonDataLoadRDD.scala:365)
>>        at
>> org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.compute(NewCarbonDataLoadRDD.scala:322)
>>        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>        at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>        at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>        at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>        at java.lang.Thread.run(Thread.java:745)
>>Caused by: java.lang.NullPointerException
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.createConfiguration(DataLoadProcessBuilder.java:158)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.build(DataLoadProcessBuilder.java:60)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:43)
>>        ... 10 more
>>17/03/27 02:27:21 ERROR rdd.NewDataFrameLoaderRDD: Executor task launch
worker-9 
>>org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException:
Data Loading failed for table carbon_table
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:54)
>>        at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.
> <init>
> (NewCarbonDataLoadRDD.scala:365)
>>        at
>> org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.compute(NewCarbonDataLoadRDD.scala:322)
>>        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>        at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>        at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>        at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>        at java.lang.Thread.run(Thread.java:745)
>>Caused by: java.lang.NullPointerException
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.createConfiguration(DataLoadProcessBuilder.java:158)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.build(DataLoadProcessBuilder.java:60)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:43)
>>        ... 10 more
>>17/03/27 02:27:21 ERROR executor.Executor: Exception in task 0.3 in stage
2.0 (TID 538)
>>org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException:
Data Loading failed for table carbon_table
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:54)
>>        at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.
> <init>
> (NewCarbonDataLoadRDD.scala:365)
>>        at
>> org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.compute(NewCarbonDataLoadRDD.scala:322)
>>        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>        at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>        at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>        at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>        at java.lang.Thread.run(Thread.java:745)
>>Caused by: java.lang.NullPointerException
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.createConfiguration(DataLoadProcessBuilder.java:158)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.build(DataLoadProcessBuilder.java:60)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:43)
>>        ... 10 more
>>
>>
>>
>>Spark log:
>>
>>ERROR 27-03 02:27:21,407 - Task 0 in stage 2.0 failed 4 times; aborting
job
>>ERROR 27-03 02:27:21,419 - main load data frame failed
>>org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0
(TID 538, hd25):
org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException:
Data Loading failed for table carbon_table
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:54)
>>        at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.
> <init>
> (NewCarbonDataLoadRDD.scala:365)
>>        at
>> org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.compute(NewCarbonDataLoadRDD.scala:322)
>>        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>        at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>        at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>        at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>        at java.lang.Thread.run(Thread.java:745)
>>Caused by: java.lang.NullPointerException
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.createConfiguration(DataLoadProcessBuilder.java:158)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.build(DataLoadProcessBuilder.java:60)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:43)
>>        ... 10 more
>>
>>
>>Driver stacktrace:
>>        at
>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
>>        at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
>>        at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
>>        at
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>        at
>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>        at
>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
>>        at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
>>        at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
>>        at scala.Option.foreach(Option.scala:236)
>>        at
>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
>>        at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
>>        at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
>>        at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
>>        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>>        at
>> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
>>        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
>>        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
>>        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
>>        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
>>        at
>> org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
>>        at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
>>        at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
>>        at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
>>        at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
>>        at
>> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadDataFrame$1(CarbonDataRDDFactory.scala:665)
>>        at
>> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:794)
>>        at
>> org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:579)
>>        at
>> org.apache.spark.sql.execution.command.LoadTableByInsert.run(carbonTableSchema.scala:297)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
>>        at
>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
>>        at
>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
>>        at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
>>        at
>> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
>>        at
>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
>>        at
>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
>>        at org.apache.spark.sql.DataFrame.
> <init>
> (DataFrame.scala:145)
>>        at org.apache.spark.sql.DataFrame.
> <init>
> (DataFrame.scala:130)
>>        at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:139)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :31)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :36)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :38)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :40)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :42)
>>        at $line23.$read$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :44)
>>        at $line23.$read$$iwC$$iwC.
> <init>
> (
> <console>
> :46)
>>        at $line23.$read$$iwC.
> <init>
> (
> <console>
> :48)
>>        at $line23.$read.
> <init>
> (
> <console>
> :50)
>>        at $line23.$read$.
> <init>
> (
> <console>
> :54)
>>        at $line23.$read$.
> <clinit>
> (
> <console>
> )
>>        at $line23.$eval$.
> <init>
> (
> <console>
> :7)
>>        at $line23.$eval$.
> <clinit>
> (
> <console>
> )
>>        at $line23.$eval.$print(
> <console>
> )
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>        at java.lang.reflect.Method.invoke(Method.java:606)
>>        at
>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
>>        at
>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
>>        at
>> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>>        at
>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
>>        at
>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
>>        at
>> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>>        at
>> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>>        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>>        at
>> org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
>>        at
>> org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
>>        at
>> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>>        at
>> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>>        at
>> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
>>        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
>>        at org.apache.spark.repl.Main$.main(Main.scala:31)
>>        at org.apache.spark.repl.Main.main(Main.scala)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>        at java.lang.reflect.Method.invoke(Method.java:606)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
>>        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>Caused by:
org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException:
Data Loading failed for table carbon_table
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:54)
>>        at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.
> <init>
> (NewCarbonDataLoadRDD.scala:365)
>>        at
>> org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.compute(NewCarbonDataLoadRDD.scala:322)
>>        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>        at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>        at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>        at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>        at java.lang.Thread.run(Thread.java:745)
>>Caused by: java.lang.NullPointerException
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.createConfiguration(DataLoadProcessBuilder.java:158)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.build(DataLoadProcessBuilder.java:60)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:43)
>>        ... 10 more
>>ERROR 27-03 02:27:21,422 - main 
>>org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0
(TID 538, hd25):
org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException:
Data Loading failed for table carbon_table
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:54)
>>        at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.
> <init>
> (NewCarbonDataLoadRDD.scala:365)
>>        at
>> org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.compute(NewCarbonDataLoadRDD.scala:322)
>>        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>        at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>        at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>        at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>        at java.lang.Thread.run(Thread.java:745)
>>Caused by: java.lang.NullPointerException
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.createConfiguration(DataLoadProcessBuilder.java:158)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.build(DataLoadProcessBuilder.java:60)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:43)
>>        ... 10 more
>>
>>
>>Driver stacktrace:
>>        at
>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
>>        at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
>>        at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
>>        at
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>        at
>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>        at
>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
>>        at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
>>        at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
>>        at scala.Option.foreach(Option.scala:236)
>>        at
>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
>>        at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
>>        at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
>>        at
>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
>>        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>>        at
>> org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
>>        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
>>        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
>>        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
>>        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
>>        at
>> org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
>>        at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
>>        at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
>>        at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
>>        at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
>>        at
>> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadDataFrame$1(CarbonDataRDDFactory.scala:665)
>>        at
>> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:794)
>>        at
>> org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:579)
>>        at
>> org.apache.spark.sql.execution.command.LoadTableByInsert.run(carbonTableSchema.scala:297)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
>>        at
>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
>>        at
>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
>>        at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
>>        at
>> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
>>        at
>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
>>        at
>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
>>        at org.apache.spark.sql.DataFrame.
> <init>
> (DataFrame.scala:145)
>>        at org.apache.spark.sql.DataFrame.
> <init>
> (DataFrame.scala:130)
>>        at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:139)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :31)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :36)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :38)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :40)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :42)
>>        at $line23.$read$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :44)
>>        at $line23.$read$$iwC$$iwC.
> <init>
> (
> <console>
> :46)
>>        at $line23.$read$$iwC.
> <init>
> (
> <console>
> :48)
>>        at $line23.$read.
> <init>
> (
> <console>
> :50)
>>        at $line23.$read$.
> <init>
> (
> <console>
> :54)
>>        at $line23.$read$.
> <clinit>
> (
> <console>
> )
>>        at $line23.$eval$.
> <init>
> (
> <console>
> :7)
>>        at $line23.$eval$.
> <clinit>
> (
> <console>
> )
>>        at $line23.$eval.$print(
> <console>
> )
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>        at java.lang.reflect.Method.invoke(Method.java:606)
>>        at
>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
>>        at
>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
>>        at
>> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>>        at
>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
>>        at
>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
>>        at
>> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>>        at
>> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>>        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>>        at
>> org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
>>        at
>> org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
>>        at
>> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>>        at
>> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>>        at
>> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
>>        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
>>        at org.apache.spark.repl.Main$.main(Main.scala:31)
>>        at org.apache.spark.repl.Main.main(Main.scala)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>        at java.lang.reflect.Method.invoke(Method.java:606)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
>>        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>Caused by:
org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException:
Data Loading failed for table carbon_table
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:54)
>>        at org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.
> <init>
> (NewCarbonDataLoadRDD.scala:365)
>>        at
>> org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.compute(NewCarbonDataLoadRDD.scala:322)
>>        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>        at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>        at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>        at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>        at java.lang.Thread.run(Thread.java:745)
>>Caused by: java.lang.NullPointerException
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.createConfiguration(DataLoadProcessBuilder.java:158)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadProcessBuilder.build(DataLoadProcessBuilder.java:60)
>>        at
>> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:43)
>>        ... 10 more
>>AUDIT 27-03 02:27:21,453 - [hd21][storm][Thread-1]Data load is failed for
default.carbon_table
>>ERROR 27-03 02:27:21,453 - main 
>>java.lang.Exception: DataLoad failure: Data Loading failed for table
carbon_table
>>        at
>> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:937)
>>        at
>> org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:579)
>>        at
>> org.apache.spark.sql.execution.command.LoadTableByInsert.run(carbonTableSchema.scala:297)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
>>        at
>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
>>        at
>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
>>        at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
>>        at
>> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
>>        at
>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
>>        at
>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
>>        at org.apache.spark.sql.DataFrame.
> <init>
> (DataFrame.scala:145)
>>        at org.apache.spark.sql.DataFrame.
> <init>
> (DataFrame.scala:130)
>>        at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:139)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :31)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :36)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :38)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :40)
>>        at $line23.$read$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :42)
>>        at $line23.$read$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :44)
>>        at $line23.$read$$iwC$$iwC.
> <init>
> (
> <console>
> :46)
>>        at $line23.$read$$iwC.
> <init>
> (
> <console>
> :48)
>>        at $line23.$read.
> <init>
> (
> <console>
> :50)
>>        at $line23.$read$.
> <init>
> (
> <console>
> :54)
>>        at $line23.$read$.
> <clinit>
> (
> <console>
> )
>>        at $line23.$eval$.
> <init>
> (
> <console>
> :7)
>>        at $line23.$eval$.
> <clinit>
> (
> <console>
> )
>>        at $line23.$eval.$print(
> <console>
> )
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>        at java.lang.reflect.Method.invoke(Method.java:606)
>>        at
>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
>>        at
>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
>>        at
>> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>>        at
>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
>>        at
>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
>>        at
>> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>>        at
>> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>>        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>>        at
>> org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
>>        at
>> org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
>>        at
>> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>>        at
>> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>>        at
>> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
>>        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
>>        at org.apache.spark.repl.Main$.main(Main.scala:31)
>>        at org.apache.spark.repl.Main.main(Main.scala)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>        at java.lang.reflect.Method.invoke(Method.java:606)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
>>        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>AUDIT 27-03 02:27:21,454 - [hd21][storm][Thread-1]Dataload failure for
default.carbon_table. Please check the logs
>>java.lang.Exception: DataLoad failure: Data Loading failed for table
carbon_table
>>        at
>> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:937)
>>        at
>> org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:579)
>>        at
>> org.apache.spark.sql.execution.command.LoadTableByInsert.run(carbonTableSchema.scala:297)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
>>        at
>> org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
>>        at
>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
>>        at
>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
>>        at
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
>>        at
>> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
>>        at
>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
>>        at
>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
>>        at org.apache.spark.sql.DataFrame.
> <init>
> (DataFrame.scala:145)
>>        at org.apache.spark.sql.DataFrame.
> <init>
> (DataFrame.scala:130)
>>        at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:139)
>>        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :31)
>>        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :36)
>>        at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :38)
>>        at $iwC$$iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :40)
>>        at $iwC$$iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :42)
>>        at $iwC$$iwC$$iwC.
> <init>
> (
> <console>
> :44)
>>        at $iwC$$iwC.
> <init>
> (
> <console>
> :46)
>>        at $iwC.
> <init>
> (
> <console>
> :48)
>>        at 
> <init>
> (
> <console>
> :50)
>>        at .
> <init>
> (
> <console>
> :54)
>>        at .
> <clinit>
> (
> <console>
> )
>>        at .
> <init>
> (
> <console>
> :7)
>>        at .
> <clinit>
> (
> <console>
> )
>>        at $print(
> <console>
> )
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>        at java.lang.reflect.Method.invoke(Method.java:606)
>>        at
>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
>>        at
>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
>>        at
>> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>>        at
>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
>>        at
>> org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
>>        at
>> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>>        at
>> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>>        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>>        at
>> org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
>>        at
>> org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
>>        at
>> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>>        at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>>        at
>> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>>        at
>> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
>>        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
>>        at org.apache.spark.repl.Main$.main(Main.scala:31)
>>        at org.apache.spark.repl.Main.main(Main.scala)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>        at java.lang.reflect.Method.invoke(Method.java:606)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
>>        at
>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
>>        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>>At 2017-03-27 00:42:28, "a" &lt;

> wwyxg@

> &gt; wrote:
>>
>> 
>>
>> Container log : error executor.CoarseGrainedExecutorBackend: RECEIVED
>> SIGNAL 15: SIGTERM。
>> spark log: 17/03/26 23:40:30 ERROR YarnScheduler: Lost executor 2 on
>> hd25: Container killed by YARN for exceeding memory limits. 49.0 GB of 49
>> GB physical memory used. Consider boosting
>> spark.yarn.executor.memoryOverhead.
>>The test sql
>>
>>
>>
>>
>>
>>
>>
>>At 2017-03-26 23:34:36, "a" &lt;

> wwyxg@

> &gt; wrote:
>>>
>>>
>>>I have set the parameters as follow:
>>>1、fs.hdfs.impl.disable.cache=true
>>>2、dfs.socket.timeout=1800000  (Exception:aused by: java.io.IOException:
Filesystem closed)
>>>3、dfs.datanode.socket.write.timeout=3600000
>>>4、set carbondata property enable.unsafe.sort=true
>>>5、remove BUCKETCOLUMNS property from the create table sql
>>>6、set spark job parameter executor-memory=48G (from 20G to 48G)
>>>
>>>
>>>But it  still failed, the error is
"executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM。"
>>>
>>>
>>>Then i try to insert 40000 0000 records into carbondata table ,it works
success.
>>>
>>>
>>>How can i insert 20 0000 0000 records into carbondata?
>>>Should me set  executor-memory big enough? Or Should me generate the csv
file from the hive table first ,then load the csv file into carbon table?
>>>Any body give me same help?
>>>
>>>
>>>Regards
>>>fish
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>At 2017-03-26 00:34:18, "a" &lt;

> wwyxg@

> &gt; wrote:
>>>>Thank you  Ravindra!
>>>>Version:
>>>>My carbondata version is 1.0,spark version is 1.6.3,hadoop version is
2.7.1,hive version is 1.1.0
>>>>one of the containers log:
>>>>17/03/25 22:07:09 ERROR executor.CoarseGrainedExecutorBackend: RECEIVED
SIGNAL 15: SIGTERM
>>>>17/03/25 22:07:09 INFO storage.DiskBlockManager: Shutdown hook called
>>>>17/03/25 22:07:09 INFO util.ShutdownHookManager: Shutdown hook called
>>>>17/03/25 22:07:09 INFO util.ShutdownHookManager: Deleting directory
/data1/hadoop/hd_space/tmp/nm-local-dir/usercache/storm/appcache/application_1490340325187_0042/spark-84b305f9-af7b-4f58-a809-700345a84109
>>>>17/03/25 22:07:10 ERROR impl.ParallelReadMergeSorterImpl:
pool-23-thread-2 
>>>>java.io.IOException: Error reading file:
hdfs://xxxx_table_tmp/dt=2017-01-01/pt=ios/000006_0
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1046)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.next(OrcRawRecordMerger.java:263)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.next(OrcRawRecordMerger.java:547)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$1.next(OrcInputFormat.java:1234)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$1.next(OrcInputFormat.java:1218)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$NullKeyRecordReader.next(OrcInputFormat.java:1150)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$NullKeyRecordReader.next(OrcInputFormat.java:1136)
>>>>        at
>>>> org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:249)
>>>>        at
>>>> org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211)
>>>>        at
>>>> org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
>>>>        at
>>>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>>>>        at
>>>> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>>>>        at
>>>> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>>>>        at
>>>> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>>>>        at
>>>> org.apache.carbondata.spark.rdd.NewRddIterator.hasNext(NewCarbonDataLoadRDD.scala:412)
>>>>        at
>>>> org.apache.carbondata.processing.newflow.steps.InputProcessorStepImpl$InputProcessorIterator.internalHasNext(InputProcessorStepImpl.java:163)
>>>>        at
>>>> org.apache.carbondata.processing.newflow.steps.InputProcessorStepImpl$InputProcessorIterator.getBatch(InputProcessorStepImpl.java:221)
>>>>        at
>>>> org.apache.carbondata.processing.newflow.steps.InputProcessorStepImpl$InputProcessorIterator.next(InputProcessorStepImpl.java:183)
>>>>        at
>>>> org.apache.carbondata.processing.newflow.steps.InputProcessorStepImpl$InputProcessorIterator.next(InputProcessorStepImpl.java:117)
>>>>        at
>>>> org.apache.carbondata.processing.newflow.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:80)
>>>>        at
>>>> org.apache.carbondata.processing.newflow.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:73)
>>>>        at
>>>> org.apache.carbondata.processing.newflow.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.call(ParallelReadMergeSorterImpl.java:196)
>>>>        at
>>>> org.apache.carbondata.processing.newflow.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.call(ParallelReadMergeSorterImpl.java:177)
>>>>        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>        at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>        at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>        at java.lang.Thread.run(Thread.java:745)
>>>>Caused by: java.io.IOException: Filesystem closed
>>>>        at
>>>> org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
>>>>        at
>>>> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:868)
>>>>        at
>>>> org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
>>>>        at java.io.DataInputStream.readFully(DataInputStream.java:195)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.MetadataReader.readStripeFooter(MetadataReader.java:112)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripeFooter(RecordReaderImpl.java:228)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.beginReadStripe(RecordReaderImpl.java:805)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:776)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:986)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1019)
>>>>        at
>>>> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1042)
>>>>        ... 26 more
>>>>I will try to set enable.unsafe.sort=true and remove BUCKETCOLUMNS
property ,and try again.
>>>>
>>>>
>>>>At 2017-03-25 20:55:03, "Ravindra Pesala" &lt;

> ravi.pesala@

> &gt; wrote:
>>>>>Hi,
>>>>>
>>>>>Carbodata launches one job per each node to sort the data at node level
and
>>>>>avoid shuffling. Internally it uses threads to use parallel load.
Please
>>>>>use carbon.number.of.cores.while.loading property in carbon.properties
file
>>>>>and set the number of cores it should use per machine while loading.
>>>>>Carbondata sorts the data  at each node level to maintain the Btree for
>>>>>each node per segment. It improves the query performance by filtering
>>>>>faster if we have Btree at node level instead of each block level.
>>>>>
>>>>>1.Which version of Carbondata are you using?
>>>>>2.There are memory issues in Carbondata-1.0 version and are fixed
current
>>>>>master.
>>>>>3.And you can improve the performance by enabling
enable.unsafe.sort=true in
>>>>>carbon.properties file. But it is not supported if bucketing of columns
are
>>>>>enabled. We are planning to support unsafe sort load for bucketing also
in
>>>>>next version.
>>>>>
>>>>>Please send the executor log to know about the error you are facing.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>Regards,
>>>>>Ravindra
>>>>>
>>>>>On 25 March 2017 at 16:18, 

> wwyxg@

>  &lt;

> wwyxg@

> &gt; wrote:
>>>>>
>>>>>> Hello!
>>>>>>
>>>>>> *0、The failure*
>>>>>> When i insert into carbon table,i encounter failure。The failure is 
>>>>>> as
>>>>>> follow:
>>>>>> Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times,
>>>>>> most
>>>>>> recent failure: Lost task 0.3 in stage 2.0 (TID 1007, hd26):
>>>>>> ExecutorLostFailure (executor 1 exited caused by one of the running
>>>>>> tasks)
>>>>>> Reason: Slave lost+details
>>>>>>
>>>>>> Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times,
>>>>>> most recent failure: Lost task 0.3 in stage 2.0 (TID 1007, hd26):
>>>>>> ExecutorLostFailure (executor 1 exited caused by one of the running
>>>>>> tasks) Reason: Slave lost
>>>>>> Driver stacktrace:
>>>>>>
>>>>>> the stage:
>>>>>>
>>>>>> *Step:*
>>>>>> *1、start spark-shell*
>>>>>> ./bin/spark-shell \
>>>>>> --master yarn-client \
>>>>>> --num-executors 5 \  (I tried to set this parameter range from 10 to
>>>>>> 20,but the second job has only 5 tasks)
>>>>>> --executor-cores 5 \
>>>>>> --executor-memory 20G \
>>>>>> --driver-memory 8G \
>>>>>> --queue root.default \
>>>>>> --jars /xxx.jar
>>>>>>
>>>>>> //spark-default.conf spark.default.parallelism=320
>>>>>>
>>>>>> import org.apache.spark.sql.CarbonContext
>>>>>> val cc = new CarbonContext(sc, "hdfs://xxxx/carbonData/CarbonStore")
>>>>>>
>>>>>> *2、create table*
>>>>>> cc.sql("CREATE TABLE IF NOT EXISTS xxxx_table (dt String,pt
>>>>>> String,lst
>>>>>> String,plat String,sty String,is_pay String,is_vip String,is_mpack
>>>>>> String,scene String,status String,nw String,isc String,area
>>>>>> String,spttag
>>>>>> String,province String,isp String,city String,tv String,hwm
>>>>>> String,pip
>>>>>> String,fo String,sh String,mid String,user_id String,play_pv
>>>>>> Int,spt_cnt
>>>>>> Int,prg_spt_cnt Int) row format delimited fields terminated by '|'
>>>>>> STORED
>>>>>> BY 'carbondata' TBLPROPERTIES ('DICTIONARY_EXCLUDE'='pip,sh,
>>>>>> mid,fo,user_id','DICTIONARY_INCLUDE'='dt,pt,lst,plat,sty,
>>>>>> is_pay,is_vip,is_mpack,scene,status,nw,isc,area,spttag,
>>>>>> province,isp,city,tv,hwm','NO_INVERTED_INDEX'='lst,plat,hwm,
>>>>>> pip,sh,mid','BUCKETNUMBER'='10','BUCKETCOLUMNS'='fo')")
>>>>>>
>>>>>> //notes,set "fo" column BUCKETCOLUMNS is to join another table
>>>>>> //the column distinct values are as follows:
>>>>>>
>>>>>>
>>>>>> *3、insert into table*(xxxx_table_tmp  is a hive extenal orc table,has
>>>>>> 20
>>>>>> 0000 0000 records)
>>>>>> cc.sql("insert into xxxx_table select dt,pt,lst,plat,sty,is_pay,is_
>>>>>> vip,is_mpack,scene,status,nw,isc,area,spttag,province,isp,
>>>>>> city,tv,hwm,pip,fo,sh,mid,user_id ,play_pv,spt_cnt,prg_spt_cnt from
>>>>>> xxxx_table_tmp where dt='2017-01-01'")
>>>>>>
>>>>>> *4、spark split sql into two jobs,the first finished succeeded, but
>>>>>> the
>>>>>> second failed:*
>>>>>>
>>>>>>
>>>>>> *5、The second job stage:*
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Question:*
>>>>>> 1、Why the second job has only five jobs,but the first job has 994
>>>>>> jobs ?(
>>>>>> note:My hadoop cluster has 5 datanode)
>>>>>>       I guess it caused the failure
>>>>>> 2、In the sources,i find DataLoadPartitionCoalescer.class,is it means
>>>>>> that
>>>>>> "one datanode has only one partition ,and then the task is only one
>>>>>> on the
>>>>>> datanode"?
>>>>>> 3、In the ExampleUtils class,"carbon.table.split.partition.enable" is
>>>>>> set
>>>>>> as follow,but i can not find "carbon.table.split.partition.enable" in
>>>>>> other parts of the project。
>>>>>>      I set "carbon.table.split.partition.enable" to true, but the
>>>>>> second
>>>>>> job has only five jobs.How to use this property?
>>>>>>      ExampleUtils :
>>>>>>     // whether use table split partition
>>>>>>     // true -> use table split partition, support multiple partition
>>>>>> loading
>>>>>>     // false -> use node split partition, support data load by host
>>>>>> partition
>>>>>>    
>>>>>> CarbonProperties.getInstance().addProperty("carbon.table.split.partition.enable",
>>>>>> "false")
>>>>>> 4、Insert into carbon table takes 3 hours ,but eventually failed 。How
>>>>>> can
>>>>>> i speed it.
>>>>>> 5、in the spark-shell  ,I tried to set this parameter range from 10 to
>>>>>> 20,but the second job has only 5 tasks
>>>>>>      the other parameter executor-memory = 20G is enough?
>>>>>>
>>>>>> I need your help!Thank you very much!
>>>>>>
>>>>>> 

> wwyxg@

>>>>>>
>>>>>> ------------------------------
>>>>>> 

> wwyxg@

>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>-- 
>>>>>Thanks & Regards,
>>>>>Ravi





--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/insert-into-carbon-table-failed-tp9609p9707.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.

Reply via email to