Hi all, raised a jira issue : https://issues.apache.org/jira/browse/CARBONDATA-585
On the execution of the following query : LOAD DATA inpath 'hdfs://localhost:54310/csv/test.csv' INTO table employee options('DELIMITER'=',', 'FILEHEADER'='id, firstname'); the table schema is a following : -----------------------------+ col_name data_type comment -----------------------------+ id bigint firstname string -----------------------------+ The load gets successful at times but we also end up often with the following error : Dictionary file is locked for Updation. Following below are the logs : AUDIT 02-01 18:17:07,009 - [knoldus][pallavi][Thread-110]Dataload failure for default.employee. Please check the logs INFO 02-01 18:17:07,020 - pool-30-thread-1 Successfully deleted the lock file /tmp/default/employee/meta.lock INFO 02-01 18:17:07,022 - Table MetaData Unlocked Successfully after data load ERROR 02-01 18:17:07,022 - Error executing query, currentState RUNNING, org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 5, 192.168.2.188): java.lang.RuntimeException: Dictionary file firstname is locked for updation. Please try after some time at scala.sys.package$.error(package.scala:27) at org.apache.carbondata.spark.rdd.CarbonGlobalDictionaryGenerateRDD$$anon$1.<init>(CarbonGlobalDictionaryRDD.scala:364) at org.apache.carbondata.spark.rdd.CarbonGlobalDictionaryGenerateRDD.compute(CarbonGlobalDictionaryRDD.scala:302) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.collect(RDD.scala:926) at org.apache.carbondata.spark.util.GlobalDictionaryUtil$.generateGlobalDictionary(GlobalDictionaryUtil.scala:769) On Tue, Dec 27, 2016 at 8:10 PM, Liang Chen <chenliang6...@gmail.com> wrote: > Hi > > Updated ,thanks for you pointed out the issue. > > Regards > Liang > > > 李寅威 wrote > > thx QiangCai, the problem is solved. > > > > > > so, maybe it's better to correct the document at > > https://cwiki.apache.org/confluence/display/CARBONDATA/ > Cluster+deployment+guide, > > change the value of spark.executor.extraJavaOptions > > > > > > from > > -Dcarbon.properties.filepath=carbon.properties > > > > > > to > > -Dcarbon.properties.filepath= > > <YOUR_SPARK_HOME_PATH> > > "/conf/carbon.properties > > > > > > > > > > > > ------------------ Original ------------------ > > From: "QiangCai";< > > > qiangcai@ > > > >; > > Date: Tue, Dec 27, 2016 05:40 PM > > To: "dev"< > > > dev@.apache > > > >; > > > > Subject: Re: 回复: Dictionary file is locked for updation > > > > > > > > please correct the path of carbon.properties file. > > > > spark.executor.extraJavaOptions > > -Dcarbon.properties.filepath=carbon.properties > > > > > > > > > > > > -- > > View this message in context: > > http://apache-carbondata-mailing-list-archive.1130556. > n5.nabble.com/Dictionary-file-is-locked-for-updation-tp5076p5092.html > > Sent from the Apache CarbonData Mailing List archive mailing list archive > > at Nabble.com. > > > > > > -- > View this message in context: http://apache-carbondata- > mailing-list-archive.1130556.n5.nabble.com/Dictionary-file- > is-locked-for-updation-tp5076p5103.html > Sent from the Apache CarbonData Mailing List archive mailing list archive > at Nabble.com. > -- Regards | Pallavi Singh Software Consultant Knoldus Software LLP pallavi.si...@knoldus.in +91-9911235949