dev Thanks
yixu2001 From: Liang Chen Date: 2018-03-23 16:36 To: dev Subject: Re: Re: Getting [Problem in loading segment blocks] error after doing multi update operations Hi Already arrange to fix this issue, will raise the pull request asap. Thanks for your feedback. Regards Liang yixu2001 wrote > dev > This issue has caused great trouble for our production. I will appreciate > if you have any plan to fix it and let me know. > > > yixu2001 > > From: BabuLal > Date: 2018-03-23 00:20 > To: dev > Subject: Re: Getting [Problem in loading segment blocks] error after doing > multi update operations > hi all > i am able to reproduce same exception in my cluster and got the same > exception. (Trace is listed below) > ------ > scala> carbon.sql("select count(*) from public.c_compact4").show > 2018-03-22 20:40:33,105 | WARN | main | main > spark.sql.sources.options.keys > expected, but read nothing | > org.apache.carbondata.common.logging.impl.StandardLogService.logWarnMessage(StandardLogService.java:168) > org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, > tree: > Exchange SinglePartition > +- *HashAggregate(keys=[], functions=[partial_count(1)], > output=[count#1443L]) > +- *BatchedScan CarbonDatasourceHadoopRelation [ Database name :public, > Table name :c_compact4, Schema > :Some(StructType(StructField(id,StringType,true), > StructField(qqnum,StringType,true), StructField(nick,StringType,true), > StructField(age,StringType,true), StructField(gender,StringType,true), > StructField(auth,StringType,true), StructField(qunnum,StringType,true), > StructField(mvcc,StringType,true))) ] public.c_compact4[] > at > org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) > at > org.apache.spark.sql.execution.exchange.ShuffleExchange.doExecute(ShuffleExchange.scala:112) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:235) > at > org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141) > at > org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:372) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114 > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135 > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:225) > at > org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:308) > at > org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:113) > at > org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2386) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) > at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2788) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2385) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2392) > at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2128 > at > org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2127) > at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2818) > at org.apache.spark.sql.Dataset.head(Dataset.scala:2127) > at org.apache.spark.sql.Dataset.take(Dataset.scala:2342) > at org.apache.spark.sql.Dataset.showString(Dataset.scala:248) > at org.apache.spark.sql.Dataset.show(Dataset.scala:638) > at org.apache.spark.sql.Dataset.show(Dataset.scala:597) > at org.apache.spark.sql.Dataset.show(Dataset.scala:606) > ... 48 elided > Caused by: java.io.IOException: Problem in loading segment blocks. > at > org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:153) > at > org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getDataMaps(BlockletDataMapFactory.java:76) > at > org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:72) > at > org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getDataBlocksOfSegment(CarbonTableInputFormat.java:739 > at > org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:666) > at > org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:426) > at > org.apache.carbondata.spark.rdd.CarbonScanRDD.getPartitions(CarbonScanRDD.scala:107) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:251) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251) > at scala.Option.getOrElse(Option.scala:121 > at org.apache.spark.rdd.RDD.partitions(RDD.scala:251 > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:251) > at org.apache.spark.ShuffleDependency. > <init> > (Dependency.scala:91) > at > org.apache.spark.sql.execution.exchange.ShuffleExchange$.prepareShuffleDependency(ShuffleExchange.scala:273) > at > org.apache.spark.sql.execution.exchange.ShuffleExchange.prepareShuffleDependency(ShuffleExchange.scala:84) > at > org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:121) > at > org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:112) > at > org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52) > ... 81 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.getLocations(AbstractDFSCarbonFile.java:509) > at > org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:142) > > ----------------Store location---- ---- > linux-49:/opt/babu # hadoop fs -ls > /user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/*.deletedelta > -rw-rw-r--+ 3 hdfs hive 177216 2018-03-22 18:20 > /user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-0_batchno0-0-1521723019528.deletedelta > -rw-r--r-- 3 hdfs hive 0 2018-03-22 19:35 > /user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-0_batchno0-0-1521723886214.deletedelta > -rw-rw-r--+ 3 hdfs hive 87989 2018-03-22 18:20 > /user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-1_batchno0-0-1521723019528.deletedelta > -rw-r--r-- 3 hdfs hive 0 2018-03-22 19:35 > /user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-1_batchno0-0-1521723886214.deletedelta > -rw-rw-r--+ 3 hdfs hive 87989 2018-03-22 18:20 > /user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-2_batchno0-0-1521723019528.deletedelta > -rw-r--r-- 3 hdfs hive 0 2018-03-22 19:35 > /user/hive/warehouse/carbon.store/public/c_compact4/Fact/Part0/Segment_0/part-0-2_batchno0-0-1521723886214.deletedelta > > > ----------------------------------------------------------- > > Issue reproduced technique :- > Writing content of delete delta is failed but deletedelta file created > successfully . Failed during Horizontal Compaction ( added setSpaceQuota > in hdfs so that file can created successfully and write to this file is > failed) > *Below points to be handled to fix this issue.* > > 1. When Horizontal Compaction is failed 0 byte delete delta file should be > deleted currently it is not deleted. This is a cleaning part of the > Horizontal Compaction fail . > 2. delete delta of 0 byte should not be considered while reading .( we can > further discuss about this solution ) . currently tablestatus file has the > entry of deletedelta timestamp. > 3. If deleting is in progress , file is created (name node has entry of > file) but data writing is in progress (not yet flush) but at same time > select query is triggered ,then Query will failed so this scenario also > need to handle. > > @dev :- Please Let me know if any other detail is needed. > > Thanks > Babu > > > > -- > Sent from: > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/ -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/