Re: Throw NullPointerException occasionally when query from stream table

2018-11-15 Thread manishgupta88
Hi xm_zzc

As I can see from logs and the code flow, the hand-off code clears the cache
from executor code while the exception is thrown from driver code during
query.
You are getting the exception because you are using local mode whereIn
executor and driver are in the same JVM. We will check on this how can we
avoid the exception in local mode scenario. Meanwhile you can continue your
testing in cluster mode.

Regards
Manish Gupta



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: Throw NullPointerException occasionally when query from stream table

2018-11-06 Thread xm_zzc
Hi David:
  please see the call stack: 

 



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: Throw NullPointerException occasionally when query from stream table

2018-11-06 Thread David CaiQiang
Where do we call SegmentPropertiesAndSchemaHolder.invalidate in handoff
thread?



-
Best Regards
David Cai
--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: Throw NullPointerException occasionally when query from stream table

2018-11-06 Thread xm_zzc
Hi:
  The root cause is that when execute select sql, BlockDataMap will call
'SegmentPropertiesAndSchemaHolder.addSegmentProperties ' to add segment info
one by one, meanwhile if there are some segments updated, for example,
stream segment is handoff , handoff thread will call
'SegmentPropertiesAndSchemaHolder.invalidate' to delete segment info one by
one, if segmentIdAndSegmentPropertiesIndexWrapper.segmentIdSet.isEmpty() is
true, it will remove segmentPropertiesIndex, but select thread is still
using segmentPropertiesIndex to add/get segment info, and then NPE occur. 



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: Throw NullPointerException occasionally when query from stream table

2018-10-31 Thread xm_zzc
Hi: 
  I added some logs to trace this problem, found that when call
BlockDataMap.getFileFooterEntrySchema, the key 'segmentPropertiesIndex'
which was stored in BlockDataMap instance was removed by other thread from
SegmentPropertiesAndSchemaHolder.indexToSegmentPropertiesWrapperMapping :

2018-10-31 14:49:24,967
datastore.block.SegmentPropertiesAndSchemaHolder.addSegmentProperties(SegmentPropertiesAndSchemaHolder.java:115)
- 
Thread-39 -put 37 into indexToSegmentPropertiesWrapperMapping 0 
2018-10-31 14:49:25,472
datastore.block.SegmentPropertiesAndSchemaHolder.invalidate(SegmentPropertiesAndSchemaHolder.java:243)
- 
Executor task launch worker for task 926 -remove 37 out of
indexToSegmentPropertiesWrapperMapping 31 
2018-10-31 14:49:25,486
indexstore.blockletindex.BlockDataMap.getFileFooterEntrySchema(BlockDataMap.java:1002)
- 
Thread-39 -get 37 null


2018-10-31 14:56:45,057
datastore.block.SegmentPropertiesAndSchemaHolder.addSegmentProperties(SegmentPropertiesAndSchemaHolder.java:115)
- 
Thread-39 -put 98 into indexToSegmentPropertiesWrapperMapping 0 
2018-10-31 14:56:45,477
datastore.block.SegmentPropertiesAndSchemaHolder.invalidate(SegmentPropertiesAndSchemaHolder.java:243)
- 
Executor task launch worker for task 2653 -remove 98 out of
indexToSegmentPropertiesWrapperMapping 67
2018-10-31 14:56:46,290
indexstore.blockletindex.BlockDataMap.getFileFooterEntrySchema(BlockDataMap.java:1002)
- 
Thread-39 -get 98 null 
2018-10-31 14:56:51,392
indexstore.blockletindex.BlockDataMap.getFileFooterEntrySchema(BlockDataMap.java:1002)
- 
Thread-39 -get 98 null 




--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Throw NullPointerException occasionally when query from stream table

2018-10-29 Thread xm_zzc
Hi:
  I ran a structured streaming app on local[4] mode (Spark 2.3.2 +
CarbonData master branch) to insert data, and then started a thread to
execute select sql, the 'NullPointerException' occured occasionally.
  *I found that the smaller the value of CarbonCommonConstants.HANDOFF_SIZE
is, the more easily the error occur*.
  Please see my test code:  CarbonStream1_5.scala

  
  
  The  NullPointerException is :
  Exception in thread "Thread-42" java.lang.NullPointerException
at
org.apache.carbondata.core.indexstore.blockletindex.BlockDataMap.getFileFooterEntrySchema(BlockDataMap.java:1001)
at
org.apache.carbondata.core.indexstore.blockletindex.BlockDataMap.prune(BlockDataMap.java:656)
at
org.apache.carbondata.core.indexstore.blockletindex.BlockDataMap.prune(BlockDataMap.java:743)
at
org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getAllBlocklets(BlockletDataMapFactory.java:391)
at
org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:132)
at
org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:491)
at
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:412)
at
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:528)
at
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:219)
at
org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:127)
at
org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:67)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at org.apache.spark.ShuffleDependency.(Dependency.scala:91)
at
org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$.prepareShuffleDependency(ShuffleExchangeExec.scala:321)
at
org.apache.spark.sql.execution.TakeOrderedAndProjectExec.doExecute(limit.scala:154)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at
org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:371)
at
org.apache.spark.sql.execution.ProjectExec.inputRDDs(basicPhysicalOperators.scala:41)
at
org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:605)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at
org.apache.spark.rdd.RDDOperationScope$.with