[ https://issues.apache.org/jira/browse/ASTERIXDB-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402816#comment-15402816 ]
Ian Maxon commented on ASTERIXDB-1534: -------------------------------------- It seems like the exception isn't totally deterministic. For me, this is the stack since I have been debugging: Caused by: java.lang.NullPointerException at org.apache.hyracks.storage.am.common.frames.TreeIndexNSMFrame.getTupleCount(TreeIndexNSMFrame.java:287) at org.apache.hyracks.storage.am.btree.impls.BTreeRangeSearchCursor.hasNext(BTreeRangeSearchCursor.java:141) at org.apache.hyracks.storage.am.lsm.common.impls.LSMIndexSearchCursor.pushIntoPriorityQueue(LSMIndexSearchCursor.java:175) at org.apache.hyracks.storage.am.lsm.common.impls.LSMIndexSearchCursor.initPriorityQueue(LSMIndexSearchCursor.java:74) at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTreeRangeSearchCursor.open(LSMBTreeRangeSearchCursor.java:228) at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTreeSearchCursor.open(LSMBTreeSearchCursor.java:81) at org.apache.hyracks.storage.am.lsm.btree.impls.LSMBTree.search(LSMBTree.java:412) at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.search(LSMHarness.java:393) at org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.search(LSMTreeIndexAccessor.java:100) at org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.nextFrame(IndexSearchOperatorNodePushable.java:183) ... 10 more > NPE when restart the server > --------------------------- > > Key: ASTERIXDB-1534 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1534 > Project: Apache AsterixDB > Issue Type: Bug > Components: Storage > Environment: master > commit a89fae64ac21fb8eefde79f79d2dbe1a0e54c364 > Date: Wed Jul 6 07:58:55 2016 -0700 > Reporter: Jianfeng Jia > Assignee: Michael Blow > Attachments: asterix-configuration.xml, ingest.sh > > > When I stop and start the cluster by managix, I hit the following error: > {code} > ERROR: > /rhome/jianfeng/managix/home/asterix/cloudberry/.nfs00000000021805340000118e > (No such file or directory) > j > {code} > And no nc and cc got started. > After a while, I ran the managix start again, the cluster restart > successfully. > But one of the dataset can't answer any queries. The simplest select query > {code} > for $t in dataset twitter.ds_tweet limit 5 return $t > {code} > will give me the following error: > {code} > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.lang.NullPointerException > at > org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:45) > at org.apache.hyracks.control.nc.Task.run(Task.java:319) > ... 3 more > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.lang.NullPointerException > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:218) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83) > at org.apache.hyracks.control.nc.Task.run(Task.java:263) > ... 3 more > Caused by: java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.lang.NullPointerException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:212) > ... 5 more > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > java.lang.NullPointerException > at > org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.nextFrame(IndexSearchOperatorNodePushable.java:187) > at > org.apache.hyracks.dataflow.common.comm.io.AbstractFrameAppender.write(AbstractFrameAppender.java:93) > at > org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.flushAndReset(AbstractOneInputOneOutputOneFramePushRuntime.java:63) > at > org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.flushIfNotFailed(AbstractOneInputOneOutputOneFramePushRuntime.java:69) > at > org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:55) > at > org.apache.hyracks.algebricks.runtime.operators.std.AssignRuntimeFactory$1.close(AssignRuntimeFactory.java:122) > at > org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.close(EmptyTupleSourceRuntimeFactory.java:60) > at > org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$1.initialize(AlgebricksMetaOperatorDescriptor.java:116) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$$Lambda$1/350086994.runAction(Unknown > Source) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:205) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:202) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ... 3 more > Caused by: java.lang.NullPointerException > at > org.apache.hyracks.storage.am.common.frames.TreeIndexNSMFrame.getTupleCount(TreeIndexNSMFrame.java:287) > at > org.apache.hyracks.storage.am.btree.impls.BTreeRangeSearchCursor.hasNext(BTreeRangeSearchCursor.java:141) > at > org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.PartitionedOnDiskInvertedIndex.openInvertedListPartitionCursors(PartitionedOnDiskInvertedIndex.java:98) > at > org.apache.hyracks.storage.am.lsm.invertedindex.search.PartitionedTOccurrenceSearcher.search(PartitionedTOccurrenceSearcher.java:116) > at > org.apache.hyracks.storage.am.lsm.invertedindex.ondisk.OnDiskInvertedIndex$OnDiskInvertedIndexAccessor.search(OnDiskInvertedIndex.java:519) > at > org.apache.hyracks.storage.am.lsm.invertedindex.impls.LSMInvertedIndexSearchCursor.hasNext(LSMInvertedIndexSearchCursor.java:143) > at > org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.writeSearchResults(IndexSearchOperatorNodePushable.java:149) > at > org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.nextFrame(IndexSearchOperatorNodePushable.java:184) > ... 15 more > > {code} > One hint is that this dataset was connecting with a feed before the restart. > The other dataset that didn't have feed connection seems working fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)