[
https://issues.apache.org/jira/browse/ASTERIXDB-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17944638#comment-17944638
]
ASF subversion and git services commented on ASTERIXDB-3586:
------------------------------------------------------------
Commit b88d3fc569ebb507ab5584b22b8e622619ad463d in asterixdb's branch
refs/heads/master from Ritik Raj
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=b88d3fc569 ]
[ASTERIXDB-3586][STO] Sync tupleIndex while skipping tuples
- user model changes: no
- storage format changes: no
- interface changes: yes
Details:
When `compiler.column.filter` is enabled, the assembler’s `valueIndex`
(denoted as `tupleIndex` currently) may become out of sync with the
filter’s `tupleIndex`.
This misalignment between the filter and assembler
can lead to incorrect query results. This change ensures proper
synchronization of `tupleIndex` while skipping tuples to maintain
correctness.
Ext-ref: MB-66000
Change-Id: I260612851c4dabfb9e74f2902f421720d9f88657
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19555
Integration-Tests: Jenkins <[email protected]>
Tested-by: Jenkins <[email protected]>
Reviewed-by: Peeyush Gupta <[email protected]>
> Internal error while query evaluation
> -------------------------------------
>
> Key: ASTERIXDB-3586
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-3586
> Project: Apache AsterixDB
> Issue Type: Bug
> Components: STO - Storage
> Affects Versions: 0.9.9
> Reporter: Ritik Raj
> Assignee: Ritik Raj
> Priority: Critical
> Labels: triaged
> Fix For: 0.9.10
>
>
> On running the following query with the filter
> {code:java}
> SET `compiler.column.filter` "true";
> SELECT `field1`, `field2`, `field3`
> FROM acollection
> WHERE (
> `field1` = $field1 .... AND
> `updated_at` BETWEEN "2024-06-19" AND "2024-11-30"
> ) LIMIT $limit;
> {code}
> an Internal error was thrown when the updated_at is between "2024-06-19" AND
> "2024-11-30" but works fine for "2024-05-19" AND "2024-11-30".
> On Investigation, it seems to be coming from how the column filter evaluates
> the tupleIndex.
> There are three components which maintain their own version of tuple Index,
> when it comes to column filter evaluation.
> 1. *Cursor* for the component which goes over the stored tuple in memory or
> disk component
> 2. *FIlterIterator* which maintains their own version of tupleIndex for
> deciding the current tuple satisfying the filter condition
> 3. {*}Column Assembler{*}, which maintains *their* tupleIndex to know the
> number of tuple assembled.
> and if there are antimatters (deleted entrries) are present
> {{Column Assembler’s tupleIndex <= Cursor’s tupleIndex}}
> {{Column Assembler’s tupleIndex <= Filter’s tupleIndex}}
> {{as there are less tuples for the assembler to assemble.}}
> Based on the findings, there can be a case where there is disparity between
> the *three* tupleIndexes, which caused the Column Assembler’s tupleIndex
> to be skipped to a point beyond the tupleIndex that can be assembled, giving
> the following error:
>
> {code:java}
> Caused by:
> org.apache.hyracks.storage.am.lsm.btree.column.error.ColumnarValueException:
> {"PrimitiveValueAssembler":{"isDelegate":false,"assemblerReader":{"typeTag":"string","columnIndex":53,"valueIndex":902,"valueCount":902,"allMissing":false,"level":0,"maxLevel":1,"nullBitMask":2,"numberOfEncounteredMissing":268,"numberOfEncounteredNull":0,"numberOfDecodersRequired":1,"maxLevelsEncountered":"{1}","isPrimaryKeyColumn":false}},"ColumnAssembler":{"tupleIndex":903,"numberOfTuples":980,"numberOfSkips":896},"AssemblerState":{"inGroup":false},"QueryColumnWithMetaTupleReference":{"isAntiMatter":false,"previousIndex":757,"primaryKeyReaders":[{"typeTag":"string","columnIndex":0,"valueIndex":980,"valueCount":980,"allMissing":false,"level":1,"maxLevel":1,"nullBitMask":2,"numberOfEncounteredMissing":78,"numberOfEncounteredNull":0,"numberOfDecodersRequired":1,"maxLevelsEncountered":"{1}","isPrimaryKeyColumn":true}]}}
> at
> org.apache.asterix.column.assembler.PrimitiveValueAssembler.createException(PrimitiveValueAssembler.java:67)
> ~[asterix-column-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.asterix.column.assembler.PrimitiveValueAssembler.next(PrimitiveValueAssembler.java:50)
> ~[asterix-column-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.asterix.column.operation.query.ColumnAssembler.nextValue(ColumnAssembler.java:86)
> ~[asterix-column-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.asterix.column.tuple.QueryColumnWithMetaTupleReference.getFilteredAssembledValue(QueryColumnWithMetaTupleReference.java:195)
> ~[asterix-column-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.asterix.column.tuple.QueryColumnWithMetaTupleReference.getAssembledValue(QueryColumnWithMetaTupleReference.java:154)
> ~[asterix-column-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.asterix.column.operation.query.QueryColumnWithMetaTupleProjector.getAssembledValue(QueryColumnWithMetaTupleProjector.java:76)
> ~[asterix-column-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.asterix.column.operation.query.QueryColumnTupleProjector.project(QueryColumnTupleProjector.java:93)
> ~[asterix-column-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.writeTupleToOutput(IndexSearchOperatorNodePushable.java:401)
> ~[hyracks-storage-am-common-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.writeSearchResults(IndexSearchOperatorNodePushable.java:274)
> ~[hyracks-storage-am-common-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.searchAllPartitions(IndexSearchOperatorNodePushable.java:470)
> ~[hyracks-storage-am-common-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.nextFrame(IndexSearchOperatorNodePushable.java:316)
> ~[hyracks-storage-am-common-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.hyracks.dataflow.common.comm.io.AbstractFrameAppender.write(AbstractFrameAppender.java:94)
> ~[hyracks-dataflow-common-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.open(EmptyTupleSourceRuntimeFactory.java:55)
> ~[algebricks-runtime-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$SourcePushRuntime.initialize(AlgebricksMetaOperatorDescriptor.java:175)
> ~[algebricks-runtime-1.1.0-1238.jar:1.1.0-1238] at
> org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$runInParallel$0(SuperActivityOperatorNodePushable.java:245)
> ~[hyracks-api-1.1.0-1238.jar:1.1.0-1238] at
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) ~[?:?] ...
> 3 more
>
> {code}
>
> {code:java}
> {
> "PrimitiveValueAssembler":
> {
> "isDelegate": false,
> "assemblerReader":
> {
> "typeTag": "string",
> "columnIndex": 53,
> "valueIndex": 902,
> "valueCount": 902, ------- (1)
> "allMissing": false,
> "level": 0,
> "maxLevel": 1,
> "nullBitMask": 2,
> "numberOfEncounteredMissing": 268,
> "numberOfEncounteredNull": 0,
> "numberOfDecodersRequired": 1,
> "maxLevelsEncountered": "{1}",
> "isPrimaryKeyColumn": false
> }
> },
> "ColumnAssembler":
> {
> "tupleIndex": 903, ------ (4)
> "numberOfTuples": 980,
> "numberOfSkips": 896
> },
> "AssemblerState":
> {
> "inGroup": false
> },
> "QueryColumnWithMetaTupleReference":
> {
> "isAntiMatter": false,
> "previousIndex": 757,
> "primaryKeyReaders":
> [
> {
> "typeTag": "string",
> "columnIndex": 0,
> "valueIndex": 980,
> "valueCount": 980, ----- (2)
> "allMissing": false,
> "level": 1,
> "maxLevel": 1,
> "nullBitMask": 2,
> "numberOfEncounteredMissing": 78, -------- (3)
> "numberOfEncounteredNull": 0,
> "numberOfDecodersRequired": 1,
> "maxLevelsEncountered": "{1}",
> "isPrimaryKeyColumn": true
> }
> ]
> }
> }{code}
> From logs:
> * *Total Tuples:* 980
> * *Deleted Tuples (antimatters):* 78
> * *Valid Value Tuples to Assemble:* 902 (980 - 78)
> * *Column Assembler Tuple Index:* 903 (exceeds valid range)
> Since {*}tupleIndex (903) > available tuples (902){*}, the assembler fails
> due to no more values.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)