Paul Rogers created DRILL-7428: ---------------------------------- Summary: Drill incorrectly allows a repeated map field to be projected to top level Key: DRILL-7428 URL: https://issues.apache.org/jira/browse/DRILL-7428 Project: Apache Drill Issue Type: Bug Reporter: Paul Rogers
Consider the following query from the [Mongo DB tests|https://github.com/apache/drill/blob/master/contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/MongoTestConstants.java#L80]: {noformat} select t.name as name, t.topping.type as type from mongo.%s.`%s` t where t.sales >= 150 {noformat} The query is used in [{{TestMongoQueries.testUnShardedDBInShardedClusterWithProjectionAndFilter()}}|https://github.com/apache/drill/blob/master/contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/TestMongoQueries.java#L89]. Here it turns out that {{topping}} is a repeated map. The query is projecting the members of that map to the top level. The query has five rows, but 24 values in the repeated map. The Project operator allows the projection, resulting in an output batch in which most vectors have 5 values, but the {{topping}} column, now at the top level and no longer in the map, has 24 values. As a result, the first five values, formerly associated with the first record, are now associated with the first five top-level records, while the values formerly associated with records 1-4 are lost. Thus, this is a data corruption bug. -- This message was sent by Atlassian Jira (v8.3.4#803005)