czxrrr commented on a change in pull request #701:
URL: https://github.com/apache/orc/pull/701#discussion_r635317164
##########
File path: tools/test/TestMatch.cc
##########
@@ -1085,6 +1085,27 @@ TEST(TestMatch, selectColumns) {
<< "\"887336a7\"}}]}";
EXPECT_EQ(expectedMapWithColumnId.str(), line);
+ // Map column #12 again, to test map key is automatically included
+ // two subtypes with column id:
+ // map<string(20),struct(21)<int1(22):int,string1(23):string>
+ cols.clear();
+ cols.push_back(22);
+ cols.push_back(23);
+ rowReaderOpts.includeTypes(cols);
+ rowReader = reader->createRowReader(rowReaderOpts);
+ c = rowReader->getSelectedColumns();
+ for (unsigned int i=1; i < c.size(); i++) {
+ if (i>=19 && i<=23)
+ EXPECT_TRUE(c[i]);
+ else
+ EXPECT_TRUE(!c[i]);
+ }
+ batch = rowReader->createRowBatch(1);
+ std::ostringstream expectedMapSchema;
+ expectedMapSchema << "Struct vector <0 of 1; Map vector <Byte vector <0 of
1>, "
+ << "Struct vector <0 of 1; Long vector <0 of 1>; Byte vector <0 of 1>;
> with 0 of 1>; >";
+ EXPECT_EQ(expectedMapSchema.str(), batch->toString());
Review comment:
The bug does not only exist in MapVectorBatch::toString() and it exists
in MapVectorBatch::getMemoryUse() ALSO.
Once key or value missing, even rootBatch->getMemoryUse() will fail.
In my opinion, If people really wanna querying keys or values independently,
Struct of array of keys and array of values is more suitable, they do not need
map.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]