Vitalii Diravka created DRILL-7176:
--------------------------------------

             Summary: Invalid result for MAP structure in MongoDB collection
                 Key: DRILL-7176
                 URL: https://issues.apache.org/jira/browse/DRILL-7176
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - MongoDB
    Affects Versions: 1.15.0
            Reporter: Vitalii Diravka
             Fix For: Future


The following query works fine for JSON file:
{code}
apache drill (mongo.local)> select * from dfs.`/tmp/test.json` o;
+-------------------------------------+--------+
|                 _id                 |  name  |
+-------------------------------------+--------+
| {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
| {"$oid":"5cb0e161f0849231dfe16c01"} | manuel |
+-------------------------------------+--------+
2 rows selected (0.129 seconds)
apache drill (mongo.local)> select * from dfs.`/tmp/test.json` o where 
o.`_id`.`$oid` = '5cb0e161f0849231dfe16d99';
+-------------------------------------+--------+
|                 _id                 |  name  |
+-------------------------------------+--------+
| {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
+-------------------------------------+--------+
1 row selected (0.168 seconds)
{code}
But the same query on the same MongoDB collection returns wrong result:
{code}
apache drill (mongo.local)> use mongo.local;
+------+-----------------------------------------+
|  ok  |                 summary                 |
+------+-----------------------------------------+
| true | Default schema changed to [mongo.local] |
+------+-----------------------------------------+
1 row selected (0.107 seconds)
apache drill (mongo.local)> alter session set store.mongo.bson.record.reader = 
false;
+------+-----------------------------------------+
|  ok  |                 summary                 |
+------+-----------------------------------------+
| true | store.mongo.bson.record.reader updated. |
+------+-----------------------------------------+
1 row selected (0.066 seconds)
apache drill (mongo.local)> select * from json_test;
+-------------------------------------+--------+
|                 _id                 |  name  |
+-------------------------------------+--------+
| {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
| {"$oid":"5cb0e161f0849231dfe16c01"} | manuel |
+-------------------------------------+--------+
2 rows selected (0.115 seconds)
apache drill (mongo.local)> select * from json_test o where o.`_id`.`$oid` = 
'5cb0e161f0849231dfe16d99';
+----+
| ** |
+----+
+----+
No rows selected (0.121 seconds)
{code}
The logical plan for the last query:
{noformat}
00-00    Screen : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, 
cumulative cost = {6.2 rows, 8.2 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 
10157
00-01      Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount = 
2.0, cumulative cost = {6.0 rows, 8.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, 
id = 10156
00-02        Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount 
= 2.0, cumulative cost = {4.0 rows, 6.0 cpu, 140.0 io, 0.0 network, 0.0 
memory}, id = 10155
00-03          Scan(table=[[mongo, local, json_test]], 
groupscan=[MongoGroupScan [MongoScanSpec=MongoScanSpec [dbName=local, 
collectionName=json_test, 
filters=Document{{_id=Document{{$eq=5cb0e161f0849231dfe16d99}}}}], 
columns=[`**`, `_id`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY _id): 
rowcount = 2.0, cumulative cost = {2.0 rows, 4.0 cpu, 140.0 io, 0.0 network, 
0.0 memory}, id = 10154
{noformat}

The same query, but with specified filed in project works fine:
{code}
apache drill (mongo.local)> select `_id` from json_test o where o.`_id`.`$oid` 
= '5cb0e161f0849231dfe16d99';
+-------------------------------------+
|                 _id                 |
+-------------------------------------+
| {"$oid":"5cb0e161f0849231dfe16d99"} |
+-------------------------------------+
1 row selected (0.133 seconds)
{code}
It looks like a planning issue, however the similar query on other field works 
fine with a similar logical plan:
{code}
apache drill (mongo.local)> select * from json_test o  where o.`name` = 
'thiago';
+-------------------------------------+--------+
|                 _id                 |  name  |
+-------------------------------------+--------+
| {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
+-------------------------------------+--------+
1 row selected (0.18 seconds)
{code}

{noformat}
00-00    Screen : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0, 
cumulative cost = {6.2 rows, 8.2 cpu, 140.0 io, 0.0 network, 0.0 memory}, id = 
9981
00-01      Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount = 
2.0, cumulative cost = {6.0 rows, 8.0 cpu, 140.0 io, 0.0 network, 0.0 memory}, 
id = 9980
00-02        Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount 
= 2.0, cumulative cost = {4.0 rows, 6.0 cpu, 140.0 io, 0.0 network, 0.0 
memory}, id = 9979
00-03          Scan(table=[[mongo, local, json_test]], 
groupscan=[MongoGroupScan [MongoScanSpec=MongoScanSpec [dbName=local, 
collectionName=json_test, filters=Document{{name=Document{{$eq=thiago}}}}], 
columns=[`**`, `name`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY name): 
rowcount = 2.0, cumulative cost = {2.0 rows, 4.0 cpu, 140.0 io, 0.0 network, 
0.0 memory}, id = 9978
{noformat}

Datasource JSON file in attachments.
Upload it to MongoDB:
{code}
vitalii@vitalii-UX331UN:~$ mongoimport --host localhost --db local --collection 
vitalii_test --file /tmp/test.json
{code}
{code}
vitalii@vitalii-UX331UN:~$ mongo
...
rs0:PRIMARY> db.vitalii_test.find()
{ "_id" : ObjectId("5cb0e161f0849231dfe16d99"), "name" : "thiago" }
{ "_id" : ObjectId("5cb0e161f0849231dfe16c01"), "name" : "manuel" }
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to