Vitalii Diravka created DRILL-7176:
--------------------------------------
Summary: Invalid result for MAP structure in MongoDB collection
Key: DRILL-7176
URL: https://issues.apache.org/jira/browse/DRILL-7176
Project: Apache Drill
Issue Type: Bug
Components: Storage - MongoDB
Affects Versions: 1.15.0
Reporter: Vitalii Diravka
Fix For: Future
The following query works fine for JSON file:
{code}
apache drill (mongo.local)> select * from dfs.`/tmp/test.json` o;
+-------------------------------------+--------+
| _id | name |
+-------------------------------------+--------+
| {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
| {"$oid":"5cb0e161f0849231dfe16c01"} | manuel |
+-------------------------------------+--------+
2 rows selected (0.129 seconds)
apache drill (mongo.local)> select * from dfs.`/tmp/test.json` o where
o.`_id`.`$oid` = '5cb0e161f0849231dfe16d99';
+-------------------------------------+--------+
| _id | name |
+-------------------------------------+--------+
| {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
+-------------------------------------+--------+
1 row selected (0.168 seconds)
{code}
But the same query on the same MongoDB collection returns wrong result:
{code}
apache drill (mongo.local)> use mongo.local;
+------+-----------------------------------------+
| ok | summary |
+------+-----------------------------------------+
| true | Default schema changed to [mongo.local] |
+------+-----------------------------------------+
1 row selected (0.107 seconds)
apache drill (mongo.local)> alter session set store.mongo.bson.record.reader =
false;
+------+-----------------------------------------+
| ok | summary |
+------+-----------------------------------------+
| true | store.mongo.bson.record.reader updated. |
+------+-----------------------------------------+
1 row selected (0.066 seconds)
apache drill (mongo.local)> select * from json_test;
+-------------------------------------+--------+
| _id | name |
+-------------------------------------+--------+
| {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
| {"$oid":"5cb0e161f0849231dfe16c01"} | manuel |
+-------------------------------------+--------+
2 rows selected (0.115 seconds)
apache drill (mongo.local)> select * from json_test o where o.`_id`.`$oid` =
'5cb0e161f0849231dfe16d99';
+----+
| ** |
+----+
+----+
No rows selected (0.121 seconds)
{code}
The logical plan for the last query:
{noformat}
00-00 Screen : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0,
cumulative cost = {6.2 rows, 8.2 cpu, 140.0 io, 0.0 network, 0.0 memory}, id =
10157
00-01 Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount =
2.0, cumulative cost = {6.0 rows, 8.0 cpu, 140.0 io, 0.0 network, 0.0 memory},
id = 10156
00-02 Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount
= 2.0, cumulative cost = {4.0 rows, 6.0 cpu, 140.0 io, 0.0 network, 0.0
memory}, id = 10155
00-03 Scan(table=[[mongo, local, json_test]],
groupscan=[MongoGroupScan [MongoScanSpec=MongoScanSpec [dbName=local,
collectionName=json_test,
filters=Document{{_id=Document{{$eq=5cb0e161f0849231dfe16d99}}}}],
columns=[`**`, `_id`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY _id):
rowcount = 2.0, cumulative cost = {2.0 rows, 4.0 cpu, 140.0 io, 0.0 network,
0.0 memory}, id = 10154
{noformat}
The same query, but with specified filed in project works fine:
{code}
apache drill (mongo.local)> select `_id` from json_test o where o.`_id`.`$oid`
= '5cb0e161f0849231dfe16d99';
+-------------------------------------+
| _id |
+-------------------------------------+
| {"$oid":"5cb0e161f0849231dfe16d99"} |
+-------------------------------------+
1 row selected (0.133 seconds)
{code}
It looks like a planning issue, however the similar query on other field works
fine with a similar logical plan:
{code}
apache drill (mongo.local)> select * from json_test o where o.`name` =
'thiago';
+-------------------------------------+--------+
| _id | name |
+-------------------------------------+--------+
| {"$oid":"5cb0e161f0849231dfe16d99"} | thiago |
+-------------------------------------+--------+
1 row selected (0.18 seconds)
{code}
{noformat}
00-00 Screen : rowType = RecordType(DYNAMIC_STAR **): rowcount = 2.0,
cumulative cost = {6.2 rows, 8.2 cpu, 140.0 io, 0.0 network, 0.0 memory}, id =
9981
00-01 Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount =
2.0, cumulative cost = {6.0 rows, 8.0 cpu, 140.0 io, 0.0 network, 0.0 memory},
id = 9980
00-02 Project(**=[$0]) : rowType = RecordType(DYNAMIC_STAR **): rowcount
= 2.0, cumulative cost = {4.0 rows, 6.0 cpu, 140.0 io, 0.0 network, 0.0
memory}, id = 9979
00-03 Scan(table=[[mongo, local, json_test]],
groupscan=[MongoGroupScan [MongoScanSpec=MongoScanSpec [dbName=local,
collectionName=json_test, filters=Document{{name=Document{{$eq=thiago}}}}],
columns=[`**`, `name`]]]) : rowType = RecordType(DYNAMIC_STAR **, ANY name):
rowcount = 2.0, cumulative cost = {2.0 rows, 4.0 cpu, 140.0 io, 0.0 network,
0.0 memory}, id = 9978
{noformat}
Datasource JSON file in attachments.
Upload it to MongoDB:
{code}
vitalii@vitalii-UX331UN:~$ mongoimport --host localhost --db local --collection
vitalii_test --file /tmp/test.json
{code}
{code}
vitalii@vitalii-UX331UN:~$ mongo
...
rs0:PRIMARY> db.vitalii_test.find()
{ "_id" : ObjectId("5cb0e161f0849231dfe16d99"), "name" : "thiago" }
{ "_id" : ObjectId("5cb0e161f0849231dfe16c01"), "name" : "manuel" }
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)