Setup:
1. Mongodb with this sample data inside:
db.test.findOne();
{
"_id" : ObjectId("5aa8487d470dd39a635a12f5"),
"name" : "orange",
"context" : {
"time" : ISODate("2018-03-13T21:52:54.940Z"),
"user" : "jack"
}
}
1. Connect with Drill 1.12 and run the query:
select t.context.`time` as `time`, t.context from mongo.test.test t;
The result is:
+------------+---------+
| time | context |
+------------+---------+
| 2018-03-13 |
{"time":{"dayOfYear":72,"year":2018,"dayOfMonth":13,"dayOfWeek":2, ...
},"user":"jack"} |
+------------+---------+
Note how the output formatting of the same field “time” is different depending
on whether the time is shown as a first level column or as a nested JSON field.
Has anyone seen this behavior?
Looking at the source code, it appears that the JSON output is produced from
JsonStringHashMap.java:79 (org.apache.drill.exec:vector:1.12.0). This class
uses its own ObjectMapper instance to serialize the map.
Would it better to add some Mixin’s to the various complex types that may
appear inside this JsonStringHashMap? For example, DateTime object can be
serialized using the logical representation, rather than the getters of the
DataTime class used by Drill.
Thanks.
-- Jiang