We are working on a custom Drill storage plugin to retrieve data from a
proprietary a JSON based storage. The drill version being used is 1.4.0.
Assuming the data is a JSON looking like this:
{
...
"topping": {
"id": "5001, 5002, ...",
...
}
...
}
When we submit this query:
select t.topping.id from meld.project1.event.table1 t;
The plugin receives a SchemaPath object for "topping.id". The plugin then
creates an output vector with the provided SchemaPath as the field name, e.g.
final MajorType type = Types.optional(MinorType.VARCHAR); // assuming we
always use this type
final MaterializedField field = MaterializedField.create(schemaPath, type);
// schemaPath is given to the plugin
final Class<? extends ValueVector> clazz = (Class<? extends ValueVector>)
TypeHelper.getValueVectorClass(type.getMinorType(), type.getMode());
ValueVector vector = output.addField(field, clazz);
The data is then added into the vector and returned. However, the returned
field cannot be matched with what Drill expects. So we get something like this:
0: jdbc:drill:zk=local> select t.topping.id from meld.project1.event.table1 t;
+---------+
| EXPR$0 |
+---------+
| null |
| null |
| null |
| null |
| null |
| null |
+---------+
When we are expecting to get this:
0: jdbc:drill:zk=local> select t.`topping.id` from meld.project1.event.table1 t;
+-------------------------------------------+
| topping.id |
+-------------------------------------------+
| 5001, 5002, 5003, 5004 |
| 5001, 5002, 5005, 5007, 5006, 5003, 5004 |
| 5001, 5002, 5003, 5004 |
| 5001, 5002, 5005, 5007, 5006, 5003, 5004 |
| 5001, 5002, 5005, 5003, 5004 |
| 5001, 5002, 5005, 5003, 5004 |
+-------------------------------------------+
In the second query, the SchemaPath is a nested path. Our plugin accepts this
specification and retrieve the same results. So what are we doing wrong here?
How do we correctly return values for a nested JSON field?
Thanks.
-- Jiang