paul-rogers commented on issue #1870: DRILL-7359: Add support for DICT type in RowSet Framework URL: https://github.com/apache/drill/pull/1870#issuecomment-552769421 The schema builder issue has been nagging at me. I observed above that the current way of handling nested types is getting too complex, with all the special cases and "resumeFoo" methods. The current design was originally designed to build a `BatchSchema` with `MaterializedField`s,. Those classes require a top-down construction. (That is also why the final `build()` method returns a `BatchSchema` and why we must call `buildSchema()` to get a `TupleMetadata`.) It turns out that the `SchemaMetadata` classes *do not* have the same constraints. They can actually be built bottom-up. It never occurred to me to clean up the old-style of building when we repurposed `SchemaBuilder` to build metadata objects. So, since we can change the structure, we could borrow a technique from, I think, certain JSON builders. Rather than having the schema builder create the nested type builder, ask the user to do so explicitly: ``` TupleMetadata schema = new SchemaBuilder() .addMap(new MapBuilder("myMap") .add("name", MinorType.VARCHAR) .addMap(new MapBuilder("nested") .add("id", MinorType.INT) .build()) .build()) .add("last", MinorType.INT) .build(); ``` Adopting this model as additional benefits. It is currently a bit awkward to create a bare `ColumnMetadata`. We could create helper methods to create each kind of column schema. Something like: ``` foo = SchemaBuilder.column("foo", MinorType.INT, DataMode.REQUIRED); bar = SchemaBuilder.dict("bar", MinorType.VARCHAR) .value(SchemaBuilder.map(...)) .build(); ``` Lots of things we can play with. Given all of this, my advice for the `DICT` type is to figure out something that works with the structure we already have. Then, as a separate PR, we can clean up the `SchemaBuilder` to make it easier to use for the many complex types we now support. That way, you don't have to solve the schema builder problem in this PR.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
