paul-rogers edited a comment on issue #1829: DRILL-7096: Develop vector for 
canonical Map<K,V>
URL: https://github.com/apache/drill/pull/1829#issuecomment-515716535
 
 
   here seems to be little documentation of the design or implementation: I 
find myself having to reverse engineer a design from the code. Would it be 
helpful for other reviewers if we included a bit of explanation?
   
   In particular, what is the structure of the new vector? The true map vector 
seems to have a user-defined key and value type. Does this mean keys can be 
integers or maps? Do the proposed map functions handle non-Varchar keys? Do 
they handle repeated or map keys? If not, should the implementation restrict 
key types?
   
   How do we handle varying value types? (I have a map with, say, {name: 
"Fred", balance: 123}. Would the user specify a Union? Have we fixed the many 
existing problems with the Union type? Or, is the true map meant to be like a 
Java map: Map<K,V>, and not a Python or JSON map with string keys and values of 
any type?
   
   Given the title of this PR, "Develop vector for canonical Map<K,V>", the 
intent seems to be to create a Java-like map. In this case, the name "true" 
map, means "true Java map", not "true JSON map." This may seem like an 
unimportant distinction, but I do point you to [Drill's JSON data 
model](https://drill.apache.org/docs/json-data-model/).
   
   I suppose the the "V" type could be Union to implement a JSON map. Would the 
various functions that work with this map handle union?
   
   Given that this whole exercise simply creates two correlated lists, "key" 
and "value", I do wonder at the need for all the new complexity. Can't we 
simply add functions that work with an existing repeated map where the user 
tells us which column is the key and which is the value? Or, infer the key and 
value if the user uses the names "key" and "value"?
   
   A bit of documentation (a README file, a package-info file or just a Javadoc 
explanation in the vector class) would help answer these questions.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to