[ https://issues.apache.org/jira/browse/HIVE-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965258#comment-14965258 ]
Sergio Peña commented on HIVE-12080: ------------------------------------ The patch looks good. It is exactly what I had in mind. I just have some questions and comments. 1. {{private TypeInfo hiveTypeInfo;}} This variable is a member of the class, and the value is assigned on {{init}}, and read on {{prepareForRead}}. This means that the class must be instantiated in order to call both init and prepareForRead. I don't see where this class {{DataWritableReadSupport}} is instantiated besides this code below on {{ParquetRecordReaderWrapper}}: {code} final ReadContext readContext = new DataWritableReadSupport().init(new InitContext(jobConf, null, fileMetaData.getSchema())); {code} Question: How is the {{hiveTypeInfo}} value able to have the same value on prepareForRead? I don't remember how this works. 2. I think we need to do add more test cases. We should test maps, arrays and primitive columns as well. You added the struct on {{parquet_schema_type_evolution.q}}. Let's add more to be sure we test every change. Another test case I have in mind is to have 2 parquet files, one with {{int}} and another with {{bigint}}. This way we could test the mix of both integers. Also, should we rename the .q file? something like, parquet_schema_column_type_promotion.q? I don't know, I am just brainstorming. 3. {{ETypeConverter.java}} is exactly what I had in mind. Good job for that. I am just having this developer idea if we can do something like [~rdblue] mentioned where we can use an interface, such as NumericWritable where it has getInt(), getLong(), etc. implemented. I don't know how yet, but if you can think on something that can make this etypeconverter class look clean would be great. If not, don't worry. This patch does the work we are expecting. 4. Are you going to support other types, like short? strings? I would like to get rid of ParquetShortInspector, ParquetStringInspector and ParquetByteInspector. Overall, the patch looks pretty good. Next time, could you create a Review Board so I can leave the comments directly? This would be much better. Thanks. > Support auto type widening for Parquet table > -------------------------------------------- > > Key: HIVE-12080 > URL: https://issues.apache.org/jira/browse/HIVE-12080 > Project: Hive > Issue Type: New Feature > Components: File Formats > Reporter: Mohammad Kamrul Islam > Assignee: Mohammad Kamrul Islam > Attachments: HIVE-12080.1.patch > > > Currently Hive+Parquet doesn't support it. It should include at least basic > type promotions short->int->bigint, float->double etc, that are already > supported for other file formats. > There were similar effort (Hive-6784) but was not committed. This JIRA is to > address the same in different way with little (no) performance impact. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)