Hi, Hive 0.9.0 + Elephant-Bird 3.0.7 I faced a problem to use the elephant-bird with hive. I know what maybe cause this problem, but I don't know which side this bug belongs to. Let me know explain what is the problem. If we define a google protobuf file, with field name like 'dateString' (the field contains an uppercase 'S'), then when I query the table like this: select dateString from table .............
I will get the following exception trace: Caused by: java.lang.RuntimeException: cannot find field datestring from [org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@49aacd5f ..................... at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:321) at org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldRef(UnionStructObjectInspector.java:96) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57) at org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:878) at org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:904) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389) at org.apache.hadoop.hive.ql.exec.FilterOperator.initializeOp(FilterOperator.java:73) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:133) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:444) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:98) Here is the code for the method throws this error: public static StructField getStandardStructFieldRef(String fieldName, List<? extends StructField> fields) { fieldName = fieldName.toLowerCase(); for (int i = 0; i < fields.size(); i++) { if (fields.get(i).getFieldName().equals(fieldName)) { return fields.get(i); } } // For backward compatibility: fieldNames can also be integer Strings. try { int i = Integer.parseInt(fieldName); if (i >= 0 && i < fields.size()) { return fields.get(i); } } catch (NumberFormatException e) { // ignore } throw new RuntimeException("cannot find field " + fieldName + " from " + fields); // return null; } I understand the problem happens because at this time, the fileName is "datestring" (all lowercase charcters), but the List<fields> contains the fieldName for that field is "dateString", and that is why the RuntimeException happened. But I don't know which side this bug belongs to, or I want to know more inside detail about the Hive implementation contract. >From this link: >https://cwiki.apache.org/Hive/user-faq.html#UserFAQ-AreHiveQLidentifiers%2528e.g.tablenames%252Ccolumnnames%252Cetc%2529casesensitive%253F I know that in hive, the table name and column name should be case insensitive, so even though in my Query, I used "select dateString", the fieldName changed to "datestring" in the code, but the StructField of ObjectInspector from the elephant-bird return the EXACTLY fieldname, defined in the code, "dateString" in this case. of course, I can change my protof file to only use lowercase field name to bypass this bug, but my questions are: 1) If I implement my ObjectInspector, should I pay attention to the field name? Is it needed to be lowercase? 2) I would consider this as a bug of hive, right? If this line: fieldName = fieldName.toLowerCase(); to lowercase the data, then the comparing should also do it by lowering case by changing if (fields.get(i).getFieldName().equals(fieldName)) to if (fields.get(i).getFieldName().toLowerCase().equals(fieldName)) right? Thanks Yong