[ 
https://issues.apache.org/jira/browse/HIVE-21240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767045#comment-16767045
 ] 

Zoltan Haindrich commented on HIVE-21240:
-----------------------------------------

Could you please explain why are you making the tests pass by "adapting" them 
to your changes? ...IIRC HCat uses Lists/etc but the other world is using 
standard Array-s
{code}
 
@@ -109,9 +109,10 @@ public void testStructNullField() throws Exception {
       udf.initialize(arguments);
 
       Object res = udf.evaluate(evalArgs("{\"a\":null}"));
-      assertTrue(res instanceof Object[]);
-      Object o[] = (Object[]) res;
-      assertEquals(null, o[0]);
+      assertTrue(res instanceof List<?>);
+
+      List<?> o = (List<?>) res;
+      assertEquals(null, o.get(0));
     }
   }
{code}

I'm not sure about the whole rewrite thing:

{quote}
Use Jackson Tree parser instead of manually parsing
{quote}
actually..."manually parsing" have avoided to have the tree parsed as a tree 
upfront; it worked by using a 1 token lookahead

{quote}
Current JSON parser accepts, but does not apply, custom timestamp formats in 
most cases
{quote}
can't this be fixed without rewriting?

{quote}
Added cache for column-name to column-index searches, currently O\(n\) for each 
row processed, for each column in the row
{quote}
I think this could be also fixed in the existing implementation; IIRC there was 
a FIXME about this issue....

could you please stop reformating lines which are not modified at all? 
especially comments/apidocs ?
{code}
-  private static StringBuilder appendWithQuotes(StringBuilder sb, String 
value) {
-    return sb == null ? null : 
sb.append(SerDeUtils.QUOTE).append(value).append(SerDeUtils.QUOTE);
+  private static StringBuilder appendWithQuotes(StringBuilder sb,
+      String value) {
+    return sb == null ? null
+        : sb.append(SerDeUtils.QUOTE).append(value).append(SerDeUtils.QUOTE);

-  // TODO : code section copied over from SerDeUtils because of non-standard 
json production there
-  // should use quotes for all field names. We should fix this there, and then 
remove this copy.
+  // TODO : code section copied over from SerDeUtils because of non-standard
+  // json production there
+  // should use quotes for all field names. We should fix this there, and then
+  // remove this copy.
{code}
please use the hive formatter 
(https://cwiki.apache.org/confluence/display/Hive/HowToContribute)

> JSON SerDe Deserialize Re-Write
> -------------------------------
>
>                 Key: HIVE-21240
>                 URL: https://issues.apache.org/jira/browse/HIVE-21240
>             Project: Hive
>          Issue Type: Improvement
>          Components: Serializers/Deserializers
>    Affects Versions: 4.0.0, 3.1.1
>            Reporter: BELUGA BEHR
>            Assignee: BELUGA BEHR
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-21240.1.patch, HIVE-21240.1.patch, 
> HIVE-21240.2.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The JSON SerDe has a few issues, I will link them to this JIRA.
> * Use Jackson Tree parser instead of manually parsing
> * Added support for base-64 encoded data (the expected format when using JSON)
> * Added support to skip blank lines (returns all columns as null values)
> * Current JSON parser accepts, but does not apply, custom timestamp formats 
> in most cases
> * Added some unit tests
> * Added cache for column-name to column-index searches, currently O\(n\) for 
> each row processed, for each column in the row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to