[GitHub] [hudi] yihua commented on a diff in pull request #8840: [HUDI-5352] Fix `LocalDate` serialization in colstats

2023-06-23 Thread via GitHub


yihua commented on code in PR #8840:
URL: https://github.com/apache/hudi/pull/8840#discussion_r1240098101


##
hudi-common/src/test/java/org/apache/hudi/common/util/TestJsonUtils.java:
##
@@ -1,60 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *   http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-
-package org.apache.hudi.common.util;
-
-import org.junit.jupiter.api.Test;
-
-import java.time.Instant;
-import java.util.Arrays;
-import java.util.List;
-import java.util.stream.Collectors;
-
-import static org.junit.jupiter.api.Assertions.assertEquals;
-
-public class TestJsonUtils {

Review Comment:
   This PR only fixes the jsr310 dependency without changing the serialization. 
 So the tests would fail.  You can put up a separate PR on fixing the 
serialization of data types, but is that required?  That is going to change the 
col stats persisted in the MDT.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yihua commented on a diff in pull request #8840: [HUDI-5352] Fix `LocalDate` serialization in colstats

2023-06-08 Thread via GitHub


yihua commented on code in PR #8840:
URL: https://github.com/apache/hudi/pull/8840#discussion_r1223597615


##
hudi-common/src/main/java/org/apache/hudi/common/util/JsonUtils.java:
##
@@ -20,41 +20,74 @@
 package org.apache.hudi.common.util;
 
 import org.apache.hudi.exception.HoodieIOException;
+import org.apache.hudi.util.Lazy;
 
 import com.fasterxml.jackson.annotation.JsonAutoDetect;
 import com.fasterxml.jackson.annotation.PropertyAccessor;
 import com.fasterxml.jackson.core.JsonProcessingException;
 import com.fasterxml.jackson.databind.DeserializationFeature;
 import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.databind.SerializationFeature;
+import com.fasterxml.jackson.databind.util.StdDateFormat;
+import com.fasterxml.jackson.datatype.jsr310.JavaTimeModule;
 
 /**
  * Utils for JSON serialization and deserialization.
  */
 public class JsonUtils {
 
-  private static final ObjectMapper MAPPER = new ObjectMapper();
-
-  static {
-MAPPER.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES);
-// We need to exclude custom getters, setters and creators which can use 
member fields
-// to derive new fields, so that they are not included in the serialization
-MAPPER.setVisibility(PropertyAccessor.FIELD, 
JsonAutoDetect.Visibility.ANY);
-MAPPER.setVisibility(PropertyAccessor.GETTER, 
JsonAutoDetect.Visibility.NONE);
-MAPPER.setVisibility(PropertyAccessor.IS_GETTER, 
JsonAutoDetect.Visibility.NONE);
-MAPPER.setVisibility(PropertyAccessor.SETTER, 
JsonAutoDetect.Visibility.NONE);
-MAPPER.setVisibility(PropertyAccessor.CREATOR, 
JsonAutoDetect.Visibility.NONE);
-  }
+  private static final Lazy MAPPER = 
Lazy.lazily(JsonUtils::instantiateObjectMapper);
 
   public static ObjectMapper getObjectMapper() {
-return MAPPER;
+return MAPPER.get();
   }
 
   public static String toString(Object value) {
 try {
-  return MAPPER.writeValueAsString(value);
+  return MAPPER.get().writeValueAsString(value);
 } catch (JsonProcessingException e) {
   throw new HoodieIOException(
   "Fail to convert the class: " + value.getClass().getName() + " to 
Json String", e);
 }
   }
+
+  private static ObjectMapper instantiateObjectMapper() {
+ObjectMapper mapper = new ObjectMapper();
+
+registerModules(mapper);
+
+// We're writing out dates as their string representations instead of 
(int) timestamps
+mapper.disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS);
+// NOTE: This is necessary to make sure that w/ Jackson >= 2.11 colon is 
not infixed
+//   into the timezone value ("+00:00" as opposed to "+" before 
2.11)
+//   While Jackson is able to parse both of these formats, we keep it 
as false
+//   to make sure metadata produced by Hudi stays consistent across 
Jackson versions
+configureColonInTimezone(mapper);

Review Comment:
   I think we serialize the column stats to the metadata record payload, 
correct?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org