[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-14 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1193244670


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String 
instantTime) {
   return false;
 }
   }
+
+  private static ZoneId getZoneId() {
+return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL)
+? ZoneId.systemDefault()

Review Comment:
   > See the discussions we take in: #8631
   
   It seems that there is no good way to get `HoodieTimelineTimeZone` through 
`HoodieTableMetaClient` in `HoodieInstantTimeGenerator`, I currently get 
`HoodieTimelineTimeZone` by instantiate a `HoodieTableConfig`, can you give me 
some advice?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-11 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1191784979


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String 
instantTime) {
   return false;
 }
   }
+
+  private static ZoneId getZoneId() {
+return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL)
+? ZoneId.systemDefault()

Review Comment:
   > metaClient
   
   I sees, I will try to modify the code as you say.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-11 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1191785146


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String 
instantTime) {
   return false;
 }
   }
+
+  private static ZoneId getZoneId() {
+return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL)
+? ZoneId.systemDefault()

Review Comment:
   > See the discussions we take in: #8631
   
   I sees, I will try to modify the code as you say.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-10 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1190155373


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String 
instantTime) {
   return false;
 }
   }
+
+  private static ZoneId getZoneId() {
+return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL)
+? ZoneId.systemDefault()

Review Comment:
   > If possible, fetch the timezone whout metaClient.tableConfig, the 
`HoodieTimelineTimeZone` can not assure the initialization of zoneId.
   
   
   In the class `HoodieInstantTimeGenerator`, set an initial value( 
`HoodieTimelineTimeZone.LOCAL` ) for the property `commitTimeZone` 
   ```java
   private static HoodieTimelineTimeZone commitTimeZone = 
HoodieTimelineTimeZone.LOCAL;
   ```
   And update `commitTimeZone` value in `HoodieTableConfig#create`
   ```java
   if (hoodieConfig.contains(TIMELINE_TIMEZONE)) {
   
HoodieInstantTimeGenerator.setCommitTimeZone(HoodieTimelineTimeZone.valueOf(hoodieConfig.getString(TIMELINE_TIMEZONE)));
   }
   ```
   
   ```java
   public static void setCommitTimeZone(HoodieTimelineTimeZone commitTimeZone) {
 HoodieInstantTimeGenerator.commitTimeZone = commitTimeZone;
   }
   ```
   So, I think getting ZoneId by HoodieTimelineTimeZone should be correct. and 
I don't really understand the meaning of `the HoodieTimelineTimeZone can not 
assure the initialization of zoneId`.
   I don't know if my idea is correct, looking forward to your reply.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-10 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1190046653


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String 
instantTime) {
   return false;
 }
   }
+
+  private static ZoneId getZoneId() {
+return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL)
+? ZoneId.systemDefault()

Review Comment:
   > If possible, fetch the timezone whout metaClient.tableConfig, the 
`HoodieTimelineTimeZone` can not assure the initialization of zoneId.
   
I will try to modify the code as you say



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-09 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1189257444


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -94,7 +96,9 @@ public static Date parseDateFromInstantTime(String timestamp) 
throws ParseExcept
   }
 
   LocalDateTime dt = LocalDateTime.parse(timestampInMillis, 
MILLIS_INSTANT_TIME_FORMATTER);
-  return Date.from(dt.atZone(ZoneId.systemDefault()).toInstant());
+  Instant instant = dt.atZone(getZoneId()).toInstant();
+  TimeZone.setDefault(TimeZone.getTimeZone(getZoneId()));
+  return Date.from(instant);

Review Comment:
   > It is risky to set up timezone per JVM process: `TimeZone.setDefault(`, 
this could impact all the threads in the JVM.
   
   One of the tests failed, I will try to find the reason and change the code.
   ```
   [ERROR] Failures: 
   [ERROR]   
TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1120 expected: 
<1> but was: <0>
   [ERROR]   
TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1090 expected: 
 but was: 
   [ERROR]   
TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1120 expected: 
<1> but was: <0>
   [ERROR]   
TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1090 expected: 
 but was: 
   [INFO] 
   [ERROR] Tests run: 354, Failures: 4, Errors: 0, Skipped: 7
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-09 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1188563707


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -94,7 +96,9 @@ public static Date parseDateFromInstantTime(String timestamp) 
throws ParseExcept
   }
 
   LocalDateTime dt = LocalDateTime.parse(timestampInMillis, 
MILLIS_INSTANT_TIME_FORMATTER);
-  return Date.from(dt.atZone(ZoneId.systemDefault()).toInstant());
+  Instant instant = dt.atZone(getZoneId()).toInstant();
+  TimeZone.setDefault(TimeZone.getTimeZone(getZoneId()));
+  return Date.from(instant);

Review Comment:
   > It is risky to set up timezone per JVM process: `TimeZone.setDefault(`, 
this could impact all the threads in the JVM.
   
   done
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-08 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1188006365


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -129,7 +129,7 @@ public static String getInstantForDateString(String 
dateString) {
   }
 
   private static TemporalAccessor convertDateToTemporalAccessor(Date d) {
-return d.toInstant().atZone(ZoneId.systemDefault()).toLocalDateTime();
+return d.toInstant().atZone(getZoneId()).toLocalDateTime();
   }
 

Review Comment:
   > Can we supplement some UTs for `parseDateFromInstantTime` and 
`convertDateToTemporalAccessor` ?
   
   And in the TestHoodieActiveTimeline.java, there are many UTs related to 
DateParsing, such as: 
   - `testInvalidInstantDateParsing` 
   - `testMillisGranularityInstantDateParsing` 
   etc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-08 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1188004776


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -129,7 +129,7 @@ public static String getInstantForDateString(String 
dateString) {
   }
 
   private static TemporalAccessor convertDateToTemporalAccessor(Date d) {
-return d.toInstant().atZone(ZoneId.systemDefault()).toLocalDateTime();
+return d.toInstant().atZone(getZoneId()).toLocalDateTime();
   }
 

Review Comment:
   > convertDateToTemporalAccessor
   
   I added two UTs: `testFormatDateWithCommitTimeZone` and 
`testInstantDateParsingWithCommitTimeZone`,  
`testInstantDateParsingWithCommitTimeZone` is used to test the correctness of 
the HoodieInstantTimeGenerator#convertDateToTemporalAccessor()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-08 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1188004776


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -129,7 +129,7 @@ public static String getInstantForDateString(String 
dateString) {
   }
 
   private static TemporalAccessor convertDateToTemporalAccessor(Date d) {
-return d.toInstant().atZone(ZoneId.systemDefault()).toLocalDateTime();
+return d.toInstant().atZone(getZoneId()).toLocalDateTime();
   }
 

Review Comment:
   > convertDateToTemporalAccessor
   
   I added two UTs: `testFormatDateWithCommitTimeZone` and 
`testInstantDateParsingWithCommitTimeZone`,  
`testInstantDateParsingWithCommitTimeZone` is used to test the correctness of 
the `HoodieInstantTimeGenerator#convertDateToTemporalAccessor()` via 
`formatDate()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain

2023-05-08 Thread via GitHub


clownxc commented on code in PR #8659:
URL: https://github.com/apache/hudi/pull/8659#discussion_r1187391365


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java:
##
@@ -129,7 +129,7 @@ public static String getInstantForDateString(String 
dateString) {
   }
 
   private static TemporalAccessor convertDateToTemporalAccessor(Date d) {
-return d.toInstant().atZone(ZoneId.systemDefault()).toLocalDateTime();
+return d.toInstant().atZone(getZoneId()).toLocalDateTime();
   }
 

Review Comment:
   > Can we supplement some UTs for `parseDateFromInstantTime` and 
`convertDateToTemporalAccessor` ?
   
   I would be happy to do it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org