[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain
clownxc commented on code in PR #8659: URL: https://github.com/apache/hudi/pull/8659#discussion_r1193244670 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String instantTime) { return false; } } + + private static ZoneId getZoneId() { +return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL) +? ZoneId.systemDefault() Review Comment: > See the discussions we take in: #8631 It seems that there is no good way to get `HoodieTimelineTimeZone` through `HoodieTableMetaClient` in `HoodieInstantTimeGenerator`, I currently get `HoodieTimelineTimeZone` by instantiate a `HoodieTableConfig`, can you give me some advice? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain
clownxc commented on code in PR #8659: URL: https://github.com/apache/hudi/pull/8659#discussion_r1191784979 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String instantTime) { return false; } } + + private static ZoneId getZoneId() { +return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL) +? ZoneId.systemDefault() Review Comment: > metaClient I sees, I will try to modify the code as you say. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain
clownxc commented on code in PR #8659: URL: https://github.com/apache/hudi/pull/8659#discussion_r1191785146 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String instantTime) { return false; } } + + private static ZoneId getZoneId() { +return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL) +? ZoneId.systemDefault() Review Comment: > See the discussions we take in: #8631 I sees, I will try to modify the code as you say. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain
clownxc commented on code in PR #8659: URL: https://github.com/apache/hudi/pull/8659#discussion_r1190155373 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String instantTime) { return false; } } + + private static ZoneId getZoneId() { +return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL) +? ZoneId.systemDefault() Review Comment: > If possible, fetch the timezone whout metaClient.tableConfig, the `HoodieTimelineTimeZone` can not assure the initialization of zoneId. In the class `HoodieInstantTimeGenerator`, set an initial value( `HoodieTimelineTimeZone.LOCAL` ) for the property `commitTimeZone` ```java private static HoodieTimelineTimeZone commitTimeZone = HoodieTimelineTimeZone.LOCAL; ``` And update `commitTimeZone` value in `HoodieTableConfig#create` ```java if (hoodieConfig.contains(TIMELINE_TIMEZONE)) { HoodieInstantTimeGenerator.setCommitTimeZone(HoodieTimelineTimeZone.valueOf(hoodieConfig.getString(TIMELINE_TIMEZONE))); } ``` ```java public static void setCommitTimeZone(HoodieTimelineTimeZone commitTimeZone) { HoodieInstantTimeGenerator.commitTimeZone = commitTimeZone; } ``` So, I think getting ZoneId by HoodieTimelineTimeZone should be correct. and I don't really understand the meaning of `the HoodieTimelineTimeZone can not assure the initialization of zoneId`. I don't know if my idea is correct, looking forward to your reply. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain
clownxc commented on code in PR #8659: URL: https://github.com/apache/hudi/pull/8659#discussion_r1190046653 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -144,4 +144,10 @@ public static boolean isValidInstantTime(String instantTime) { return false; } } + + private static ZoneId getZoneId() { +return commitTimeZone.equals(HoodieTimelineTimeZone.LOCAL) +? ZoneId.systemDefault() Review Comment: > If possible, fetch the timezone whout metaClient.tableConfig, the `HoodieTimelineTimeZone` can not assure the initialization of zoneId. I will try to modify the code as you say -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain
clownxc commented on code in PR #8659: URL: https://github.com/apache/hudi/pull/8659#discussion_r1189257444 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -94,7 +96,9 @@ public static Date parseDateFromInstantTime(String timestamp) throws ParseExcept } LocalDateTime dt = LocalDateTime.parse(timestampInMillis, MILLIS_INSTANT_TIME_FORMATTER); - return Date.from(dt.atZone(ZoneId.systemDefault()).toInstant()); + Instant instant = dt.atZone(getZoneId()).toInstant(); + TimeZone.setDefault(TimeZone.getTimeZone(getZoneId())); + return Date.from(instant); Review Comment: > It is risky to set up timezone per JVM process: `TimeZone.setDefault(`, this could impact all the threads in the JVM. One of the tests failed, I will try to find the reason and change the code. ``` [ERROR] Failures: [ERROR] TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1120 expected: <1> but was: <0> [ERROR] TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1090 expected: but was: [ERROR] TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1120 expected: <1> but was: <0> [ERROR] TestHoodieDeltaStreamer.testCleanerDeleteReplacedDataWithArchive:1090 expected: but was: [INFO] [ERROR] Tests run: 354, Failures: 4, Errors: 0, Skipped: 7 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain
clownxc commented on code in PR #8659: URL: https://github.com/apache/hudi/pull/8659#discussion_r1188563707 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -94,7 +96,9 @@ public static Date parseDateFromInstantTime(String timestamp) throws ParseExcept } LocalDateTime dt = LocalDateTime.parse(timestampInMillis, MILLIS_INSTANT_TIME_FORMATTER); - return Date.from(dt.atZone(ZoneId.systemDefault()).toInstant()); + Instant instant = dt.atZone(getZoneId()).toInstant(); + TimeZone.setDefault(TimeZone.getTimeZone(getZoneId())); + return Date.from(instant); Review Comment: > It is risky to set up timezone per JVM process: `TimeZone.setDefault(`, this could impact all the threads in the JVM. done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain
clownxc commented on code in PR #8659: URL: https://github.com/apache/hudi/pull/8659#discussion_r1188006365 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -129,7 +129,7 @@ public static String getInstantForDateString(String dateString) { } private static TemporalAccessor convertDateToTemporalAccessor(Date d) { -return d.toInstant().atZone(ZoneId.systemDefault()).toLocalDateTime(); +return d.toInstant().atZone(getZoneId()).toLocalDateTime(); } Review Comment: > Can we supplement some UTs for `parseDateFromInstantTime` and `convertDateToTemporalAccessor` ? And in the TestHoodieActiveTimeline.java, there are many UTs related to DateParsing, such as: - `testInvalidInstantDateParsing` - `testMillisGranularityInstantDateParsing` etc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain
clownxc commented on code in PR #8659: URL: https://github.com/apache/hudi/pull/8659#discussion_r1188004776 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -129,7 +129,7 @@ public static String getInstantForDateString(String dateString) { } private static TemporalAccessor convertDateToTemporalAccessor(Date d) { -return d.toInstant().atZone(ZoneId.systemDefault()).toLocalDateTime(); +return d.toInstant().atZone(getZoneId()).toLocalDateTime(); } Review Comment: > convertDateToTemporalAccessor I added two UTs: `testFormatDateWithCommitTimeZone` and `testInstantDateParsingWithCommitTimeZone`, `testInstantDateParsingWithCommitTimeZone` is used to test the correctness of the HoodieInstantTimeGenerator#convertDateToTemporalAccessor() -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain
clownxc commented on code in PR #8659: URL: https://github.com/apache/hudi/pull/8659#discussion_r1188004776 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -129,7 +129,7 @@ public static String getInstantForDateString(String dateString) { } private static TemporalAccessor convertDateToTemporalAccessor(Date d) { -return d.toInstant().atZone(ZoneId.systemDefault()).toLocalDateTime(); +return d.toInstant().atZone(getZoneId()).toLocalDateTime(); } Review Comment: > convertDateToTemporalAccessor I added two UTs: `testFormatDateWithCommitTimeZone` and `testInstantDateParsingWithCommitTimeZone`, `testInstantDateParsingWithCommitTimeZone` is used to test the correctness of the `HoodieInstantTimeGenerator#convertDateToTemporalAccessor()` via `formatDate()` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] clownxc commented on a diff in pull request #8659: [HUDI-6155] Fix cleaner based on hours for earliest commit to retain
clownxc commented on code in PR #8659: URL: https://github.com/apache/hudi/pull/8659#discussion_r1187391365 ## hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieInstantTimeGenerator.java: ## @@ -129,7 +129,7 @@ public static String getInstantForDateString(String dateString) { } private static TemporalAccessor convertDateToTemporalAccessor(Date d) { -return d.toInstant().atZone(ZoneId.systemDefault()).toLocalDateTime(); +return d.toInstant().atZone(getZoneId()).toLocalDateTime(); } Review Comment: > Can we supplement some UTs for `parseDateFromInstantTime` and `convertDateToTemporalAccessor` ? I would be happy to do it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org