Abacn commented on code in PR #36059:
URL: https://github.com/apache/beam/pull/36059#discussion_r2334279401
##########
sdks/java/io/clickhouse/src/main/java/org/apache/beam/sdk/io/clickhouse/TableSchema.java:
##########
@@ -313,6 +314,10 @@ public static ColumnType parse(String str) {
* @return value of ClickHouse expression
*/
public static Object parseDefaultExpression(ColumnType columnType, String
value) {
+ if (value == null || value.isEmpty()) {
+ return null;
Review Comment:
while null is fine, return null on value.isEmpty() sounds a breaking change.
Previously it returned an empty string on case STRING
##########
sdks/java/io/clickhouse/src/main/java/org/apache/beam/sdk/io/clickhouse/TableSchema.java:
##########
@@ -335,8 +335,73 @@ public static Object parseDefaultExpression(ColumnType
columnType, String value)
return Long.valueOf(value);
case UINT64:
return Long.valueOf(value);
+ case ENUM8:
+ case ENUM16:
+ case FIXEDSTRING:
+ case STRING:
+ return value;
case BOOL:
return Boolean.valueOf(value);
+ case FLOAT32:
+ return Float.valueOf(value);
+ case FLOAT64:
+ return Double.valueOf(value);
+ case DATE:
+ case DATETIME:
+ // ClickHouse DateTime/Date format: 'YYYY-MM-DD HH:MM:SS' or
'YYYY-MM-DD'
+ // Convert to Joda DateTime (used by Beam Schema.FieldType.DATETIME)
+ try {
+ String formattedValue = value.contains(" ") ? value : value + "
00:00:00";
+ return new org.joda.time.DateTime(
+ java.time.LocalDateTime.parse(
+ formattedValue,
+
java.time.format.DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss"))
+ .atZone(java.time.ZoneId.of("UTC"))
+ .toInstant()
+ .toEpochMilli());
Review Comment:
shall we add back UTC timezone after converted java.time.Dateime to
joda.Datetime here?
##########
sdks/java/io/clickhouse/src/main/java/org/apache/beam/sdk/io/clickhouse/TableSchema.java:
##########
@@ -335,8 +335,73 @@ public static Object parseDefaultExpression(ColumnType
columnType, String value)
return Long.valueOf(value);
case UINT64:
return Long.valueOf(value);
+ case ENUM8:
+ case ENUM16:
+ case FIXEDSTRING:
+ case STRING:
+ return value;
case BOOL:
return Boolean.valueOf(value);
+ case FLOAT32:
+ return Float.valueOf(value);
+ case FLOAT64:
+ return Double.valueOf(value);
+ case DATE:
+ case DATETIME:
+ // ClickHouse DateTime/Date format: 'YYYY-MM-DD HH:MM:SS' or
'YYYY-MM-DD'
+ // Convert to Joda DateTime (used by Beam Schema.FieldType.DATETIME)
+ try {
+ String formattedValue = value.contains(" ") ? value : value + "
00:00:00";
+ return new org.joda.time.DateTime(
+ java.time.LocalDateTime.parse(
+ formattedValue,
+
java.time.format.DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss"))
+ .atZone(java.time.ZoneId.of("UTC"))
+ .toInstant()
+ .toEpochMilli());
+ } catch (java.time.format.DateTimeParseException e) {
+ throw new IllegalArgumentException("Invalid DateTime/Date format:
" + value, e);
+ }
+ case ARRAY:
+ // ClickHouse Array format: '[1,2,3]' or '["a","b"]'
Review Comment:
I wonder if we can use a third party lib to parse array, or there is any
tool within clickhouse library supporting this. This sounds fragile.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]