Re: [I] [SUPPORT] Hudi SQL Based Transformer Fails when trying to provide SQL File as input [hudi]

2024-05-21 Thread via GitHub
ad1happy2go commented on issue #11258: URL: https://github.com/apache/hudi/issues/11258#issuecomment-2123907590 @soumilshah1995 Your transformer class should be --transformer-class org.apache.hudi.utilities.transform.SqlFileBasedTransformer -- This is an automated message from the Apache

[jira] [Updated] (HUDI-7781) Filter wrong partitions when using hoodie.datasource.write.partitions.to.delete

2024-05-21 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7781: - Reviewers: Danny Chen > Filter wrong partitions when using > hoodie.datasource.write.partitions.to.delete

[jira] [Updated] (HUDI-7781) Filter wrong partitions when using hoodie.datasource.write.partitions.to.delete

2024-05-21 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7781: - Sprint: Sprint 2023-04-26 > Filter wrong partitions when using > hoodie.datasource.write.partitions.to.de

Re: [I] Intermittent stall of S3 PUT request for about 17 minutes [hudi]

2024-05-21 Thread via GitHub
hamadjaved commented on issue #11203: URL: https://github.com/apache/hudi/issues/11203#issuecomment-2123799251 I ran into something very similar - it typically happened when the size of the file being written to a partition approached ~ 100mb or so. I'll be curious if there are network sett

Re: [PR] [HUDI-7781] Filter wrong partitions when using hoodie.datasource.write.partitions.to.delete [hudi]

2024-05-21 Thread via GitHub
Zouxxyy closed pull request #11260: [HUDI-7781] Filter wrong partitions when using hoodie.datasource.write.partitions.to.delete URL: https://github.com/apache/hudi/pull/11260 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] [HUDI-7781] Filter wrong partitions when using hoodie.datasource.write.partitions.to.delete [hudi]

2024-05-21 Thread via GitHub
Zouxxyy commented on PR #11260: URL: https://github.com/apache/hudi/pull/11260#issuecomment-2123793925 @danny0405 When `hive_style_partitioning` is false, such as 2016/03/15, it is difficult to automatically add * to identify them. Besides I think that for this type of configuration that

Re: [PR] [HUDI-7772] HoodieTimelineArchiver##getCommitInstantsToArchive need skip limiting archiving of instants [hudi]

2024-05-21 Thread via GitHub
xuzifu666 closed pull request #11245: [HUDI-7772] HoodieTimelineArchiver##getCommitInstantsToArchive need skip limiting archiving of instants URL: https://github.com/apache/hudi/pull/11245 -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] [HUDI-7772] HoodieTimelineArchiver##getCommitInstantsToArchive need skip limiting archiving of instants [hudi]

2024-05-21 Thread via GitHub
xuzifu666 commented on PR #11245: URL: https://github.com/apache/hudi/pull/11245#issuecomment-2123724285 close it first -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] [HUDI-7776] Simplify HoodieStorage instance fetching [hudi]

2024-05-21 Thread via GitHub
codope commented on code in PR #11259: URL: https://github.com/apache/hudi/pull/11259#discussion_r1609145729 ## hudi-common/src/main/java/org/apache/hudi/storage/HoodieStorageUtils.java: ## @@ -33,6 +35,13 @@ public static HoodieStorage getStorage(String basePath, StorageConfig

Re: [PR] [HUDI-7776] Simplify HoodieStorage instance fetching [hudi]

2024-05-21 Thread via GitHub
codope commented on code in PR #11259: URL: https://github.com/apache/hudi/pull/11259#discussion_r1609145623 ## hudi-common/src/main/java/org/apache/hudi/common/config/HoodieStorageConfig.java: ## @@ -243,7 +243,12 @@ public class HoodieStorageConfig extends HoodieConfig {

Re: [PR] [HUDI-7776] Simplify HoodieStorage instance fetching [hudi]

2024-05-21 Thread via GitHub
codope commented on code in PR #11259: URL: https://github.com/apache/hudi/pull/11259#discussion_r1609145152 ## hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java: ## @@ -384,13 +384,12 @@ public Boolean isMetadataTable() { public HoodieStorag

Re: [PR] [HUDI-7778] Fixing global index for duplicate updates [hudi]

2024-05-21 Thread via GitHub
danny0405 commented on code in PR #11256: URL: https://github.com/apache/hudi/pull/11256#discussion_r1609131887 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java: ## @@ -288,12 +288,9 @@ public static HoodieData> mergeForPartitionUpdat

Re: [PR] [HUDI-7622] Optimize HoodieTableSource's sanity check [hudi]

2024-05-21 Thread via GitHub
danny0405 commented on code in PR #11031: URL: https://github.com/apache/hudi/pull/11031#discussion_r1609127301 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java: ## @@ -102,9 +102,9 @@ public DynamicTableSink createDynamicTableSink(

[jira] [Updated] (HUDI-7622) Add sanity check for HoodieTableSource

2024-05-21 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7622: - Status: In Progress (was: Open) > Add sanity check for HoodieTableSource > --

[jira] [Updated] (HUDI-7622) Add sanity check for HoodieTableSource

2024-05-21 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7622: - Reviewers: Danny Chen > Add sanity check for HoodieTableSource > -- >

[jira] [Updated] (HUDI-7622) Add sanity check for HoodieTableSource

2024-05-21 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7622: - Status: Patch Available (was: In Progress) > Add sanity check for HoodieTableSource > ---

[jira] [Updated] (HUDI-7622) Add sanity check for HoodieTableSource

2024-05-21 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7622: - Sprint: Sprint 2023-04-26 > Add sanity check for HoodieTableSource > -

Re: [PR] [HUDI-7774] Add Avro Logical type support for Merciful Java convertor [hudi]

2024-05-21 Thread via GitHub
Davis-Zhang-Onehouse commented on code in PR #11265: URL: https://github.com/apache/hudi/pull/11265#discussion_r1608950828 ## hudi-common/src/main/java/org/apache/hudi/avro/AvroLogicalTypeEnum.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] [HUDI-7774] Add Avro Logical type support for Merciful Java convertor [hudi]

2024-05-21 Thread via GitHub
yihua commented on code in PR #11265: URL: https://github.com/apache/hudi/pull/11265#discussion_r1608917941 ## hudi-common/src/main/java/org/apache/hudi/avro/AvroLogicalTypeEnum.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] [HUDI-7774] Add Avro Logical type support for Merciful Java convertor [hudi]

2024-05-21 Thread via GitHub
Davis-Zhang-Onehouse commented on code in PR #11265: URL: https://github.com/apache/hudi/pull/11265#discussion_r1608799784 ## hudi-common/src/main/java/org/apache/hudi/avro/MercifulJsonConverter.java: ## @@ -187,196 +178,774 @@ private static Object convertJsonToAvroField(Object

Re: [PR] [HUDI-4205] Fix NullPointerException in HFile reader creation [hudi]

2024-05-21 Thread via GitHub
cmanning-arcadia commented on PR #5841: URL: https://github.com/apache/hudi/pull/5841#issuecomment-2123341314 This was merged and then overwritten shortly thereafter. We are currently experiencing this issue with trying to load the metadata as a Hudi table directly in order to efficiently l

Re: [PR] [HUDI-7774] Add Avro Logical type support for Merciful Java convertor [hudi]

2024-05-21 Thread via GitHub
hudi-bot commented on PR #11265: URL: https://github.com/apache/hudi/pull/11265#issuecomment-2123301071 ## CI report: * c284b2fce8c40e3c5b7448a6cf2f47f3b5ae50c4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24

Re: [PR] [HUDI-7774] Add Avro Logical type support for Merciful Java convertor [hudi]

2024-05-21 Thread via GitHub
Davis-Zhang-Onehouse commented on code in PR #11265: URL: https://github.com/apache/hudi/pull/11265#discussion_r1608801275 ## hudi-common/src/test/java/org/apache/hudi/avro/TestMercifulJsonConverter.java: ## @@ -55,6 +70,649 @@ public void basicConversion() throws IOException {

Re: [PR] [HUDI-7774] Add Avro Logical type support for Merciful Java convertor [hudi]

2024-05-21 Thread via GitHub
Davis-Zhang-Onehouse commented on code in PR #11265: URL: https://github.com/apache/hudi/pull/11265#discussion_r1608799784 ## hudi-common/src/main/java/org/apache/hudi/avro/MercifulJsonConverter.java: ## @@ -187,196 +178,774 @@ private static Object convertJsonToAvroField(Object

Re: [PR] [HUDI-7774] Add Avro Logical type support for Merciful Java convertor [hudi]

2024-05-21 Thread via GitHub
Davis-Zhang-Onehouse commented on code in PR #11265: URL: https://github.com/apache/hudi/pull/11265#discussion_r1608797800 ## hudi-common/src/main/java/org/apache/hudi/avro/AvroLogicalTypeEnum.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] [HUDI-7774] Add Avro Logical type support for Merciful Java convertor [hudi]

2024-05-21 Thread via GitHub
yihua commented on code in PR #11265: URL: https://github.com/apache/hudi/pull/11265#discussion_r1608752328 ## hudi-common/src/main/java/org/apache/hudi/avro/MercifulJsonConverter.java: ## @@ -187,196 +178,774 @@ private static Object convertJsonToAvroField(Object value, String

Re: [PR] [HUDI-7774] Add Avro Logical type support for Merciful Java convertor [hudi]

2024-05-21 Thread via GitHub
yihua commented on code in PR #11265: URL: https://github.com/apache/hudi/pull/11265#discussion_r1608735139 ## hudi-common/src/main/java/org/apache/hudi/avro/AvroLogicalTypeEnum.java: ## @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or m

Re: [PR] [HUDI-7776] Simplify HoodieStorage instance fetching [hudi]

2024-05-21 Thread via GitHub
yihua commented on code in PR #11259: URL: https://github.com/apache/hudi/pull/11259#discussion_r1608732601 ## hudi-common/src/main/java/org/apache/hudi/metadata/AbstractHoodieTableMetadata.java: ## @@ -35,16 +36,18 @@ public abstract class AbstractHoodieTableMetadata implement

Re: [PR] [HUDI-7776] Simplify HoodieStorage instance fetching [hudi]

2024-05-21 Thread via GitHub
yihua commented on code in PR #11259: URL: https://github.com/apache/hudi/pull/11259#discussion_r1608730905 ## hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java: ## @@ -384,13 +384,12 @@ public Boolean isMetadataTable() { public HoodieStorage

Re: [PR] [HUDI-7776] Simplify HoodieStorage instance fetching [hudi]

2024-05-21 Thread via GitHub
yihua commented on code in PR #11259: URL: https://github.com/apache/hudi/pull/11259#discussion_r1608728190 ## hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileReaderFactory.java: ## @@ -40,9 +39,9 @@ */ public class HoodieFileReaderFactory { - protected fina

Re: [PR] [HUDI-7776] Simplify HoodieStorage instance fetching [hudi]

2024-05-21 Thread via GitHub
yihua commented on code in PR #11259: URL: https://github.com/apache/hudi/pull/11259#discussion_r1608726591 ## hudi-common/src/main/java/org/apache/hudi/storage/HoodieStorageUtils.java: ## @@ -33,6 +35,13 @@ public static HoodieStorage getStorage(String basePath, StorageConfigu

Re: [PR] [HUDI-7776] Simplify HoodieStorage instance fetching [hudi]

2024-05-21 Thread via GitHub
yihua commented on code in PR #11259: URL: https://github.com/apache/hudi/pull/11259#discussion_r1608718789 ## hudi-common/src/main/java/org/apache/hudi/common/config/HoodieStorageConfig.java: ## @@ -243,7 +243,12 @@ public class HoodieStorageConfig extends HoodieConfig {

Re: [PR] [HUDI-7776] Simplify HoodieStorage instance fetching [hudi]

2024-05-21 Thread via GitHub
yihua commented on code in PR #11259: URL: https://github.com/apache/hudi/pull/11259#discussion_r1608715066 ## hudi-common/src/main/java/org/apache/hudi/common/config/HoodieStorageConfig.java: ## @@ -243,7 +243,12 @@ public class HoodieStorageConfig extends HoodieConfig {

Re: [PR] [HUDI-7713] Enforce ordering of fields during schema reconciliation [hudi]

2024-05-21 Thread via GitHub
the-other-tim-brown commented on code in PR #11154: URL: https://github.com/apache/hudi/pull/11154#discussion_r1608649456 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/InternalSchemaBuilder.java: ## @@ -67,6 +68,10 @@ public Map buildNameToId(Type type) { ret

Re: [PR] [HUDI-7713] Enforce ordering of fields during schema reconciliation [hudi]

2024-05-21 Thread via GitHub
the-other-tim-brown commented on code in PR #11154: URL: https://github.com/apache/hudi/pull/11154#discussion_r1608646457 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSchemaUtils.scala: ## @@ -93,14 +93,14 @@ object HoodieSchemaUtils {

Re: [PR] [HUDI-7713] Enforce ordering of fields during schema reconciliation [hudi]

2024-05-21 Thread via GitHub
the-other-tim-brown commented on code in PR #11154: URL: https://github.com/apache/hudi/pull/11154#discussion_r1608646068 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/utils/AvroSchemaEvolutionUtils.java: ## @@ -145,8 +146,9 @@ public static Schema reconcileSchema

Re: [PR] [HUDI-7713] Enforce ordering of fields during schema reconciliation [hudi]

2024-05-21 Thread via GitHub
codope commented on code in PR #11154: URL: https://github.com/apache/hudi/pull/11154#discussion_r1608630384 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSchemaUtils.scala: ## @@ -93,14 +93,14 @@ object HoodieSchemaUtils { // in the ta

Re: [I] [SUPPORT] HiveSyncTool failure - Unable to create a `_ro` table when writing data [hudi]

2024-05-21 Thread via GitHub
shubhamn21 commented on issue #11254: URL: https://github.com/apache/hudi/issues/11254#issuecomment-2122959293 Here it is: ``` options = { "hoodie.datasource.write.keygenerator.class": "org.apache.hudi.keygen.ComplexKeyGenerator", "hoodie.datasource.write.operation": "ins

Re: [I] [SUPPORT] HiveSyncTool failure - Unable to create a `_ro` table when writing data [hudi]

2024-05-21 Thread via GitHub
shubhamn21 commented on issue #11254: URL: https://github.com/apache/hudi/issues/11254#issuecomment-2122949340 Here it is: ``` options = { "hoodie.datasource.write.keygenerator.class": "org.apache.hudi.keygen.ComplexKeyGenerator", "hoodie.datasource.write.operation": "ins

Re: [PR] [HUDI-7007] Add bloom_filters index support on read side [hudi]

2024-05-21 Thread via GitHub
KnightChess commented on code in PR #11043: URL: https://github.com/apache/hudi/pull/11043#discussion_r1608419895 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BloomFiltersIndexSupport.scala: ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software

Re: [PR] [HUDI-7580] Fix order of fields when records inserted out of order [hudi]

2024-05-21 Thread via GitHub
KnightChess commented on code in PR #11019: URL: https://github.com/apache/hudi/pull/11019#discussion_r1608532112 ## hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/TestInsertTable.scala: ## @@ -294,6 +294,37 @@ class TestInsertTable extends HoodieS

Re: [PR] [HUDI-7007] Add bloom_filters index support on read side [hudi]

2024-05-21 Thread via GitHub
KnightChess commented on code in PR #11043: URL: https://github.com/apache/hudi/pull/11043#discussion_r1608419895 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BloomFiltersIndexSupport.scala: ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software

Re: [PR] [HUDI-7007] Add bloom_filters index support on read side [hudi]

2024-05-21 Thread via GitHub
KnightChess commented on code in PR #11043: URL: https://github.com/apache/hudi/pull/11043#discussion_r1608419895 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BloomFiltersIndexSupport.scala: ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software

Re: [PR] [HUDI-7007] Add bloom_filters index support on read side [hudi]

2024-05-21 Thread via GitHub
KnightChess commented on code in PR #11043: URL: https://github.com/apache/hudi/pull/11043#discussion_r1608369256 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/BloomFiltersIndexSupport.scala: ## @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software

Re: [PR] [HUDI-7146] Implement secondary index write path [hudi]

2024-05-21 Thread via GitHub
hudi-bot commented on PR #11146: URL: https://github.com/apache/hudi/pull/11146#issuecomment-2122424376 ## CI report: * 4c02510ba3b23cb7d7feffbb2c6850c038d6103a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=24

[jira] [Commented] (HUDI-7779) Guarding archival to not archive unintended commits

2024-05-21 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848149#comment-17848149 ] Sagar Sumit commented on HUDI-7779: --- {quote}Corner case to consider. Lets add onto above

Re: [I] [SUPPORT] HiveSyncTool failure - Unable to create a `_ro` table when writing data [hudi]

2024-05-21 Thread via GitHub
ad1happy2go commented on issue #11254: URL: https://github.com/apache/hudi/issues/11254#issuecomment-2122168532 @shubhamn21 Please Provide Writer configurations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [HUDI-7778] Fixing global index for duplicate updates [hudi]

2024-05-21 Thread via GitHub
KnightChess commented on code in PR #11256: URL: https://github.com/apache/hudi/pull/11256#discussion_r1606265774 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java: ## @@ -288,12 +288,9 @@ public static HoodieData> mergeForPartitionUpd

Re: [PR] [HUDI-7778] Fixing global index for duplicate updates [hudi]

2024-05-21 Thread via GitHub
codope commented on code in PR #11256: URL: https://github.com/apache/hudi/pull/11256#discussion_r1607940757 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java: ## @@ -288,12 +288,9 @@ public static HoodieData> mergeForPartitionUpdatesI

Re: [PR] [HUDI-7622] Optimize HoodieTableSource's sanity check [hudi]

2024-05-21 Thread via GitHub
hudi-bot commented on PR #11031: URL: https://github.com/apache/hudi/pull/11031#issuecomment-2122048600 ## CI report: * e159472757b2475611e99dc4afd8fe2def6967f4 UNKNOWN * c4a9e9a0debe32518a84877c79c4831740b95caa Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [HUDI-7622] Optimize HoodieTableSource's sanity check [hudi]

2024-05-21 Thread via GitHub
hudi-bot commented on PR #11031: URL: https://github.com/apache/hudi/pull/11031#issuecomment-2122028975 ## CI report: * e159472757b2475611e99dc4afd8fe2def6967f4 UNKNOWN * c4a9e9a0debe32518a84877c79c4831740b95caa Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4

Re: [PR] [HUDI-7395] Fix computation for metrics in HoodieMetadataMetrics [hudi]

2024-05-21 Thread via GitHub
hudi-bot commented on PR #10641: URL: https://github.com/apache/hudi/pull/10641#issuecomment-2121936991 ## CI report: * b086907bcddcb493e9d7f9a711b74e0fc7d1ea48 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23