[jira] [Created] (HUDI-6451) Randomly obtain a path in HoodieMemoryConfig#getDefaultSpillableMapBasePath
Shilun Fan created HUDI-6451: Summary: Randomly obtain a path in HoodieMemoryConfig#getDefaultSpillableMapBasePath Key: HUDI-6451 URL: https://issues.apache.org/jira/browse/HUDI-6451 Project: Apache Hudi Issue Type: Improvement Components: cli Reporter: Shilun Fan The HoodieMemoryConfig#getDefaultSpillableMapBasePath method retrieves a path from YARN's LOCAL_DIRS environment variable. However, the order of LOCAL_DIRS concatenation in YARN is typically fixed, resulting in the DefaultSpillableMapBasePath being a fixed value. Considering the disk load perspective, we should randomize the selection of a path from the array. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-6451) Randomly obtain a path in HoodieMemoryConfig#getDefaultSpillableMapBasePath
[ https://issues.apache.org/jira/browse/HUDI-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-6451: - Status: In Progress (was: Open) > Randomly obtain a path in HoodieMemoryConfig#getDefaultSpillableMapBasePath > --- > > Key: HUDI-6451 > URL: https://issues.apache.org/jira/browse/HUDI-6451 > Project: Apache Hudi > Issue Type: Improvement > Components: cli >Reporter: Shilun Fan >Priority: Major > > The HoodieMemoryConfig#getDefaultSpillableMapBasePath method retrieves a path > from YARN's LOCAL_DIRS environment variable. However, the order of LOCAL_DIRS > concatenation in YARN is typically fixed, resulting in the > DefaultSpillableMapBasePath being a fixed value. Considering the disk load > perspective, we should randomize the selection of a path from the array. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-6451) Randomly obtain a path in HoodieMemoryConfig#getDefaultSpillableMapBasePath
[ https://issues.apache.org/jira/browse/HUDI-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-6451: Assignee: Shilun Fan > Randomly obtain a path in HoodieMemoryConfig#getDefaultSpillableMapBasePath > --- > > Key: HUDI-6451 > URL: https://issues.apache.org/jira/browse/HUDI-6451 > Project: Apache Hudi > Issue Type: Improvement > Components: cli >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > The HoodieMemoryConfig#getDefaultSpillableMapBasePath method retrieves a path > from YARN's LOCAL_DIRS environment variable. However, the order of LOCAL_DIRS > concatenation in YARN is typically fixed, resulting in the > DefaultSpillableMapBasePath being a fixed value. Considering the disk load > perspective, we should randomize the selection of a path from the array. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-6086) Improve HiveSchemaUtil#generateCreateDDL With StringBuilder.
[ https://issues.apache.org/jira/browse/HUDI-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-6086: - Summary: Improve HiveSchemaUtil#generateCreateDDL With StringBuilder. (was: Improve HiveSchemaUtil#generateCreateDDL With ST) > Improve HiveSchemaUtil#generateCreateDDL With StringBuilder. > > > Key: HUDI-6086 > URL: https://issues.apache.org/jira/browse/HUDI-6086 > Project: Apache Hudi > Issue Type: Improvement > Components: hive >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > > The code of HiveSchemaUtil#generateCreateDDL uses a lot of append, which > makes the code very difficult to read. Usually, in this case, we should use > antlr's ST to generate SQL. This jira will use ST to improve this part of the > code -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-6086) Improve HiveSchemaUtil#generateCreateDDL With ST
Shilun Fan created HUDI-6086: Summary: Improve HiveSchemaUtil#generateCreateDDL With ST Key: HUDI-6086 URL: https://issues.apache.org/jira/browse/HUDI-6086 Project: Apache Hudi Issue Type: Improvement Components: hive Reporter: Shilun Fan Assignee: Shilun Fan The code of HiveSchemaUtil#generateCreateDDL uses a lot of append, which makes the code very difficult to read. Usually, in this case, we should use antlr's ST to generate SQL. This jira will use ST to improve this part of the code -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-6086) Improve HiveSchemaUtil#generateCreateDDL With ST
[ https://issues.apache.org/jira/browse/HUDI-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-6086: - Status: In Progress (was: Open) > Improve HiveSchemaUtil#generateCreateDDL With ST > > > Key: HUDI-6086 > URL: https://issues.apache.org/jira/browse/HUDI-6086 > Project: Apache Hudi > Issue Type: Improvement > Components: hive >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > The code of HiveSchemaUtil#generateCreateDDL uses a lot of append, which > makes the code very difficult to read. Usually, in this case, we should use > antlr's ST to generate SQL. This jira will use ST to improve this part of the > code -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-6079) Improve the code of HMSDDLExecutor, HiveQueryDDLExecutor
[ https://issues.apache.org/jira/browse/HUDI-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-6079: - Status: In Progress (was: Open) > Improve the code of HMSDDLExecutor, HiveQueryDDLExecutor > > > Key: HUDI-6079 > URL: https://issues.apache.org/jira/browse/HUDI-6079 > Project: Apache Hudi > Issue Type: Improvement > Components: hive >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > 1. Modify the log format > 2. Remove redundant code > 3. Increase code readability -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-6079) Improve the code of HMSDDLExecutor, HiveQueryDDLExecutor
[ https://issues.apache.org/jira/browse/HUDI-6079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-6079: Assignee: Shilun Fan > Improve the code of HMSDDLExecutor, HiveQueryDDLExecutor > > > Key: HUDI-6079 > URL: https://issues.apache.org/jira/browse/HUDI-6079 > Project: Apache Hudi > Issue Type: Improvement > Components: hive >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > 1. Modify the log format > 2. Remove redundant code > 3. Increase code readability -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-6079) Improve the code of HMSDDLExecutor, HiveQueryDDLExecutor
Shilun Fan created HUDI-6079: Summary: Improve the code of HMSDDLExecutor, HiveQueryDDLExecutor Key: HUDI-6079 URL: https://issues.apache.org/jira/browse/HUDI-6079 Project: Apache Hudi Issue Type: Improvement Components: hive Reporter: Shilun Fan 1. Modify the log format 2. Remove redundant code 3. Increase code readability -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-6064) Improve JDBCExecutor#getTableSchema Use ColName
Shilun Fan created HUDI-6064: Summary: Improve JDBCExecutor#getTableSchema Use ColName Key: HUDI-6064 URL: https://issues.apache.org/jira/browse/HUDI-6064 Project: Apache Hudi Issue Type: Improvement Components: hive Reporter: Shilun Fan Assignee: Shilun Fan JDBCExecutor#getTableSchema Use ColIndex, which is not conducive to code reading, use ColName instead of ColIndex. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-6064) Improve JDBCExecutor#getTableSchema Use ColName
[ https://issues.apache.org/jira/browse/HUDI-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-6064: - Status: In Progress (was: Open) > Improve JDBCExecutor#getTableSchema Use ColName > --- > > Key: HUDI-6064 > URL: https://issues.apache.org/jira/browse/HUDI-6064 > Project: Apache Hudi > Issue Type: Improvement > Components: hive >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > JDBCExecutor#getTableSchema Use ColIndex, which is not conducive to code > reading, use ColName instead of ColIndex. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-6063) Modify logging errors In JDBCExecutor
[ https://issues.apache.org/jira/browse/HUDI-6063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-6063: Assignee: Shilun Fan > Modify logging errors In JDBCExecutor > - > > Key: HUDI-6063 > URL: https://issues.apache.org/jira/browse/HUDI-6063 > Project: Apache Hudi > Issue Type: Bug > Components: hive >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > There is a logging error in JDBCExecutor. During the process of drop > partitions, the log prints add partitions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-6063) Modify logging errors In JDBCExecutor
[ https://issues.apache.org/jira/browse/HUDI-6063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-6063: - Status: In Progress (was: Open) > Modify logging errors In JDBCExecutor > - > > Key: HUDI-6063 > URL: https://issues.apache.org/jira/browse/HUDI-6063 > Project: Apache Hudi > Issue Type: Bug > Components: hive >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > There is a logging error in JDBCExecutor. During the process of drop > partitions, the log prints add partitions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-6063) Modify logging errors In JDBCExecutor
Shilun Fan created HUDI-6063: Summary: Modify logging errors In JDBCExecutor Key: HUDI-6063 URL: https://issues.apache.org/jira/browse/HUDI-6063 Project: Apache Hudi Issue Type: Bug Components: hive Reporter: Shilun Fan There is a logging error in JDBCExecutor. During the process of drop partitions, the log prints add partitions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-5958) Improve ResolvedSchema Instead of TableSchema
Shilun Fan created HUDI-5958: Summary: Improve ResolvedSchema Instead of TableSchema Key: HUDI-5958 URL: https://issues.apache.org/jira/browse/HUDI-5958 Project: Apache Hudi Issue Type: Improvement Components: flink Reporter: Shilun Fan When reading the code, I found that there is a case of using TableSchema in the flink-example project, TableSchema has been Deprecated, We can use resolvedSchema instead. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5958) Improve ResolvedSchema Instead of TableSchema
[ https://issues.apache.org/jira/browse/HUDI-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5958: Assignee: Shilun Fan > Improve ResolvedSchema Instead of TableSchema > - > > Key: HUDI-5958 > URL: https://issues.apache.org/jira/browse/HUDI-5958 > Project: Apache Hudi > Issue Type: Improvement > Components: flink >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > When reading the code, I found that there is a case of using TableSchema in > the flink-example project, TableSchema has been Deprecated, We can use > resolvedSchema instead. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5958) Improve ResolvedSchema Instead of TableSchema
[ https://issues.apache.org/jira/browse/HUDI-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-5958: - Status: In Progress (was: Open) > Improve ResolvedSchema Instead of TableSchema > - > > Key: HUDI-5958 > URL: https://issues.apache.org/jira/browse/HUDI-5958 > Project: Apache Hudi > Issue Type: Improvement > Components: flink >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > When reading the code, I found that there is a case of using TableSchema in > the flink-example project, TableSchema has been Deprecated, We can use > resolvedSchema instead. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-5876) Remove usage of deprecated TableConfig.
Shilun Fan created HUDI-5876: Summary: Remove usage of deprecated TableConfig. Key: HUDI-5876 URL: https://issues.apache.org/jira/browse/HUDI-5876 Project: Apache Hudi Issue Type: Improvement Reporter: Shilun Fan Assignee: Shilun Fan This is a small change, I found out that SortOperatorGen initializes TableConfig using deprecated method. Use recommended methods to improve. TableConfig /** Please use \{@link TableConfig#getDefault()} instead. */ @Deprecated public TableConfig() {} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5869) Fix Some Typos in Hudi-Common
[ https://issues.apache.org/jira/browse/HUDI-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-5869: - Status: In Progress (was: Open) > Fix Some Typos in Hudi-Common > - > > Key: HUDI-5869 > URL: https://issues.apache.org/jira/browse/HUDI-5869 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > When reading the code, I found some typo issues and fixed them -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-5869) Fix Some Typos in Hudi-Common
Shilun Fan created HUDI-5869: Summary: Fix Some Typos in Hudi-Common Key: HUDI-5869 URL: https://issues.apache.org/jira/browse/HUDI-5869 Project: Apache Hudi Issue Type: Improvement Reporter: Shilun Fan Assignee: Shilun Fan When reading the code, I found some typo issues and fixed them -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5389) Remove Hudi Cli Duplicates Code
[ https://issues.apache.org/jira/browse/HUDI-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5389: Assignee: Shilun Fan > Remove Hudi Cli Duplicates Code > --- > > Key: HUDI-5389 > URL: https://issues.apache.org/jira/browse/HUDI-5389 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > > In the process of reading the code, I found some duplicate code, I think this > part of the duplicate code can be removed directly. > ||cli||hudi-spark|| > |org.apache.hudi.cli.DedupeSparkJob|org.apache.spark.sql.hudi.DedupeSparkJob| > |org.apache.hudi.cli.DeDupeType|org.apache.spark.sql.hudi.DeDupeType| > |org.apache.hudi.cli.SparkHelpers|org.apache.spark.sql.hudi.SparkHelpers| > The code on the left side of the table can be directly replaced by the code > on the right side of the table, because their contents are exactly the same. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5398) Fix Typo in hudi-integ-test#README.md
[ https://issues.apache.org/jira/browse/HUDI-5398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5398: Assignee: Shilun Fan > Fix Typo in hudi-integ-test#README.md > - > > Key: HUDI-5398 > URL: https://issues.apache.org/jira/browse/HUDI-5398 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 0.13.0 > > > When reading the README.md of hudi-integ-test, I found some Typo, after > reading the document, fix it -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5283) Replace deprecated method Schema.parse with Schema.Parser
[ https://issues.apache.org/jira/browse/HUDI-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5283: Assignee: Shilun Fan > Replace deprecated method Schema.parse with Schema.Parser > - > > Key: HUDI-5283 > URL: https://issues.apache.org/jira/browse/HUDI-5283 > Project: Apache Hudi > Issue Type: Improvement > Components: cli >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 0.13.0 > > > When reading the code, I found that > HoodieBootstrapSchemaProvider#getBootstrapSchema uses the deprecated method > Schema.parse, which can be replaced by Schema.Parser().parse(), > At the same time, I searched at the moudle level, only to find that this > place uses an deprecated method. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5035) Remove deprecated API usage in SparkPreCommitValidator#validate
[ https://issues.apache.org/jira/browse/HUDI-5035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5035: Assignee: Shilun Fan > Remove deprecated API usage in SparkPreCommitValidator#validate > --- > > Key: HUDI-5035 > URL: https://issues.apache.org/jira/browse/HUDI-5035 > Project: Apache Hudi > Issue Type: Improvement > Components: cli >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 0.13.0 > > Attachments: image-2022-10-15-07-23-43-689.png > > > I found that the code uses the deprecated API, modify the code to use the > recommended API > > !image-2022-10-15-07-23-43-689.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5124) Fix HoodieInternalRowFileWriter#canWrite error return tag
[ https://issues.apache.org/jira/browse/HUDI-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5124: Assignee: Shilun Fan > Fix HoodieInternalRowFileWriter#canWrite error return tag > - > > Key: HUDI-5124 > URL: https://issues.apache.org/jira/browse/HUDI-5124 > Project: Apache Hudi > Issue Type: Bug > Components: cli >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 0.12.2 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5154) Improve hudi-spark-client Lambada writing
[ https://issues.apache.org/jira/browse/HUDI-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5154: Assignee: Shilun Fan > Improve hudi-spark-client Lambada writing > - > > Key: HUDI-5154 > URL: https://issues.apache.org/jira/browse/HUDI-5154 > Project: Apache Hudi > Issue Type: Improvement > Components: cli >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 0.13.0 > > > When reading the code, I found that the hudi-spark-client module can improve > the writing of Lambada expressions and make the code cleaner. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5072) Extract transform duplicate code
[ https://issues.apache.org/jira/browse/HUDI-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5072: Assignee: Shilun Fan > Extract transform duplicate code > > > Key: HUDI-5072 > URL: https://issues.apache.org/jira/browse/HUDI-5072 > Project: Apache Hudi > Issue Type: Improvement > Components: cli >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 0.13.0 > > > When reading the code, I found that the transform methods of > MultipleSparkJobExecutionStrategy and SingleSparkJobExecutionStrategy have > redundant code. I think we can extract them to make the code cleaner. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5027) Replace hardcoded hbase config keys with HbaseConstants
[ https://issues.apache.org/jira/browse/HUDI-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5027: Assignee: Shilun Fan > Replace hardcoded hbase config keys with HbaseConstants > > > Key: HUDI-5027 > URL: https://issues.apache.org/jira/browse/HUDI-5027 > Project: Apache Hudi > Issue Type: Improvement > Components: code-quality >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 0.12.2 > > > When I read the code, I found that SparkHoodieHBaseIndex uses a lot of > hardcoded variables, it would be better to replace with Hbase's Constants. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-4997) use jackson-v2 replace jackson-v1 import
[ https://issues.apache.org/jira/browse/HUDI-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-4997: Assignee: Shilun Fan > use jackson-v2 replace jackson-v1 import > > > Key: HUDI-4997 > URL: https://issues.apache.org/jira/browse/HUDI-4997 > Project: Apache Hudi > Issue Type: Improvement > Components: cli >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 0.12.2 > > > HoodieWriteCommitCallbackUtil uses ObjectMapper, but uses jackson-v1 import, > jackson-v1 has security risks, replace import with jackson-v2. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5002) Remove deprecated API usage in SparkHoodieHBaseIndex#generateStatement
[ https://issues.apache.org/jira/browse/HUDI-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5002: Assignee: Shilun Fan > Remove deprecated API usage in SparkHoodieHBaseIndex#generateStatement > --- > > Key: HUDI-5002 > URL: https://issues.apache.org/jira/browse/HUDI-5002 > Project: Apache Hudi > Issue Type: Improvement > Components: index >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 0.12.2 > > Attachments: image-2022-10-10-21-31-59-535.png > > > When I read the code, I found that SparkHoodieHBaseIndex#generateStatement > uses Hbase's deprecated method(setMaxVersion), I replaced it with new method. > > {code:java} > private Get generateStatement(String key) throws IOException { > return new > Get(Bytes.toBytes(getHBaseKey(key))).setMaxVersions(1).addColumn(SYSTEM_COLUMN_FAMILY, > COMMIT_TS_COLUMN) > .addColumn(SYSTEM_COLUMN_FAMILY, > FILE_NAME_COLUMN).addColumn(SYSTEM_COLUMN_FAMILY, PARTITION_PATH_COLUMN); > } {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HUDI-5033) Fix Broken Link In MultipleSparkJobExecutionStrategy
[ https://issues.apache.org/jira/browse/HUDI-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan resolved HUDI-5033. -- > Fix Broken Link In MultipleSparkJobExecutionStrategy > > > Key: HUDI-5033 > URL: https://issues.apache.org/jira/browse/HUDI-5033 > Project: Apache Hudi > Issue Type: Improvement > Components: cli >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Attachments: image-2022-10-15-07-09-08-084.png > > > When I read the code, I found that there is a link that cannot be linked to > the code. I will fix it. I have completed the inspection of the entire module > (hudi-spark-client), only this is the problem > !image-2022-10-15-07-09-08-084.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5033) Fix Broken Link In MultipleSparkJobExecutionStrategy
[ https://issues.apache.org/jira/browse/HUDI-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5033: Assignee: Shilun Fan > Fix Broken Link In MultipleSparkJobExecutionStrategy > > > Key: HUDI-5033 > URL: https://issues.apache.org/jira/browse/HUDI-5033 > Project: Apache Hudi > Issue Type: Improvement > Components: cli >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Attachments: image-2022-10-15-07-09-08-084.png > > > When I read the code, I found that there is a link that cannot be linked to > the code. I will fix it. I have completed the inspection of the entire module > (hudi-spark-client), only this is the problem > !image-2022-10-15-07-09-08-084.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5664) Improve SqlQueryPreCommitValidator#queries Parallelism
[ https://issues.apache.org/jira/browse/HUDI-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reassigned HUDI-5664: Assignee: Shilun Fan > Improve SqlQueryPreCommitValidator#queries Parallelism > -- > > Key: HUDI-5664 > URL: https://issues.apache.org/jira/browse/HUDI-5664 > Project: Apache Hudi > Issue Type: Improvement > Components: cli >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 0.13.1, 0.14.0 > > > I found that SqlQueryPreCommitValidator#validateRecordsBeforeAndAfter has a > todo > // TODO run this in a thread pool to improve parallelism > I think we can improve it using List's parallelStream -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5845) Remove usage of deprecated getTableAvroSchemaWithoutMetadataFields
[ https://issues.apache.org/jira/browse/HUDI-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-5845: - Status: In Progress (was: Open) > Remove usage of deprecated getTableAvroSchemaWithoutMetadataFields > -- > > Key: HUDI-5845 > URL: https://issues.apache.org/jira/browse/HUDI-5845 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Shilun Fan >Priority: Major > > Remove usage of deprecated getTableAvroSchemaWithoutMetadataFields -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-5845) Remove usage of deprecated getTableAvroSchemaWithoutMetadataFields
Shilun Fan created HUDI-5845: Summary: Remove usage of deprecated getTableAvroSchemaWithoutMetadataFields Key: HUDI-5845 URL: https://issues.apache.org/jira/browse/HUDI-5845 Project: Apache Hudi Issue Type: Improvement Reporter: Shilun Fan Remove usage of deprecated getTableAvroSchemaWithoutMetadataFields -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-5664) Improve SqlQueryPreCommitValidator#queries Parallelism
Shilun Fan created HUDI-5664: Summary: Improve SqlQueryPreCommitValidator#queries Parallelism Key: HUDI-5664 URL: https://issues.apache.org/jira/browse/HUDI-5664 Project: Apache Hudi Issue Type: Improvement Components: cli Reporter: Shilun Fan I found that SqlQueryPreCommitValidator#validateRecordsBeforeAndAfter has a todo // TODO run this in a thread pool to improve parallelism I think we can improve it using List's parallelStream -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HUDI-5398) Fix Typo in hudi-integ-test#README.md
[ https://issues.apache.org/jira/browse/HUDI-5398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648306#comment-17648306 ] Shilun Fan commented on HUDI-5398: -- Can any partner give me a contributor permission so that I can assign jira, thank you very much! > Fix Typo in hudi-integ-test#README.md > - > > Key: HUDI-5398 > URL: https://issues.apache.org/jira/browse/HUDI-5398 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 0.13.0 > > > When reading the README.md of hudi-integ-test, I found some Typo, after > reading the document, fix it -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5398) Fix Typo in hudi-integ-test#README.md
[ https://issues.apache.org/jira/browse/HUDI-5398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-5398: - Status: In Progress (was: Open) > Fix Typo in hudi-integ-test#README.md > - > > Key: HUDI-5398 > URL: https://issues.apache.org/jira/browse/HUDI-5398 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Shilun Fan >Priority: Minor > Fix For: 0.13.0 > > > When reading the README.md of hudi-integ-test, I found some Typo, after > reading the document, fix it -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-5398) Fix Typo in hudi-integ-test#README.md
Shilun Fan created HUDI-5398: Summary: Fix Typo in hudi-integ-test#README.md Key: HUDI-5398 URL: https://issues.apache.org/jira/browse/HUDI-5398 Project: Apache Hudi Issue Type: Improvement Reporter: Shilun Fan Fix For: 0.13.0 When reading the README.md of hudi-integ-test, I found some Typo, after reading the document, fix it -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-5389) Remove Hudi Cli Duplicates Code
Shilun Fan created HUDI-5389: Summary: Remove Hudi Cli Duplicates Code Key: HUDI-5389 URL: https://issues.apache.org/jira/browse/HUDI-5389 Project: Apache Hudi Issue Type: Improvement Reporter: Shilun Fan In the process of reading the code, I found some duplicate code, I think this part of the duplicate code can be removed directly. ||cli||hudi-spark|| |org.apache.hudi.cli.DedupeSparkJob|org.apache.spark.sql.hudi.DedupeSparkJob| |org.apache.hudi.cli.DeDupeType|org.apache.spark.sql.hudi.DeDupeType| |org.apache.hudi.cli.SparkHelpers|org.apache.spark.sql.hudi.SparkHelpers| The code on the left side of the table can be directly replaced by the code on the right side of the table, because their contents are exactly the same. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5389) Remove Hudi Cli Duplicates Code
[ https://issues.apache.org/jira/browse/HUDI-5389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-5389: - Status: In Progress (was: Open) > Remove Hudi Cli Duplicates Code > --- > > Key: HUDI-5389 > URL: https://issues.apache.org/jira/browse/HUDI-5389 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Shilun Fan >Priority: Major > > In the process of reading the code, I found some duplicate code, I think this > part of the duplicate code can be removed directly. > ||cli||hudi-spark|| > |org.apache.hudi.cli.DedupeSparkJob|org.apache.spark.sql.hudi.DedupeSparkJob| > |org.apache.hudi.cli.DeDupeType|org.apache.spark.sql.hudi.DeDupeType| > |org.apache.hudi.cli.SparkHelpers|org.apache.spark.sql.hudi.SparkHelpers| > The code on the left side of the table can be directly replaced by the code > on the right side of the table, because their contents are exactly the same. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-5283) Replace deprecated method Schema.parse With Schema.Parser
Shilun Fan created HUDI-5283: Summary: Replace deprecated method Schema.parse With Schema.Parser Key: HUDI-5283 URL: https://issues.apache.org/jira/browse/HUDI-5283 Project: Apache Hudi Issue Type: Improvement Components: cli Reporter: Shilun Fan When reading the code, I found that HoodieBootstrapSchemaProvider#getBootstrapSchema uses the deprecated method Schema.parse, which can be replaced by Schema.Parser().parse(), At the same time, I searched at the moudle level, only to find that this place uses an deprecated method. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5283) Replace deprecated method Schema.parse With Schema.Parser
[ https://issues.apache.org/jira/browse/HUDI-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HUDI-5283: - Status: In Progress (was: Open) > Replace deprecated method Schema.parse With Schema.Parser > - > > Key: HUDI-5283 > URL: https://issues.apache.org/jira/browse/HUDI-5283 > Project: Apache Hudi > Issue Type: Improvement > Components: cli >Reporter: Shilun Fan >Priority: Major > > When reading the code, I found that > HoodieBootstrapSchemaProvider#getBootstrapSchema uses the deprecated method > Schema.parse, which can be replaced by Schema.Parser().parse(), > At the same time, I searched at the moudle level, only to find that this > place uses an deprecated method. -- This message was sent by Atlassian Jira (v8.20.10#820010)