(hudi) branch asf-site updated: Update docker_demo.md (#10522)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 053c05794e2 Update docker_demo.md (#10522) 053c05794e2 is described below commit 053c05794e29ad6af179360507f3e27e6a6b81ac Author: Dan Roscigno AuthorDate: Sat Jan 20 02:06:40 2024 -0500 Update docker_demo.md (#10522) * Update docker_demo.md Based on my experience trying the demo and issue #10262 I am suggesting that instead of using the master branch for the demo the tab 0.14.1 be used. Additionally, the `mvn` command should specify specific versions: `-Dscala-2.11 -Dspark2.4` --- website/versioned_docs/version-0.14.1/docker_demo.md | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/website/versioned_docs/version-0.14.1/docker_demo.md b/website/versioned_docs/version-0.14.1/docker_demo.md index 0564bce20a7..ae8b232dad1 100644 --- a/website/versioned_docs/version-0.14.1/docker_demo.md +++ b/website/versioned_docs/version-0.14.1/docker_demo.md @@ -49,9 +49,10 @@ The first step is to build Hudi. **Note** This step builds Hudi on default suppo NOTE: Make sure you've cloned the [Hudi repository](https://github.com/apache/hudi) first. -```java +```bash cd -mvn clean package -Pintegration-tests -DskipTests +git checkout release-0.14.1 +mvn clean package -Pintegration-tests -DskipTests -Dscala-2.11 -Dspark2.4 ``` ### Bringing up Demo Cluster @@ -134,9 +135,10 @@ $ docker ps :::note Please note the following for Mac AArch64 users - The demo must be built and run using the master branch. We currently plan to include support starting with the -0.13.0 release. + The demo must be built and run using the release-0.14.1 tag. Presto and Trino are not currently supported in the demo. + You will see warnings that there is no history server for your architecture. You can ignore these. + You wil see the warning "Unable to load native-hadoop library for your platform... using builtin-java classes where applicable." You can ignore this. ::: @@ -339,7 +341,7 @@ After executing the above command, you will notice 1. A hive table named `stock_ticks_cow` created which supports Snapshot and Incremental queries on Copy On Write table. 2. Two new tables `stock_ticks_mor_rt` and `stock_ticks_mor_ro` created for the Merge On Read table. The former -supports Snapshot and Incremental queries (providing near-real time data) while the later supports ReadOptimized queries. +supports Snapshot and Incremental queries (providing near-real time data) while the later supports ReadOptimized queries. `http://namenode:50070/explorer.html#/user/hive/warehouse/stock_ticks_mor` ### Step 4 (a): Run Hive Queries
[hudi] branch master updated (61fc3c03a6 -> 59f652a19c)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 61fc3c03a6 [HUDI-4447] fix SQL metasync when perform delete table operation (#6180) add 59f652a19c [HUDI-4424] Add new compactoin trigger stratgy: NUM_COMMITS_AFTER_REQ… (#6144) No new revisions were added by this update. Summary of changes: .../action/compact/CompactionTriggerStrategy.java | 2 + .../compact/ScheduleCompactionActionExecutor.java | 23 +++ .../table/action/compact/TestInlineCompaction.java | 74 ++ .../apache/hudi/common/util/CompactionUtils.java | 30 + 4 files changed, 129 insertions(+)
[hudi] branch master updated (1ea1e659c2 -> e5faf2cc84)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 1ea1e659c2 [HUDI-4474] Infer metasync configs (#6217) add e5faf2cc84 [HUDI-4210] Create custom hbase index to solve data skew issue on hbase regions (#5797) No new revisions were added by this update. Summary of changes: .../apache/hudi/config/HoodieHBaseIndexConfig.java | 4 .../org/apache/hudi/config/HoodieWriteConfig.java | 4 .../hbase/RebalancedSparkHoodieHBaseIndex.java}| 26 +++--- .../hudi/index/hbase/SparkHoodieHBaseIndex.java| 10 ++--- 4 files changed, 23 insertions(+), 21 deletions(-) copy hudi-client/{hudi-client-common/src/main/java/org/apache/hudi/table/storage/HoodieDefaultLayout.java => hudi-spark-client/src/main/java/org/apache/hudi/index/hbase/RebalancedSparkHoodieHBaseIndex.java} (60%)
[hudi] branch master updated: [HUDI-4065] Add FileBasedLockProvider (#6071)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 382d19e85b [HUDI-4065] Add FileBasedLockProvider (#6071) 382d19e85b is described below commit 382d19e85b06d0fc7f4ad37bb4e4eae3f5e76b78 Author: 冯健 AuthorDate: Tue Jul 19 07:52:47 2022 +0800 [HUDI-4065] Add FileBasedLockProvider (#6071) --- .../lock/FileSystemBasedLockProvider.java | 152 + .../org/apache/hudi/config/HoodieLockConfig.java | 9 +- .../hudi/client/TestFileBasedLockProvider.java | 135 ++ .../hudi/client/TestHoodieClientMultiWriter.java | 87 +++- .../hudi/common/config/LockConfiguration.java | 2 + 5 files changed, 349 insertions(+), 36 deletions(-) diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/FileSystemBasedLockProvider.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/FileSystemBasedLockProvider.java new file mode 100644 index 00..96a42e8409 --- /dev/null +++ b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/FileSystemBasedLockProvider.java @@ -0,0 +1,152 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.hudi.client.transaction.lock; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hudi.common.config.LockConfiguration; +import org.apache.hudi.common.fs.FSUtils; +import org.apache.hudi.common.lock.LockProvider; +import org.apache.hudi.common.lock.LockState; +import org.apache.hudi.common.table.HoodieTableMetaClient; +import org.apache.hudi.common.util.StringUtils; +import org.apache.hudi.common.util.ValidationUtils; +import org.apache.hudi.config.HoodieWriteConfig; +import org.apache.hudi.exception.HoodieIOException; +import org.apache.hudi.exception.HoodieLockException; +import org.apache.log4j.LogManager; +import org.apache.log4j.Logger; + +import java.io.IOException; +import java.io.Serializable; +import java.util.concurrent.TimeUnit; + +import static org.apache.hudi.common.config.LockConfiguration.FILESYSTEM_LOCK_EXPIRE_PROP_KEY; +import static org.apache.hudi.common.config.LockConfiguration.FILESYSTEM_LOCK_PATH_PROP_KEY; + +/** + * A FileSystem based lock. This {@link LockProvider} implementation allows to lock table operations + * using DFS. Users might need to manually clean the Locker's path if writeClient crash and never run again. + * NOTE: This only works for DFS with atomic create/delete operation + */ +public class FileSystemBasedLockProvider implements LockProvider, Serializable { + + private static final Logger LOG = LogManager.getLogger(FileSystemBasedLockProvider.class); + + private static final String LOCK_FILE_NAME = "lock"; + + private final int lockTimeoutMinutes; + private transient FileSystem fs; + private transient Path lockFile; + protected LockConfiguration lockConfiguration; + + public FileSystemBasedLockProvider(final LockConfiguration lockConfiguration, final Configuration configuration) { +checkRequiredProps(lockConfiguration); +this.lockConfiguration = lockConfiguration; +String lockDirectory = lockConfiguration.getConfig().getString(FILESYSTEM_LOCK_PATH_PROP_KEY, null); +if (StringUtils.isNullOrEmpty(lockDirectory)) { + lockDirectory = lockConfiguration.getConfig().getString(HoodieWriteConfig.BASE_PATH.key()) ++ Path.SEPARATOR + HoodieTableMetaClient.METAFOLDER_NAME; +} +this.lockTimeoutMinutes = lockConfiguration.getConfig().getInteger(FILESYSTEM_LOCK_EXPIRE_PROP_KEY); +this.lockFile = new Path(lockDirectory + Path.SEPARATOR + LOCK_FILE_NAME); +this.fs = FSUtils.getFs(this.lockFile.toString(), configuration); + } + + @Override + public void close() { +synchronized (LOCK_FILE_NAME) { + try { +fs.delete(this.lockFile, true); + } catch (IOException e) { +throw new HoodieLock
[jira] [Updated] (HUDI-4409) Improve LockManager wait logic when catch exception
[ https://issues.apache.org/jira/browse/HUDI-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-4409: --- Summary: Improve LockManager wait logic when catch exception (was: LockManager improve wait time logic) > Improve LockManager wait logic when catch exception > --- > > Key: HUDI-4409 > URL: https://issues.apache.org/jira/browse/HUDI-4409 > Project: Apache Hudi > Issue Type: Improvement >Reporter: liujinhui >Priority: Major > Labels: pull-request-available > > {code:java} > //public void lock() { > if > (writeConfig.getWriteConcurrencyMode().supportsOptimisticConcurrencyControl()) > { > LockProvider lockProvider = getLockProvider(); > int retryCount = 0; > boolean acquired = false; > while (retryCount <= maxRetries) { > try { > acquired = > lockProvider.tryLock(writeConfig.getLockAcquireWaitTimeoutInMs(), > TimeUnit.MILLISECONDS); > if (acquired) { > break; > } > LOG.info("Retrying to acquire lock..."); > Thread.sleep(maxWaitTimeInMs); > } catch (HoodieLockException | InterruptedException e) { > if (retryCount >= maxRetries) { > throw new HoodieLockException("Unable to acquire lock, lock object > ", e); > } > } finally { > retryCount++; > } > } > if (!acquired) { > throw new HoodieLockException("Unable to acquire lock, lock object " + > lockProvider.getLock()); > } > } > } {code} > We should put sleep in catch -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-4409) Improve LockManager wait logic when catch exception
[ https://issues.apache.org/jira/browse/HUDI-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-4409: -- Assignee: liujinhui > Improve LockManager wait logic when catch exception > --- > > Key: HUDI-4409 > URL: https://issues.apache.org/jira/browse/HUDI-4409 > Project: Apache Hudi > Issue Type: Improvement >Reporter: liujinhui >Assignee: liujinhui >Priority: Major > Labels: pull-request-available > > {code:java} > //public void lock() { > if > (writeConfig.getWriteConcurrencyMode().supportsOptimisticConcurrencyControl()) > { > LockProvider lockProvider = getLockProvider(); > int retryCount = 0; > boolean acquired = false; > while (retryCount <= maxRetries) { > try { > acquired = > lockProvider.tryLock(writeConfig.getLockAcquireWaitTimeoutInMs(), > TimeUnit.MILLISECONDS); > if (acquired) { > break; > } > LOG.info("Retrying to acquire lock..."); > Thread.sleep(maxWaitTimeInMs); > } catch (HoodieLockException | InterruptedException e) { > if (retryCount >= maxRetries) { > throw new HoodieLockException("Unable to acquire lock, lock object > ", e); > } > } finally { > retryCount++; > } > } > if (!acquired) { > throw new HoodieLockException("Unable to acquire lock, lock object " + > lockProvider.getLock()); > } > } > } {code} > We should put sleep in catch -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (HUDI-4409) Improve LockManager wait logic when catch exception
[ https://issues.apache.org/jira/browse/HUDI-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-4409. -- Resolution: Done > Improve LockManager wait logic when catch exception > --- > > Key: HUDI-4409 > URL: https://issues.apache.org/jira/browse/HUDI-4409 > Project: Apache Hudi > Issue Type: Improvement >Reporter: liujinhui >Assignee: liujinhui >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > > {code:java} > //public void lock() { > if > (writeConfig.getWriteConcurrencyMode().supportsOptimisticConcurrencyControl()) > { > LockProvider lockProvider = getLockProvider(); > int retryCount = 0; > boolean acquired = false; > while (retryCount <= maxRetries) { > try { > acquired = > lockProvider.tryLock(writeConfig.getLockAcquireWaitTimeoutInMs(), > TimeUnit.MILLISECONDS); > if (acquired) { > break; > } > LOG.info("Retrying to acquire lock..."); > Thread.sleep(maxWaitTimeInMs); > } catch (HoodieLockException | InterruptedException e) { > if (retryCount >= maxRetries) { > throw new HoodieLockException("Unable to acquire lock, lock object > ", e); > } > } finally { > retryCount++; > } > } > if (!acquired) { > throw new HoodieLockException("Unable to acquire lock, lock object " + > lockProvider.getLock()); > } > } > } {code} > We should put sleep in catch -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-4409) Improve LockManager wait logic when catch exception
[ https://issues.apache.org/jira/browse/HUDI-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-4409: --- Fix Version/s: 0.12.0 > Improve LockManager wait logic when catch exception > --- > > Key: HUDI-4409 > URL: https://issues.apache.org/jira/browse/HUDI-4409 > Project: Apache Hudi > Issue Type: Improvement >Reporter: liujinhui >Assignee: liujinhui >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > > {code:java} > //public void lock() { > if > (writeConfig.getWriteConcurrencyMode().supportsOptimisticConcurrencyControl()) > { > LockProvider lockProvider = getLockProvider(); > int retryCount = 0; > boolean acquired = false; > while (retryCount <= maxRetries) { > try { > acquired = > lockProvider.tryLock(writeConfig.getLockAcquireWaitTimeoutInMs(), > TimeUnit.MILLISECONDS); > if (acquired) { > break; > } > LOG.info("Retrying to acquire lock..."); > Thread.sleep(maxWaitTimeInMs); > } catch (HoodieLockException | InterruptedException e) { > if (retryCount >= maxRetries) { > throw new HoodieLockException("Unable to acquire lock, lock object > ", e); > } > } finally { > retryCount++; > } > } > if (!acquired) { > throw new HoodieLockException("Unable to acquire lock, lock object " + > lockProvider.getLock()); > } > } > } {code} > We should put sleep in catch -- This message was sent by Atlassian Jira (v8.20.10#820010)
[hudi] branch master updated: [HUDI-4409] Improve LockManager wait logic when catch exception (#6122)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 1959b843b7 [HUDI-4409] Improve LockManager wait logic when catch exception (#6122) 1959b843b7 is described below commit 1959b843b706babed8c16ee31c6fc266871d709f Author: liujinhui <965147...@qq.com> AuthorDate: Mon Jul 18 22:45:52 2022 +0800 [HUDI-4409] Improve LockManager wait logic when catch exception (#6122) --- .../java/org/apache/hudi/client/transaction/lock/LockManager.java| 5 + 1 file changed, 5 insertions(+) diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/LockManager.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/LockManager.java index ca15c4fdc2..6ebae44fd4 100644 --- a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/LockManager.java +++ b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/LockManager.java @@ -74,6 +74,11 @@ public class LockManager implements Serializable, AutoCloseable { if (retryCount >= maxRetries) { throw new HoodieLockException("Unable to acquire lock, lock object ", e); } + try { +Thread.sleep(maxWaitTimeInMs); + } catch (InterruptedException ex) { +// ignore InterruptedException here + } } finally { retryCount++; }
[hudi] branch master updated (0ff34b6974 -> 7689e62cd9)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 0ff34b6974 [HUDI-4214] improve repeat init write schema in ExpressionPayload (#5820) add 7689e62cd9 [HUDI-4265] Deprecate useless targetTableName parameter in HoodieMultiTableDeltaStreamer (#5883) No new revisions were added by this update. Summary of changes: .../deltastreamer/HoodieMultiTableDeltaStreamer.java| 17 - 1 file changed, 12 insertions(+), 5 deletions(-)
[hudi] branch master updated: [HUDI-4218] [HUDI-4218] Expose the real exception information when an exception occurs in the tableExists method (#5827)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new c291b05699 [HUDI-4218] [HUDI-4218] Expose the real exception information when an exception occurs in the tableExists method (#5827) c291b05699 is described below commit c291b056996f7c5c2c25ad75f5ac57dd64028327 Author: 董可伦 AuthorDate: Wed Jun 15 18:10:35 2022 +0800 [HUDI-4218] [HUDI-4218] Expose the real exception information when an exception occurs in the tableExists method (#5827) --- hudi-utilities/src/main/java/org/apache/hudi/utilities/UtilHelpers.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hudi-utilities/src/main/java/org/apache/hudi/utilities/UtilHelpers.java b/hudi-utilities/src/main/java/org/apache/hudi/utilities/UtilHelpers.java index f389695f7b..175ba3d66f 100644 --- a/hudi-utilities/src/main/java/org/apache/hudi/utilities/UtilHelpers.java +++ b/hudi-utilities/src/main/java/org/apache/hudi/utilities/UtilHelpers.java @@ -403,7 +403,7 @@ public class UtilHelpers { statement.setQueryTimeout(Integer.parseInt(options.get(JDBCOptions.JDBC_QUERY_TIMEOUT(; statement.executeQuery(); } catch (SQLException e) { - return false; + throw new HoodieException(e); } return true; }
[hudi] branch asf-site updated: [MINOR] Fix incorrect full-width comma usage in the doc DDL demo (#5721)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 9761fde718 [MINOR] Fix incorrect full-width comma usage in the doc DDL demo (#5721) 9761fde718 is described below commit 9761fde718642238705833ea1b4b0cc5930634f1 Author: 木木夕120 AuthorDate: Tue May 31 19:58:59 2022 +0800 [MINOR] Fix incorrect full-width comma usage in the doc DDL demo (#5721) --- website/docs/table_management.md | 2 +- website/versioned_docs/version-0.10.0/table_management.md | 2 +- website/versioned_docs/version-0.10.1/table_management.md | 2 +- website/versioned_docs/version-0.11.0/table_management.md | 2 +- website/versioned_docs/version-0.9.0/quick-start-guide.md | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/website/docs/table_management.md b/website/docs/table_management.md index 92cb6092aa..762b7c5916 100644 --- a/website/docs/table_management.md +++ b/website/docs/table_management.md @@ -82,7 +82,7 @@ Here is an example of creating a COW partitioned table. create table if not exists hudi_table_p0 ( id bigint, name string, -dt string, +dt string, hh string ) using hudi options ( diff --git a/website/versioned_docs/version-0.10.0/table_management.md b/website/versioned_docs/version-0.10.0/table_management.md index 76c02edc6d..ad7af11c55 100644 --- a/website/versioned_docs/version-0.10.0/table_management.md +++ b/website/versioned_docs/version-0.10.0/table_management.md @@ -82,7 +82,7 @@ Here is an example of creating a COW partitioned table. create table if not exists hudi_table_p0 ( id bigint, name string, -dt string, +dt string, hh string ) using hudi options ( diff --git a/website/versioned_docs/version-0.10.1/table_management.md b/website/versioned_docs/version-0.10.1/table_management.md index 76c02edc6d..ad7af11c55 100644 --- a/website/versioned_docs/version-0.10.1/table_management.md +++ b/website/versioned_docs/version-0.10.1/table_management.md @@ -82,7 +82,7 @@ Here is an example of creating a COW partitioned table. create table if not exists hudi_table_p0 ( id bigint, name string, -dt string, +dt string, hh string ) using hudi options ( diff --git a/website/versioned_docs/version-0.11.0/table_management.md b/website/versioned_docs/version-0.11.0/table_management.md index 92cb6092aa..762b7c5916 100644 --- a/website/versioned_docs/version-0.11.0/table_management.md +++ b/website/versioned_docs/version-0.11.0/table_management.md @@ -82,7 +82,7 @@ Here is an example of creating a COW partitioned table. create table if not exists hudi_table_p0 ( id bigint, name string, -dt string, +dt string, hh string ) using hudi options ( diff --git a/website/versioned_docs/version-0.9.0/quick-start-guide.md b/website/versioned_docs/version-0.9.0/quick-start-guide.md index 7196332003..6a7e1cc7c8 100644 --- a/website/versioned_docs/version-0.9.0/quick-start-guide.md +++ b/website/versioned_docs/version-0.9.0/quick-start-guide.md @@ -251,7 +251,7 @@ Here is an example of creating an external COW partitioned table. create table if not exists hudi_table_p0 ( id bigint, name string, -dt string, +dt string, hh string ) using hudi location '/tmp/hudi/hudi_table_p0'
[hudi] branch master updated: [MINOR] Minor fixes to exception log and removing unwanted metrics flush in integ test (#5646)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 7d02b1fd3c [MINOR] Minor fixes to exception log and removing unwanted metrics flush in integ test (#5646) 7d02b1fd3c is described below commit 7d02b1fd3c74abfbd118f69a10a8c106cc900a3e Author: Sivabalan Narayanan AuthorDate: Fri May 20 19:27:35 2022 -0400 [MINOR] Minor fixes to exception log and removing unwanted metrics flush in integ test (#5646) --- .../org/apache/hudi/integ/testsuite/dag/scheduler/DagScheduler.java| 3 --- .../main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java | 2 +- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/scheduler/DagScheduler.java b/hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/scheduler/DagScheduler.java index 0183f52c2a..ab80df0d6a 100644 --- a/hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/scheduler/DagScheduler.java +++ b/hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/scheduler/DagScheduler.java @@ -117,9 +117,6 @@ public class DagScheduler { if (curRound < workflowDag.getRounds()) { new DelayNode(workflowDag.getIntermittentDelayMins()).execute(executionContext, curRound); } - - // After each level, report and flush the metrics - Metrics.flush(); } while (curRound++ < workflowDag.getRounds()); log.info("Finished workloads"); } diff --git a/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java b/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java index 8f44b8b7d0..a1a804b9ed 100644 --- a/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java +++ b/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java @@ -846,7 +846,7 @@ public class DeltaSync implements Serializable { } return newWriteSchema; } catch (Exception e) { - throw new HoodieException("Failed to fetch schema from table."); + throw new HoodieException("Failed to fetch schema from table ", e); } }
[hudi] branch master updated: [HUDI-3849] AvroDeserializer supports AVRO_REBASE_MODE_IN_READ configuration (#5287)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 9625d16937 [HUDI-3849] AvroDeserializer supports AVRO_REBASE_MODE_IN_READ configuration (#5287) 9625d16937 is described below commit 9625d16937954a54420384b41f964e48cba8cc2f Author: cxzl25 AuthorDate: Sat May 7 15:39:14 2022 +0800 [HUDI-3849] AvroDeserializer supports AVRO_REBASE_MODE_IN_READ configuration (#5287) --- .../org/apache/spark/sql/avro/HoodieSpark3_2AvroDeserializer.scala | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hudi-spark-datasource/hudi-spark3/src/main/scala/org/apache/spark/sql/avro/HoodieSpark3_2AvroDeserializer.scala b/hudi-spark-datasource/hudi-spark3/src/main/scala/org/apache/spark/sql/avro/HoodieSpark3_2AvroDeserializer.scala index 0275e2f635..d839c73032 100644 --- a/hudi-spark-datasource/hudi-spark3/src/main/scala/org/apache/spark/sql/avro/HoodieSpark3_2AvroDeserializer.scala +++ b/hudi-spark-datasource/hudi-spark3/src/main/scala/org/apache/spark/sql/avro/HoodieSpark3_2AvroDeserializer.scala @@ -18,13 +18,14 @@ package org.apache.spark.sql.avro import org.apache.avro.Schema -import org.apache.hudi.HoodieSparkUtils +import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.types.DataType class HoodieSpark3_2AvroDeserializer(rootAvroType: Schema, rootCatalystType: DataType) extends HoodieAvroDeserializer { - private val avroDeserializer = new AvroDeserializer(rootAvroType, rootCatalystType, "EXCEPTION") + private val avroDeserializer = new AvroDeserializer(rootAvroType, rootCatalystType, +SQLConf.get.getConf(SQLConf.AVRO_REBASE_MODE_IN_READ)) def deserialize(data: Any): Option[Any] = avroDeserializer.deserialize(data) }
[jira] [Closed] (HUDI-184) Integrate Hudi with Apache Flink
[ https://issues.apache.org/jira/browse/HUDI-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-184. - Resolution: Implemented This feature has been tracked via https://issues.apache.org/jira/browse/HUDI-1521 > Integrate Hudi with Apache Flink > > > Key: HUDI-184 > URL: https://issues.apache.org/jira/browse/HUDI-184 > Project: Apache Hudi > Issue Type: New Feature > Components: writer-core > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > Apache Flink is a popular streaming processing engine. > Integrating Hudi with Flink is a valuable work. > The discussion mailing thread is here: > [https://lists.apache.org/api/source.lua/1533de2d4cd4243fa9e8f8bf057ffd02f2ac0bec7c7539d8f72166ea@%3Cdev.hudi.apache.org%3E] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Reopened] (HUDI-184) Integrate Hudi with Apache Flink
[ https://issues.apache.org/jira/browse/HUDI-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reopened HUDI-184: --- > Integrate Hudi with Apache Flink > > > Key: HUDI-184 > URL: https://issues.apache.org/jira/browse/HUDI-184 > Project: Apache Hudi > Issue Type: New Feature > Components: writer-core > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > Apache Flink is a popular streaming processing engine. > Integrating Hudi with Flink is a valuable work. > The discussion mailing thread is here: > [https://lists.apache.org/api/source.lua/1533de2d4cd4243fa9e8f8bf057ffd02f2ac0bec7c7539d8f72166ea@%3Cdev.hudi.apache.org%3E] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Closed] (HUDI-609) Implement a Flink specific HoodieIndex
[ https://issues.apache.org/jira/browse/HUDI-609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-609. - Resolution: Won't Do > Implement a Flink specific HoodieIndex > -- > > Key: HUDI-609 > URL: https://issues.apache.org/jira/browse/HUDI-609 > Project: Apache Hudi > Issue Type: Sub-task > Reporter: vinoyang > Assignee: vinoyang >Priority: Major > > Indexing is a key step in hudi's write flow. {{HoodieIndex}} is the super > abstract class of all the implement of the index. Currently, {{HoodieIndex}} > couples with Spark in the design. However, HUDI-538 is doing the restructure > for hudi-client so that hudi can be decoupled with Spark. After that, we > would get an engine-irrelevant implementation of {{HoodieIndex}}. And > extending that class, we could implement a Flink specific index. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Closed] (HUDI-608) Implement a flink datastream execution context
[ https://issues.apache.org/jira/browse/HUDI-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-608. - Resolution: Won't Do > Implement a flink datastream execution context > -- > > Key: HUDI-608 > URL: https://issues.apache.org/jira/browse/HUDI-608 > Project: Apache Hudi > Issue Type: Sub-task > Reporter: vinoyang > Assignee: vinoyang >Priority: Major > > Currently {{HoodieWriteClient}} does something like > `hoodieRecordRDD.map().sort()` internally.. if we want to support Flink > DataStream as the object, then we need to somehow define an abstraction like > {{HoodieExecutionContext}} which will have a common set of map(T) -> T, > filter(), repartition() methods. There will be subclass like > {{HoodieFlinkDataStreamExecutionContext}} which will implement it > in Flink specific ways and hand back the transformed T object. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Closed] (HUDI-184) Integrate Hudi with Apache Flink
[ https://issues.apache.org/jira/browse/HUDI-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-184. - Resolution: Won't Do > Integrate Hudi with Apache Flink > > > Key: HUDI-184 > URL: https://issues.apache.org/jira/browse/HUDI-184 > Project: Apache Hudi > Issue Type: New Feature > Components: writer-core > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > Apache Flink is a popular streaming processing engine. > Integrating Hudi with Flink is a valuable work. > The discussion mailing thread is here: > [https://lists.apache.org/api/source.lua/1533de2d4cd4243fa9e8f8bf057ffd02f2ac0bec7c7539d8f72166ea@%3Cdev.hudi.apache.org%3E] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HUDI-2418) Support HiveSchemaProvider
[ https://issues.apache.org/jira/browse/HUDI-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2418: --- Summary: Support HiveSchemaProvider (was: add HiveSchemaProvider ) > Support HiveSchemaProvider > --- > > Key: HUDI-2418 > URL: https://issues.apache.org/jira/browse/HUDI-2418 > Project: Apache Hudi > Issue Type: Improvement > Components: DeltaStreamer >Reporter: Jian Feng >Assignee: Jian Feng >Priority: Major > Labels: pull-request-available > Fix For: 0.11.0 > > > when using DeltaStreamer to migrate exist Hive table, it better to have a > HiveSchemaProvider instead of avro schema file. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[hudi] branch asf-site updated: [MINOR] Fix RocketMQ logo in landing page (#4061)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new c57cc91 [MINOR] Fix RocketMQ logo in landing page (#4061) c57cc91 is described below commit c57cc91bb7d5c49461713cffcc2bf461799a694a Author: leesf <490081...@qq.com> AuthorDate: Mon Nov 22 10:16:10 2021 +0800 [MINOR] Fix RocketMQ logo in landing page (#4061) --- website/static/assets/images/hudi-lake.png | Bin 150248 -> 152033 bytes 1 file changed, 0 insertions(+), 0 deletions(-) diff --git a/website/static/assets/images/hudi-lake.png b/website/static/assets/images/hudi-lake.png index 103c040..4e6f9cf 100644 Binary files a/website/static/assets/images/hudi-lake.png and b/website/static/assets/images/hudi-lake.png differ
[hudi] branch master updated (aec5d11 -> 4d884bd)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from aec5d11 Check --source-avro-schema-path parameter (#3987) add 4d884bd [MINOR] Fix typo,'Hooide' corrected to 'Hoodie' (#4007) No new revisions were added by this update. Summary of changes: .../src/main/scala/org/apache/spark/sql/hudi/HoodieOptionConfig.scala | 2 +- .../org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala | 2 +- .../test/scala/org/apache/spark/sql/hudi/TestHoodieOptionConfig.scala | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-)
[jira] [Created] (HUDI-2699) Remove duplicated zookeeper with tests classifier exists in bundles
vinoyang created HUDI-2699: -- Summary: Remove duplicated zookeeper with tests classifier exists in bundles Key: HUDI-2699 URL: https://issues.apache.org/jira/browse/HUDI-2699 Project: Apache Hudi Issue Type: Sub-task Reporter: vinoyang Assignee: vinoyang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-2643) Remove duplicated hbase-common with tests classifier exists in bundles
[ https://issues.apache.org/jira/browse/HUDI-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2643. -- Resolution: Done 13b637ddc3ab9fba51e303cfa0343a496e476d26 > Remove duplicated hbase-common with tests classifier exists in bundles > -- > > Key: HUDI-2643 > URL: https://issues.apache.org/jira/browse/HUDI-2643 > Project: Apache Hudi > Issue Type: Sub-task > Reporter: vinoyang > Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2643) Remove duplicated hbase-common with tests classifier exists in bundles
[ https://issues.apache.org/jira/browse/HUDI-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2643: --- Fix Version/s: 0.10.0 > Remove duplicated hbase-common with tests classifier exists in bundles > -- > > Key: HUDI-2643 > URL: https://issues.apache.org/jira/browse/HUDI-2643 > Project: Apache Hudi > Issue Type: Sub-task > Reporter: vinoyang > Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated: [HUDI-2643] Remove duplicated hbase-common with tests classifier exists in bundles (#3886)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 13b637d [HUDI-2643] Remove duplicated hbase-common with tests classifier exists in bundles (#3886) 13b637d is described below commit 13b637ddc3ab9fba51e303cfa0343a496e476d26 Author: vinoyang AuthorDate: Mon Nov 1 20:11:00 2021 +0800 [HUDI-2643] Remove duplicated hbase-common with tests classifier exists in bundles (#3886) --- dependencies/hudi-flink-bundle_2.11.txt | 1 - dependencies/hudi-flink-bundle_2.12.txt | 7 +++--- dependencies/hudi-hadoop-mr-bundle.txt | 8 +-- dependencies/hudi-integ-test-bundle.txt | 35 - dependencies/hudi-spark-bundle_2.11.txt | 1 - dependencies/hudi-spark-bundle_2.12.txt | 1 - dependencies/hudi-spark3-bundle_2.12.txt| 1 - dependencies/hudi-utilities-bundle_2.11.txt | 1 - dependencies/hudi-utilities-bundle_2.12.txt | 7 +++--- packaging/hudi-flink-bundle/pom.xml | 4 packaging/hudi-hadoop-mr-bundle/pom.xml | 10 + packaging/hudi-spark-bundle/pom.xml | 4 packaging/hudi-utilities-bundle/pom.xml | 4 13 files changed, 43 insertions(+), 41 deletions(-) diff --git a/dependencies/hudi-flink-bundle_2.11.txt b/dependencies/hudi-flink-bundle_2.11.txt index 9252d0a..7ece1e8 100644 --- a/dependencies/hudi-flink-bundle_2.11.txt +++ b/dependencies/hudi-flink-bundle_2.11.txt @@ -133,7 +133,6 @@ hamcrest-core/org.hamcrest/1.3//hamcrest-core-1.3.jar hbase-annotations/org.apache.hbase/1.2.3//hbase-annotations-1.2.3.jar hbase-client/org.apache.hbase/1.2.3//hbase-client-1.2.3.jar hbase-common/org.apache.hbase/1.2.3//hbase-common-1.2.3.jar -hbase-common/org.apache.hbase/1.2.3/tests/hbase-common-1.2.3-tests.jar hbase-hadoop-compat/org.apache.hbase/1.2.3//hbase-hadoop-compat-1.2.3.jar hbase-hadoop2-compat/org.apache.hbase/1.2.3//hbase-hadoop2-compat-1.2.3.jar hbase-prefix-tree/org.apache.hbase/1.2.3//hbase-prefix-tree-1.2.3.jar diff --git a/dependencies/hudi-flink-bundle_2.12.txt b/dependencies/hudi-flink-bundle_2.12.txt index 84eacdc..d7566b5 100644 --- a/dependencies/hudi-flink-bundle_2.12.txt +++ b/dependencies/hudi-flink-bundle_2.12.txt @@ -134,7 +134,6 @@ hamcrest-core/org.hamcrest/1.3//hamcrest-core-1.3.jar hbase-annotations/org.apache.hbase/1.2.3//hbase-annotations-1.2.3.jar hbase-client/org.apache.hbase/1.2.3//hbase-client-1.2.3.jar hbase-common/org.apache.hbase/1.2.3//hbase-common-1.2.3.jar -hbase-common/org.apache.hbase/1.2.3/tests/hbase-common-1.2.3-tests.jar hbase-hadoop-compat/org.apache.hbase/1.2.3//hbase-hadoop-compat-1.2.3.jar hbase-hadoop2-compat/org.apache.hbase/1.2.3//hbase-hadoop2-compat-1.2.3.jar hbase-prefix-tree/org.apache.hbase/1.2.3//hbase-prefix-tree-1.2.3.jar @@ -163,10 +162,10 @@ htrace-core/org.apache.htrace/3.1.0-incubating//htrace-core-3.1.0-incubating.jar httpclient/org.apache.httpcomponents/4.4.1//httpclient-4.4.1.jar httpcore/org.apache.httpcomponents/4.4.1//httpcore-4.4.1.jar ivy/org.apache.ivy/2.4.0//ivy-2.4.0.jar -jackson-annotations/com.fasterxml.jackson.core/2.6.7//jackson-annotations-2.6.7.jar +jackson-annotations/com.fasterxml.jackson.core/2.10.0//jackson-annotations-2.10.0.jar jackson-core-asl/org.codehaus.jackson/1.9.13//jackson-core-asl-1.9.13.jar -jackson-core/com.fasterxml.jackson.core/2.6.7//jackson-core-2.6.7.jar -jackson-databind/com.fasterxml.jackson.core/2.6.7.3//jackson-databind-2.6.7.3.jar +jackson-core/com.fasterxml.jackson.core/2.10.0//jackson-core-2.10.0.jar +jackson-databind/com.fasterxml.jackson.core/2.10.0//jackson-databind-2.10.0.jar jackson-jaxrs/org.codehaus.jackson/1.9.13//jackson-jaxrs-1.9.13.jar jackson-mapper-asl/org.codehaus.jackson/1.9.13//jackson-mapper-asl-1.9.13.jar jackson-xc/org.codehaus.jackson/1.9.13//jackson-xc-1.9.13.jar diff --git a/dependencies/hudi-hadoop-mr-bundle.txt b/dependencies/hudi-hadoop-mr-bundle.txt index a9c4afe..bcc2659 100644 --- a/dependencies/hudi-hadoop-mr-bundle.txt +++ b/dependencies/hudi-hadoop-mr-bundle.txt @@ -70,7 +70,6 @@ hamcrest-core/org.hamcrest/1.3//hamcrest-core-1.3.jar hbase-annotations/org.apache.hbase/1.2.3//hbase-annotations-1.2.3.jar hbase-client/org.apache.hbase/1.2.3//hbase-client-1.2.3.jar hbase-common/org.apache.hbase/1.2.3//hbase-common-1.2.3.jar -hbase-common/org.apache.hbase/1.2.3/tests/hbase-common-1.2.3-tests.jar hbase-hadoop-compat/org.apache.hbase/1.2.3//hbase-hadoop-compat-1.2.3.jar hbase-hadoop2-compat/org.apache.hbase/1.2.3//hbase-hadoop2-compat-1.2.3.jar hbase-prefix-tree/org.apache.hbase/1.2.3//hbase-prefix-tree-1.2.3.jar @@ -85,7 +84,9 @@ jackson-annotations/com.fasterxml.jackson.core/2.6.7//jackson-annotations-2.6.7. jackson-core-asl/org.codehaus.jackson/1.9.13//jackson-core-asl-1.9.13.jar jackson-core/com.fasterxml.jackson.core/2.6.7//jackson-core-2.6.7.jar
[jira] [Created] (HUDI-2643) Remove duplicated hbase-common with tests classifier exists in bundles
vinoyang created HUDI-2643: -- Summary: Remove duplicated hbase-common with tests classifier exists in bundles Key: HUDI-2643 URL: https://issues.apache.org/jira/browse/HUDI-2643 Project: Apache Hudi Issue Type: Sub-task Reporter: vinoyang Assignee: vinoyang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-2614) Remove duplicated hadoop-hdfs with tests classifier exists in bundles
[ https://issues.apache.org/jira/browse/HUDI-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2614. -- Resolution: Done b1c4acf0aeb0f3d650c8e704828b1c2b0d2b5b40 > Remove duplicated hadoop-hdfs with tests classifier exists in bundles > - > > Key: HUDI-2614 > URL: https://issues.apache.org/jira/browse/HUDI-2614 > Project: Apache Hudi > Issue Type: Sub-task > Reporter: vinoyang > Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2614) Remove duplicated hadoop-hdfs with tests classifier exists in bundles
[ https://issues.apache.org/jira/browse/HUDI-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2614: --- Fix Version/s: 0.10.0 > Remove duplicated hadoop-hdfs with tests classifier exists in bundles > - > > Key: HUDI-2614 > URL: https://issues.apache.org/jira/browse/HUDI-2614 > Project: Apache Hudi > Issue Type: Sub-task > Reporter: vinoyang > Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (e3fc746 -> b1c4acf)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from e3fc746 [HUDI-2625] Revert "[HUDI-2005] Avoiding direct fs calls in HoodieLogFileReader (#3757)" (#3863) add b1c4acf [HUDI-2614] Remove duplicated hadoop-hdfs with tests classifier exists in bundles (#3864) No new revisions were added by this update. Summary of changes: dependencies/hudi-flink-bundle_2.11.txt | 1 - dependencies/hudi-flink-bundle_2.12.txt | 13 +-- dependencies/hudi-hive-sync-bundle.txt | 5 dependencies/hudi-integ-test-bundle.txt | 36 + dependencies/hudi-kafka-connect-bundle.txt | 1 - dependencies/hudi-spark-bundle_2.11.txt | 1 - dependencies/hudi-spark-bundle_2.12.txt | 4 +--- dependencies/hudi-spark3-bundle_2.12.txt| 4 +--- dependencies/hudi-utilities-bundle_2.11.txt | 1 - dependencies/hudi-utilities-bundle_2.12.txt | 10 hudi-client/hudi-client-common/pom.xml | 1 + hudi-client/hudi-java-client/pom.xml| 22 ++ hudi-integ-test/pom.xml | 1 + hudi-spark-datasource/hudi-spark/pom.xml| 22 ++ hudi-sync/hudi-hive-sync/pom.xml| 1 + packaging/hudi-integ-test-bundle/pom.xml| 1 + pom.xml | 1 + 17 files changed, 87 insertions(+), 38 deletions(-)
[jira] [Created] (HUDI-2614) Remove duplicated hadoop-hdfs with tests classifier exists in bundles
vinoyang created HUDI-2614: -- Summary: Remove duplicated hadoop-hdfs with tests classifier exists in bundles Key: HUDI-2614 URL: https://issues.apache.org/jira/browse/HUDI-2614 Project: Apache Hudi Issue Type: Sub-task Reporter: vinoyang Assignee: vinoyang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2600) Remove duplicated hadoop-common with tests classifier exists in bundles
[ https://issues.apache.org/jira/browse/HUDI-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2600: --- Fix Version/s: 0.10.0 > Remove duplicated hadoop-common with tests classifier exists in bundles > --- > > Key: HUDI-2600 > URL: https://issues.apache.org/jira/browse/HUDI-2600 > Project: Apache Hudi > Issue Type: Sub-task > Components: Release Administrative > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > We found many duplicated dependencies in the generated dependency list, > `hadoop-common` is one of them: > {code:java} > hadoop-common/org.apache.hadoop/2.7.3//hadoop-common-2.7.3.jar > hadoop-common/org.apache.hadoop/2.7.3/tests/hadoop-common-2.7.3-tests.jar > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-2600) Remove duplicated hadoop-common with tests classifier exists in bundles
[ https://issues.apache.org/jira/browse/HUDI-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2600. -- Resolution: Done 220bf6a7e6f5cdf0efbbbee9df6852a8b2288570 > Remove duplicated hadoop-common with tests classifier exists in bundles > --- > > Key: HUDI-2600 > URL: https://issues.apache.org/jira/browse/HUDI-2600 > Project: Apache Hudi > Issue Type: Sub-task > Components: Release Administrative > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > We found many duplicated dependencies in the generated dependency list, > `hadoop-common` is one of them: > {code:java} > hadoop-common/org.apache.hadoop/2.7.3//hadoop-common-2.7.3.jar > hadoop-common/org.apache.hadoop/2.7.3/tests/hadoop-common-2.7.3-tests.jar > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated: [HUDI-2600] Remove duplicated hadoop-common with tests classifier exists in bundles (#3847)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 220bf6a [HUDI-2600] Remove duplicated hadoop-common with tests classifier exists in bundles (#3847) 220bf6a is described below commit 220bf6a7e6f5cdf0efbbbee9df6852a8b2288570 Author: vinoyang AuthorDate: Mon Oct 25 13:45:28 2021 +0800 [HUDI-2600] Remove duplicated hadoop-common with tests classifier exists in bundles (#3847) --- dependencies/hudi-flink-bundle_2.11.txt | 6 +++--- dependencies/hudi-hive-sync-bundle.txt | 7 +-- dependencies/hudi-kafka-connect-bundle.txt | 3 +-- dependencies/hudi-spark-bundle_2.11.txt | 3 +-- dependencies/hudi-timeline-server-bundle.txt | 1 - dependencies/hudi-utilities-bundle_2.11.txt | 3 +-- hudi-client/hudi-client-common/pom.xml | 1 + hudi-sync/hudi-hive-sync/pom.xml | 1 + hudi-timeline-service/pom.xml| 1 + 9 files changed, 10 insertions(+), 16 deletions(-) diff --git a/dependencies/hudi-flink-bundle_2.11.txt b/dependencies/hudi-flink-bundle_2.11.txt index b97995c..4414594 100644 --- a/dependencies/hudi-flink-bundle_2.11.txt +++ b/dependencies/hudi-flink-bundle_2.11.txt @@ -64,7 +64,7 @@ commons-lang/commons-lang/2.6//commons-lang-2.6.jar commons-lang3/org.apache.commons/3.1//commons-lang3-3.1.jar commons-logging/commons-logging/1.2//commons-logging-1.2.jar commons-math/org.apache.commons/2.2//commons-math-2.2.jar -commons-math3/org.apache.commons/3.1.1//commons-math3-3.1.1.jar +commons-math3/org.apache.commons/3.5//commons-math3-3.5.jar commons-net/commons-net/3.1//commons-net-3.1.jar commons-pool/commons-pool/1.6//commons-pool-1.6.jar config/com.typesafe/1.3.3//config-1.3.3.jar @@ -107,6 +107,7 @@ force-shading/org.apache.flink/1.13.1//force-shading-1.13.1.jar grizzled-slf4j_2.11/org.clapper/1.3.2//grizzled-slf4j_2.11-1.3.2.jar groovy-all/org.codehaus.groovy/2.4.4//groovy-all-2.4.4.jar gson/com.google.code.gson/2.3.1//gson-2.3.1.jar +guava/com.google.guava/12.0.1//guava-12.0.1.jar guice-assistedinject/com.google.inject.extensions/3.0//guice-assistedinject-3.0.jar guice-servlet/com.google.inject.extensions/3.0//guice-servlet-3.0.jar guice/com.google.inject/3.0//guice-3.0.jar @@ -114,7 +115,6 @@ hadoop-annotations/org.apache.hadoop/2.7.3//hadoop-annotations-2.7.3.jar hadoop-auth/org.apache.hadoop/2.7.3//hadoop-auth-2.7.3.jar hadoop-client/org.apache.hadoop/2.7.3//hadoop-client-2.7.3.jar hadoop-common/org.apache.hadoop/2.7.3//hadoop-common-2.7.3.jar -hadoop-common/org.apache.hadoop/2.7.3/tests/hadoop-common-2.7.3-tests.jar hadoop-hdfs/org.apache.hadoop/2.7.3//hadoop-hdfs-2.7.3.jar hadoop-hdfs/org.apache.hadoop/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar hadoop-mapreduce-client-app/org.apache.hadoop/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar @@ -132,7 +132,7 @@ hadoop-yarn-server-resourcemanager/org.apache.hadoop/2.7.2//hadoop-yarn-server-r hadoop-yarn-server-web-proxy/org.apache.hadoop/2.7.2//hadoop-yarn-server-web-proxy-2.7.2.jar hamcrest-core/org.hamcrest/1.3//hamcrest-core-1.3.jar hbase-annotations/org.apache.hbase/1.2.3//hbase-annotations-1.2.3.jar -hbase-client/org.apache.hbase/1.1.1//hbase-client-1.1.1.jar +hbase-client/org.apache.hbase/1.2.3//hbase-client-1.2.3.jar hbase-common/org.apache.hbase/1.2.3//hbase-common-1.2.3.jar hbase-common/org.apache.hbase/1.2.3/tests/hbase-common-1.2.3-tests.jar hbase-hadoop-compat/org.apache.hbase/1.2.3//hbase-hadoop-compat-1.2.3.jar diff --git a/dependencies/hudi-hive-sync-bundle.txt b/dependencies/hudi-hive-sync-bundle.txt index aefcfbb..f80ee31 100644 --- a/dependencies/hudi-hive-sync-bundle.txt +++ b/dependencies/hudi-hive-sync-bundle.txt @@ -56,7 +56,6 @@ hadoop-annotations/org.apache.hadoop/2.7.3//hadoop-annotations-2.7.3.jar hadoop-auth/org.apache.hadoop/2.7.3//hadoop-auth-2.7.3.jar hadoop-client/org.apache.hadoop/2.7.3//hadoop-client-2.7.3.jar hadoop-common/org.apache.hadoop/2.7.3//hadoop-common-2.7.3.jar -hadoop-common/org.apache.hadoop/2.7.3/tests/hadoop-common-2.7.3-tests.jar hadoop-hdfs/org.apache.hadoop/2.7.3//hadoop-hdfs-2.7.3.jar hadoop-hdfs/org.apache.hadoop/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar hadoop-mapreduce-client-app/org.apache.hadoop/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar @@ -87,9 +86,7 @@ jackson-annotations/com.fasterxml.jackson.core/2.6.7//jackson-annotations-2.6.7. jackson-core-asl/org.codehaus.jackson/1.9.13//jackson-core-asl-1.9.13.jar jackson-core/com.fasterxml.jackson.core/2.6.7//jackson-core-2.6.7.jar jackson-databind/com.fasterxml.jackson.core/2.6.7.3//jackson-databind-2.6.7.3.jar -jackson-jaxrs/org.codehaus.jackson/1.9.13//jackson-jaxrs-1.9.13.jar jackson-mapper-asl/org.codehaus.jackson/1.9.13//jackson-mapper-asl-1.9.13.jar -jackson-xc/org.codehaus.jackson/1.9.13//jackson-xc-1.9.13.jar jamon-runtime/org.jamon/2.4.1//jamon
[hudi] branch master updated: [MINOR] Show source table operator details on the flink web when reading hudi table (#3842)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 91845e2 [MINOR] Show source table operator details on the flink web when reading hudi table (#3842) 91845e2 is described below commit 91845e241da242cede95f705b0637331ce9222ff Author: mincwang <33626973+mincw...@users.noreply.github.com> AuthorDate: Sun Oct 24 23:18:01 2021 +0800 [MINOR] Show source table operator details on the flink web when reading hudi table (#3842) --- .../java/org/apache/hudi/table/HoodieTableSource.java | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSource.java b/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSource.java index 4e193fa..f0dbffd 100644 --- a/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSource.java +++ b/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSource.java @@ -180,7 +180,7 @@ public class HoodieTableSource implements conf, FilePathUtils.toFlinkPath(path), maxCompactionMemoryInBytes, getRequiredPartitionPaths()); InputFormat inputFormat = getInputFormat(true); OneInputStreamOperatorFactory factory = StreamReadOperator.factory((MergeOnReadInputFormat) inputFormat); - SingleOutputStreamOperator source = execEnv.addSource(monitoringFunction, "split_monitor") + SingleOutputStreamOperator source = execEnv.addSource(monitoringFunction, getSourceOperatorName("split_monitor")) .setParallelism(1) .transform("split_reader", typeInfo, factory) .setParallelism(conf.getInteger(FlinkOptions.READ_TASKS)); @@ -188,7 +188,7 @@ public class HoodieTableSource implements } else { InputFormatSourceFunction func = new InputFormatSourceFunction<>(getInputFormat(), typeInfo); DataStreamSource source = execEnv.addSource(func, asSummaryString(), typeInfo); - return source.name("bounded_source").setParallelism(conf.getInteger(FlinkOptions.READ_TASKS)); + return source.name(getSourceOperatorName("bounded_source")).setParallelism(conf.getInteger(FlinkOptions.READ_TASKS)); } } }; @@ -266,6 +266,21 @@ public class HoodieTableSource implements return requiredPartitions; } + private String getSourceOperatorName(String operatorName) { +String[] schemaFieldNames = this.schema.getColumnNames().toArray(new String[0]); +List fields = Arrays.stream(this.requiredPos) +.mapToObj(i -> schemaFieldNames[i]) +.collect(Collectors.toList()); +StringBuilder sb = new StringBuilder(); +sb.append(operatorName) +.append("(") + .append("table=").append(Collections.singletonList(conf.getString(FlinkOptions.TABLE_NAME))) +.append(", ") +.append("fields=").append(fields) +.append(")"); +return sb.toString(); + } + @Nullable private Set getRequiredPartitionPaths() { if (this.requiredPartitions == null) {
[jira] [Closed] (HUDI-2592) NumberFormatException: Zero length BigInteger when write.precombine.field is decimal type
[ https://issues.apache.org/jira/browse/HUDI-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2592. -- Resolution: Fixed > NumberFormatException: Zero length BigInteger when write.precombine.field is > decimal type > - > > Key: HUDI-2592 > URL: https://issues.apache.org/jira/browse/HUDI-2592 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Matrix42 >Assignee: Matrix42 >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0, 0.11.0 > > > when write.precombine.field is decimal type,write decimal will be an empty > byte array, when read will throw NumberFormatException: Zero length > BigInteger like below: > {code:java} > 2021-10-20 17:14:03 > java.lang.NumberFormatException: Zero length BigInteger > at java.math.BigInteger.(BigInteger.java:302) > at > org.apache.flink.table.data.DecimalData.fromUnscaledBytes(DecimalData.java:223) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createDecimalConverter$4dc14f00$1(AvroToRowDataConverters.java:158) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createNullableConverter$4568343a$1(AvroToRowDataConverters.java:94) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createRowConverter$68595fbd$1(AvroToRowDataConverters.java:75) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$1.hasNext(MergeOnReadInputFormat.java:300) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$LogFileOnlyIterator.reachedEnd(MergeOnReadInputFormat.java:362) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat.reachedEnd(MergeOnReadInputFormat.java:202) > at > org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:90) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63) > at > org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213) > {code} > analyze: > > HoodieAvroUtils.getNestedFieldVal will invoked to extract precombine field. > next will invoke convertValueForAvroLogicalTypes. when field is decimal > type,the bytebuffer will consumed, we should rewind. > {code:java} > private static Object convertValueForAvroLogicalTypes(Schema fieldSchema, > Object fieldValue) { > if (fieldSchema.getLogicalType() == LogicalTypes.date()) { > return LocalDate.ofEpochDay(Long.parseLong(fieldValue.toString())); > } else if (fieldSchema.getLogicalType() instanceof LogicalTypes.Decimal) { > Decimal dc = (Decimal) fieldSchema.getLogicalType(); > DecimalConversion decimalConversion = new DecimalConversion(); > if (fieldSchema.getType() == Schema.Type.FIXED) { > return decimalConversion.fromFixed((GenericFixed) fieldValue, > fieldSchema, > LogicalTypes.decimal(dc.getPrecision(), dc.getScale())); > } else if (fieldSchema.getType() == Schema.Type.BYTES) { > > //this methoad will consume the byteBuffer > return decimalConversion.fromBytes((ByteBuffer) fieldValue, fieldSchema, > LogicalTypes.decimal(dc.getPrecision(), dc.getScale())); > } > } > return fieldValue; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HUDI-2592) NumberFormatException: Zero length BigInteger when write.precombine.field is decimal type
[ https://issues.apache.org/jira/browse/HUDI-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reopened HUDI-2592: > NumberFormatException: Zero length BigInteger when write.precombine.field is > decimal type > - > > Key: HUDI-2592 > URL: https://issues.apache.org/jira/browse/HUDI-2592 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Matrix42 >Assignee: Matrix42 >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0, 0.11.0 > > > when write.precombine.field is decimal type,write decimal will be an empty > byte array, when read will throw NumberFormatException: Zero length > BigInteger like below: > {code:java} > 2021-10-20 17:14:03 > java.lang.NumberFormatException: Zero length BigInteger > at java.math.BigInteger.(BigInteger.java:302) > at > org.apache.flink.table.data.DecimalData.fromUnscaledBytes(DecimalData.java:223) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createDecimalConverter$4dc14f00$1(AvroToRowDataConverters.java:158) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createNullableConverter$4568343a$1(AvroToRowDataConverters.java:94) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createRowConverter$68595fbd$1(AvroToRowDataConverters.java:75) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$1.hasNext(MergeOnReadInputFormat.java:300) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$LogFileOnlyIterator.reachedEnd(MergeOnReadInputFormat.java:362) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat.reachedEnd(MergeOnReadInputFormat.java:202) > at > org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:90) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63) > at > org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213) > {code} > analyze: > > HoodieAvroUtils.getNestedFieldVal will invoked to extract precombine field. > next will invoke convertValueForAvroLogicalTypes. when field is decimal > type,the bytebuffer will consumed, we should rewind. > {code:java} > private static Object convertValueForAvroLogicalTypes(Schema fieldSchema, > Object fieldValue) { > if (fieldSchema.getLogicalType() == LogicalTypes.date()) { > return LocalDate.ofEpochDay(Long.parseLong(fieldValue.toString())); > } else if (fieldSchema.getLogicalType() instanceof LogicalTypes.Decimal) { > Decimal dc = (Decimal) fieldSchema.getLogicalType(); > DecimalConversion decimalConversion = new DecimalConversion(); > if (fieldSchema.getType() == Schema.Type.FIXED) { > return decimalConversion.fromFixed((GenericFixed) fieldValue, > fieldSchema, > LogicalTypes.decimal(dc.getPrecision(), dc.getScale())); > } else if (fieldSchema.getType() == Schema.Type.BYTES) { > > //this methoad will consume the byteBuffer > return decimalConversion.fromBytes((ByteBuffer) fieldValue, fieldSchema, > LogicalTypes.decimal(dc.getPrecision(), dc.getScale())); > } > } > return fieldValue; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-2592) NumberFormatException: Zero length BigInteger when write.precombine.field is decimal type
[ https://issues.apache.org/jira/browse/HUDI-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17432925#comment-17432925 ] vinoyang commented on HUDI-2592: [~Matrix42] I have given you Jira contributor permission. Thanks for your contribution! > NumberFormatException: Zero length BigInteger when write.precombine.field is > decimal type > - > > Key: HUDI-2592 > URL: https://issues.apache.org/jira/browse/HUDI-2592 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Matrix42 >Assignee: Matrix42 >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0, 0.11.0 > > > when write.precombine.field is decimal type,write decimal will be an empty > byte array, when read will throw NumberFormatException: Zero length > BigInteger like below: > {code:java} > 2021-10-20 17:14:03 > java.lang.NumberFormatException: Zero length BigInteger > at java.math.BigInteger.(BigInteger.java:302) > at > org.apache.flink.table.data.DecimalData.fromUnscaledBytes(DecimalData.java:223) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createDecimalConverter$4dc14f00$1(AvroToRowDataConverters.java:158) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createNullableConverter$4568343a$1(AvroToRowDataConverters.java:94) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createRowConverter$68595fbd$1(AvroToRowDataConverters.java:75) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$1.hasNext(MergeOnReadInputFormat.java:300) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$LogFileOnlyIterator.reachedEnd(MergeOnReadInputFormat.java:362) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat.reachedEnd(MergeOnReadInputFormat.java:202) > at > org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:90) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63) > at > org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213) > {code} > analyze: > > HoodieAvroUtils.getNestedFieldVal will invoked to extract precombine field. > next will invoke convertValueForAvroLogicalTypes. when field is decimal > type,the bytebuffer will consumed, we should rewind. > {code:java} > private static Object convertValueForAvroLogicalTypes(Schema fieldSchema, > Object fieldValue) { > if (fieldSchema.getLogicalType() == LogicalTypes.date()) { > return LocalDate.ofEpochDay(Long.parseLong(fieldValue.toString())); > } else if (fieldSchema.getLogicalType() instanceof LogicalTypes.Decimal) { > Decimal dc = (Decimal) fieldSchema.getLogicalType(); > DecimalConversion decimalConversion = new DecimalConversion(); > if (fieldSchema.getType() == Schema.Type.FIXED) { > return decimalConversion.fromFixed((GenericFixed) fieldValue, > fieldSchema, > LogicalTypes.decimal(dc.getPrecision(), dc.getScale())); > } else if (fieldSchema.getType() == Schema.Type.BYTES) { > > //this methoad will consume the byteBuffer > return decimalConversion.fromBytes((ByteBuffer) fieldValue, fieldSchema, > LogicalTypes.decimal(dc.getPrecision(), dc.getScale())); > } > } > return fieldValue; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-2592) NumberFormatException: Zero length BigInteger when write.precombine.field is decimal type
[ https://issues.apache.org/jira/browse/HUDI-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-2592: -- Assignee: Matrix42 > NumberFormatException: Zero length BigInteger when write.precombine.field is > decimal type > - > > Key: HUDI-2592 > URL: https://issues.apache.org/jira/browse/HUDI-2592 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Matrix42 >Assignee: Matrix42 >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0, 0.11.0 > > > when write.precombine.field is decimal type,write decimal will be an empty > byte array, when read will throw NumberFormatException: Zero length > BigInteger like below: > {code:java} > 2021-10-20 17:14:03 > java.lang.NumberFormatException: Zero length BigInteger > at java.math.BigInteger.(BigInteger.java:302) > at > org.apache.flink.table.data.DecimalData.fromUnscaledBytes(DecimalData.java:223) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createDecimalConverter$4dc14f00$1(AvroToRowDataConverters.java:158) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createNullableConverter$4568343a$1(AvroToRowDataConverters.java:94) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createRowConverter$68595fbd$1(AvroToRowDataConverters.java:75) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$1.hasNext(MergeOnReadInputFormat.java:300) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$LogFileOnlyIterator.reachedEnd(MergeOnReadInputFormat.java:362) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat.reachedEnd(MergeOnReadInputFormat.java:202) > at > org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:90) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63) > at > org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213) > {code} > analyze: > > HoodieAvroUtils.getNestedFieldVal will invoked to extract precombine field. > next will invoke convertValueForAvroLogicalTypes. when field is decimal > type,the bytebuffer will consumed, we should rewind. > {code:java} > private static Object convertValueForAvroLogicalTypes(Schema fieldSchema, > Object fieldValue) { > if (fieldSchema.getLogicalType() == LogicalTypes.date()) { > return LocalDate.ofEpochDay(Long.parseLong(fieldValue.toString())); > } else if (fieldSchema.getLogicalType() instanceof LogicalTypes.Decimal) { > Decimal dc = (Decimal) fieldSchema.getLogicalType(); > DecimalConversion decimalConversion = new DecimalConversion(); > if (fieldSchema.getType() == Schema.Type.FIXED) { > return decimalConversion.fromFixed((GenericFixed) fieldValue, > fieldSchema, > LogicalTypes.decimal(dc.getPrecision(), dc.getScale())); > } else if (fieldSchema.getType() == Schema.Type.BYTES) { > > //this methoad will consume the byteBuffer > return decimalConversion.fromBytes((ByteBuffer) fieldValue, fieldSchema, > LogicalTypes.decimal(dc.getPrecision(), dc.getScale())); > } > } > return fieldValue; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2592) NumberFormatException: Zero length BigInteger when write.precombine.field is decimal type
[ https://issues.apache.org/jira/browse/HUDI-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2592: --- Status: Closed (was: Patch Available) > NumberFormatException: Zero length BigInteger when write.precombine.field is > decimal type > - > > Key: HUDI-2592 > URL: https://issues.apache.org/jira/browse/HUDI-2592 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Matrix42 >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0, 0.11.0 > > > when write.precombine.field is decimal type,write decimal will be an empty > byte array, when read will throw NumberFormatException: Zero length > BigInteger like below: > {code:java} > 2021-10-20 17:14:03 > java.lang.NumberFormatException: Zero length BigInteger > at java.math.BigInteger.(BigInteger.java:302) > at > org.apache.flink.table.data.DecimalData.fromUnscaledBytes(DecimalData.java:223) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createDecimalConverter$4dc14f00$1(AvroToRowDataConverters.java:158) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createNullableConverter$4568343a$1(AvroToRowDataConverters.java:94) > at > org.apache.flink.connectors.hudi.util.AvroToRowDataConverters.lambda$createRowConverter$68595fbd$1(AvroToRowDataConverters.java:75) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$1.hasNext(MergeOnReadInputFormat.java:300) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat$LogFileOnlyIterator.reachedEnd(MergeOnReadInputFormat.java:362) > at > org.apache.flink.connectors.hudi.table.format.mor.MergeOnReadInputFormat.reachedEnd(MergeOnReadInputFormat.java:202) > at > org.apache.flink.streaming.api.functions.source.InputFormatSourceFunction.run(InputFormatSourceFunction.java:90) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63) > at > org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213) > {code} > analyze: > > HoodieAvroUtils.getNestedFieldVal will invoked to extract precombine field. > next will invoke convertValueForAvroLogicalTypes. when field is decimal > type,the bytebuffer will consumed, we should rewind. > {code:java} > private static Object convertValueForAvroLogicalTypes(Schema fieldSchema, > Object fieldValue) { > if (fieldSchema.getLogicalType() == LogicalTypes.date()) { > return LocalDate.ofEpochDay(Long.parseLong(fieldValue.toString())); > } else if (fieldSchema.getLogicalType() instanceof LogicalTypes.Decimal) { > Decimal dc = (Decimal) fieldSchema.getLogicalType(); > DecimalConversion decimalConversion = new DecimalConversion(); > if (fieldSchema.getType() == Schema.Type.FIXED) { > return decimalConversion.fromFixed((GenericFixed) fieldValue, > fieldSchema, > LogicalTypes.decimal(dc.getPrecision(), dc.getScale())); > } else if (fieldSchema.getType() == Schema.Type.BYTES) { > > //this methoad will consume the byteBuffer > return decimalConversion.fromBytes((ByteBuffer) fieldValue, fieldSchema, > LogicalTypes.decimal(dc.getPrecision(), dc.getScale())); > } > } > return fieldValue; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (84ca981 -> 499af7c)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 84ca981 [HUDI-2553] Metadata table compaction trigger max delta commits (#3794) add 499af7c [HUDI-2592] Fix write empty array when write.precombine.field is decimal type (#3837) No new revisions were added by this update. Summary of changes: .../java/org/apache/hudi/avro/HoodieAvroUtils.java | 11 +++--- .../org/apache/hudi/avro/TestHoodieAvroUtils.java | 40 +- 2 files changed, 39 insertions(+), 12 deletions(-)
[jira] [Created] (HUDI-2600) Remove duplicated hadoop-common with tests classifier exists in bundles
vinoyang created HUDI-2600: -- Summary: Remove duplicated hadoop-common with tests classifier exists in bundles Key: HUDI-2600 URL: https://issues.apache.org/jira/browse/HUDI-2600 Project: Apache Hudi Issue Type: Sub-task Components: Release Administrative Reporter: vinoyang We found many duplicated dependencies in the generated dependency list, `hadoop-common` is one of them: {code:java} hadoop-common/org.apache.hadoop/2.7.3//hadoop-common-2.7.3.jar hadoop-common/org.apache.hadoop/2.7.3/tests/hadoop-common-2.7.3-tests.jar {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-2600) Remove duplicated hadoop-common with tests classifier exists in bundles
[ https://issues.apache.org/jira/browse/HUDI-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-2600: -- Assignee: vinoyang > Remove duplicated hadoop-common with tests classifier exists in bundles > --- > > Key: HUDI-2600 > URL: https://issues.apache.org/jira/browse/HUDI-2600 > Project: Apache Hudi > Issue Type: Sub-task > Components: Release Administrative > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > We found many duplicated dependencies in the generated dependency list, > `hadoop-common` is one of them: > {code:java} > hadoop-common/org.apache.hadoop/2.7.3//hadoop-common-2.7.3.jar > hadoop-common/org.apache.hadoop/2.7.3/tests/hadoop-common-2.7.3-tests.jar > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-2507) Generate more dependency list file for other bundles
[ https://issues.apache.org/jira/browse/HUDI-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2507. -- Resolution: Done b480294e792b6344d37560587f8f6e170e210d14 > Generate more dependency list file for other bundles > > > Key: HUDI-2507 > URL: https://issues.apache.org/jira/browse/HUDI-2507 > Project: Apache Hudi > Issue Type: Sub-task > Components: Usability > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2507) Generate more dependency list file for other bundles
[ https://issues.apache.org/jira/browse/HUDI-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2507: --- Fix Version/s: 0.10.0 > Generate more dependency list file for other bundles > > > Key: HUDI-2507 > URL: https://issues.apache.org/jira/browse/HUDI-2507 > Project: Apache Hudi > Issue Type: Sub-task > Components: Usability > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (aa3c4ec -> b480294)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from aa3c4ec [HUDI-2583] Refactor TestWriteCopyOnWrite test cases (#3832) add b480294 [HUDI-2507] Generate more dependency list file for other bundles (#3773) No new revisions were added by this update. Summary of changes: .../hudi-flink-bundle_2.11.txt | 0 .../hudi-flink-bundle_2.12.txt | 9 +- .../hudi-hadoop-mr-bundle.txt | 1 - .../hudi-hive-sync-bundle.txt | 14 +- .../hudi-integ-test-bundle.txt | 64 ++--- .../hudi-kafka-connect-bundle.txt | 143 +++ .../hudi-presto-bundle.txt | 0 .../hudi-spark-bundle_2.11.txt | 0 .../hudi-spark-bundle_2.12.txt | 4 +- .../hudi-spark3-bundle_2.12.txt| 10 +- .../hudi-timeline-server-bundle.txt| 0 .../hudi-utilities-bundle_2.11.txt | 0 .../hudi-utilities-bundle_2.12.txt | 6 +- scripts/dependency.sh | 155 +++-- 14 files changed, 202 insertions(+), 204 deletions(-) copy dev/dependencyList_hudi-flink-bundle_2.11.txt => dependencies/hudi-flink-bundle_2.11.txt (100%) rename dev/dependencyList_hudi-flink-bundle_2.11.txt => dependencies/hudi-flink-bundle_2.12.txt (97%) copy dev/dependencyList_hudi-presto-bundle.txt => dependencies/hudi-hadoop-mr-bundle.txt (99%) copy dev/dependencyList_hudi-presto-bundle.txt => dependencies/hudi-hive-sync-bundle.txt (91%) copy dev/dependencyList_hudi-utilities-bundle_2.11.txt => dependencies/hudi-integ-test-bundle.txt (87%) copy dev/dependencyList_hudi-utilities-bundle_2.11.txt => dependencies/hudi-kafka-connect-bundle.txt (70%) rename dev/dependencyList_hudi-presto-bundle.txt => dependencies/hudi-presto-bundle.txt (100%) copy dev/dependencyList_hudi-spark-bundle_2.11.txt => dependencies/hudi-spark-bundle_2.11.txt (100%) copy dev/dependencyList_hudi-spark-bundle_2.11.txt => dependencies/hudi-spark-bundle_2.12.txt (99%) rename dev/dependencyList_hudi-spark-bundle_2.11.txt => dependencies/hudi-spark3-bundle_2.12.txt (97%) rename dev/dependencyList_hudi-timeline-server-bundle.txt => dependencies/hudi-timeline-server-bundle.txt (100%) copy dev/dependencyList_hudi-utilities-bundle_2.11.txt => dependencies/hudi-utilities-bundle_2.11.txt (100%) rename dev/dependencyList_hudi-utilities-bundle_2.11.txt => dependencies/hudi-utilities-bundle_2.12.txt (98%)
[hudi] branch master updated: [MINOR] Fix typo, 'intance' corrected to 'instance' (#3788)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 46f0496 [MINOR] Fix typo,'intance' corrected to 'instance' (#3788) 46f0496 is described below commit 46f0496a0838431cd8886ca882a902d801c4dfb8 Author: 董可伦 AuthorDate: Tue Oct 19 23:16:48 2021 +0800 [MINOR] Fix typo,'intance' corrected to 'instance' (#3788) --- .../java/org/apache/hudi/table/action/clean/CleanActionExecutor.java| 2 +- .../org/apache/hudi/table/action/restore/BaseRestoreActionExecutor.java | 2 +- .../apache/hudi/table/action/rollback/BaseRollbackActionExecutor.java | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanActionExecutor.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanActionExecutor.java index abe88b9..1b229ca 100644 --- a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanActionExecutor.java +++ b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanActionExecutor.java @@ -211,7 +211,7 @@ public class CleanActionExecutor extends /** * Update metadata table if available. Any update to metadata table happens within data table lock. - * @param cleanMetadata intance of {@link HoodieCleanMetadata} to be applied to metadata. + * @param cleanMetadata instance of {@link HoodieCleanMetadata} to be applied to metadata. */ private void writeMetadata(HoodieCleanMetadata cleanMetadata) { try { diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/restore/BaseRestoreActionExecutor.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/restore/BaseRestoreActionExecutor.java index 8b0085c..ac8f994 100644 --- a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/restore/BaseRestoreActionExecutor.java +++ b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/restore/BaseRestoreActionExecutor.java @@ -105,7 +105,7 @@ public abstract class BaseRestoreActionExecutor
[jira] [Created] (HUDI-2508) Build GA for the dependeny diff check workflow
vinoyang created HUDI-2508: -- Summary: Build GA for the dependeny diff check workflow Key: HUDI-2508 URL: https://issues.apache.org/jira/browse/HUDI-2508 Project: Apache Hudi Issue Type: Sub-task Components: Usability Reporter: vinoyang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-2508) Build GA for the dependeny diff check workflow
[ https://issues.apache.org/jira/browse/HUDI-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-2508: -- Assignee: vinoyang > Build GA for the dependeny diff check workflow > -- > > Key: HUDI-2508 > URL: https://issues.apache.org/jira/browse/HUDI-2508 > Project: Apache Hudi > Issue Type: Sub-task > Components: Usability > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-2507) Generate more dependency list file for other bundles
[ https://issues.apache.org/jira/browse/HUDI-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-2507: -- Assignee: vinoyang > Generate more dependency list file for other bundles > > > Key: HUDI-2507 > URL: https://issues.apache.org/jira/browse/HUDI-2507 > Project: Apache Hudi > Issue Type: Sub-task > Components: Usability > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-2507) Generate more dependency list file for other bundles
vinoyang created HUDI-2507: -- Summary: Generate more dependency list file for other bundles Key: HUDI-2507 URL: https://issues.apache.org/jira/browse/HUDI-2507 Project: Apache Hudi Issue Type: Sub-task Components: Usability Reporter: vinoyang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-2506) Hudi dependency governance
[ https://issues.apache.org/jira/browse/HUDI-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-2506: -- Assignee: vinoyang > Hudi dependency governance > -- > > Key: HUDI-2506 > URL: https://issues.apache.org/jira/browse/HUDI-2506 > Project: Apache Hudi > Issue Type: Task > Components: Usability > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-2440) Add dependency change diff script for dependency governace
[ https://issues.apache.org/jira/browse/HUDI-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2440. -- Resolution: Done > Add dependency change diff script for dependency governace > -- > > Key: HUDI-2440 > URL: https://issues.apache.org/jira/browse/HUDI-2440 > Project: Apache Hudi > Issue Type: Sub-task > Components: Usability, Utilities > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > Currently, hudi's dependency management is chaotic, e.g. for > `hudi-spark-bundle_2.11`, the dependency list is here: > {code:java} > HikariCP/2.5.1//HikariCP-2.5.1.jar > ST4/4.0.4//ST4-4.0.4.jar > aircompressor/0.15//aircompressor-0.15.jar > annotations/17.0.0//annotations-17.0.0.jar > ant-launcher/1.9.1//ant-launcher-1.9.1.jar > ant/1.6.5//ant-1.6.5.jar > ant/1.9.1//ant-1.9.1.jar > antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar > aopalliance/1.0//aopalliance-1.0.jar > apache-curator/2.7.1//apache-curator-2.7.1.pom > apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar > apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar > api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar > api-util/1.0.0-M20//api-util-1.0.0-M20.jar > asm/3.1//asm-3.1.jar > avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar > avatica/1.8.0//avatica-1.8.0.jar > avro/1.8.2//avro-1.8.2.jar > bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar > calcite-core/1.10.0//calcite-core-1.10.0.jar > calcite-druid/1.10.0//calcite-druid-1.10.0.jar > calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar > commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar > commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar > commons-cli/1.2//commons-cli-1.2.jar > commons-codec/1.4//commons-codec-1.4.jar > commons-collections/3.2.2//commons-collections-3.2.2.jar > commons-compiler/2.7.6//commons-compiler-2.7.6.jar > commons-compress/1.9//commons-compress-1.9.jar > commons-configuration/1.6//commons-configuration-1.6.jar > commons-daemon/1.0.13//commons-daemon-1.0.13.jar > commons-dbcp/1.4//commons-dbcp-1.4.jar > commons-digester/1.8//commons-digester-1.8.jar > commons-el/1.0//commons-el-1.0.jar > commons-httpclient/3.1//commons-httpclient-3.1.jar > commons-io/2.4//commons-io-2.4.jar > commons-lang/2.6//commons-lang-2.6.jar > commons-lang3/3.1//commons-lang3-3.1.jar > commons-logging/1.2//commons-logging-1.2.jar > commons-math/2.2//commons-math-2.2.jar > commons-math3/3.1.1//commons-math3-3.1.1.jar > commons-net/3.1//commons-net-3.1.jar > commons-pool/1.5.4//commons-pool-1.5.4.jar > curator-client/2.7.1//curator-client-2.7.1.jar > curator-framework/2.7.1//curator-framework-2.7.1.jar > curator-recipes/2.7.1//curator-recipes-2.7.1.jar > datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar > datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar > datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar > derby/10.10.2.0//derby-10.10.2.0.jar > disruptor/3.3.0//disruptor-3.3.0.jar > dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar > eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar > fastutil/7.0.13//fastutil-7.0.13.jar > findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar > fluent-hc/4.4.1//fluent-hc-4.4.1.jar > groovy-all/2.4.4//groovy-all-2.4.4.jar > gson/2.3.1//gson-2.3.1.jar > guava/14.0.1//guava-14.0.1.jar > guice-assistedinject/3.0//guice-assistedinject-3.0.jar > guice-servlet/3.0//guice-servlet-3.0.jar > guice/3.0//guice-3.0.jar > hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar > hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar > hadoop-client/2.7.3//hadoop-client-2.7.3.jar > hadoop-common/2.7.3//hadoop-common-2.7.3.jar > hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar > hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar > hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar > hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar > hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar > hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar > hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar > hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar > hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar > hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar > hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.jar > hadoop-yarn-registry/2.7.1//hadoop-yarn-registry
[jira] [Created] (HUDI-2506) Hudi dependency governance
vinoyang created HUDI-2506: -- Summary: Hudi dependency governance Key: HUDI-2506 URL: https://issues.apache.org/jira/browse/HUDI-2506 Project: Apache Hudi Issue Type: Task Components: Usability Reporter: vinoyang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HUDI-2440) Add dependency change diff script for dependency governace
[ https://issues.apache.org/jira/browse/HUDI-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reopened HUDI-2440: > Add dependency change diff script for dependency governace > -- > > Key: HUDI-2440 > URL: https://issues.apache.org/jira/browse/HUDI-2440 > Project: Apache Hudi > Issue Type: Improvement > Components: Usability, Utilities > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > Currently, hudi's dependency management is chaotic, e.g. for > `hudi-spark-bundle_2.11`, the dependency list is here: > {code:java} > HikariCP/2.5.1//HikariCP-2.5.1.jar > ST4/4.0.4//ST4-4.0.4.jar > aircompressor/0.15//aircompressor-0.15.jar > annotations/17.0.0//annotations-17.0.0.jar > ant-launcher/1.9.1//ant-launcher-1.9.1.jar > ant/1.6.5//ant-1.6.5.jar > ant/1.9.1//ant-1.9.1.jar > antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar > aopalliance/1.0//aopalliance-1.0.jar > apache-curator/2.7.1//apache-curator-2.7.1.pom > apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar > apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar > api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar > api-util/1.0.0-M20//api-util-1.0.0-M20.jar > asm/3.1//asm-3.1.jar > avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar > avatica/1.8.0//avatica-1.8.0.jar > avro/1.8.2//avro-1.8.2.jar > bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar > calcite-core/1.10.0//calcite-core-1.10.0.jar > calcite-druid/1.10.0//calcite-druid-1.10.0.jar > calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar > commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar > commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar > commons-cli/1.2//commons-cli-1.2.jar > commons-codec/1.4//commons-codec-1.4.jar > commons-collections/3.2.2//commons-collections-3.2.2.jar > commons-compiler/2.7.6//commons-compiler-2.7.6.jar > commons-compress/1.9//commons-compress-1.9.jar > commons-configuration/1.6//commons-configuration-1.6.jar > commons-daemon/1.0.13//commons-daemon-1.0.13.jar > commons-dbcp/1.4//commons-dbcp-1.4.jar > commons-digester/1.8//commons-digester-1.8.jar > commons-el/1.0//commons-el-1.0.jar > commons-httpclient/3.1//commons-httpclient-3.1.jar > commons-io/2.4//commons-io-2.4.jar > commons-lang/2.6//commons-lang-2.6.jar > commons-lang3/3.1//commons-lang3-3.1.jar > commons-logging/1.2//commons-logging-1.2.jar > commons-math/2.2//commons-math-2.2.jar > commons-math3/3.1.1//commons-math3-3.1.1.jar > commons-net/3.1//commons-net-3.1.jar > commons-pool/1.5.4//commons-pool-1.5.4.jar > curator-client/2.7.1//curator-client-2.7.1.jar > curator-framework/2.7.1//curator-framework-2.7.1.jar > curator-recipes/2.7.1//curator-recipes-2.7.1.jar > datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar > datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar > datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar > derby/10.10.2.0//derby-10.10.2.0.jar > disruptor/3.3.0//disruptor-3.3.0.jar > dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar > eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar > fastutil/7.0.13//fastutil-7.0.13.jar > findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar > fluent-hc/4.4.1//fluent-hc-4.4.1.jar > groovy-all/2.4.4//groovy-all-2.4.4.jar > gson/2.3.1//gson-2.3.1.jar > guava/14.0.1//guava-14.0.1.jar > guice-assistedinject/3.0//guice-assistedinject-3.0.jar > guice-servlet/3.0//guice-servlet-3.0.jar > guice/3.0//guice-3.0.jar > hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar > hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar > hadoop-client/2.7.3//hadoop-client-2.7.3.jar > hadoop-common/2.7.3//hadoop-common-2.7.3.jar > hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar > hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar > hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar > hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar > hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar > hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar > hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar > hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar > hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar > hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar > hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.jar > hadoop-yarn-registry/2.7.1//hadoop-yarn-registry-2.7.1.jar >
[jira] [Updated] (HUDI-2440) Add dependency change diff script for dependency governace
[ https://issues.apache.org/jira/browse/HUDI-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2440: --- Parent: HUDI-2506 Issue Type: Sub-task (was: Improvement) > Add dependency change diff script for dependency governace > -- > > Key: HUDI-2440 > URL: https://issues.apache.org/jira/browse/HUDI-2440 > Project: Apache Hudi > Issue Type: Sub-task > Components: Usability, Utilities > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > Currently, hudi's dependency management is chaotic, e.g. for > `hudi-spark-bundle_2.11`, the dependency list is here: > {code:java} > HikariCP/2.5.1//HikariCP-2.5.1.jar > ST4/4.0.4//ST4-4.0.4.jar > aircompressor/0.15//aircompressor-0.15.jar > annotations/17.0.0//annotations-17.0.0.jar > ant-launcher/1.9.1//ant-launcher-1.9.1.jar > ant/1.6.5//ant-1.6.5.jar > ant/1.9.1//ant-1.9.1.jar > antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar > aopalliance/1.0//aopalliance-1.0.jar > apache-curator/2.7.1//apache-curator-2.7.1.pom > apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar > apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar > api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar > api-util/1.0.0-M20//api-util-1.0.0-M20.jar > asm/3.1//asm-3.1.jar > avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar > avatica/1.8.0//avatica-1.8.0.jar > avro/1.8.2//avro-1.8.2.jar > bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar > calcite-core/1.10.0//calcite-core-1.10.0.jar > calcite-druid/1.10.0//calcite-druid-1.10.0.jar > calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar > commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar > commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar > commons-cli/1.2//commons-cli-1.2.jar > commons-codec/1.4//commons-codec-1.4.jar > commons-collections/3.2.2//commons-collections-3.2.2.jar > commons-compiler/2.7.6//commons-compiler-2.7.6.jar > commons-compress/1.9//commons-compress-1.9.jar > commons-configuration/1.6//commons-configuration-1.6.jar > commons-daemon/1.0.13//commons-daemon-1.0.13.jar > commons-dbcp/1.4//commons-dbcp-1.4.jar > commons-digester/1.8//commons-digester-1.8.jar > commons-el/1.0//commons-el-1.0.jar > commons-httpclient/3.1//commons-httpclient-3.1.jar > commons-io/2.4//commons-io-2.4.jar > commons-lang/2.6//commons-lang-2.6.jar > commons-lang3/3.1//commons-lang3-3.1.jar > commons-logging/1.2//commons-logging-1.2.jar > commons-math/2.2//commons-math-2.2.jar > commons-math3/3.1.1//commons-math3-3.1.1.jar > commons-net/3.1//commons-net-3.1.jar > commons-pool/1.5.4//commons-pool-1.5.4.jar > curator-client/2.7.1//curator-client-2.7.1.jar > curator-framework/2.7.1//curator-framework-2.7.1.jar > curator-recipes/2.7.1//curator-recipes-2.7.1.jar > datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar > datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar > datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar > derby/10.10.2.0//derby-10.10.2.0.jar > disruptor/3.3.0//disruptor-3.3.0.jar > dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar > eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar > fastutil/7.0.13//fastutil-7.0.13.jar > findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar > fluent-hc/4.4.1//fluent-hc-4.4.1.jar > groovy-all/2.4.4//groovy-all-2.4.4.jar > gson/2.3.1//gson-2.3.1.jar > guava/14.0.1//guava-14.0.1.jar > guice-assistedinject/3.0//guice-assistedinject-3.0.jar > guice-servlet/3.0//guice-servlet-3.0.jar > guice/3.0//guice-3.0.jar > hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar > hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar > hadoop-client/2.7.3//hadoop-client-2.7.3.jar > hadoop-common/2.7.3//hadoop-common-2.7.3.jar > hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar > hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar > hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar > hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar > hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar > hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar > hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar > hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar > hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar > hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar > hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.ja
[jira] [Closed] (HUDI-2440) Add dependency change diff script for dependency governace
[ https://issues.apache.org/jira/browse/HUDI-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2440. -- Resolution: Implemented 47ed91799943271f219419cf209793a98b3f09b5 > Add dependency change diff script for dependency governace > -- > > Key: HUDI-2440 > URL: https://issues.apache.org/jira/browse/HUDI-2440 > Project: Apache Hudi > Issue Type: Improvement > Components: Usability, Utilities > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > Currently, hudi's dependency management is chaotic, e.g. for > `hudi-spark-bundle_2.11`, the dependency list is here: > {code:java} > HikariCP/2.5.1//HikariCP-2.5.1.jar > ST4/4.0.4//ST4-4.0.4.jar > aircompressor/0.15//aircompressor-0.15.jar > annotations/17.0.0//annotations-17.0.0.jar > ant-launcher/1.9.1//ant-launcher-1.9.1.jar > ant/1.6.5//ant-1.6.5.jar > ant/1.9.1//ant-1.9.1.jar > antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar > aopalliance/1.0//aopalliance-1.0.jar > apache-curator/2.7.1//apache-curator-2.7.1.pom > apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar > apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar > api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar > api-util/1.0.0-M20//api-util-1.0.0-M20.jar > asm/3.1//asm-3.1.jar > avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar > avatica/1.8.0//avatica-1.8.0.jar > avro/1.8.2//avro-1.8.2.jar > bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar > calcite-core/1.10.0//calcite-core-1.10.0.jar > calcite-druid/1.10.0//calcite-druid-1.10.0.jar > calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar > commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar > commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar > commons-cli/1.2//commons-cli-1.2.jar > commons-codec/1.4//commons-codec-1.4.jar > commons-collections/3.2.2//commons-collections-3.2.2.jar > commons-compiler/2.7.6//commons-compiler-2.7.6.jar > commons-compress/1.9//commons-compress-1.9.jar > commons-configuration/1.6//commons-configuration-1.6.jar > commons-daemon/1.0.13//commons-daemon-1.0.13.jar > commons-dbcp/1.4//commons-dbcp-1.4.jar > commons-digester/1.8//commons-digester-1.8.jar > commons-el/1.0//commons-el-1.0.jar > commons-httpclient/3.1//commons-httpclient-3.1.jar > commons-io/2.4//commons-io-2.4.jar > commons-lang/2.6//commons-lang-2.6.jar > commons-lang3/3.1//commons-lang3-3.1.jar > commons-logging/1.2//commons-logging-1.2.jar > commons-math/2.2//commons-math-2.2.jar > commons-math3/3.1.1//commons-math3-3.1.1.jar > commons-net/3.1//commons-net-3.1.jar > commons-pool/1.5.4//commons-pool-1.5.4.jar > curator-client/2.7.1//curator-client-2.7.1.jar > curator-framework/2.7.1//curator-framework-2.7.1.jar > curator-recipes/2.7.1//curator-recipes-2.7.1.jar > datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar > datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar > datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar > derby/10.10.2.0//derby-10.10.2.0.jar > disruptor/3.3.0//disruptor-3.3.0.jar > dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar > eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar > fastutil/7.0.13//fastutil-7.0.13.jar > findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar > fluent-hc/4.4.1//fluent-hc-4.4.1.jar > groovy-all/2.4.4//groovy-all-2.4.4.jar > gson/2.3.1//gson-2.3.1.jar > guava/14.0.1//guava-14.0.1.jar > guice-assistedinject/3.0//guice-assistedinject-3.0.jar > guice-servlet/3.0//guice-servlet-3.0.jar > guice/3.0//guice-3.0.jar > hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar > hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar > hadoop-client/2.7.3//hadoop-client-2.7.3.jar > hadoop-common/2.7.3//hadoop-common-2.7.3.jar > hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar > hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar > hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar > hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar > hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar > hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar > hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar > hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar > hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar > hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar > hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.ja
[jira] [Updated] (HUDI-2440) Add dependency change diff script for dependency governace
[ https://issues.apache.org/jira/browse/HUDI-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2440: --- Fix Version/s: 0.10.0 > Add dependency change diff script for dependency governace > -- > > Key: HUDI-2440 > URL: https://issues.apache.org/jira/browse/HUDI-2440 > Project: Apache Hudi > Issue Type: Improvement > Components: Usability, Utilities > Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > Currently, hudi's dependency management is chaotic, e.g. for > `hudi-spark-bundle_2.11`, the dependency list is here: > {code:java} > HikariCP/2.5.1//HikariCP-2.5.1.jar > ST4/4.0.4//ST4-4.0.4.jar > aircompressor/0.15//aircompressor-0.15.jar > annotations/17.0.0//annotations-17.0.0.jar > ant-launcher/1.9.1//ant-launcher-1.9.1.jar > ant/1.6.5//ant-1.6.5.jar > ant/1.9.1//ant-1.9.1.jar > antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar > aopalliance/1.0//aopalliance-1.0.jar > apache-curator/2.7.1//apache-curator-2.7.1.pom > apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar > apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar > api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar > api-util/1.0.0-M20//api-util-1.0.0-M20.jar > asm/3.1//asm-3.1.jar > avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar > avatica/1.8.0//avatica-1.8.0.jar > avro/1.8.2//avro-1.8.2.jar > bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar > calcite-core/1.10.0//calcite-core-1.10.0.jar > calcite-druid/1.10.0//calcite-druid-1.10.0.jar > calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar > commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar > commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar > commons-cli/1.2//commons-cli-1.2.jar > commons-codec/1.4//commons-codec-1.4.jar > commons-collections/3.2.2//commons-collections-3.2.2.jar > commons-compiler/2.7.6//commons-compiler-2.7.6.jar > commons-compress/1.9//commons-compress-1.9.jar > commons-configuration/1.6//commons-configuration-1.6.jar > commons-daemon/1.0.13//commons-daemon-1.0.13.jar > commons-dbcp/1.4//commons-dbcp-1.4.jar > commons-digester/1.8//commons-digester-1.8.jar > commons-el/1.0//commons-el-1.0.jar > commons-httpclient/3.1//commons-httpclient-3.1.jar > commons-io/2.4//commons-io-2.4.jar > commons-lang/2.6//commons-lang-2.6.jar > commons-lang3/3.1//commons-lang3-3.1.jar > commons-logging/1.2//commons-logging-1.2.jar > commons-math/2.2//commons-math-2.2.jar > commons-math3/3.1.1//commons-math3-3.1.1.jar > commons-net/3.1//commons-net-3.1.jar > commons-pool/1.5.4//commons-pool-1.5.4.jar > curator-client/2.7.1//curator-client-2.7.1.jar > curator-framework/2.7.1//curator-framework-2.7.1.jar > curator-recipes/2.7.1//curator-recipes-2.7.1.jar > datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar > datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar > datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar > derby/10.10.2.0//derby-10.10.2.0.jar > disruptor/3.3.0//disruptor-3.3.0.jar > dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar > eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar > fastutil/7.0.13//fastutil-7.0.13.jar > findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar > fluent-hc/4.4.1//fluent-hc-4.4.1.jar > groovy-all/2.4.4//groovy-all-2.4.4.jar > gson/2.3.1//gson-2.3.1.jar > guava/14.0.1//guava-14.0.1.jar > guice-assistedinject/3.0//guice-assistedinject-3.0.jar > guice-servlet/3.0//guice-servlet-3.0.jar > guice/3.0//guice-3.0.jar > hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar > hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar > hadoop-client/2.7.3//hadoop-client-2.7.3.jar > hadoop-common/2.7.3//hadoop-common-2.7.3.jar > hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar > hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar > hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar > hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar > hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar > hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar > hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar > hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar > hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar > hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar > hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.jar > hadoop-yarn-registry/2.7.1//hadoop-yarn
[hudi] branch master updated: [HUDI-2440] Add dependency change diff script for dependency governace (#3674)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 47ed917 [HUDI-2440] Add dependency change diff script for dependency governace (#3674) 47ed917 is described below commit 47ed91799943271f219419cf209793a98b3f09b5 Author: vinoyang AuthorDate: Thu Sep 30 16:56:11 2021 +0800 [HUDI-2440] Add dependency change diff script for dependency governace (#3674) --- dev/dependencyList_hudi-flink-bundle_2.11.txt | 296 +++ dev/dependencyList_hudi-presto-bundle.txt | 132 + dev/dependencyList_hudi-spark-bundle_2.11.txt | 262 + dev/dependencyList_hudi-timeline-server-bundle.txt | 144 + dev/dependencyList_hudi-utilities-bundle_2.11.txt | 324 + scripts/dependency.sh | 127 6 files changed, 1285 insertions(+) diff --git a/dev/dependencyList_hudi-flink-bundle_2.11.txt b/dev/dependencyList_hudi-flink-bundle_2.11.txt new file mode 100644 index 000..b97995c --- /dev/null +++ b/dev/dependencyList_hudi-flink-bundle_2.11.txt @@ -0,0 +1,296 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +HikariCP/com.zaxxer/2.5.1//HikariCP-2.5.1.jar +ST4/org.antlr/4.0.4//ST4-4.0.4.jar +aircompressor/io.airlift/0.15//aircompressor-0.15.jar +akka-actor_2.11/com.typesafe.akka/2.5.21//akka-actor_2.11-2.5.21.jar +akka-protobuf_2.11/com.typesafe.akka/2.5.21//akka-protobuf_2.11-2.5.21.jar +akka-slf4j_2.11/com.typesafe.akka/2.5.21//akka-slf4j_2.11-2.5.21.jar +akka-stream_2.11/com.typesafe.akka/2.5.21//akka-stream_2.11-2.5.21.jar +annotations/org.jetbrains/17.0.0//annotations-17.0.0.jar +ant-launcher/org.apache.ant/1.9.1//ant-launcher-1.9.1.jar +ant/ant/1.6.5//ant-1.6.5.jar +ant/org.apache.ant/1.9.1//ant-1.9.1.jar +antlr-runtime/org.antlr/3.5.2//antlr-runtime-3.5.2.jar +aopalliance/aopalliance/1.0//aopalliance-1.0.jar +apache-curator/org.apache.curator/2.7.1//apache-curator-2.7.1.pom +apacheds-i18n/org.apache.directory.server/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar +apacheds-kerberos-codec/org.apache.directory.server/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar +api-asn1-api/org.apache.directory.api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar +api-util/org.apache.directory.api/1.0.0-M20//api-util-1.0.0-M20.jar +asm/asm/3.1//asm-3.1.jar +audience-annotations/org.apache.yetus/0.11.0//audience-annotations-0.11.0.jar +avatica-metrics/org.apache.calcite.avatica/1.8.0//avatica-metrics-1.8.0.jar +avatica/org.apache.calcite.avatica/1.8.0//avatica-1.8.0.jar +avro/org.apache.avro/1.10.0//avro-1.10.0.jar +bijection-avro_2.11/com.twitter/0.9.7//bijection-avro_2.11-0.9.7.jar +bijection-core_2.11/com.twitter/0.9.7//bijection-core_2.11-0.9.7.jar +bonecp/com.jolbox/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar +calcite-core/org.apache.calcite/1.10.0//calcite-core-1.10.0.jar +calcite-druid/org.apache.calcite/1.10.0//calcite-druid-1.10.0.jar +calcite-linq4j/org.apache.calcite/1.10.0//calcite-linq4j-1.10.0.jar +chill-java/com.twitter/0.7.6//chill-java-0.7.6.jar +chill_2.11/com.twitter/0.7.6//chill_2.11-0.7.6.jar +commons-beanutils-core/commons-beanutils/1.8.0//commons-beanutils-core-1.8.0.jar +commons-beanutils/commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar +commons-cli/commons-cli/1.2//commons-cli-1.2.jar +commons-codec/commons-codec/1.4//commons-codec-1.4.jar +commons-collections/commons-collections/3.2.2//commons-collections-3.2.2.jar +commons-compiler/org.codehaus.janino/2.7.6//commons-compiler-2.7.6.jar +commons-compress/org.apache.commons/1.20//commons-compress-1.20.jar +commons-configuration/commons-configuration/1.6//commons-configuration-1.6.jar +commons-daemon/commons-daemon/1.0.13//commons-daemon-1.0.13.jar +commons-dbcp/commons-dbcp/1.4//commons-dbcp-1.4.jar +commons-digester/commons-digester/1.8//commons-digester-1.8.jar +commons-el/commons-el/1.0//commons-el-1.0.jar +commons-httpclient/commons-httpclient/3.0.1//commons-httpclient-3.0.1.jar +commons-io/commons-io/2.4//commons-io-2.4.jar +commons-lang/commons-lang/2.6//commons-lang-2.6.jar +commons-lang3/or
[hudi] branch master updated (dd1bd62 -> 2f07e12)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from dd1bd62 [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource (#3413) add 2f07e12 [MINOR] Fix typo Hooodie corrected to Hoodie & reuqired corrected to required (#3730) No new revisions were added by this update. Summary of changes: .../src/main/java/org/apache/hudi/internal/DefaultSource.java | 2 +- .../src/main/java/org/apache/hudi/spark3/internal/DefaultSource.java| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
[jira] [Closed] (HUDI-2487) An empty message in Kafka causes a task exception
[ https://issues.apache.org/jira/browse/HUDI-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2487. -- Fix Version/s: (was: 0.9.0) 0.10.0 Resolution: Implemented 9067657a5ff313990c819065ad12d71fa8bb0f06 > An empty message in Kafka causes a task exception > - > > Key: HUDI-2487 > URL: https://issues.apache.org/jira/browse/HUDI-2487 > Project: Apache Hudi > Issue Type: Improvement > Components: DeltaStreamer >Reporter: qianchutao >Assignee: qianchutao >Priority: Major > Labels: easyfix, newbie, pull-request-available > Fix For: 0.10.0 > > Original Estimate: 24h > Remaining Estimate: 24h > > h1. Question: > When I use deltaStreamer to update hive tables in upsert mode from json > data in Kafka to HUDi, if the value of the message body in Kafka is null, the > task throws an exception. > h2. Exception description: > Lost task 0.1 in stage 2.0 (TID 24, > node-group-1UtpO.1f562475-6982-4b16-a50d-d19b0ebff950.com, executor 6): > org.apache.hudi.exception.HoodieException: The value of tmSmp can not be null > at > org.apache.hudi.avro.HoodieAvroUtils.getNestedFieldVal(HoodieAvroUtils.java:463) > at > org.apache.hudi.utilities.deltastreamer.DeltaSync.lambda$readFromSource$d62e16$1(DeltaSync.java:389) > at > org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:410) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:410) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:196) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:62) > at > org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:58) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55) > at org.apache.spark.scheduler.Task.run(Task.scala:123) > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:413) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1551) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:419) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > h1. The task Settings: > > {code:java} > hoodie.datasource.write.precombine.field=tmSmp > hoodie.datasource.write.recordkey.field=subOrderId,activityId,ticketId > hoodie.datasource.hive_sync.partition_fields=db,dt > hoodie.datasource.write.partitionpath.field=db:SIMPLE,dt:SIMPLE > hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator > hoodie.datasource.hive_sync.enable=true > hoodie.datasource.meta.sync.enable=true > hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.MultiPartKeysValueExtractor > hoodie.datasource.hive_sync.support_timestamp=true > hoodie.datasource.hive_sync.auto_create_database=true > hoodie.meta.sync.client.tool.class=org.apache.hudi.hive.HiveSyncTool > hoodie.datasource.hive_sync.base_file_format=PARQUET > {code} > > > h1. Spark-submit Script parameter Settings: > > {code:java} > --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer > --source-ordering-field tmSmp \ > --table-type MERGE_ON_READ \ > --target-table ${TABLE_NAME} \ > --source-class org.apache.hudi.utilities.sources.JsonKafkaSource \ > --schemaprovider-class > org.apache.hudi.utilities.schema.FilebasedSchemaProvider \ > --enable-sync \ > --op UPSERT \ > --continuous \ > {code} > > > So I think some optimizations can be made to prevent task throwing, > such as filtering messages with a null value in Kafka. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated: [HUDI-2487] Fix JsonKafkaSource cannot filter empty messages from kafka (#3715)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 9067657 [HUDI-2487] Fix JsonKafkaSource cannot filter empty messages from kafka (#3715) 9067657 is described below commit 9067657a5ff313990c819065ad12d71fa8bb0f06 Author: qianchutao <72595723+qianchu...@users.noreply.github.com> AuthorDate: Tue Sep 28 13:47:15 2021 +0800 [HUDI-2487] Fix JsonKafkaSource cannot filter empty messages from kafka (#3715) --- .../hudi/utilities/sources/JsonKafkaSource.java| 6 +- .../utilities/sources/TestJsonKafkaSource.java | 22 ++ 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/JsonKafkaSource.java b/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/JsonKafkaSource.java index cf9e905..39340d0 100644 --- a/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/JsonKafkaSource.java +++ b/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/JsonKafkaSource.java @@ -69,7 +69,11 @@ public class JsonKafkaSource extends JsonSource { private JavaRDD toRDD(OffsetRange[] offsetRanges) { return KafkaUtils.createRDD(sparkContext, offsetGen.getKafkaParams(), offsetRanges, -LocationStrategies.PreferConsistent()).map(x -> (String) x.value()); +LocationStrategies.PreferConsistent()).filter(x -> { + String msgValue = (String) x.value(); + //Filter null messages from Kafka to prevent Exceptions + return msgValue != null; +}).map(x -> (String) x.value()); } @Override diff --git a/hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestJsonKafkaSource.java b/hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestJsonKafkaSource.java index da11035..2ed4c42 100644 --- a/hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestJsonKafkaSource.java +++ b/hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestJsonKafkaSource.java @@ -151,6 +151,28 @@ public class TestJsonKafkaSource extends UtilitiesTestBase { assertEquals(Option.empty(), fetch4AsRows.getBatch()); } + // test whether empty messages can be filtered + @Test + public void testJsonKafkaSourceFilterNullMsg() { +// topic setup. +testUtils.createTopic(TEST_TOPIC_NAME, 2); +HoodieTestDataGenerator dataGenerator = new HoodieTestDataGenerator(); +TypedProperties props = createPropsForJsonSource(null, "earliest"); + +Source jsonSource = new JsonKafkaSource(props, jsc, sparkSession, schemaProvider, metrics); +SourceFormatAdapter kafkaSource = new SourceFormatAdapter(jsonSource); + +// 1. Extract without any checkpoint => get all the data, respecting sourceLimit +assertEquals(Option.empty(), kafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE).getBatch()); +// Send 1000 non-null messages to Kafka +testUtils.sendMessages(TEST_TOPIC_NAME, Helpers.jsonifyRecords(dataGenerator.generateInserts("000", 1000))); +// Send 100 null messages to Kafka +testUtils.sendMessages(TEST_TOPIC_NAME,new String[100]); +InputBatch> fetch1 = kafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE); +// Verify that messages with null values are filtered +assertEquals(1000, fetch1.getBatch().get().count()); + } + // test case with kafka offset reset strategy @Test public void testJsonKafkaSourceResetStrategy() {
[jira] [Closed] (HUDI-2447) Extract common parts from 'if' & Fix typo
[ https://issues.apache.org/jira/browse/HUDI-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2447. -- Resolution: Done > Extract common parts from 'if' & Fix typo > - > > Key: HUDI-2447 > URL: https://issues.apache.org/jira/browse/HUDI-2447 > Project: Apache Hudi > Issue Type: Improvement > Components: Hive Integration >Reporter: 董可伦 >Assignee: 董可伦 >Priority: Minor > Labels: pull-request-available > Fix For: 0.10.0 > > > Extract common parts from 'if' & Fix typo -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2447) Extract common parts from 'if' & Fix typo
[ https://issues.apache.org/jira/browse/HUDI-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2447: --- Priority: Minor (was: Major) > Extract common parts from 'if' & Fix typo > - > > Key: HUDI-2447 > URL: https://issues.apache.org/jira/browse/HUDI-2447 > Project: Apache Hudi > Issue Type: Improvement > Components: Hive Integration >Reporter: 董可伦 >Assignee: 董可伦 >Priority: Minor > Labels: pull-request-available > Fix For: 0.10.0 > > > Extract common parts from 'if' & Fix typo -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (61d0096 -> 3a150ee)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 61d0096 [HUDI-2434] Make periodSeconds of GraphiteReporter configurable (#3667) add 3a150ee [HUDI-2447] Extract common business logic & Fix typo (#3683) No new revisions were added by this update. Summary of changes: .../main/java/org/apache/hudi/dla/DLASyncTool.java | 28 ++ 1 file changed, 12 insertions(+), 16 deletions(-)
[jira] [Closed] (HUDI-2434) Add GraphiteReporter reporter periodSeconds config
[ https://issues.apache.org/jira/browse/HUDI-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2434. -- Resolution: Done > Add GraphiteReporter reporter periodSeconds config > -- > > Key: HUDI-2434 > URL: https://issues.apache.org/jira/browse/HUDI-2434 > Project: Apache Hudi > Issue Type: Improvement >Reporter: liujinhui >Assignee: liujinhui >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated: [HUDI-2434] Make periodSeconds of GraphiteReporter configurable (#3667)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 61d0096 [HUDI-2434] Make periodSeconds of GraphiteReporter configurable (#3667) 61d0096 is described below commit 61d009608899bc70c1372d5cb00a2f35e188c30c Author: liujinhui <965147...@qq.com> AuthorDate: Fri Sep 17 19:39:55 2021 +0800 [HUDI-2434] Make periodSeconds of GraphiteReporter configurable (#3667) --- .../org/apache/hudi/config/HoodieWriteConfig.java | 4 ++ .../metrics/HoodieMetricsGraphiteConfig.java | 11 .../hudi/metrics/MetricsGraphiteReporter.java | 4 +- .../hudi/metrics/TestHoodieGraphiteMetrics.java| 60 ++ 4 files changed, 78 insertions(+), 1 deletion(-) diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java index c871253..7f0ec10 100644 --- a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java +++ b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java @@ -1475,6 +1475,10 @@ public class HoodieWriteConfig extends HoodieConfig { return getString(HoodieMetricsGraphiteConfig.GRAPHITE_METRIC_PREFIX_VALUE); } + public int getGraphiteReportPeriodSeconds() { +return getInt(HoodieMetricsGraphiteConfig.GRAPHITE_REPORT_PERIOD_IN_SECONDS); + } + public String getJmxHost() { return getString(HoodieMetricsJmxConfig.JMX_HOST_NAME); } diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/metrics/HoodieMetricsGraphiteConfig.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/metrics/HoodieMetricsGraphiteConfig.java index 12987a7..25c4c6a 100644 --- a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/metrics/HoodieMetricsGraphiteConfig.java +++ b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/metrics/HoodieMetricsGraphiteConfig.java @@ -61,6 +61,12 @@ public class HoodieMetricsGraphiteConfig extends HoodieConfig { .sinceVersion("0.5.1") .withDocumentation("Standard prefix applied to all metrics. This helps to add datacenter, environment information for e.g"); + public static final ConfigProperty GRAPHITE_REPORT_PERIOD_IN_SECONDS = ConfigProperty + .key(GRAPHITE_PREFIX + ".report.period.seconds") + .defaultValue(30) + .sinceVersion("0.10.0") + .withDocumentation("Graphite reporting period in seconds. Default to 30."); + /** * @deprecated Use {@link #GRAPHITE_SERVER_HOST_NAME} and its methods instead */ @@ -126,6 +132,11 @@ public class HoodieMetricsGraphiteConfig extends HoodieConfig { return this; } +public HoodieMetricsGraphiteConfig.Builder periodSeconds(String periodSeconds) { + hoodieMetricsGraphiteConfig.setValue(GRAPHITE_REPORT_PERIOD_IN_SECONDS, periodSeconds); + return this; +} + public HoodieMetricsGraphiteConfig build() { hoodieMetricsGraphiteConfig.setDefaults(HoodieMetricsGraphiteConfig.class.getName()); return hoodieMetricsGraphiteConfig; diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/MetricsGraphiteReporter.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/MetricsGraphiteReporter.java index 9855ac0..c6dff8f 100644 --- a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/MetricsGraphiteReporter.java +++ b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/MetricsGraphiteReporter.java @@ -42,6 +42,7 @@ public class MetricsGraphiteReporter extends MetricsReporter { private final HoodieWriteConfig config; private String serverHost; private int serverPort; + private final int periodSeconds; public MetricsGraphiteReporter(HoodieWriteConfig config, MetricRegistry registry) { this.registry = registry; @@ -56,12 +57,13 @@ public class MetricsGraphiteReporter extends MetricsReporter { } this.graphiteReporter = createGraphiteReport(); +this.periodSeconds = config.getGraphiteReportPeriodSeconds(); } @Override public void start() { if (graphiteReporter != null) { - graphiteReporter.start(30, TimeUnit.SECONDS); + graphiteReporter.start(periodSeconds, TimeUnit.SECONDS); } else { LOG.error("Cannot start as the graphiteReporter is null."); } diff --git a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/metrics/TestHoodieGraphiteMetrics.java b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/metrics/TestHoodieGraphiteMetrics.java new file mode 100644 index 000..6ff7ee8 --- /dev/null +
[jira] [Created] (HUDI-2440) Add dependency change diff script for dependency governace
vinoyang created HUDI-2440: -- Summary: Add dependency change diff script for dependency governace Key: HUDI-2440 URL: https://issues.apache.org/jira/browse/HUDI-2440 Project: Apache Hudi Issue Type: Improvement Components: Utilities Reporter: vinoyang Assignee: vinoyang Currently, hudi's dependency management is chaotic, e.g. for `hudi-spark-bundle_2.11`, the dependency list is here: {code:java} HikariCP/2.5.1//HikariCP-2.5.1.jar ST4/4.0.4//ST4-4.0.4.jar aircompressor/0.15//aircompressor-0.15.jar annotations/17.0.0//annotations-17.0.0.jar ant-launcher/1.9.1//ant-launcher-1.9.1.jar ant/1.6.5//ant-1.6.5.jar ant/1.9.1//ant-1.9.1.jar antlr-runtime/3.5.2//antlr-runtime-3.5.2.jar aopalliance/1.0//aopalliance-1.0.jar apache-curator/2.7.1//apache-curator-2.7.1.pom apacheds-i18n/2.0.0-M15//apacheds-i18n-2.0.0-M15.jar apacheds-kerberos-codec/2.0.0-M15//apacheds-kerberos-codec-2.0.0-M15.jar api-asn1-api/1.0.0-M20//api-asn1-api-1.0.0-M20.jar api-util/1.0.0-M20//api-util-1.0.0-M20.jar asm/3.1//asm-3.1.jar avatica-metrics/1.8.0//avatica-metrics-1.8.0.jar avatica/1.8.0//avatica-1.8.0.jar avro/1.8.2//avro-1.8.2.jar bonecp/0.8.0.RELEASE//bonecp-0.8.0.RELEASE.jar calcite-core/1.10.0//calcite-core-1.10.0.jar calcite-druid/1.10.0//calcite-druid-1.10.0.jar calcite-linq4j/1.10.0//calcite-linq4j-1.10.0.jar commons-beanutils-core/1.8.0//commons-beanutils-core-1.8.0.jar commons-beanutils/1.7.0//commons-beanutils-1.7.0.jar commons-cli/1.2//commons-cli-1.2.jar commons-codec/1.4//commons-codec-1.4.jar commons-collections/3.2.2//commons-collections-3.2.2.jar commons-compiler/2.7.6//commons-compiler-2.7.6.jar commons-compress/1.9//commons-compress-1.9.jar commons-configuration/1.6//commons-configuration-1.6.jar commons-daemon/1.0.13//commons-daemon-1.0.13.jar commons-dbcp/1.4//commons-dbcp-1.4.jar commons-digester/1.8//commons-digester-1.8.jar commons-el/1.0//commons-el-1.0.jar commons-httpclient/3.1//commons-httpclient-3.1.jar commons-io/2.4//commons-io-2.4.jar commons-lang/2.6//commons-lang-2.6.jar commons-lang3/3.1//commons-lang3-3.1.jar commons-logging/1.2//commons-logging-1.2.jar commons-math/2.2//commons-math-2.2.jar commons-math3/3.1.1//commons-math3-3.1.1.jar commons-net/3.1//commons-net-3.1.jar commons-pool/1.5.4//commons-pool-1.5.4.jar curator-client/2.7.1//curator-client-2.7.1.jar curator-framework/2.7.1//curator-framework-2.7.1.jar curator-recipes/2.7.1//curator-recipes-2.7.1.jar datanucleus-api-jdo/4.2.4//datanucleus-api-jdo-4.2.4.jar datanucleus-core/4.1.17//datanucleus-core-4.1.17.jar datanucleus-rdbms/4.1.19//datanucleus-rdbms-4.1.19.jar derby/10.10.2.0//derby-10.10.2.0.jar disruptor/3.3.0//disruptor-3.3.0.jar dropwizard-metrics-hadoop-metrics2-reporter/0.1.2//dropwizard-metrics-hadoop-metrics2-reporter-0.1.2.jar eigenbase-properties/1.1.5//eigenbase-properties-1.1.5.jar fastutil/7.0.13//fastutil-7.0.13.jar findbugs-annotations/1.3.9-1//findbugs-annotations-1.3.9-1.jar fluent-hc/4.4.1//fluent-hc-4.4.1.jar groovy-all/2.4.4//groovy-all-2.4.4.jar gson/2.3.1//gson-2.3.1.jar guava/14.0.1//guava-14.0.1.jar guice-assistedinject/3.0//guice-assistedinject-3.0.jar guice-servlet/3.0//guice-servlet-3.0.jar guice/3.0//guice-3.0.jar hadoop-annotations/2.7.3//hadoop-annotations-2.7.3.jar hadoop-auth/2.7.3//hadoop-auth-2.7.3.jar hadoop-client/2.7.3//hadoop-client-2.7.3.jar hadoop-common/2.7.3//hadoop-common-2.7.3.jar hadoop-common/2.7.3/tests/hadoop-common-2.7.3-tests.jar hadoop-hdfs/2.7.3//hadoop-hdfs-2.7.3.jar hadoop-hdfs/2.7.3/tests/hadoop-hdfs-2.7.3-tests.jar hadoop-mapreduce-client-app/2.7.3//hadoop-mapreduce-client-app-2.7.3.jar hadoop-mapreduce-client-common/2.7.3//hadoop-mapreduce-client-common-2.7.3.jar hadoop-mapreduce-client-core/2.7.3//hadoop-mapreduce-client-core-2.7.3.jar hadoop-mapreduce-client-jobclient/2.7.3//hadoop-mapreduce-client-jobclient-2.7.3.jar hadoop-mapreduce-client-shuffle/2.7.3//hadoop-mapreduce-client-shuffle-2.7.3.jar hadoop-yarn-api/2.7.3//hadoop-yarn-api-2.7.3.jar hadoop-yarn-client/2.7.3//hadoop-yarn-client-2.7.3.jar hadoop-yarn-common/2.7.3//hadoop-yarn-common-2.7.3.jar hadoop-yarn-registry/2.7.1//hadoop-yarn-registry-2.7.1.jar hadoop-yarn-server-applicationhistoryservice/2.7.2//hadoop-yarn-server-applicationhistoryservice-2.7.2.jar hadoop-yarn-server-common/2.7.2//hadoop-yarn-server-common-2.7.2.jar hadoop-yarn-server-resourcemanager/2.7.2//hadoop-yarn-server-resourcemanager-2.7.2.jar hadoop-yarn-server-web-proxy/2.7.2//hadoop-yarn-server-web-proxy-2.7.2.jar hamcrest-core/1.3//hamcrest-core-1.3.jar hbase-annotations/1.2.3//hbase-annotations-1.2.3.jar hbase-client/1.2.3//hbase-client-1.2.3.jar hbase-common/1.2.3//hbase-common-1.2.3.jar hbase-common/1.2.3/tests/hbase-common-1.2.3-tests.jar hbase-hadoop-compat/1.2.3//hbase-hadoop-compat-1.2.3.jar hbase-hadoop2-compat/1.2.3//hbase-hadoop2-compat-1.2.3.jar hbase-prefix-tree/1.2.3//hbase-prefix-tree-1.2.3.jar hbase
[jira] [Closed] (HUDI-2423) Separate some config logic from HoodieMetricsConfig into HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig
[ https://issues.apache.org/jira/browse/HUDI-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2423. -- Resolution: Done > Separate some config logic from HoodieMetricsConfig into > HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig > --- > > Key: HUDI-2423 > URL: https://issues.apache.org/jira/browse/HUDI-2423 > Project: Apache Hudi > Issue Type: Improvement >Reporter: liujinhui >Assignee: liujinhui >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2423) Separate some config logic from HoodieMetricsConfig into HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig
[ https://issues.apache.org/jira/browse/HUDI-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2423: --- Fix Version/s: 0.10.0 > Separate some config logic from HoodieMetricsConfig into > HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig > --- > > Key: HUDI-2423 > URL: https://issues.apache.org/jira/browse/HUDI-2423 > Project: Apache Hudi > Issue Type: Improvement >Reporter: liujinhui >Assignee: liujinhui >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2423) Separate some config logic from HoodieMetricsConfig into HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig
[ https://issues.apache.org/jira/browse/HUDI-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2423: --- Summary: Separate some config logic from HoodieMetricsConfig into HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig (was: Breakdown HoodieMetricsConfig into HoodieMetricsGraphiteConfig、HoodieMetricsJmxConfig...) > Separate some config logic from HoodieMetricsConfig into > HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig > --- > > Key: HUDI-2423 > URL: https://issues.apache.org/jira/browse/HUDI-2423 > Project: Apache Hudi > Issue Type: Improvement >Reporter: liujinhui >Assignee: liujinhui >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated: [HUDI-2423] Separate some config logic from HoodieMetricsConfig into HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig (#3652)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 2791fb9 [HUDI-2423] Separate some config logic from HoodieMetricsConfig into HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig (#3652) 2791fb9 is described below commit 2791fb9a964b39ef9aaec83eafd080013186b2eb Author: liujinhui <965147...@qq.com> AuthorDate: Thu Sep 16 15:08:10 2021 +0800 [HUDI-2423] Separate some config logic from HoodieMetricsConfig into HoodieMetricsGraphiteConfig HoodieMetricsJmxConfig (#3652) --- .../org/apache/hudi/config/HoodieWriteConfig.java | 29 - .../config/{ => metrics}/HoodieMetricsConfig.java | 112 + .../{ => metrics}/HoodieMetricsDatadogConfig.java | 4 +- .../metrics/HoodieMetricsGraphiteConfig.java | 134 + .../config/metrics/HoodieMetricsJmxConfig.java | 118 ++ .../HoodieMetricsPrometheusConfig.java | 45 ++- .../metadata/HoodieBackedTableMetadataWriter.java | 22 ++-- .../datadog/TestHoodieMetricsDatadogConfig.java| 2 +- .../functional/TestHoodieBackedMetadata.java | 7 +- 9 files changed, 344 insertions(+), 129 deletions(-) diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java index 4df7d0d..c871253 100644 --- a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java +++ b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java @@ -40,6 +40,11 @@ import org.apache.hudi.common.table.timeline.versioning.TimelineLayoutVersion; import org.apache.hudi.common.table.view.FileSystemViewStorageConfig; import org.apache.hudi.common.util.ReflectionUtils; import org.apache.hudi.common.util.ValidationUtils; +import org.apache.hudi.config.metrics.HoodieMetricsConfig; +import org.apache.hudi.config.metrics.HoodieMetricsDatadogConfig; +import org.apache.hudi.config.metrics.HoodieMetricsGraphiteConfig; +import org.apache.hudi.config.metrics.HoodieMetricsJmxConfig; +import org.apache.hudi.config.metrics.HoodieMetricsPrometheusConfig; import org.apache.hudi.execution.bulkinsert.BulkInsertSortMode; import org.apache.hudi.index.HoodieIndex; import org.apache.hudi.keygen.SimpleAvroKeyGenerator; @@ -1459,23 +1464,23 @@ public class HoodieWriteConfig extends HoodieConfig { } public String getGraphiteServerHost() { -return getString(HoodieMetricsConfig.GRAPHITE_SERVER_HOST_NAME); +return getString(HoodieMetricsGraphiteConfig.GRAPHITE_SERVER_HOST_NAME); } public int getGraphiteServerPort() { -return getInt(HoodieMetricsConfig.GRAPHITE_SERVER_PORT_NUM); +return getInt(HoodieMetricsGraphiteConfig.GRAPHITE_SERVER_PORT_NUM); } public String getGraphiteMetricPrefix() { -return getString(HoodieMetricsConfig.GRAPHITE_METRIC_PREFIX_VALUE); +return getString(HoodieMetricsGraphiteConfig.GRAPHITE_METRIC_PREFIX_VALUE); } public String getJmxHost() { -return getString(HoodieMetricsConfig.JMX_HOST_NAME); +return getString(HoodieMetricsJmxConfig.JMX_HOST_NAME); } public String getJmxPort() { -return getString(HoodieMetricsConfig.JMX_PORT_NUM); +return getString(HoodieMetricsJmxConfig.JMX_PORT_NUM); } public int getDatadogReportPeriodSeconds() { @@ -1777,6 +1782,8 @@ public class HoodieWriteConfig extends HoodieConfig { private boolean isMetadataConfigSet = false; private boolean isLockConfigSet = false; private boolean isPreCommitValidationConfigSet = false; +private boolean isMetricsJmxConfigSet = false; +private boolean isMetricsGraphiteConfigSet = false; public Builder withEngineType(EngineType engineType) { this.engineType = engineType; @@ -1931,6 +1938,18 @@ public class HoodieWriteConfig extends HoodieConfig { return this; } +public Builder withMetricsJmxConfig(HoodieMetricsJmxConfig metricsJmxConfig) { + writeConfig.getProps().putAll(metricsJmxConfig.getProps()); + isMetricsJmxConfigSet = true; + return this; +} + +public Builder withMetricsGraphiteConfig(HoodieMetricsGraphiteConfig mericsGraphiteConfig) { + writeConfig.getProps().putAll(mericsGraphiteConfig.getProps()); + isMetricsGraphiteConfigSet = true; + return this; +} + public Builder withPreCommitValidatorConfig(HoodiePreCommitValidatorConfig validatorConfig) { writeConfig.getProps().putAll(validatorConfig.getProps()); isPreCommitValidationConfigSet = true; diff --git a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieMetricsConfig.java b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/co
[hudi] branch master updated (76554aa -> 86a7351)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 76554aa [MINOR] Add document for DataSourceReadOptions (#3653) add 86a7351 [MINOR] Delete Redundant code (#3661) No new revisions were added by this update. Summary of changes: .../src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
[hudi] branch asf-site updated: [MINOR][DOCS] Fixed the broken link on the how to contribute page (#3663)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new cb436e2 [MINOR][DOCS] Fixed the broken link on the how to contribute page (#3663) cb436e2 is described below commit cb436e2d0af007bda2ba9df651f3a58b358695e6 Author: Vinoth Govindarajan AuthorDate: Tue Sep 14 23:44:21 2021 -0700 [MINOR][DOCS] Fixed the broken link on the how to contribute page (#3663) --- website/contribute/how-to-contribute.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/contribute/how-to-contribute.md b/website/contribute/how-to-contribute.md index 1d2ff4d..137e5b2 100644 --- a/website/contribute/how-to-contribute.md +++ b/website/contribute/how-to-contribute.md @@ -33,7 +33,7 @@ Committers are chosen by a majority vote of the Apache Hudi [PMC](https://www.ap ## Code Contributions Useful resources for contributing can be found under the "Quick Links" left menu. -Specifically, please refer to the detailed [contribution guide](/contribute/how-to-contribute). +Specifically, please refer to the detailed [contribution guide](/contribute/developer-setup). ## Accounts
[hudi] branch master updated (627f20f -> 76554aa)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 627f20f [HUDI-2430] Make decimal compatible with hudi for flink writer (#3658) add 76554aa [MINOR] Add document for DataSourceReadOptions (#3653) No new revisions were added by this update. Summary of changes: .../src/main/scala/org/apache/hudi/DataSourceOptions.scala| 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-)
[jira] [Updated] (HUDI-2410) Fix getDefaultBootstrapIndexClass logical error
[ https://issues.apache.org/jira/browse/HUDI-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2410: --- Description: {code:java} public static String getDefaultBootstrapIndexClass(Properties props) { String defaultClass = BOOTSTRAP_INDEX_CLASS_NAME.defaultValue(); if ("false".equalsIgnoreCase(props.getProperty(BOOTSTRAP_INDEX_ENABLE.key({ defaultClass = NO_OP_BOOTSTRAP_INDEX_CLASS; } return defaultClass; } {code} When hoodie.bootstrap.index.enable is not passed, the original logic will follow HFileBootstrapIndex, This should not be judged here was: public static String getDefaultBootstrapIndexClass(Properties props) { String defaultClass = BOOTSTRAP_INDEX_CLASS_NAME.defaultValue(); if ("false".equalsIgnoreCase(props.getProperty(BOOTSTRAP_INDEX_ENABLE.key( { defaultClass = NO_OP_BOOTSTRAP_INDEX_CLASS; } return defaultClass; } When hoodie.bootstrap.index.enable is not passed, the original logic will follow HFileBootstrapIndex, This should not be judged here > Fix getDefaultBootstrapIndexClass logical error > --- > > Key: HUDI-2410 > URL: https://issues.apache.org/jira/browse/HUDI-2410 > Project: Apache Hudi > Issue Type: Bug > Components: bootstrap >Reporter: liujinhui >Assignee: liujinhui >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > > {code:java} > public static String getDefaultBootstrapIndexClass(Properties props) { > String defaultClass = BOOTSTRAP_INDEX_CLASS_NAME.defaultValue(); > if > ("false".equalsIgnoreCase(props.getProperty(BOOTSTRAP_INDEX_ENABLE.key({ > defaultClass = NO_OP_BOOTSTRAP_INDEX_CLASS; > } > return defaultClass; > } > {code} > > When hoodie.bootstrap.index.enable is not passed, the original logic will > follow HFileBootstrapIndex, > This should not be judged here -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-2410) Fix getDefaultBootstrapIndexClass logical error
[ https://issues.apache.org/jira/browse/HUDI-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2410. -- Resolution: Fixed 9f3c4a2a7f565f7bcc32189a202a3d400ece23f1 > Fix getDefaultBootstrapIndexClass logical error > --- > > Key: HUDI-2410 > URL: https://issues.apache.org/jira/browse/HUDI-2410 > Project: Apache Hudi > Issue Type: Bug > Components: bootstrap >Reporter: liujinhui >Assignee: liujinhui >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > public static String getDefaultBootstrapIndexClass(Properties props) { > String defaultClass = BOOTSTRAP_INDEX_CLASS_NAME.defaultValue(); > if > ("false".equalsIgnoreCase(props.getProperty(BOOTSTRAP_INDEX_ENABLE.key( > { defaultClass = NO_OP_BOOTSTRAP_INDEX_CLASS; } > return defaultClass; > } > When hoodie.bootstrap.index.enable is not passed, the original logic will > follow HFileBootstrapIndex, > This should not be judged here -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-2410) Fix getDefaultBootstrapIndexClass logical error
[ https://issues.apache.org/jira/browse/HUDI-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-2410: -- Assignee: liujinhui > Fix getDefaultBootstrapIndexClass logical error > --- > > Key: HUDI-2410 > URL: https://issues.apache.org/jira/browse/HUDI-2410 > Project: Apache Hudi > Issue Type: Bug > Components: bootstrap >Reporter: liujinhui >Assignee: liujinhui >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > public static String getDefaultBootstrapIndexClass(Properties props) { > String defaultClass = BOOTSTRAP_INDEX_CLASS_NAME.defaultValue(); > if > ("false".equalsIgnoreCase(props.getProperty(BOOTSTRAP_INDEX_ENABLE.key( > { defaultClass = NO_OP_BOOTSTRAP_INDEX_CLASS; } > return defaultClass; > } > When hoodie.bootstrap.index.enable is not passed, the original logic will > follow HFileBootstrapIndex, > This should not be judged here -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (c79017c -> 9f3c4a2)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from c79017c [HUDI-2397] Add `--enable-sync` parameter (#3608) add 9f3c4a2 [HUDI-2410] Fix getDefaultBootstrapIndexClass logical error (#3633) No new revisions were added by this update. Summary of changes: .../src/test/java/org/apache/hudi/table/TestCleaner.java | 2 +- .../org/apache/hudi/common/table/HoodieTableConfig.java | 9 + .../apache/hudi/common/table/HoodieTableMetaClient.java | 15 +++ .../common/table/view/TestHoodieTableFileSystemView.java | 7 ++- .../org/apache/hudi/common/testutils/HoodieTestUtils.java | 3 ++- .../java/org/apache/hudi/functional/TestBootstrap.java| 4 ++-- 6 files changed, 31 insertions(+), 9 deletions(-)
[jira] [Closed] (HUDI-2411) Remove unnecessary method overriden and note
[ https://issues.apache.org/jira/browse/HUDI-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2411. -- Resolution: Done 44b9bc145e0d101bcc688f11c6a30ebcbb7a4a7d > Remove unnecessary method overriden and note > - > > Key: HUDI-2411 > URL: https://issues.apache.org/jira/browse/HUDI-2411 > Project: Apache Hudi > Issue Type: Task >Reporter: Xianghu Wang >Assignee: Xianghu Wang >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (512ca42 -> 44b9bc1)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 512ca42 [MINOR] Correct the comment for the parallelism of tasks in FlinkOptions (#3634) add 44b9bc1 [HUDI-2411] Remove unnecessary method overriden and note (#3636) No new revisions were added by this update. Summary of changes: .../hudi/index/bloom/HoodieBaseBloomIndexCheckFunction.java | 13 + 1 file changed, 1 insertion(+), 12 deletions(-)
[hudi] branch master updated: [MINOR] Correct the comment for the parallelism of tasks in FlinkOptions (#3634)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 512ca42 [MINOR] Correct the comment for the parallelism of tasks in FlinkOptions (#3634) 512ca42 is described below commit 512ca42d14a29e5d8da02198345024f2f83999d9 Author: SteNicholas AuthorDate: Fri Sep 10 13:42:11 2021 +0800 [MINOR] Correct the comment for the parallelism of tasks in FlinkOptions (#3634) --- .../src/main/java/org/apache/hudi/configuration/FlinkOptions.java | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java b/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java index 6e0ff52..64b308d 100644 --- a/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java +++ b/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java @@ -325,19 +325,19 @@ public class FlinkOptions extends HoodieConfig { .key("write.index_bootstrap.tasks") .intType() .noDefaultValue() - .withDescription("Parallelism of tasks that do index bootstrap, default is 4"); + .withDescription("Parallelism of tasks that do index bootstrap, default is the parallelism of the environment"); public static final ConfigOption BUCKET_ASSIGN_TASKS = ConfigOptions .key("write.bucket_assign.tasks") .intType() .noDefaultValue() - .withDescription("Parallelism of tasks that do bucket assign, default is 4"); + .withDescription("Parallelism of tasks that do bucket assign, default is the parallelism of the environment"); public static final ConfigOption WRITE_TASKS = ConfigOptions .key("write.tasks") .intType() - .defaultValue(4) - .withDescription("Parallelism of tasks that do actual write, default is 4"); + .noDefaultValue() + .withDescription("Parallelism of tasks that do actual write, default is the parallelism of the environment"); public static final ConfigOption WRITE_TASK_MAX_SIZE = ConfigOptions .key("write.task.max.size")
[hudi] branch master updated: [MINOR] Remove unused variables (#3631)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 4abcb4f [MINOR] Remove unused variables (#3631) 4abcb4f is described below commit 4abcb4f6591448ef1d9bbc9aa237758ae75ecba7 Author: Wei AuthorDate: Thu Sep 9 23:21:16 2021 +0800 [MINOR] Remove unused variables (#3631) --- .../org/apache/hudi/hive/replication/HiveSyncGlobalCommitConfig.java| 2 -- 1 file changed, 2 deletions(-) diff --git a/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/replication/HiveSyncGlobalCommitConfig.java b/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/replication/HiveSyncGlobalCommitConfig.java index bce84e9..c3dd2af 100644 --- a/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/replication/HiveSyncGlobalCommitConfig.java +++ b/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/replication/HiveSyncGlobalCommitConfig.java @@ -46,10 +46,8 @@ public class HiveSyncGlobalCommitConfig extends GlobalHiveSyncConfig { public static String LOCAL_HIVE_SITE_URI = "hivesyncglobal.local_hive_site_uri"; public static String REMOTE_HIVE_SITE_URI = "hivesyncglobal.remote_hive_site_uri"; - public static String CONFIG_FILE_URI = "hivesyncglobal.config_file_uri"; public static String REMOTE_BASE_PATH = "hivesyncglobal.remote_base_path"; public static String LOCAL_BASE_PATH = "hivesyncglobal.local_base_path"; - public static String RETRY_ATTEMPTS = "hivesyncglobal.retry_attempts"; public static String REMOTE_HIVE_SERVER_JDBC_URLS = "hivesyncglobal.remote_hs2_jdbc_urls"; public static String LOCAL_HIVE_SERVER_JDBC_URLS = "hivesyncglobal.local_hs2_jdbc_urls";
[jira] [Updated] (HUDI-2384) Allow log file size more than 2GB
[ https://issues.apache.org/jira/browse/HUDI-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2384: --- Fix Version/s: 0.10.0 > Allow log file size more than 2GB > - > > Key: HUDI-2384 > URL: https://issues.apache.org/jira/browse/HUDI-2384 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core >Reporter: XiaoyuGeng >Assignee: XiaoyuGeng >Priority: Minor > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-2384) Allow log file size more than 2GB
[ https://issues.apache.org/jira/browse/HUDI-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2384. -- Resolution: Done 21fd6edfe7721c674b40877fbbdbac71b36bf782 > Allow log file size more than 2GB > - > > Key: HUDI-2384 > URL: https://issues.apache.org/jira/browse/HUDI-2384 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core >Reporter: XiaoyuGeng >Assignee: XiaoyuGeng >Priority: Minor > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (38c9b85 -> 21fd6ed)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 38c9b85 [HUDI-2280] Use GitHub Actions to build different scala spark versions (#3556) add 21fd6ed [HUDI-2384] Change log file size config to long (#3577) No new revisions were added by this update. Summary of changes: .../src/main/java/org/apache/hudi/config/HoodieStorageConfig.java | 2 +- .../src/main/java/org/apache/hudi/config/HoodieWriteConfig.java | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
[jira] [Closed] (HUDI-2320) Add support ByteArrayDeserializer in AvroKafkaSource
[ https://issues.apache.org/jira/browse/HUDI-2320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2320. -- Resolution: Done bf5a52e51bbeaa089995335a0a4c55884792e505 > Add support ByteArrayDeserializer in AvroKafkaSource > > > Key: HUDI-2320 > URL: https://issues.apache.org/jira/browse/HUDI-2320 > Project: Apache Hudi > Issue Type: Improvement > Components: DeltaStreamer >Reporter: 董可伦 >Assignee: 董可伦 >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > When the 'value.serializer' of Kafka Avro Producer is > 'org.apache.kafka.common.serialization.ByteArraySerializer',Use the following > configuration > {code:java} > --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \ > --schemaprovider-class > org.apache.hudi.utilities.schema.JdbcbasedSchemaProvider \ > --hoodie-conf > "hoodie.deltastreamer.source.kafka.value.deserializer.class=org.apache.kafka.common.serialization.ByteArrayDeserializer" > {code} > For now,It will throw an exception:: > {code:java} > java.lang.ClassCastException: [B cannot be cast to > org.apache.avro.generic.GenericRecord{code} > After support ByteArrayDeserializer,Use the configuration above,It works > properly.And there is no need to provide 'schema.registry.url',For example, > we can use the JdbcbasedSchemaProvider to get the sourceSchema -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated: [HUDI-2320] Add support ByteArrayDeserializer in AvroKafkaSource (#3502)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new bf5a52e [HUDI-2320] Add support ByteArrayDeserializer in AvroKafkaSource (#3502) bf5a52e is described below commit bf5a52e51bbeaa089995335a0a4c55884792e505 Author: 董可伦 AuthorDate: Mon Aug 30 10:01:15 2021 +0800 [HUDI-2320] Add support ByteArrayDeserializer in AvroKafkaSource (#3502) --- hudi-utilities/pom.xml | 2 +- .../hudi/utilities/sources/AvroKafkaSource.java| 22 ++ 2 files changed, 19 insertions(+), 5 deletions(-) diff --git a/hudi-utilities/pom.xml b/hudi-utilities/pom.xml index 4dcc966..089b780 100644 --- a/hudi-utilities/pom.xml +++ b/hudi-utilities/pom.xml @@ -254,7 +254,7 @@ com.twitter bijection-avro_${scala.binary.version} - 0.9.3 + 0.9.7 diff --git a/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/AvroKafkaSource.java b/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/AvroKafkaSource.java index 500c412..ff8ea5a 100644 --- a/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/AvroKafkaSource.java +++ b/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/AvroKafkaSource.java @@ -26,11 +26,13 @@ import org.apache.hudi.exception.HoodieIOException; import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics; import org.apache.hudi.utilities.deser.KafkaAvroSchemaDeserializer; import org.apache.hudi.utilities.schema.SchemaProvider; +import org.apache.hudi.utilities.sources.helpers.AvroConvertor; import org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen; import org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen.CheckpointUtils; import org.apache.avro.generic.GenericRecord; import org.apache.kafka.common.serialization.StringDeserializer; +import org.apache.kafka.common.serialization.ByteArrayDeserializer; import org.apache.log4j.LogManager; import org.apache.log4j.Logger; import org.apache.spark.api.java.JavaRDD; @@ -55,13 +57,15 @@ public class AvroKafkaSource extends AvroSource { private final KafkaOffsetGen offsetGen; private final HoodieDeltaStreamerMetrics metrics; + private final SchemaProvider schemaProvider; + private final String deserializerClassName; public AvroKafkaSource(TypedProperties props, JavaSparkContext sparkContext, SparkSession sparkSession, SchemaProvider schemaProvider, HoodieDeltaStreamerMetrics metrics) { super(props, sparkContext, sparkSession, schemaProvider); props.put(NATIVE_KAFKA_KEY_DESERIALIZER_PROP, StringDeserializer.class); -String deserializerClassName = props.getString(DataSourceWriteOptions.KAFKA_AVRO_VALUE_DESERIALIZER_CLASS().key(), +deserializerClassName = props.getString(DataSourceWriteOptions.KAFKA_AVRO_VALUE_DESERIALIZER_CLASS().key(), DataSourceWriteOptions.KAFKA_AVRO_VALUE_DESERIALIZER_CLASS().defaultValue()); try { @@ -78,6 +82,7 @@ public class AvroKafkaSource extends AvroSource { throw new HoodieException(error, e); } +this.schemaProvider = schemaProvider; this.metrics = metrics; offsetGen = new KafkaOffsetGen(props); } @@ -91,12 +96,21 @@ public class AvroKafkaSource extends AvroSource { return new InputBatch<>(Option.empty(), CheckpointUtils.offsetsToStr(offsetRanges)); } JavaRDD newDataRDD = toRDD(offsetRanges); -return new InputBatch<>(Option.of(newDataRDD), KafkaOffsetGen.CheckpointUtils.offsetsToStr(offsetRanges)); +return new InputBatch<>(Option.of(newDataRDD), CheckpointUtils.offsetsToStr(offsetRanges)); } private JavaRDD toRDD(OffsetRange[] offsetRanges) { -return KafkaUtils.createRDD(sparkContext, offsetGen.getKafkaParams(), offsetRanges, -LocationStrategies.PreferConsistent()).map(obj -> (GenericRecord) obj.value()); +if (deserializerClassName.equals(ByteArrayDeserializer.class.getName())) { + if (schemaProvider == null) { +throw new HoodieException("Please provide a valid schema provider class when use ByteArrayDeserializer!"); + } + AvroConvertor convertor = new AvroConvertor(schemaProvider.getSourceSchema()); + return KafkaUtils.createRDD(sparkContext, offsetGen.getKafkaParams(), offsetRanges, + LocationStrategies.PreferConsistent()).map(obj -> convertor.fromAvroBinary(obj.value())); +} else { + return KafkaUtils.createRDD(sparkContext, offsetGen.getKafkaParams(), offsetRanges, + LocationStrategies.PreferConsistent()).map(obj -> (GenericRecord) obj.value()); +} } @Override
[hudi] branch asf-site updated: [MINOR] Remove link to missing monitoring section (#3424)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new de2ea79 [MINOR] Remove link to missing monitoring section (#3424) de2ea79 is described below commit de2ea7970cdabf20c5a930a948a125cba261da35 Author: Damon P. Cortesi AuthorDate: Mon Aug 9 01:55:20 2021 -0700 [MINOR] Remove link to missing monitoring section (#3424) The monitoring section doesn't exist in `deployment.md` so the link in the TOC was not working. Unsure if it was removed or what happened, but this PR removes the link to the missing section. --- website/docs/deployment.md | 1 - 1 file changed, 1 deletion(-) diff --git a/website/docs/deployment.md b/website/docs/deployment.md index 3b2366a..20bf723 100644 --- a/website/docs/deployment.md +++ b/website/docs/deployment.md @@ -13,7 +13,6 @@ Specifically, we will cover the following aspects. - [Upgrading Versions](#upgrading) : Picking up new releases of Hudi, guidelines and general best-practices. - [Migrating to Hudi](#migrating) : How to migrate your existing tables to Apache Hudi. - [Interacting via CLI](#cli) : Using the CLI to perform maintenance or deeper introspection. - - [Monitoring](#monitoring) : Tracking metrics from your hudi tables using popular tools. - [Troubleshooting](#troubleshooting) : Uncovering, triaging and resolving issues in production. ## Deploying
[jira] [Closed] (HUDI-2225) Add compaction example
[ https://issues.apache.org/jira/browse/HUDI-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2225. -- Fix Version/s: 0.9.0 Resolution: Done aa857beee00a764cee90d6e790ee4b0ab4ad4862 > Add compaction example > -- > > Key: HUDI-2225 > URL: https://issues.apache.org/jira/browse/HUDI-2225 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Sagar Sumit >Assignee: Sagar Sumit >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (b21ae68 -> aa857be)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from b21ae68 [MINOR] Improving runtime of TestStructuredStreaming by 2 mins (#3382) add aa857be [HUDI-2225] Add a compaction job in hudi-examples (#3347) No new revisions were added by this update. Summary of changes: .../examples/spark/HoodieMorCompactionJob.scala| 113 + 1 file changed, 113 insertions(+) create mode 100644 hudi-examples/src/main/scala/org/apache/hudi/examples/spark/HoodieMorCompactionJob.scala
[jira] [Assigned] (HUDI-2244) Fix database alreadyExistsException while hive sync
[ https://issues.apache.org/jira/browse/HUDI-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned HUDI-2244: -- Assignee: Zheng yunhong > Fix database alreadyExistsException while hive sync > --- > > Key: HUDI-2244 > URL: https://issues.apache.org/jira/browse/HUDI-2244 > Project: Apache Hudi > Issue Type: Bug > Components: Hive Integration >Reporter: Zheng yunhong >Assignee: Zheng yunhong >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > Fix database alreadyExistsException while hive sync. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-2244) Fix database alreadyExistsException while hive sync
[ https://issues.apache.org/jira/browse/HUDI-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2244. -- Resolution: Fixed eedfadeb46d5538bc7efb2c455469f1b42e9385e > Fix database alreadyExistsException while hive sync > --- > > Key: HUDI-2244 > URL: https://issues.apache.org/jira/browse/HUDI-2244 > Project: Apache Hudi > Issue Type: Bug > Components: Hive Integration >Reporter: Zheng yunhong >Assignee: Zheng yunhong >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > Fix database alreadyExistsException while hive sync. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (91c2213 -> eedfade)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 91c2213 [HUDI-2245] BucketAssigner generates the fileId evenly to avoid data skew (#3362) add eedfade [HUDI-2244] Fix database alreadyExists exception while hive sync (#3361) No new revisions were added by this update. Summary of changes: .../java/org/apache/hudi/hive/HiveSyncTool.java| 4 +- .../org/apache/hudi/hive/HoodieHiveClient.java | 11 +++-- .../org/apache/hudi/hive/TestHiveSyncTool.java | 47 ++ .../apache/hudi/hive/testutils/HiveTestUtil.java | 3 +- 4 files changed, 56 insertions(+), 9 deletions(-)
[jira] [Closed] (HUDI-2230) "Task not serializable" exception due to non-serializable Codahale Timers
[ https://issues.apache.org/jira/browse/HUDI-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2230. -- Resolution: Fixed 8105cf588e28820b9c021c9ed0e59e3f8b6efa71 > "Task not serializable" exception due to non-serializable Codahale Timers > - > > Key: HUDI-2230 > URL: https://issues.apache.org/jira/browse/HUDI-2230 > Project: Apache Hudi > Issue Type: Bug > Components: metrics >Affects Versions: 0.9.0 >Reporter: Dave Hagman >Assignee: Dave Hagman >Priority: Major > Labels: pull-request-available > Fix For: 0.9.0 > > > Steps to reproduce: > * Enable graphite metrics via props file. Example: > {noformat} > hoodie.metrics.on=true > hoodie.metrics.reporter.type=GRAPHITE > hoodie.metrics.graphite.host= > hoodie.metrics.graphite.port= > hoodie.metrics.graphite.metric.prefix= > {noformat} > * Run the Deltastreamer > * Note the following exception: > {noformat} > Exception in thread "main" org.apache.hudi.exception.HoodieException: > org.apache.hudi.exception.HoodieException: Task not serializable > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:165) > at org.apache.hudi.common.util.Option.ifPresent(Option.java:96) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:160) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:501) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959) > at > org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1038) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1047) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.util.concurrent.ExecutionException: > org.apache.hudi.exception.HoodieException: Task not serializable > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:90) > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:163) > ... 15 more > Caused by: org.apache.hudi.exception.HoodieException: Task not serializable > at > org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:649) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.spark.SparkException: Task not serializable > at > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:416) > at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:406) > at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162) > at org.apache.spark.SparkContext.clean(SparkContext.scala:2502) > at org.apache.spark.rdd.RDD.$anonfun$map$1(RDD.scala:422) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:414) > at org.apache.spark.rdd.RDD.map(RDD.scala:421) > at org.apache.spark.api.java.JavaRDDLike.map(JavaRDDLike.scala:93) > at org.apache.spark.api.java.JavaRDDLik
[hudi] branch master updated (8fef50e -> 8105cf5)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 8fef50e [HUDI-2044] Integrate consumers with rocksDB and compression within External Spillable Map (#3318) add 8105cf5 [HUDI-2230] Make codahale times transient to avoid serializable exceptions (#3345) No new revisions were added by this update. Summary of changes: .../hudi/utilities/deltastreamer/HoodieDeltaStreamerMetrics.java| 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
[hudi] branch master updated (61148c1 -> 024cf01)
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 61148c1 [HUDI-2176, 2178, 2179] Adding virtual key support to COW table (#3306) add 024cf01 [MINOR] Correct the words accroding in the comments to according (#3343) No new revisions were added by this update. Summary of changes: .../src/main/java/org/apache/hudi/util/RowDataToAvroConverters.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[jira] [Updated] (HUDI-2216) the words 'fiels' in the comments is incorrect
[ https://issues.apache.org/jira/browse/HUDI-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2216: --- Issue Type: Improvement (was: Bug) > the words 'fiels' in the comments is incorrect > -- > > Key: HUDI-2216 > URL: https://issues.apache.org/jira/browse/HUDI-2216 > Project: Apache Hudi > Issue Type: Improvement > Components: Docs >Affects Versions: 0.9.0 >Reporter: 董可伦 >Assignee: 董可伦 >Priority: Major > Labels: documentation, pull-request-available > Fix For: 0.9.0 > > Attachments: HUDI-2216.png > > > the words 'fiels' in the comments of MergeIntoHoodieTableCommand is > incorrect,it should be > 'fields' > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (HUDI-2216) the words 'fiels' in the comments is incorrect
[ https://issues.apache.org/jira/browse/HUDI-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-2216. -- Resolution: Done a91296f14a037a148d949b2380ad503677e688c7 > the words 'fiels' in the comments is incorrect > -- > > Key: HUDI-2216 > URL: https://issues.apache.org/jira/browse/HUDI-2216 > Project: Apache Hudi > Issue Type: Improvement > Components: Docs >Affects Versions: 0.9.0 >Reporter: 董可伦 >Assignee: 董可伦 >Priority: Trivial > Labels: documentation, pull-request-available > Fix For: 0.9.0 > > Attachments: HUDI-2216.png > > > the words 'fiels' in the comments of MergeIntoHoodieTableCommand is > incorrect,it should be > 'fields' > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2216) the words 'fiels' in the comments is incorrect
[ https://issues.apache.org/jira/browse/HUDI-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-2216: --- Priority: Trivial (was: Major) > the words 'fiels' in the comments is incorrect > -- > > Key: HUDI-2216 > URL: https://issues.apache.org/jira/browse/HUDI-2216 > Project: Apache Hudi > Issue Type: Improvement > Components: Docs >Affects Versions: 0.9.0 >Reporter: 董可伦 >Assignee: 董可伦 >Priority: Trivial > Labels: documentation, pull-request-available > Fix For: 0.9.0 > > Attachments: HUDI-2216.png > > > the words 'fiels' in the comments of MergeIntoHoodieTableCommand is > incorrect,it should be > 'fields' > -- This message was sent by Atlassian Jira (v8.3.4#803005)