svn commit: r53549 - /release/hudi/0.8.0/

2022-04-03 Thread smarthi
Author: smarthi
Date: Sun Apr  3 14:26:54 2022
New Revision: 53549

Log:
Archiving Older Release 0.8.0

Removed:
release/hudi/0.8.0/



svn commit: r53548 - /release/hudi/0.7.0/

2022-04-03 Thread smarthi
Author: smarthi
Date: Sun Apr  3 14:26:27 2022
New Revision: 53548

Log:
Archiving Older Release 0.7.0

Removed:
release/hudi/0.7.0/



[hudi] branch asf-site updated: [MINOR] Add alibaba cloud to powered-by page (#1655)

2020-05-23 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 3184acb  [MINOR] Add alibaba cloud to powered-by page (#1655)
3184acb is described below

commit 3184acb50aa9e83312ae7cb7da41aaabfb7a4ab8
Author: leesf 
AuthorDate: Sun May 24 00:53:52 2020 +0800

[MINOR] Add alibaba cloud to powered-by page (#1655)
---
 docs/_docs/1_4_powered_by.md | 4 
 1 file changed, 4 insertions(+)

diff --git a/docs/_docs/1_4_powered_by.md b/docs/_docs/1_4_powered_by.md
index 04044d2..f84e303 100644
--- a/docs/_docs/1_4_powered_by.md
+++ b/docs/_docs/1_4_powered_by.md
@@ -26,6 +26,10 @@ Apache Hudi was originally developed at 
[Uber](https://uber.com), to achieve [lo
 It has been in production since Aug 2016, powering the massive [100PB data 
lake](https://eng.uber.com/uber-big-data-platform/), including highly business 
critical tables like core trips,riders,partners. It also 
 powers several incremental Hive ETL pipelines and being currently integrated 
into Uber's data dispersal system.
 
+### Alibaba Cloud
+Alibaba Cloud provides cloud computing services to online businesses and 
Alibaba's own e-commerce ecosystem, Apache Hudi is integrated into Alibaba 
Cloud [Data Lake Analytics](https://www.alibabacloud.com/help/product/70174.htm)
+offering real-time analysis on hudi dataset.
+
 ### Amazon Web Services
 Amazon Web Services is the World's leading cloud services provider. Apache 
Hudi is [pre-installed](https://aws.amazon.com/emr/features/hudi/) with the AWS 
Elastic Map Reduce 
 offering, providing means for AWS users to perform record-level 
updates/deletes and manage storage efficiently.



[incubator-hudi] branch master updated: HUDI-479: Eliminate or Minimize use of Guava if possible (#1159)

2020-03-28 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 8c30013  HUDI-479: Eliminate or Minimize use of Guava if possible 
(#1159)
8c30013 is described below

commit 8c3001363d80b29733470221c192a72f541381c5
Author: Suneel Marthi 
AuthorDate: Sat Mar 28 03:11:32 2020 -0400

HUDI-479: Eliminate or Minimize use of Guava if possible (#1159)
---
 .../apache/hudi/cli/commands/RollbacksCommand.java |   4 +-
 .../common/HoodieTestCommitMetadataGenerator.java  |  20 ++--
 .../org/apache/hudi/client/HoodieWriteClient.java  |   6 +-
 .../hudi/index/bloom/BloomIndexFileInfo.java   |   5 +-
 .../org/apache/hudi/io/HoodieAppendHandle.java |   2 +-
 .../apache/hudi/metrics/JmxMetricsReporter.java|  10 +-
 .../org/apache/hudi/metrics/JmxReporterServer.java |  18 ++--
 .../main/java/org/apache/hudi/metrics/Metrics.java |   5 +-
 .../compact/HoodieMergeOnReadTableCompactor.java   |   4 +-
 .../apache/hudi/table/rollback/RollbackHelper.java |   2 +-
 .../index/bloom/TestHoodieGlobalBloomIndex.java|  12 +--
 .../java/org/apache/hudi/table/TestCleaner.java|  51 +-
 .../strategy/TestHoodieCompactionStrategy.java |  37 +--
 .../apache/hudi/avro/MercifulJsonConverter.java|  25 +++--
 .../org/apache/hudi/common/model/HoodieRecord.java |   7 +-
 .../hudi/common/table/HoodieTableMetaClient.java   |   6 +-
 .../table/timeline/HoodieActiveTimeline.java   |  11 +-
 .../table/timeline/HoodieArchivedTimeline.java |   1 +
 .../table/timeline/HoodieDefaultTimeline.java  |  10 +-
 .../hudi/common/table/timeline/HoodieInstant.java  |   6 +-
 .../IncrementalTimelineSyncFileSystemView.java |   2 +-
 .../view/RemoteHoodieTableFileSystemView.java  |   2 +-
 .../org/apache/hudi/common/util/AvroUtils.java |  18 ++--
 .../org/apache/hudi/common/util/CleanerUtils.java  |  11 +-
 .../apache/hudi/common/util/CollectionUtils.java   | 111 +
 .../java/org/apache/hudi/common/util/FSUtils.java  |   4 +-
 .../org/apache/hudi/common/util/FileIOUtils.java   |  25 +
 .../apache/hudi/common/util/ReflectionUtils.java   |  69 +++--
 .../hudi/common/minicluster/HdfsTestService.java   |  10 +-
 .../common/minicluster/ZookeeperTestService.java   |   6 +-
 .../common/model/TestHoodieCommitMetadata.java |   1 +
 .../table/string/TestHoodieActiveTimeline.java |  12 +--
 .../table/view/TestIncrementalFSViewSync.java  |  22 ++--
 .../view/TestPriorityBasedFileSystemView.java  |   5 +-
 .../hudi/common/util/CompactionTestUtils.java  |  15 +--
 .../hudi/common/util/TestCompactionUtils.java  |  10 +-
 .../org/apache/hudi/common/util/TestFSUtils.java   |   2 +-
 .../realtime/HoodieParquetRealtimeInputFormat.java |   4 +-
 .../org/apache/hudi/hive/util/HiveTestService.java |   9 +-
 .../org/apache/hudi/integ/ITTestHoodieDemo.java|  40 
 .../org/apache/hudi/HoodieDataSourceHelpers.java   |   5 +-
 .../hudi/utilities/HoodieSnapshotExporter.java |   4 +-
 .../hudi/utilities/sources/HoodieIncrSource.java   |   2 +-
 .../apache/hudi/utilities/UtilitiesTestBase.java   |   4 +-
 pom.xml|   7 --
 style/checkstyle.xml   |   4 +-
 46 files changed, 429 insertions(+), 217 deletions(-)

diff --git 
a/hudi-cli/src/main/java/org/apache/hudi/cli/commands/RollbacksCommand.java 
b/hudi-cli/src/main/java/org/apache/hudi/cli/commands/RollbacksCommand.java
index 4a122c6..3993714 100644
--- a/hudi-cli/src/main/java/org/apache/hudi/cli/commands/RollbacksCommand.java
+++ b/hudi-cli/src/main/java/org/apache/hudi/cli/commands/RollbacksCommand.java
@@ -28,9 +28,9 @@ import 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline;
 import org.apache.hudi.common.table.timeline.HoodieInstant;
 import org.apache.hudi.common.table.timeline.HoodieInstant.State;
 import org.apache.hudi.common.util.AvroUtils;
+import org.apache.hudi.common.util.CollectionUtils;
 import org.apache.hudi.common.util.collection.Pair;
 
-import com.google.common.collect.ImmutableSet;
 import org.springframework.shell.core.CommandMarker;
 import org.springframework.shell.core.annotation.CliCommand;
 import org.springframework.shell.core.annotation.CliOption;
@@ -123,7 +123,7 @@ public class RollbacksCommand implements CommandMarker {
   class RollbackTimeline extends HoodieActiveTimeline {
 
 public RollbackTimeline(HoodieTableMetaClient metaClient) {
-  super(metaClient, 
ImmutableSet.builder().add(HoodieTimeline.ROLLBACK_EXTENSION).build());
+  super(metaClient, 
CollectionUtils.createImmutableSet(HoodieTimeline.ROLLBACK_EXTENSION));
 }
   }
 }
diff --git 
a/hudi-cli/src/test/java/org/apache/hudi/cli/common/HoodieTestCommitMetadataGenerator.java
 
b/hudi-cli/src/test/java/org/apache/hudi/cli/common

[incubator-hudi] branch master updated: [MINOR] Update DOAP with 0.5.2 Release (#1448)

2020-03-25 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new e101ea9  [MINOR] Update DOAP with 0.5.2 Release (#1448)
e101ea9 is described below

commit e101ea9bd4405a461bc78aad1af64499f797daed
Author: Suneel Marthi 
AuthorDate: Wed Mar 25 23:37:32 2020 -0400

[MINOR] Update DOAP with 0.5.2 Release (#1448)
---
 doap_HUDI.rdf | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doap_HUDI.rdf b/doap_HUDI.rdf
index c33d201..af45a41 100644
--- a/doap_HUDI.rdf
+++ b/doap_HUDI.rdf
@@ -46,6 +46,11 @@
 2020-01-31
 0.5.1
   
+  
+Apache Hudi-incubating 0.5.2
+2020-03-26
+0.5.2
+  
 
 
   



[incubator-hudi] branch asf-site updated: [MINOR] Add link for yields.io usage (#1446)

2020-03-25 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 770ea3f  [MINOR] Add link for yields.io usage (#1446)
770ea3f is described below

commit 770ea3f75422fe2347113030c52edf8d0e96a64b
Author: vinoth chandar 
AuthorDate: Wed Mar 25 15:24:37 2020 -0700

[MINOR] Add link for yields.io usage (#1446)
---
 docs/_docs/1_4_powered_by.cn.md | 2 +-
 docs/_docs/1_4_powered_by.md| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/_docs/1_4_powered_by.cn.md b/docs/_docs/1_4_powered_by.cn.md
index 82f9c02..500771f 100644
--- a/docs/_docs/1_4_powered_by.cn.md
+++ b/docs/_docs/1_4_powered_by.cn.md
@@ -20,7 +20,7 @@ Hudi还支持几个增量的Hive ETL管道,并且目前已集成到Uber的数
 
 ### Yields.io
 
-Yields.io是第一个使用AI在企业范围内进行自动模型验证和实时监控的金融科技平台。他们的数据湖由Hudi管理,他们还积极使用Hudi为增量式、跨语言/平台机器学习构建基础架构。
+[Yields.io](https://www.yields.io/Blog/Apache-Hudi-at-Yields)是第一个使用AI在企业范围内进行自动模型验证和实时监控的金融科技平台。他们的数据湖由Hudi管理,他们还积极使用Hudi为增量式、跨语言/平台机器学习构建基础架构。
 
 ### Yotpo
 
diff --git a/docs/_docs/1_4_powered_by.md b/docs/_docs/1_4_powered_by.md
index 229150e..bee6bb9 100644
--- a/docs/_docs/1_4_powered_by.md
+++ b/docs/_docs/1_4_powered_by.md
@@ -23,7 +23,7 @@ offering, providing means for AWS users to perform 
record-level updates/deletes
 
 ### Yields.io
 
-Yields.io is the first FinTech platform that uses AI for automated model 
validation and real-time monitoring on an enterprise-wide scale. Their data 
lake is managed by Hudi. They are also actively building their infrastructure 
for incremental, cross language/platform machine learning using Hudi.
+Yields.io is the first FinTech platform that uses AI for automated model 
validation and real-time monitoring on an enterprise-wide scale. Their [data 
lake](https://www.yields.io/Blog/Apache-Hudi-at-Yields) is managed by Hudi. 
They are also actively building their infrastructure for incremental, cross 
language/platform machine learning using Hudi.
 
 ### Yotpo
 



[incubator-hudi] branch master updated: [HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java (#1350)

2020-03-13 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 99b7e9e  [HUDI-629]: Replace Guava's Hashing with an equivalent in 
NumericUtils.java (#1350)
99b7e9e is described below

commit 99b7e9eb9ef8827c1e06b7e8621b6be6403b061e
Author: Suneel Marthi 
AuthorDate: Fri Mar 13 20:28:05 2020 -0400

[HUDI-629]: Replace Guava's Hashing with an equivalent in NumericUtils.java 
(#1350)

* [HUDI-629]: Replace Guava's Hashing with an equivalent in 
NumericUtils.java
---
 hudi-client/pom.xml|  5 --
 .../apache/hudi/client/CompactionAdminClient.java  | 22 +++
 .../org/apache/hudi/client/HoodieCleanClient.java  |  4 +-
 .../org/apache/hudi/client/HoodieWriteClient.java  | 14 ++---
 .../apache/hudi/config/HoodieCompactionConfig.java | 12 ++--
 .../org/apache/hudi/config/HoodieWriteConfig.java  |  1 -
 .../bloom/BucketizedBloomCheckPartitioner.java |  5 +-
 .../apache/hudi/table/HoodieCopyOnWriteTable.java  |  6 +-
 .../apache/hudi/table/HoodieMergeOnReadTable.java  |  8 +--
 .../compact/HoodieMergeOnReadTableCompactor.java   | 10 ++--
 .../apache/hudi/table/rollback/RollbackHelper.java |  4 +-
 .../apache/hudi/config/TestHoodieWriteConfig.java  |  2 +-
 .../hudi/common/model/TimelineLayoutVersion.java   |  6 +-
 .../hudi/common/table/HoodieTableMetaClient.java   |  8 +--
 .../hudi/common/table/log/HoodieLogFileReader.java |  7 +--
 .../table/timeline/HoodieActiveTimeline.java   | 48 +++
 .../table/view/AbstractTableFileSystemView.java| 23 ---
 .../table/view/FileSystemViewStorageConfig.java|  5 +-
 .../table/view/HoodieTableFileSystemView.java  | 19 +++---
 .../view/RemoteHoodieTableFileSystemView.java  | 12 ++--
 .../table/view/RocksDbBasedFileSystemView.java |  6 +-
 .../org/apache/hudi/common/util/AvroUtils.java |  3 +-
 .../java/org/apache/hudi/common/util/FSUtils.java  |  5 +-
 .../hudi/common/util/FailSafeConsistencyGuard.java |  7 +--
 .../org/apache/hudi/common/util/NumericUtils.java  | 30 ++
 .../org/apache/hudi/common/util/RocksDBDAO.java| 17 +++---
 .../apache/hudi/common/util/ValidationUtils.java   | 70 ++
 .../common/util/queue/BoundedInMemoryQueue.java|  4 +-
 .../hudi/common/versioning/MetadataMigrator.java   |  7 +--
 .../versioning/clean/CleanV1MigrationHandler.java  |  9 +--
 .../versioning/clean/CleanV2MigrationHandler.java  |  4 +-
 .../compaction/CompactionV1MigrationHandler.java   | 11 ++--
 .../compaction/CompactionV2MigrationHandler.java   | 11 ++--
 .../hudi/common/minicluster/HdfsTestService.java   |  4 +-
 .../common/minicluster/ZookeeperTestService.java   |  9 +--
 .../hudi/common/table/log/TestHoodieLogFormat.java |  2 +-
 .../table/view/TestIncrementalFSViewSync.java  | 16 ++---
 .../apache/hudi/common/util/TestNumericUtils.java  | 26 
 .../realtime/HoodieParquetRealtimeInputFormat.java |  4 +-
 hudi-hive-sync/pom.xml |  5 --
 .../org/apache/hudi/hive/HoodieHiveClient.java |  4 +-
 .../hudi/hive/MultiPartKeysValueExtractor.java |  4 +-
 .../org/apache/hudi/hive/SchemaDifference.java |  2 +-
 .../org/apache/hudi/hive/util/HiveTestService.java |  8 +--
 hudi-integ-test/pom.xml|  7 ---
 .../timeline/service/FileSystemViewHandler.java|  4 +-
 .../hudi/utilities/HoodieWithTimelineServer.java   |  4 +-
 .../org/apache/hudi/utilities/UtilHelpers.java |  4 +-
 .../hudi/utilities/deltastreamer/DeltaSync.java| 10 ++--
 .../deltastreamer/HoodieDeltaStreamer.java |  4 +-
 .../sources/helpers/IncrSourceHelper.java  | 14 +++--
 51 files changed, 308 insertions(+), 228 deletions(-)

diff --git a/hudi-client/pom.xml b/hudi-client/pom.xml
index 347b4f2..06d6017 100644
--- a/hudi-client/pom.xml
+++ b/hudi-client/pom.xml
@@ -119,11 +119,6 @@
 
 
 
-  com.google.guava
-  guava
-
-
-
   com.beust
   jcommander
   test
diff --git 
a/hudi-client/src/main/java/org/apache/hudi/client/CompactionAdminClient.java 
b/hudi-client/src/main/java/org/apache/hudi/client/CompactionAdminClient.java
index 713fed4..7d2d664 100644
--- 
a/hudi-client/src/main/java/org/apache/hudi/client/CompactionAdminClient.java
+++ 
b/hudi-client/src/main/java/org/apache/hudi/client/CompactionAdminClient.java
@@ -36,13 +36,13 @@ import org.apache.hudi.common.util.AvroUtils;
 import org.apache.hudi.common.util.CompactionUtils;
 import org.apache.hudi.common.util.FSUtils;
 import org.apache.hudi.common.util.Option;
+import org.apache.hudi.common.util.ValidationUtils;
 import org.apache.hudi.common.util.collection.Pair;
 import org.apache.hudi.config.HoodieWriteConfig;
 import org.apache.hudi.exception.HoodieException;
 import org.apache.hudi.exception.HoodieIOExc

[incubator-hudi] branch master updated: [HUDI-688] Paring down the NOTICE file to minimum required notices (#1391)

2020-03-11 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new dd7cf38  [HUDI-688] Paring down the NOTICE file to minimum required 
notices (#1391)
dd7cf38 is described below

commit dd7cf38a137ed43b8f00aa14c985bf8b106e256f
Author: vinoth chandar 
AuthorDate: Wed Mar 11 05:24:07 2020 -0700

[HUDI-688] Paring down the NOTICE file to minimum required notices (#1391)

- Based on analysis, we don't need to call out anything
 - We only do source releases at this time
 - Fix typo in LICENSE
---
 LICENSE |  2 +-
 NOTICE  | 81 -
 2 files changed, 1 insertion(+), 82 deletions(-)

diff --git a/LICENSE b/LICENSE
index 34d6be6..28dfacd 100644
--- a/LICENSE
+++ b/LICENSE
@@ -284,7 +284,7 @@ SOFTWARE.
 
 ---
 
-This product includes code from org.apache.hadoop.
+This product includes code from Apache Hadoop
 
 * org.apache.hudi.common.bloom.filter.InternalDynamicBloomFilter.java adapted 
from org.apache.hadoop.util.bloom.DynamicBloomFilter.java
 
diff --git a/NOTICE b/NOTICE
index c0469fa..ecd4479 100644
--- a/NOTICE
+++ b/NOTICE
@@ -3,84 +3,3 @@ Copyright 2019 and onwards The Apache Software Foundation
 
 This product includes software developed at
 The Apache Software Foundation (http://www.apache.org/).
-
-This project bundles the following dependencies
-
-
-Metrics
-Copyright 2010-2013 Coda Hale and Yammer, Inc.
-
-This product includes software developed by Coda Hale and Yammer, Inc.
-
--
-Guava
-Copyright (C) 2007 The Guava Authors
-
-Licensed under the Apache License, Version 2.0
-
--
-Kryo (https://github.com/EsotericSoftware/kryo)
-Copyright (c) 2008-2018, Nathan Sweet All rights reserved.
-
-Redistribution and use in source and binary forms, with or without 
modification, are permitted provided that the
-following conditions are met:
-
-Redistributions of source code must retain the above copyright notice, this 
list of conditions and the following disclaimer.
-Redistributions in binary form must reproduce the above copyright notice, this 
list of conditions and the following disclaimer in the documentation and/or 
other materials provided with the distribution.
-
-Neither the name of Esoteric Software nor the names of its contributors may be 
used to endorse or promote products derived from this software without specific 
prior written permission.
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 
SERVICES; LOSS OF USE, DATA, OR PROF [...]
-
-
-Jackson JSON Processor
-
-This copy of Jackson JSON processor streaming parser/generator is licensed 
under the
-Apache (Software) License, version 2.0 ("the License").
-See the License for details about distribution rights, and the
-specific rights regarding derivate works.
-
-You may obtain a copy of the License at:
-
-http://www.apache.org/licenses/LICENSE-2.0
-
---
-
-Gson
-Copyright 2008 Google Inc.
-
-Licensed under the Apache License, Version 2.0 (the "License");
-you may not use this file except in compliance with the License.
-You may obtain a copy of the License at
-
-http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing, software
-distributed under the License is distributed on an "AS IS" BASIS,
-WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-See the License for the specific language governing permissions and
-limitations under the License.
-
-
-= Apache Hadoop 2.8.5 =
-Apache Hadoop
-Copyright 2009-2017 The Apache Software Foundation
-
-= Apache Hive 2.3.1 =
-Apache Hive
-Copyright 2008-2017 The Apache Software Foundation
-
-= Apache Spark 2.4.4 =
-Apache Spark
-Copyright 2014 and onwards The Apache Software Foundation
-
-= Apache Kafka 2.0.0 =
-Apache Kafka
-Copyright 2020 The Apache Software Foundation.
-
-= Apache HBase 1.2.3 =
-Apache HBase
-Copyright 2007-2019 The Apache Software Foundation.
-
-= Apache Avro 1.8.2 =
-Apache Avro
-Copyright 2010-2019 The Apache Software Foundation.
\ No newline at end of file



[incubator-hudi] branch master updated: [HUDI-670] Added test cases for TestDiskBasedMap. (#1379)

2020-03-11 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new cf0a4c1  [HUDI-670] Added test cases for TestDiskBasedMap. (#1379)
cf0a4c1 is described below

commit cf0a4c19bc4ed850172e6ac938f57a0bf7e96353
Author: Prashant Wason 
AuthorDate: Wed Mar 11 05:03:03 2020 -0700

[HUDI-670] Added test cases for TestDiskBasedMap. (#1379)

* [HUDI-670] Added test cases for TestDiskBasedMap.

* Update TestDiskBasedMap.java

Co-authored-by: Suneel Marthi 
---
 .../common/util/collection/TestDiskBasedMap.java   | 25 ++
 1 file changed, 25 insertions(+)

diff --git 
a/hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestDiskBasedMap.java
 
b/hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestDiskBasedMap.java
old mode 100644
new mode 100755
index 2cc726e..3fcfab5
--- 
a/hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestDiskBasedMap.java
+++ 
b/hudi-common/src/test/java/org/apache/hudi/common/util/collection/TestDiskBasedMap.java
@@ -20,6 +20,7 @@ package org.apache.hudi.common.util.collection;
 
 import org.apache.hudi.common.HoodieCommonTestHarness;
 import org.apache.hudi.common.model.AvroBinaryTestPayload;
+import org.apache.hudi.common.model.HoodieAvroPayload;
 import org.apache.hudi.common.model.HoodieKey;
 import org.apache.hudi.common.model.HoodieRecord;
 import org.apache.hudi.common.model.HoodieRecordPayload;
@@ -42,9 +43,11 @@ import java.io.IOException;
 import java.io.UncheckedIOException;
 import java.net.URISyntaxException;
 import java.util.ArrayList;
+import java.util.HashMap;
 import java.util.HashSet;
 import java.util.Iterator;
 import java.util.List;
+import java.util.Map;
 import java.util.Set;
 import java.util.UUID;
 import java.util.stream.Collectors;
@@ -184,6 +187,28 @@ public class TestDiskBasedMap extends 
HoodieCommonTestHarness {
 assertTrue(payloadSize > 0);
   }
 
+  @Test
+  public void testPutAll() throws IOException, URISyntaxException {
+DiskBasedMap records = new DiskBasedMap<>(basePath);
+List iRecords = SchemaTestUtil.generateHoodieTestRecords(0, 
100);
+Map recordMap = new HashMap<>();
+iRecords.forEach(r -> {
+  String key = ((GenericRecord) 
r).get(HoodieRecord.RECORD_KEY_METADATA_FIELD).toString();
+  String partitionPath = ((GenericRecord) 
r).get(HoodieRecord.PARTITION_PATH_METADATA_FIELD).toString();
+  HoodieRecord value = new HoodieRecord<>(new HoodieKey(key, 
partitionPath), new HoodieAvroPayload(Option.of((GenericRecord) r)));
+  recordMap.put(key, value);
+});
+
+records.putAll(recordMap);
+// make sure records have spilled to disk
+assertTrue(records.sizeOfFileOnDiskInBytes() > 0);
+
+// make sure all added records are present
+for (Map.Entry entry : records.entrySet()) {
+  assertTrue(recordMap.containsKey(entry.getKey()));
+}
+  }
+
   /**
* @na: Leaving this test here for a quick performance test
*/



[incubator-hudi] branch master updated: [HUDI-668] Added additional unit-tests for HUDI metrics. (#1380)

2020-03-09 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 77d5b92  [HUDI-668] Added additional unit-tests for HUDI metrics. 
(#1380)
77d5b92 is described below

commit 77d5b92d88d6583bdfc09e4c10ecfe7ddbb04806
Author: Prashant Wason 
AuthorDate: Mon Mar 9 20:15:42 2020 -0700

[HUDI-668] Added additional unit-tests for HUDI metrics. (#1380)
---
 .../apache/hudi/metrics/TestHoodieJmxMetrics.java  |   5 +-
 .../org/apache/hudi/metrics/TestHoodieMetrics.java | 117 -
 2 files changed, 119 insertions(+), 3 deletions(-)

diff --git 
a/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieJmxMetrics.java 
b/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieJmxMetrics.java
index c1e3d61..72b218b 100644
--- 
a/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieJmxMetrics.java
+++ 
b/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieJmxMetrics.java
@@ -21,6 +21,7 @@ package org.apache.hudi.metrics;
 import org.apache.hudi.config.HoodieMetricsConfig;
 import org.apache.hudi.config.HoodieWriteConfig;
 
+import org.junit.Before;
 import org.junit.Test;
 
 import static org.apache.hudi.metrics.Metrics.registerGauge;
@@ -31,9 +32,9 @@ import static org.mockito.Mockito.when;
 /**
  * Test for the Jmx metrics report.
  */
-public class TestHoodieJmxMetrics extends TestHoodieMetrics {
+public class TestHoodieJmxMetrics {
 
-  @Override
+  @Before
   public void start() {
 HoodieWriteConfig config = mock(HoodieWriteConfig.class);
 when(config.isMetricsOn()).thenReturn(true);
diff --git 
a/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieMetrics.java 
b/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieMetrics.java
old mode 100644
new mode 100755
index c71092d..d52bf8d
--- a/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieMetrics.java
+++ b/hudi-client/src/test/java/org/apache/hudi/metrics/TestHoodieMetrics.java
@@ -18,24 +18,32 @@
 
 package org.apache.hudi.metrics;
 
+import org.apache.hudi.common.model.HoodieCommitMetadata;
 import org.apache.hudi.config.HoodieWriteConfig;
 
 import org.junit.Before;
 import org.junit.Test;
 
+import com.codahale.metrics.Timer;
+
+import java.util.Arrays;
+import java.util.Random;
+
 import static org.apache.hudi.metrics.Metrics.registerGauge;
 import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
 import static org.mockito.Mockito.mock;
 import static org.mockito.Mockito.when;
 
 public class TestHoodieMetrics {
+  private HoodieMetrics metrics;
 
   @Before
   public void start() {
 HoodieWriteConfig config = mock(HoodieWriteConfig.class);
 when(config.isMetricsOn()).thenReturn(true);
 
when(config.getMetricsReporterType()).thenReturn(MetricsReporterType.INMEMORY);
-new HoodieMetrics(config, "raw_table");
+metrics = new HoodieMetrics(config, "raw_table");
   }
 
   @Test
@@ -43,4 +51,111 @@ public class TestHoodieMetrics {
 registerGauge("metric1", 123L);
 assertEquals("123", 
Metrics.getInstance().getRegistry().getGauges().get("metric1").getValue().toString());
   }
+
+  @Test
+  public void testTimerCtx() throws InterruptedException {
+Random rand = new Random();
+
+// Index metrics
+Timer.Context timer = metrics.getIndexCtx();
+Thread.sleep(5); // Ensure timer duration is > 0
+metrics.updateIndexMetrics("some_action", 
metrics.getDurationInMs(timer.stop()));
+String metricName = metrics.getMetricsName("index", 
"some_action.duration");
+long msec = 
(Long)Metrics.getInstance().getRegistry().getGauges().get(metricName).getValue();
+assertTrue(msec > 0);
+
+// Rollback metrics
+timer = metrics.getRollbackCtx();
+Thread.sleep(5); // Ensure timer duration is > 0
+long numFilesDeleted = 1 + rand.nextInt();
+metrics.updateRollbackMetrics(metrics.getDurationInMs(timer.stop()), 
numFilesDeleted);
+metricName = metrics.getMetricsName("rollback", "duration");
+msec = 
(Long)Metrics.getInstance().getRegistry().getGauges().get(metricName).getValue();
+assertTrue(msec > 0);
+metricName = metrics.getMetricsName("rollback", "numFilesDeleted");
+
assertEquals((long)Metrics.getInstance().getRegistry().getGauges().get(metricName).getValue(),
 numFilesDeleted);
+
+// Clean metrics
+timer = metrics.getRollbackCtx();
+Thread.sleep(5); // Ensure timer duration is > 0
+numFilesDeleted = 1 + rand.nextInt();
+metrics.updateCleanMetrics(metrics.getDurationInMs(timer.stop()), 
(int)numFilesDeleted);
+metricName = metrics.getMetricsName("clean", "duration");
+msec = 
(Long)Metrics.getIns

[incubator-hudi] branch master updated (5f8bf97 -> 415882f)

2020-03-07 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


from 5f8bf97  [HUDI-671] Added unit-test for HBaseIndex (#1381)
 add 415882f  [HUDI-581] NOTICE need more work as it missing content form 
included 3rd party ALv2 licensed NOTICE files (#1354)

No new revisions were added by this update.

Summary of changes:
 NOTICE | 83 +-
 1 file changed, 82 insertions(+), 1 deletion(-)



[incubator-hudi] branch master updated: [HUDI-680] Update Jackson databind to 2.6.7.3 (#1385)

2020-03-07 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new fdcd3b1  [HUDI-680] Update Jackson databind to 2.6.7.3 (#1385)
fdcd3b1 is described below

commit fdcd3b18b63d54e4b468a62d92c27497398d67ac
Author: Aki Tanaka 
AuthorDate: Sat Mar 7 14:22:19 2020 -0800

[HUDI-680] Update Jackson databind to 2.6.7.3 (#1385)
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index 8d45a9c..52d2ce0 100644
--- a/pom.xml
+++ b/pom.xml
@@ -430,7 +430,7 @@
   
 com.fasterxml.jackson.core
 jackson-databind
-${fasterxml.version}.1
+${fasterxml.version}.3
   
   
 com.fasterxml.jackson.datatype



[incubator-hudi] branch asf-site updated: [HUDI-645] Provide a statement page to describe how to report security issues (#1361)

2020-02-27 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 8938eec  [HUDI-645] Provide a statement page to describe how to report 
security issues (#1361)
8938eec is described below

commit 8938eec5508bb1c756824583998fb0567db22e28
Author: vinoyang 
AuthorDate: Thu Feb 27 22:00:53 2020 +0800

[HUDI-645] Provide a statement page to describe how to report security 
issues (#1361)
---
 docs/_config.yml   |  9 +
 docs/_pages/security.cn.md | 29 +
 docs/_pages/security.md| 28 
 3 files changed, 66 insertions(+)

diff --git a/docs/_config.yml b/docs/_config.yml
index e63fd83..bce12a7 100644
--- a/docs/_config.yml
+++ b/docs/_config.yml
@@ -58,6 +58,9 @@ author:
 - label: "Report Issues"
   icon: "fa fa-navicon"
   url: "https://issues.apache.org/jira/projects/HUDI/summary";
+- label: "Report Security Issues"
+  icon: "fa fa-navicon"
+  url: "/security"
 
 cn_author:
   name : "Quick Links"
@@ -81,6 +84,9 @@ cn_author:
 - label: "Report Issues"
   icon: "fa fa-navicon"
   url: "https://issues.apache.org/jira/projects/HUDI/summary";
+- label: "Report Security Issues"
+  icon: "fa fa-navicon"
+  url: "/cn/security"
 
 
 0.5.0_author:
@@ -105,6 +111,9 @@ cn_author:
 - label: "Report Issues"
   icon: "fa fa-navicon"
   url: "https://issues.apache.org/jira/projects/HUDI/summary";
+- label: "Report Security Issues"
+  icon: "fa fa-navicon"
+  url: "/security"
 
 
 # Layout Defaults
diff --git a/docs/_pages/security.cn.md b/docs/_pages/security.cn.md
new file mode 100644
index 000..b2e6877
--- /dev/null
+++ b/docs/_pages/security.cn.md
@@ -0,0 +1,29 @@
+---
+title: Security
+keywords: hudi, security
+permalink: /cn/security
+toc: true
+last_modified_at: 2019-12-30T15:59:57-04:00
+language: cn
+---
+
+## Reporting Security Issues
+
+The Apache Software Foundation takes a rigorous standpoint in annihilating the 
security issues in its software projects. Apache Hudi is highly sensitive and 
forthcoming to issues pertaining to its features and functionality.
+
+## Reporting Vulnerability
+
+If you have apprehensions regarding Hudi's security or you discover 
vulnerability or potential threat, don’t hesitate to get in touch with the 
[Apache Security Team](http://www.apache.org/security/) by dropping a mail at 
[secur...@apache.org](secur...@apache.org). In the mail, specify the 
description of the issue or potential threat. You are also urged to recommend 
the way to reproduce and replicate the issue. The Hudi community will get back 
to you after assessing and analysing the findings.
+
+**PLEASE PAY ATTENTION** to report the security issue on the security email 
before disclosing it on public domain.
+
+## Vulnerability Handling
+
+An overview of the vulnerability handling process is:
+
+* The reporter reports the vulnerability privately to Apache.
+* The appropriate project's security team works privately with the reporter to 
resolve the vulnerability.
+* A new release of the Apache product concerned is made that includes the fix.
+* The vulnerability is publically announced.
+
+A more detailed description of the process can be found 
[here](https://www.apache.org/security/committers.html).
\ No newline at end of file
diff --git a/docs/_pages/security.md b/docs/_pages/security.md
new file mode 100644
index 000..67898c2
--- /dev/null
+++ b/docs/_pages/security.md
@@ -0,0 +1,28 @@
+---
+title: Security
+keywords: hudi, security
+permalink: /security
+toc: true
+last_modified_at: 2019-12-30T15:59:57-04:00
+---
+
+## Reporting Security Issues
+
+The Apache Software Foundation takes a rigorous standpoint in annihilating the 
security issues in its software projects. Apache Hudi is highly sensitive and 
forthcoming to issues pertaining to its features and functionality.
+
+## Reporting Vulnerability
+
+If you have apprehensions regarding Hudi's security or you discover 
vulnerability or potential threat, don’t hesitate to get in touch with the 
[Apache Security Team](http://www.apache.org/security/) by dropping a mail at 
[secur...@apache.org](secur...@apache.org). In the mail, specify the 
description of the issue or potential threat. You are also urged to recommend 
the way to reproduce and replicate the issue. The Hudi community will get back 
to you after assessing and analysing the findings.
+
+**PLEASE PAY ATTENTION** to report the security issue on the security email 
before disclosing it on public domain.
+
+## Vulnerability Handling
+
+An ove

[incubator-hudi] branch master updated: [MINOR] Updated DOAP with 0.5.1 release (#1300)

2020-02-02 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 0026234  [MINOR] Updated DOAP with 0.5.1 release (#1300)
0026234 is described below

commit 00262340115986676fef8bbda0b8f08000c06442
Author: Suneel Marthi 
AuthorDate: Sun Feb 2 15:13:24 2020 +0100

[MINOR] Updated DOAP with 0.5.1 release (#1300)
---
 doap_HUDI.rdf | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doap_HUDI.rdf b/doap_HUDI.rdf
index 7df689b..29baa24 100644
--- a/doap_HUDI.rdf
+++ b/doap_HUDI.rdf
@@ -41,6 +41,11 @@
 2019-10-24
 0.5.0
   
+  
+Apache Hudi-incubating 0.5.0
+2020-01-31
+0.5.1
+  
 
 
   



[incubator-hudi] branch master updated (362a9b9 -> c06ec8b)

2020-01-29 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


from 362a9b9  [MINOR] Remove junit-dep dependency
 add c06ec8b  [MINOR] Fix assigning to configuration more times (#1291)

No new revisions were added by this update.

Summary of changes:
 .../hudi/index/TestHBaseQPSResourceAllocator.java  |  2 +-
 .../java/org/apache/hudi/index/TestHbaseIndex.java |  2 +-
 .../realtime/HoodieParquetRealtimeInputFormat.java | 41 ++
 3 files changed, 21 insertions(+), 24 deletions(-)



[incubator-hudi] branch master updated: [MINOR] Add missing licenses (#1271)

2020-01-22 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new ed54eb2  [MINOR] Add missing licenses (#1271)
ed54eb2 is described below

commit ed54eb20a5c9c82c348bc39387ef73f835ad6821
Author: leesf <490081...@qq.com>
AuthorDate: Wed Jan 22 21:06:45 2020 +0800

[MINOR] Add missing licenses (#1271)
---
 docker/demo/presto-batch1.commands | 18 ++
 docker/demo/presto-batch2-after-compaction.commands| 18 ++
 docker/demo/presto-table-check.commands| 18 ++
 .../hudi/utilities/TestHiveIncrementalPuller.java  | 18 ++
 4 files changed, 72 insertions(+)

diff --git a/docker/demo/presto-batch1.commands 
b/docker/demo/presto-batch1.commands
index 3e39df8..35f2b51 100644
--- a/docker/demo/presto-batch1.commands
+++ b/docker/demo/presto-batch1.commands
@@ -1,3 +1,21 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
 select symbol, max(ts) from stock_ticks_cow group by symbol HAVING symbol = 
'GOOG';
 select symbol, max(ts) from stock_ticks_mor_ro group by symbol HAVING symbol = 
'GOOG';
 select symbol, ts, volume, open, close  from stock_ticks_cow where  symbol = 
'GOOG';
diff --git a/docker/demo/presto-batch2-after-compaction.commands 
b/docker/demo/presto-batch2-after-compaction.commands
index dee4630..11d094b 100644
--- a/docker/demo/presto-batch2-after-compaction.commands
+++ b/docker/demo/presto-batch2-after-compaction.commands
@@ -1,2 +1,20 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
 select symbol, max(ts) from stock_ticks_mor_ro group by symbol HAVING symbol = 
'GOOG';
 select symbol, ts, volume, open, close  from stock_ticks_mor_ro where  symbol 
= 'GOOG';
diff --git a/docker/demo/presto-table-check.commands 
b/docker/demo/presto-table-check.commands
index 26abcfc..bcbeeed 100644
--- a/docker/demo/presto-table-check.commands
+++ b/docker/demo/presto-table-check.commands
@@ -1 +1,19 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.

[incubator-hudi] branch master updated (5471d8f -> 3f4966d)

2020-01-18 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


from 5471d8f  [MINOR] Add toString method to TimelineLayoutVersion to make 
it more readable (#1244)
 add 3f4966d  [MINOR] Fix PMC in DOAP] (#1247)

No new revisions were added by this update.

Summary of changes:
 doap_HUDI.rdf | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



[incubator-hudi] branch master updated (292c1e2 -> 5471d8f)

2020-01-17 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


from 292c1e2  [HUDI-238] Make Hudi support Scala 2.12 (#1226)
 add 5471d8f  [MINOR] Add toString method to TimelineLayoutVersion to make 
it more readable (#1244)

No new revisions were added by this update.

Summary of changes:
 .../java/org/apache/hudi/common/model/TimelineLayoutVersion.java | 5 +
 1 file changed, 5 insertions(+)



[incubator-hudi] branch asf-site updated: [HUDI-443]: Added slides from Hadoop Summit 2019, Bangalore. Also updated Adoptions

2019-12-31 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 66634b2  [HUDI-443]: Added slides from Hadoop Summit 2019, Bangalore. 
Also updated Adoptions
 new 70c9fac  Merge pull request #1162 from pratyakshsharma/asf-site-local
66634b2 is described below

commit 66634b2a4fba7ad540fc52e2236a52c896a82b8f
Author: Pratyaksh Sharma 
AuthorDate: Tue Dec 31 15:49:17 2019 +0530

[HUDI-443]: Added slides from Hadoop Summit 2019, Bangalore. Also updated 
Adoptions
---
 docs/powered_by.md | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/docs/powered_by.md b/docs/powered_by.md
index 4453a5c..413f6a0 100644
--- a/docs/powered_by.md
+++ b/docs/powered_by.md
@@ -30,6 +30,9 @@ Yields.io is the first FinTech platform that uses AI for 
automated model validat
 
 Using Hudi at Yotpo for several usages. Firstly, integrated Hudi as a writer 
in their open source ETL framework https://github.com/YotpoLtd/metorikku and 
using as an output writer for a CDC pipeline, with events that are being 
generated from a database binlog streams to Kafka and then are written to S3. 
  
+ Tathastu.ai
+
+[Tathastu.ai](https://www.tathastu.ai) offers the largest AI/ML playground of 
consumer data for data scientists, AI experts and technologists to build upon. 
They have built a CDC pipeline using Apache Hudi and Debezium. Data from Hudi 
datasets is being queried using Hive, Presto and Spark.
 
 ## Talks & Presentations
 
@@ -39,7 +42,6 @@ Using Hudi at Yotpo for several usages. Firstly, integrated 
Hudi as a writer in
 2. ["Hoodie: An Open Source Incremental Processing Framework From 
Uber"](http://www.dataengconf.com/hoodie-an-open-source-incremental-processing-framework-from-uber)
 - By Vinoth Chandar.
Apr 2017, DataEngConf, San Francisco, CA 
[Slides](https://www.slideshare.net/vinothchandar/hoodie-dataengconf-2017) 
[Video](https://www.youtube.com/watch?v=7Wudjc-v7CA)
 
-
 3. ["Incremental Processing on Large Analytical 
Datasets"](https://spark-summit.org/2017/events/incremental-processing-on-large-analytical-datasets/)
 - By Prasanna Rajaperumal
June 2017, Spark Summit 2017, San Francisco, CA. 
[Slides](https://www.slideshare.net/databricks/incremental-processing-on-large-analytical-datasets-with-prasanna-rajaperumal-and-vinoth-chandar)
 [Video](https://www.youtube.com/watch?v=3HS0lQX-cgo&feature=youtu.be)
 
@@ -56,9 +58,11 @@ Using Hudi at Yotpo for several usages. Firstly, integrated 
Hudi as a writer in
 
 8. ["Apache Hudi (Incubating) - The Past, Present and Future Of Efficient Data 
Lake 
Architectures"](https://docs.google.com/presentation/d/1FHhsvh70ZP6xXlHdVsAI0g__B_6Mpto5KQFlZ0b8-mM)
 - By Vinoth Chandar & Balaji Varadarajan
September 2019, ApacheCon NA 19, Las Vegas, NV, USA
-   
+  
 9. ["Insert, upsert, and delete data in Amazon S3 using Amazon 
EMR"](https://www.portal.reinvent.awsevents.com/connect/sessionDetail.ww?SESSION_ID=98662&csrftkn=YS67-AG7B-QIAV-ZZBK-E6TT-MD4Q-1HEP-747P)
 - By Paul Codding & Vinoth Chandar
-   December 2019, AWS re:Invent 2019, Las Vegas, NV, USA 
+  December 2019, AWS re:Invent 2019, Las Vegas, NV, USA  
+   
+10. ["Building Robust CDC Pipeline With Apache Hudi And 
Debezium"](https://www.slideshare.net/SyedKather/building-robust-cdc-pipeline-with-apache-hudi-and-debezium)
 - By Pratyaksh, Purushotham, Syed and Shaik December 2019, Hadoop Summit 
Bangalore, India
 
 ## Articles
 



[incubator-hudi] branch master updated: [HUDI-343]: Create a DOAP file for Hudi

2019-12-31 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 47c1f74  [HUDI-343]: Create a DOAP file for Hudi
 new 98c0d8c  Merge pull request #1160 from smarthi/HUDI-343
47c1f74 is described below

commit 47c1f746e2ddd061064af6026de9125c65bf114a
Author: Suneel Marthi 
AuthorDate: Tue Dec 31 02:05:21 2019 -0500

[HUDI-343]: Create a DOAP file for Hudi
---
 doap_HUDI.rdf | 59 +++
 1 file changed, 59 insertions(+)

diff --git a/doap_HUDI.rdf b/doap_HUDI.rdf
new file mode 100644
index 000..6484004
--- /dev/null
+++ b/doap_HUDI.rdf
@@ -0,0 +1,59 @@
+
+
+http://usefulinc.com/ns/doap#";
+ xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
+ xmlns:asfext="http://projects.apache.org/ns/asfext#";
+ xmlns:foaf="http://xmlns.com/foaf/0.1/";>
+
+  https://hudi.apache.org";>
+2019-12-31
+http://usefulinc.com/doap/licenses/asl20"; />
+Apache Hudi
+https://hudi.apache.org"; />
+https://hudi.apache.org"; />
+Ingests and Manages storage of large analytical 
datasets
+Hudi (pronounced “Hoodie”) ingests and manages storage of 
large analytical datasets over DFS
+ (HDFS or cloud stores) and provides three logical views for query 
access.
+https://issues.apache.org/jira/browse/HUDI"; />
+https://hudi.apache.org/community.html"; />
+https://hudi.apache.org/community.html"; />
+Java
+Scala
+http://projects.apache.org/category/library"; />
+
+  
+Apache Hudi-incubating 0.5.0
+2019-10-24
+0.5.0
+  
+
+
+  
+https://github.com/apache/incubator-hudi.git"/>
+https://github.com/apache/incubator-hudi"/>
+  
+
+
+  
+Apache Hudi-incubating PPMC
+  mailto:d...@hudi.apache.org"/>
+  
+
+  
+
\ No newline at end of file



[incubator-hudi] branch master updated (e637d9e -> add4b1e)

2019-12-30 Thread smarthi
This is an automated email from the ASF dual-hosted git repository.

smarthi pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git.


from e637d9e  [HUDI-455] Redo hudi-client log statements using SLF4J (#1145)
 new bb90ded  [MINOR] Fix out of limits for results
 new 36c0e6b  [MINOR] Fix out of limits for results
 new 74b00d1  trigger rebuild
 new 619f501  Clean up code
 new add4b1e  Merge pull request #1143 from BigDataArtisans/outoflimit

The 695 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../org/apache/hudi/cli/commands/HoodieLogFileCommand.java   | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)