[GitHub] [incubator-hudi] codecov-commenter edited a comment on pull request #1592: [Hudi-822] decouple Hudi related logics from HoodieInputFormat

2020-05-22 Thread GitBox


codecov-commenter edited a comment on pull request #1592:
URL: https://github.com/apache/incubator-hudi/pull/1592#issuecomment-632985999


   # 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1592?src=pr=h1) 
Report
   > Merging 
[#1592](https://codecov.io/gh/apache/incubator-hudi/pull/1592?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-hudi/commit/f34de3fb2738c8c36c937eba8df2a6848fafa886=desc)
 will **decrease** coverage by `0.05%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-hudi/pull/1592/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1592?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#1592  +/-   ##
   
   - Coverage 18.21%   18.16%   -0.06% 
   - Complexity  856  857   +1 
   
 Files   348  351   +3 
 Lines 1533415388  +54 
 Branches   1523 1525   +2 
   
   + Hits   2793 2795   +2 
   - Misses1218412236  +52 
 Partials357  357  
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-hudi/pull/1592?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...g/apache/hudi/hadoop/HoodieParquetInputFormat.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0hvb2RpZVBhcnF1ZXRJbnB1dEZvcm1hdC5qYXZh)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...rg/apache/hudi/hadoop/HoodieROTablePathFilter.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0hvb2RpZVJPVGFibGVQYXRoRmlsdGVyLmphdmE=)
 | `0.00% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[.../java/org/apache/hudi/hadoop/InputPathHandler.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0lucHV0UGF0aEhhbmRsZXIuamF2YQ==)
 | `0.00% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[.../hadoop/realtime/AbstractRealtimeRecordReader.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL0Fic3RyYWN0UmVhbHRpbWVSZWNvcmRSZWFkZXIuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...oop/realtime/HoodieParquetRealtimeInputFormat.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL0hvb2RpZVBhcnF1ZXRSZWFsdGltZUlucHV0Rm9ybWF0LmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...hadoop/realtime/RealtimeCompactedRecordReader.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL1JlYWx0aW1lQ29tcGFjdGVkUmVjb3JkUmVhZGVyLmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[.../hadoop/realtime/RealtimeUnmergedRecordReader.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL1JlYWx0aW1lVW5tZXJnZWRSZWNvcmRSZWFkZXIuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...a/org/apache/hudi/hadoop/utils/HoodieHiveUtil.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3V0aWxzL0hvb2RpZUhpdmVVdGlsLmphdmE=)
 | `0.00% <ø> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...ache/hudi/hadoop/utils/HoodieInputFormatUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3V0aWxzL0hvb2RpZUlucHV0Rm9ybWF0VXRpbHMuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...i/hadoop/utils/HoodieRealtimeInputFormatUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3V0aWxzL0hvb2RpZVJlYWx0aW1lSW5wdXRGb3JtYXRVdGlscy5qYXZh)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | ... and [5 
more](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1592?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 

[GitHub] [incubator-hudi] codecov-commenter commented on pull request #1592: [Hudi-822] decouple Hudi related logics from HoodieInputFormat

2020-05-22 Thread GitBox


codecov-commenter commented on pull request #1592:
URL: https://github.com/apache/incubator-hudi/pull/1592#issuecomment-632985999


   # 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1592?src=pr=h1) 
Report
   > Merging 
[#1592](https://codecov.io/gh/apache/incubator-hudi/pull/1592?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-hudi/commit/f34de3fb2738c8c36c937eba8df2a6848fafa886=desc)
 will **decrease** coverage by `0.05%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-hudi/pull/1592/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1592?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#1592  +/-   ##
   
   - Coverage 18.21%   18.16%   -0.06% 
   - Complexity  856  857   +1 
   
 Files   348  351   +3 
 Lines 1533415388  +54 
 Branches   1523 1525   +2 
   
   + Hits   2793 2795   +2 
   - Misses1218412236  +52 
 Partials357  357  
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-hudi/pull/1592?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...g/apache/hudi/hadoop/HoodieParquetInputFormat.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0hvb2RpZVBhcnF1ZXRJbnB1dEZvcm1hdC5qYXZh)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...rg/apache/hudi/hadoop/HoodieROTablePathFilter.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0hvb2RpZVJPVGFibGVQYXRoRmlsdGVyLmphdmE=)
 | `0.00% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[.../java/org/apache/hudi/hadoop/InputPathHandler.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0lucHV0UGF0aEhhbmRsZXIuamF2YQ==)
 | `0.00% <ø> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[.../hadoop/realtime/AbstractRealtimeRecordReader.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL0Fic3RyYWN0UmVhbHRpbWVSZWNvcmRSZWFkZXIuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...oop/realtime/HoodieParquetRealtimeInputFormat.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL0hvb2RpZVBhcnF1ZXRSZWFsdGltZUlucHV0Rm9ybWF0LmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...hadoop/realtime/RealtimeCompactedRecordReader.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL1JlYWx0aW1lQ29tcGFjdGVkUmVjb3JkUmVhZGVyLmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[.../hadoop/realtime/RealtimeUnmergedRecordReader.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3JlYWx0aW1lL1JlYWx0aW1lVW5tZXJnZWRSZWNvcmRSZWFkZXIuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...a/org/apache/hudi/hadoop/utils/HoodieHiveUtil.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3V0aWxzL0hvb2RpZUhpdmVVdGlsLmphdmE=)
 | `0.00% <ø> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...ache/hudi/hadoop/utils/HoodieInputFormatUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3V0aWxzL0hvb2RpZUlucHV0Rm9ybWF0VXRpbHMuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...i/hadoop/utils/HoodieRealtimeInputFormatUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3V0aWxzL0hvb2RpZVJlYWx0aW1lSW5wdXRGb3JtYXRVdGlscy5qYXZh)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | ... and [5 
more](https://codecov.io/gh/apache/incubator-hudi/pull/1592/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1592?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #286

2020-05-22 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 2.36 KB...]
/home/jenkins/tools/maven/apache-maven-3.5.4/conf:
logging
settings.xml
toolchains.xml

/home/jenkins/tools/maven/apache-maven-3.5.4/conf/logging:
simplelogger.properties

/home/jenkins/tools/maven/apache-maven-3.5.4/lib:
aopalliance-1.0.jar
cdi-api-1.0.jar
cdi-api.license
commons-cli-1.4.jar
commons-cli.license
commons-io-2.5.jar
commons-io.license
commons-lang3-3.5.jar
commons-lang3.license
ext
guava-20.0.jar
guice-4.2.0-no_aop.jar
jansi-1.17.1.jar
jansi-native
javax.inject-1.jar
jcl-over-slf4j-1.7.25.jar
jcl-over-slf4j.license
jsr250-api-1.0.jar
jsr250-api.license
maven-artifact-3.5.4.jar
maven-artifact.license
maven-builder-support-3.5.4.jar
maven-builder-support.license
maven-compat-3.5.4.jar
maven-compat.license
maven-core-3.5.4.jar
maven-core.license
maven-embedder-3.5.4.jar
maven-embedder.license
maven-model-3.5.4.jar
maven-model-builder-3.5.4.jar
maven-model-builder.license
maven-model.license
maven-plugin-api-3.5.4.jar
maven-plugin-api.license
maven-repository-metadata-3.5.4.jar
maven-repository-metadata.license
maven-resolver-api-1.1.1.jar
maven-resolver-api.license
maven-resolver-connector-basic-1.1.1.jar
maven-resolver-connector-basic.license
maven-resolver-impl-1.1.1.jar
maven-resolver-impl.license
maven-resolver-provider-3.5.4.jar
maven-resolver-provider.license
maven-resolver-spi-1.1.1.jar
maven-resolver-spi.license
maven-resolver-transport-wagon-1.1.1.jar
maven-resolver-transport-wagon.license
maven-resolver-util-1.1.1.jar
maven-resolver-util.license
maven-settings-3.5.4.jar
maven-settings-builder-3.5.4.jar
maven-settings-builder.license
maven-settings.license
maven-shared-utils-3.2.1.jar
maven-shared-utils.license
maven-slf4j-provider-3.5.4.jar
maven-slf4j-provider.license
org.eclipse.sisu.inject-0.3.3.jar
org.eclipse.sisu.inject.license
org.eclipse.sisu.plexus-0.3.3.jar
org.eclipse.sisu.plexus.license
plexus-cipher-1.7.jar
plexus-cipher.license
plexus-component-annotations-1.7.1.jar
plexus-component-annotations.license
plexus-interpolation-1.24.jar
plexus-interpolation.license
plexus-sec-dispatcher-1.4.jar
plexus-sec-dispatcher.license
plexus-utils-3.1.0.jar
plexus-utils.license
slf4j-api-1.7.25.jar
slf4j-api.license
wagon-file-3.1.0.jar
wagon-file.license
wagon-http-3.1.0-shaded.jar
wagon-http.license
wagon-provider-api-3.1.0.jar
wagon-provider-api.license

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/ext:
README.txt

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native:
freebsd32
freebsd64
linux32
linux64
osx
README.txt
windows32
windows64

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/freebsd32:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/freebsd64:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/linux32:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/linux64:
libjansi.so

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/osx:
libjansi.jnilib

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/windows32:
jansi.dll

/home/jenkins/tools/maven/apache-maven-3.5.4/lib/jansi-native/windows64:
jansi.dll
Finished /home/jenkins/tools/maven/apache-maven-3.5.4 Directory Listing :
Detected current version as: 
'HUDI_home=
0.6.0-SNAPSHOT'
[INFO] Scanning for projects...
[WARNING] 
[WARNING] Some problems were encountered while building the effective model for 
org.apache.hudi:hudi-spark_2.11:jar:0.6.0-SNAPSHOT
[WARNING] 'artifactId' contains an expression but should be a constant. @ 
org.apache.hudi:hudi-spark_${scala.binary.version}:[unknown-version], 

 line 26, column 15
[WARNING] 
[WARNING] Some problems were encountered while building the effective model for 
org.apache.hudi:hudi-timeline-service:jar:0.6.0-SNAPSHOT
[WARNING] 'build.plugins.plugin.(groupId:artifactId)' must be unique but found 
duplicate declaration of plugin org.jacoco:jacoco-maven-plugin @ 
org.apache.hudi:hudi-timeline-service:[unknown-version], 

 line 58, column 15
[WARNING] 
[WARNING] Some problems were encountered while building the effective model for 
org.apache.hudi:hudi-utilities_2.11:jar:0.6.0-SNAPSHOT
[WARNING] 'artifactId' contains an expression but should be a constant. @ 
org.apache.hudi:hudi-utilities_${scala.binary.version}:[unknown-version], 

 line 26, column 15
[WARNING] 
[WARNING] Some problems were encountered while building the effective model for 
org.apache.hudi:hudi-spark-bundle_2.11:jar:0.6.0-SNAPSHOT
[WARNING] 'artifactId' contains an expression but should be a constant. @ 

[GitHub] [incubator-hudi] garyli1019 commented on pull request #1592: [Hudi-822] decouple Hudi related logics from HoodieInputFormat

2020-05-22 Thread GitBox


garyli1019 commented on pull request #1592:
URL: https://github.com/apache/incubator-hudi/pull/1592#issuecomment-632977748


   Folks, put these two commits together is too difficult to review. Let's do 
one step at a time. 
   
   1. This PR is decoupling Hudi related logics from HoodieInputFormat. 
   2. After this we can do Spark Datasource on MOR.
   3. Then Spark Datasource incremental view on MOR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-920) Incremental view on MOR table using Spark Datasource

2020-05-22 Thread Yanjia Gary Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanjia Gary Li updated HUDI-920:

Fix Version/s: 0.6.0

> Incremental view on MOR table using Spark Datasource
> 
>
> Key: HUDI-920
> URL: https://issues.apache.org/jira/browse/HUDI-920
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Spark Integration
>Reporter: Yanjia Gary Li
>Assignee: Yanjia Gary Li
>Priority: Major
> Fix For: 0.6.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-920) Incremental view on MOR table using Spark Datasource

2020-05-22 Thread Yanjia Gary Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanjia Gary Li updated HUDI-920:

Status: Open  (was: New)

> Incremental view on MOR table using Spark Datasource
> 
>
> Key: HUDI-920
> URL: https://issues.apache.org/jira/browse/HUDI-920
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Spark Integration
>Reporter: Yanjia Gary Li
>Assignee: Yanjia Gary Li
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-920) Incremental view on MOR table using Spark Datasource

2020-05-22 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-920:
---

 Summary: Incremental view on MOR table using Spark Datasource
 Key: HUDI-920
 URL: https://issues.apache.org/jira/browse/HUDI-920
 Project: Apache Hudi (incubating)
  Issue Type: New Feature
  Components: Spark Integration
Reporter: Yanjia Gary Li
Assignee: Yanjia Gary Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] leesf edited a comment on issue #661: Tracking ticket for reporting Hudi usages from the community

2020-05-22 Thread GitBox


leesf edited a comment on issue #661:
URL: https://github.com/apache/incubator-hudi/issues/661#issuecomment-632942498


   Hudi has been integrated into [Data Lake 
Analytics](https://www.alibabacloud.com/help/product/70174.htm) at Aliyun to 
provide a datalake solution for users on 
[OSS](https://www.alibabacloud.com/help/doc-detail/31883.htm). Firstly, you 
would write data to Hudi on OSS, and then sync to DLA, after that you would 
query the Hudi dataset via DLA.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] leesf removed a comment on issue #661: Tracking ticket for reporting Hudi usages from the community

2020-05-22 Thread GitBox


leesf removed a comment on issue #661:
URL: https://github.com/apache/incubator-hudi/issues/661#issuecomment-526755266


   Not using it in prod yet, but very early stages of investigating its usage.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] leesf edited a comment on issue #661: Tracking ticket for reporting Hudi usages from the community

2020-05-22 Thread GitBox


leesf edited a comment on issue #661:
URL: https://github.com/apache/incubator-hudi/issues/661#issuecomment-632942498


   Hudi has been integrated into [AliCloud Data Lake 
Analytics](https://www.alibabacloud.com/help/product/70174.htm) at Aliyun to 
provide a datalake solution for users. Firstly, you would write data to Hudi on 
[OSS](https://www.alibabacloud.com/help/doc-detail/31883.htm), and then sync to 
DLA, after that you would query the Hudi dataset via DLA



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] leesf commented on issue #661: Tracking ticket for reporting Hudi usages from the community

2020-05-22 Thread GitBox


leesf commented on issue #661:
URL: https://github.com/apache/incubator-hudi/issues/661#issuecomment-632942498


   Hudi has been integrated into [AliCloud Data Lake 
Analytics](https://www.alibabacloud.com/help/product/70174.htm) at Aliyun to 
provide a solution for users to query hudi tables via DLA.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




svn commit: r39731 - /dev/incubator/hudi/KEYS

2020-05-22 Thread vbalaji
Author: vbalaji
Date: Fri May 22 20:19:45 2020
New Revision: 39731

Log:
Adding nsivabalan KEY

Modified:
dev/incubator/hudi/KEYS

Modified: dev/incubator/hudi/KEYS
==
--- dev/incubator/hudi/KEYS (original)
+++ dev/incubator/hudi/KEYS Fri May 22 20:19:45 2020
@@ -369,3 +369,93 @@ k2TGkmCbCkVobp4AElQ28fQ/sAYtVWLO6wGEpFH/
 FA3V4MZ7SGwKjZZa6oep6lPoig/R4MfsDwQ2zW/vLFPel1am406v
 =Z0T0
 -END PGP PUBLIC KEY BLOCK-
+
+pub   rsa2048 2020-05-22 [SC]
+  24A1F32AD79C70D7873A799EF81F5FAD4BE73A33
+uid   [ultimate] Sivabalan Narayanan (CODE SIGNING KEY) 

+sig 3F81F5FAD4BE73A33 2020-05-22  Sivabalan Narayanan (CODE SIGNING 
KEY) 
+sub   rsa2048 2020-05-22 [E]
+sig  F81F5FAD4BE73A33 2020-05-22  Sivabalan Narayanan (CODE SIGNING 
KEY) 
+pub   rsa4096 2020-05-22 [SC]
+  001B66FA2B2543C151872CCC29A4FD82F1508833
+uid   [ultimate] Sivabalan Narayanan (CODE SIGNING 4096 KEY) 

+sig 329A4FD82F1508833 2020-05-22  Sivabalan Narayanan (CODE SIGNING 
4096 KEY) 
+sub   rsa4096 2020-05-22 [E]
+sig  29A4FD82F1508833 2020-05-22  Sivabalan Narayanan (CODE SIGNING 
4096 KEY) 
+-BEGIN PGP PUBLIC KEY BLOCK-
+mQENBF7IIq8BCACdgvhd/sX3ygampSY7YQ/EFkVwT2j0+gR16XKAWY0okTPoBhWQ
+HcX3Gi0D6eZp0XpA//CN9Mqpuuvv+BoC1W5MbiV4f42+owdP83FnxLNqovbXMDF3
+WOVXbTHHXuvYdCNaxYnJZlkoI3o+oEdtauPiaH4e1yoflbbQNwvP+4o2Xw5D2Hu/
+my7sDVv3+FQwk6yAjKjcanjOBdNGaZ1xEriI7XSGQPnx0qwBnABXOhpa8d8QLLxo
+qtUBuyWmCekWvvPOdwCC7ln+R/dNoZbrc9/0CdQ1v79+LF0v3oMTFrUBly1Mg+UB
+uxoGAMSKK4tiQpZmyY53A3ikdqMnYjYdi/dLABEBAAG0PVNpdmFiYWxhbiBOYXJh
+eWFuYW4gKENPREUgU0lHTklORyBLRVkpIDxzaXZhYmFsYW5AYXBhY2hlLm9yZz6J
+AU4EEwEIADgWIQQkofMq15xw14c6eZ74H1+tS+c6MwUCXsgirwIbAwULCQgHAgYV
+CgkICwIEFgIDAQIeAQIXgAAKCRD4H1+tS+c6M61gB/9m0gXvVda9vTUVWIvXm8ue
+yuzIIAo1/ua41vE4S3mZ2sObiDP+qPSl5g7pdHj+bMWiHrLqDg7r30TQI6v6lufV
+8XTaymP/1lfisaDBRh3x6fjgoMt8enLLXMOVJArnWaqs9VbpKq7U3LKeQrAJ6LNN
+0N9QOq+emzi4Xu8RhYl6m9mzGbFUdeDCW2BKkRgnuxORp8Ok3BsXnuqT8kgyzw4D
+lcKY+2sPpNuZ9qu05nkWKy1tNXI3RC+rOiOk11Egj/UJpYuTbJFZ0Bgn0sv7689t
+J6KdYooGyISEj4IhaiBEQnDKNeyMRd3LnG7h8iQWIxz33MWosAPDjby2DNDB8PP+
+uQENBF7IIq8BCADSjrusU/qlCs+ZxGRa6LFkb9i7LqyhcCUKLrNJ6ERD1yWkFuYn
+zP2ZC0Th4DFi3pxvW5gnj/Gu6ORmlvHpSaP+JU2XBivH2lYZaOE/YOFfUlesSZnO
+Yc+B2SX1/TJwj/GiwEDHdR82OMciQgkTLxcp43rIUqxxeOVsWmvyJQ9PMcyuUcHY
+RUcr52tNwvyDwueC5coo1tMzNUH8megWTMNkA+32EQZ6mPbrogBSaZ4UjngVfy5y
+q74GEp5vYsKfQ8la67vJMws7IefdHVbn17GtcA6y7sz0wxgMKl2I4iXjT7+nIFYZ
+CI+1uFE0+4rIyN4foR8Qrta/SwwbxOEbsNYPABEBAAGJATYEGAEIACAWIQQkofMq
+15xw14c6eZ74H1+tS+c6MwUCXsgirwIbDAAKCRD4H1+tS+c6MxFMCACBCfmyHzOL
+F7DGd96+UQu2fYnel4r4v35tNaEo81mYpseJvqrxy97Lw8uOgtJCyP1HGFFoeWD8
+k/Sxc8A1wwaE652OSYZsXtRJFfkEiKbvlHu++DemvsG00Sc3Id6st1WsltYCceE9
+Vyc2JkW4lIyngdaEQ684rRD3tnL6IdpU6+IkPFk4DJ9onE0nRYRhJh3VkpBsQaxH
+Kal7GVE9urk/u4jEDHIDD7sQBZDhK7q4X9oMMZAbNVy8qyYxdakAHwdBFiFwOwiV
+IeENggOCvFPjhJFUmj9V5ufOQ7hMCoOyFnuFDEjazy1rDcYc97twQZKz9wpnJ3sT
+Qg4pNVw7XS1nmQINBF7II6MBEADMYKgJp677iUZPeLj3ugh0Ov9fyIfznW2O6nRi
+GLd7XZjQURA8g1THzYXsoXcmBr1ACYc2Sc8qqoTQZiPNpdT4ONkA/7ynCiZQqGPm
+u7lvbg37rghc0oNyqC7eqEdzB+Hfc77v7D/bknmiljTHvAhs0aaxnFyDdfZ8NGDy
+OAi3CbeMHy3v396DyrHq36KYy5IUBQFNtkx6QGR3Wuou22DR2zQgw1OWtcGxZ1Ac
+gSdHIzX0YiWu82yaq0B0i6guGgH9HLuCsojjYxLMmHpcc09nFErt2vmqLF+FjzR4
+tC6rxfShrBvn0wCMhxrwSJfiMnd4x+xc2xg07oC6Hp+XJxOcBJ75o+Tb7XoL87w+
+y8N1uQbUoYGCu3AzT9Ugf/h0ofoGpxd0gtquYEwJiZLdqtUEm0+Ik3yhp7cyo1gf
+dSxLWn5cD+B7rPakeCMxyKsWnLAZ3RH9kLIjl+qhB0xY+0nh1sYRXtduycAR2LBQ
+ZDF3rHkE0XG9UsVnHFA0+3D4X8XXQ9kv1JqqVpvrmAnQsWKcl1C9qLwGxqi5I18Z
+L81Q3S43l7sB4Y/EKcGp0xDhSwmJnyLGcT+3lDBvgZX3Z3mzw4vy+0Nc0H2mqE1F
+Rbc31gbiyzr0hCgTP6H+ae6pTw7TgIm1oWuAiQj51Y6X7gNr+/KagofiJF+w+wAC
+JZu6eQARAQABtEJTaXZhYmFsYW4gTmFyYXlhbmFuIChDT0RFIFNJR05JTkcgNDA5
+NiBLRVkpIDxzaXZhYmFsYW5AYXBhY2hlLm9yZz6JAk4EEwEIADgWIQQAG2b6KyVD
+wVGHLMwppP2C8VCIMwUCXsgjowIbAwULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAK
+CRAppP2C8VCIM6DcD/97XGT8x+7rphtmIE4SDxXCo8c8+G7YUJik6bvYTlj7M+gh
+QwvMOaRHenxXbIuL4rxnqGgmM+mER8sU20Z4P5NMA1dcrIj6KkLz3I/tBxSL/ERR
+5mIrPZzIMxNfdq7I7LYFdQAabTdQ76GT7orUo6IiaWT4tiCLUq1frwf5eNwxwHv7
+VU0gcdQJa9xe93Vf+oEfTZPZdLK2xyfpfSXbzYigxA2nEMs+wtNLTPolcgS7YL9d
+UOgJ3ghGhvqKKTlL4T0VsGnef5pqGvJKf0fABX4Q8YkyilN5mU4ScrIjpdMsQs51
+DmXKeG5j97MJbGMh6dBsMvX0FlLTH4/W6zlsoiRfrLXPRYQG6fywfD7Bf0f8Mek8
+ezoykPZy1SxiPrXRmbYmbvjBUNbOAGjAjp5S2xx/g4mygPoBppidvX7yZI88AVl1
+Rt4vZnUuly96Ks6OGj8aStYtZBoAGYT8j84T/NhF6k084EON8wnezjB3emAAykuu
+yL/jxO6x7jtYVjxG/Kg6KWl9+0J9zp/GECrforYGgdrTg6s/TPKb/7KQSgc3aQs3
+IEzFo05a0B69FI8TIDAqy1QkmvowD2W9pJ/sblqThgi/QIGJhtXiqYZ1QBgTaST1
+6x/gj6CxnpeNvVHM8oXGo0Q9py+YYUwhnX0eo0h0aPlDxUMLsJP7BVVcmIKXN7kC
+DQReyCOjARAAq3IWwMJOwGmIOxfxnXno5ynZSRa7ZtJGZuA7V+spEEQhgfZ6SlrR
+m/Slo7WAdOyB48vHJX1vrNQLRxmgfCkWfikdY205Q6OzhoLua0PbO3jdQYrQTx4A
+rx3G6fali2h55SWCRR1AQ9Iuu0Zqjp1ZaDrS/VPBFQxxioMLPjJapbmULGSmxu6E
+FHJtzgrCjkvRTrCeDppOiisAd/F7vn93usx/JyeZ6DfN4J1MQJhAzEOTFOyIj5Vu
+9JbnLnx09Hqb+JUQI+RAgRyj7h6qtYCnJmyJfK15WkrzYn0SZ/pvYFoxgKwZhl5N

svn commit: r39730 - /release/incubator/hudi/KEYS

2020-05-22 Thread vbalaji
Author: vbalaji
Date: Fri May 22 20:18:41 2020
New Revision: 39730

Log:
Adding nsivabalan KEY

Modified:
release/incubator/hudi/KEYS

Modified: release/incubator/hudi/KEYS
==
--- release/incubator/hudi/KEYS (original)
+++ release/incubator/hudi/KEYS Fri May 22 20:18:41 2020
@@ -369,3 +369,93 @@ k2TGkmCbCkVobp4AElQ28fQ/sAYtVWLO6wGEpFH/
 FA3V4MZ7SGwKjZZa6oep6lPoig/R4MfsDwQ2zW/vLFPel1am406v
 =Z0T0
 -END PGP PUBLIC KEY BLOCK-
+
+pub   rsa2048 2020-05-22 [SC]
+  24A1F32AD79C70D7873A799EF81F5FAD4BE73A33
+uid   [ultimate] Sivabalan Narayanan (CODE SIGNING KEY) 

+sig 3F81F5FAD4BE73A33 2020-05-22  Sivabalan Narayanan (CODE SIGNING 
KEY) 
+sub   rsa2048 2020-05-22 [E]
+sig  F81F5FAD4BE73A33 2020-05-22  Sivabalan Narayanan (CODE SIGNING 
KEY) 
+pub   rsa4096 2020-05-22 [SC]
+  001B66FA2B2543C151872CCC29A4FD82F1508833
+uid   [ultimate] Sivabalan Narayanan (CODE SIGNING 4096 KEY) 

+sig 329A4FD82F1508833 2020-05-22  Sivabalan Narayanan (CODE SIGNING 
4096 KEY) 
+sub   rsa4096 2020-05-22 [E]
+sig  29A4FD82F1508833 2020-05-22  Sivabalan Narayanan (CODE SIGNING 
4096 KEY) 
+-BEGIN PGP PUBLIC KEY BLOCK-
+mQENBF7IIq8BCACdgvhd/sX3ygampSY7YQ/EFkVwT2j0+gR16XKAWY0okTPoBhWQ
+HcX3Gi0D6eZp0XpA//CN9Mqpuuvv+BoC1W5MbiV4f42+owdP83FnxLNqovbXMDF3
+WOVXbTHHXuvYdCNaxYnJZlkoI3o+oEdtauPiaH4e1yoflbbQNwvP+4o2Xw5D2Hu/
+my7sDVv3+FQwk6yAjKjcanjOBdNGaZ1xEriI7XSGQPnx0qwBnABXOhpa8d8QLLxo
+qtUBuyWmCekWvvPOdwCC7ln+R/dNoZbrc9/0CdQ1v79+LF0v3oMTFrUBly1Mg+UB
+uxoGAMSKK4tiQpZmyY53A3ikdqMnYjYdi/dLABEBAAG0PVNpdmFiYWxhbiBOYXJh
+eWFuYW4gKENPREUgU0lHTklORyBLRVkpIDxzaXZhYmFsYW5AYXBhY2hlLm9yZz6J
+AU4EEwEIADgWIQQkofMq15xw14c6eZ74H1+tS+c6MwUCXsgirwIbAwULCQgHAgYV
+CgkICwIEFgIDAQIeAQIXgAAKCRD4H1+tS+c6M61gB/9m0gXvVda9vTUVWIvXm8ue
+yuzIIAo1/ua41vE4S3mZ2sObiDP+qPSl5g7pdHj+bMWiHrLqDg7r30TQI6v6lufV
+8XTaymP/1lfisaDBRh3x6fjgoMt8enLLXMOVJArnWaqs9VbpKq7U3LKeQrAJ6LNN
+0N9QOq+emzi4Xu8RhYl6m9mzGbFUdeDCW2BKkRgnuxORp8Ok3BsXnuqT8kgyzw4D
+lcKY+2sPpNuZ9qu05nkWKy1tNXI3RC+rOiOk11Egj/UJpYuTbJFZ0Bgn0sv7689t
+J6KdYooGyISEj4IhaiBEQnDKNeyMRd3LnG7h8iQWIxz33MWosAPDjby2DNDB8PP+
+uQENBF7IIq8BCADSjrusU/qlCs+ZxGRa6LFkb9i7LqyhcCUKLrNJ6ERD1yWkFuYn
+zP2ZC0Th4DFi3pxvW5gnj/Gu6ORmlvHpSaP+JU2XBivH2lYZaOE/YOFfUlesSZnO
+Yc+B2SX1/TJwj/GiwEDHdR82OMciQgkTLxcp43rIUqxxeOVsWmvyJQ9PMcyuUcHY
+RUcr52tNwvyDwueC5coo1tMzNUH8megWTMNkA+32EQZ6mPbrogBSaZ4UjngVfy5y
+q74GEp5vYsKfQ8la67vJMws7IefdHVbn17GtcA6y7sz0wxgMKl2I4iXjT7+nIFYZ
+CI+1uFE0+4rIyN4foR8Qrta/SwwbxOEbsNYPABEBAAGJATYEGAEIACAWIQQkofMq
+15xw14c6eZ74H1+tS+c6MwUCXsgirwIbDAAKCRD4H1+tS+c6MxFMCACBCfmyHzOL
+F7DGd96+UQu2fYnel4r4v35tNaEo81mYpseJvqrxy97Lw8uOgtJCyP1HGFFoeWD8
+k/Sxc8A1wwaE652OSYZsXtRJFfkEiKbvlHu++DemvsG00Sc3Id6st1WsltYCceE9
+Vyc2JkW4lIyngdaEQ684rRD3tnL6IdpU6+IkPFk4DJ9onE0nRYRhJh3VkpBsQaxH
+Kal7GVE9urk/u4jEDHIDD7sQBZDhK7q4X9oMMZAbNVy8qyYxdakAHwdBFiFwOwiV
+IeENggOCvFPjhJFUmj9V5ufOQ7hMCoOyFnuFDEjazy1rDcYc97twQZKz9wpnJ3sT
+Qg4pNVw7XS1nmQINBF7II6MBEADMYKgJp677iUZPeLj3ugh0Ov9fyIfznW2O6nRi
+GLd7XZjQURA8g1THzYXsoXcmBr1ACYc2Sc8qqoTQZiPNpdT4ONkA/7ynCiZQqGPm
+u7lvbg37rghc0oNyqC7eqEdzB+Hfc77v7D/bknmiljTHvAhs0aaxnFyDdfZ8NGDy
+OAi3CbeMHy3v396DyrHq36KYy5IUBQFNtkx6QGR3Wuou22DR2zQgw1OWtcGxZ1Ac
+gSdHIzX0YiWu82yaq0B0i6guGgH9HLuCsojjYxLMmHpcc09nFErt2vmqLF+FjzR4
+tC6rxfShrBvn0wCMhxrwSJfiMnd4x+xc2xg07oC6Hp+XJxOcBJ75o+Tb7XoL87w+
+y8N1uQbUoYGCu3AzT9Ugf/h0ofoGpxd0gtquYEwJiZLdqtUEm0+Ik3yhp7cyo1gf
+dSxLWn5cD+B7rPakeCMxyKsWnLAZ3RH9kLIjl+qhB0xY+0nh1sYRXtduycAR2LBQ
+ZDF3rHkE0XG9UsVnHFA0+3D4X8XXQ9kv1JqqVpvrmAnQsWKcl1C9qLwGxqi5I18Z
+L81Q3S43l7sB4Y/EKcGp0xDhSwmJnyLGcT+3lDBvgZX3Z3mzw4vy+0Nc0H2mqE1F
+Rbc31gbiyzr0hCgTP6H+ae6pTw7TgIm1oWuAiQj51Y6X7gNr+/KagofiJF+w+wAC
+JZu6eQARAQABtEJTaXZhYmFsYW4gTmFyYXlhbmFuIChDT0RFIFNJR05JTkcgNDA5
+NiBLRVkpIDxzaXZhYmFsYW5AYXBhY2hlLm9yZz6JAk4EEwEIADgWIQQAG2b6KyVD
+wVGHLMwppP2C8VCIMwUCXsgjowIbAwULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAK
+CRAppP2C8VCIM6DcD/97XGT8x+7rphtmIE4SDxXCo8c8+G7YUJik6bvYTlj7M+gh
+QwvMOaRHenxXbIuL4rxnqGgmM+mER8sU20Z4P5NMA1dcrIj6KkLz3I/tBxSL/ERR
+5mIrPZzIMxNfdq7I7LYFdQAabTdQ76GT7orUo6IiaWT4tiCLUq1frwf5eNwxwHv7
+VU0gcdQJa9xe93Vf+oEfTZPZdLK2xyfpfSXbzYigxA2nEMs+wtNLTPolcgS7YL9d
+UOgJ3ghGhvqKKTlL4T0VsGnef5pqGvJKf0fABX4Q8YkyilN5mU4ScrIjpdMsQs51
+DmXKeG5j97MJbGMh6dBsMvX0FlLTH4/W6zlsoiRfrLXPRYQG6fywfD7Bf0f8Mek8
+ezoykPZy1SxiPrXRmbYmbvjBUNbOAGjAjp5S2xx/g4mygPoBppidvX7yZI88AVl1
+Rt4vZnUuly96Ks6OGj8aStYtZBoAGYT8j84T/NhF6k084EON8wnezjB3emAAykuu
+yL/jxO6x7jtYVjxG/Kg6KWl9+0J9zp/GECrforYGgdrTg6s/TPKb/7KQSgc3aQs3
+IEzFo05a0B69FI8TIDAqy1QkmvowD2W9pJ/sblqThgi/QIGJhtXiqYZ1QBgTaST1
+6x/gj6CxnpeNvVHM8oXGo0Q9py+YYUwhnX0eo0h0aPlDxUMLsJP7BVVcmIKXN7kC
+DQReyCOjARAAq3IWwMJOwGmIOxfxnXno5ynZSRa7ZtJGZuA7V+spEEQhgfZ6SlrR
+m/Slo7WAdOyB48vHJX1vrNQLRxmgfCkWfikdY205Q6OzhoLua0PbO3jdQYrQTx4A
+rx3G6fali2h55SWCRR1AQ9Iuu0Zqjp1ZaDrS/VPBFQxxioMLPjJapbmULGSmxu6E
+FHJtzgrCjkvRTrCeDppOiisAd/F7vn93usx/JyeZ6DfN4J1MQJhAzEOTFOyIj5Vu
+9JbnLnx09Hqb+JUQI+RAgRyj7h6qtYCnJmyJfK15WkrzYn0SZ/pvYFoxgKwZhl5N

[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1650: [HUDI-541]: replaced dataFile/df with baseFile/bf throughout code base

2020-05-22 Thread GitBox


bvaradar commented on a change in pull request #1650:
URL: https://github.com/apache/incubator-hudi/pull/1650#discussion_r429412110



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/util/RocksDBSchemaHelper.java
##
@@ -78,12 +78,12 @@ public String getPrefixForSliceViewByPartitionFile(String 
partitionPath, String
 return String.format("type=slice,part=%s,id=%s,instant=", partitionPath, 
fileId);
   }
 
-  public String getPrefixForDataFileViewByPartitionFile(String partitionPath, 
String fileId) {
-return String.format("type=df,part=%s,id=%s,instant=", partitionPath, 
fileId);
+  public String getPrefixForBaseFileViewByPartitionFile(String partitionPath, 
String fileId) {
+return String.format("type=bf,part=%s,id=%s,instant=", partitionPath, 
fileId);

Review comment:
   Yes, this is currently transient. It is fine.

##
File path: hudi-common/src/main/avro/HoodieCompactionOperation.avsc
##
@@ -41,7 +41,7 @@
  "default": null
   },
   {
- "name":"dataFilePath",
+ "name":"baseFilePath",

Review comment:
   @pratyakshsharma : Instead of renaming fields, add an alias ?
   
   http://bigdatafindings.blogspot.com/2016/05/schema-evolution-with-avro.html

##
File path: 
hudi-cli/src/main/java/org/apache/hudi/cli/commands/CompactionCommand.java
##
@@ -370,7 +370,7 @@ private String printCompaction(HoodieCompactionPlan 
compactionPlan,
 List rows = new ArrayList<>();
 if ((null != compactionPlan) && (null != compactionPlan.getOperations())) {
   for (HoodieCompactionOperation op : compactionPlan.getOperations()) {
-rows.add(new Comparable[]{op.getPartitionPath(), op.getFileId(), 
op.getBaseInstantTime(), op.getDataFilePath(),
+rows.add(new Comparable[]{op.getPartitionPath(), op.getFileId(), 
op.getBaseInstantTime(), op.getBaseFilePath(),

Review comment:
   Yes, the name is serialized. @pratyakshsharma : Instead of renaming , 
can you add alias to the field and see if that works ?

##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/view/RemoteHoodieTableFileSystemView.java
##
@@ -71,18 +71,18 @@
 
   public static final String PENDING_COMPACTION_OPS = String.format("%s/%s", 
BASE_URL, "compactions/pending/");
 
-  public static final String LATEST_PARTITION_DATA_FILES_URL =
-  String.format("%s/%s", BASE_URL, "datafiles/latest/partition");
-  public static final String LATEST_PARTITION_DATA_FILE_URL =
-  String.format("%s/%s", BASE_URL, "datafile/latest/partition");
-  public static final String ALL_DATA_FILES = String.format("%s/%s", BASE_URL, 
"datafiles/all");
-  public static final String LATEST_ALL_DATA_FILES = String.format("%s/%s", 
BASE_URL, "datafiles/all/latest/");
-  public static final String LATEST_DATA_FILE_ON_INSTANT_URL = 
String.format("%s/%s", BASE_URL, "datafile/on/latest/");
-
-  public static final String LATEST_DATA_FILES_RANGE_INSTANT_URL =
-  String.format("%s/%s", BASE_URL, "datafiles/range/latest/");
-  public static final String LATEST_DATA_FILES_BEFORE_ON_INSTANT_URL =
-  String.format("%s/%s", BASE_URL, "datafiles/beforeoron/latest/");
+  public static final String LATEST_PARTITION_BASE_FILES_URL =
+  String.format("%s/%s", BASE_URL, "basefiles/latest/partition");

Review comment:
   Yes.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on pull request #1650: [HUDI-541]: replaced dataFile/df with baseFile/bf throughout code base

2020-05-22 Thread GitBox


pratyakshsharma commented on pull request #1650:
URL: https://github.com/apache/incubator-hudi/pull/1650#issuecomment-632859031


   Here also the error seems to be similar to the one I faced in 
https://github.com/apache/incubator-hudi/pull/1558#issuecomment-632858646 
@vinothchandar 
   
   Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump 
and [date].dumpstream.
   [ERROR] The forked VM terminated without properly saying goodbye. VM crash 
or System.exit called?
   [ERROR] Command was /bin/sh -c cd 
/home/travis/build/apache/incubator-hudi/hudi-client && 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx2g -jar 
/home/travis/build/apache/incubator-hudi/hudi-client/target/surefire/surefirebooter7263941085871558609.jar
 /home/travis/build/apache/incubator-hudi/hudi-client/target/surefire 
2020-05-22T18-17-23_954-jvmRun1 surefire2263453892439728728tmp 
surefire_31575674776281171723tmp
   [ERROR] Error occurred in starting fork, check output in log
   [ERROR] Process Exit Code: 1
   [ERROR] Crashed tests:
   [ERROR] org.apache.hudi.table.action.compact.TestAsyncCompaction
   [ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: The 
forked VM terminated without properly saying goodbye. VM crash or System.exit 
called?
   [ERROR] Command was /bin/sh -c cd 
/home/travis/build/apache/incubator-hudi/hudi-client && 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx2g -jar 
/home/travis/build/apache/incubator-hudi/hudi-client/target/surefire/surefirebooter7263941085871558609.jar
 /home/travis/build/apache/incubator-hudi/hudi-client/target/surefire 
2020-05-22T18-17-23_954-jvmRun1 surefire2263453892439728728tmp 
surefire_31575674776281171723tmp
   [ERROR] Error occurred in starting fork, check output in log
   [ERROR] Process Exit Code: 1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-22 Thread GitBox


pratyakshsharma commented on pull request #1558:
URL: https://github.com/apache/incubator-hudi/pull/1558#issuecomment-632858646


   > Do you know what the error is
   
   There is some error while forking as below -> 
   
   [ERROR] Please refer to dump files (if any exist) [date].dump, 
[date]-jvmRun[N].dump and [date].dumpstream.
   [ERROR] The forked VM terminated without properly saying goodbye. VM crash 
or System.exit called?
   [ERROR] Command was /bin/sh -c cd 
/home/travis/build/apache/incubator-hudi/hudi-client && 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx2g -jar 
/home/travis/build/apache/incubator-hudi/hudi-client/target/surefire/surefirebooter4462331339368597651.jar
 /home/travis/build/apache/incubator-hudi/hudi-client/target/surefire 
2020-05-22T14-44-50_350-jvmRun1 surefire3441656546543475644tmp 
surefire_33480398663035356346tmp
   [ERROR] Error occurred in starting fork, check output in log
   [ERROR] Process Exit Code: 134
   [ERROR] Crashed tests:
   [ERROR] org.apache.hudi.client.TestTableSchemaEvolution
   [ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: The 
forked VM terminated without properly saying goodbye. VM crash or System.exit 
called?
   [ERROR] Command was /bin/sh -c cd 
/home/travis/build/apache/incubator-hudi/hudi-client && 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx2g -jar 
/home/travis/build/apache/incubator-hudi/hudi-client/target/surefire/surefirebooter4462331339368597651.jar
 /home/travis/build/apache/incubator-hudi/hudi-client/target/surefire 
2020-05-22T14-44-50_350-jvmRun1 surefire3441656546543475644tmp 
surefire_33480398663035356346tmp
   [ERROR] Error occurred in starting fork, check output in log
   [ERROR] Process Exit Code: 134
   
   @vinothchandar 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] vinothchandar commented on pull request #1650: [HUDI-541]: replaced dataFile/df with baseFile/bf throughout code base

2020-05-22 Thread GitBox


vinothchandar commented on pull request #1650:
URL: https://github.com/apache/incubator-hudi/pull/1650#issuecomment-632852432


   
   @bvaradar you can just look at my comments alone  



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1650: [HUDI-541]: replaced dataFile/df with baseFile/bf throughout code base

2020-05-22 Thread GitBox


vinothchandar commented on a change in pull request #1650:
URL: https://github.com/apache/incubator-hudi/pull/1650#discussion_r429399564



##
File path: 
hudi-client/src/main/java/org/apache/hudi/table/action/rollback/MergeOnReadRollbackActionExecutor.java
##
@@ -124,7 +124,7 @@ public MergeOnReadRollbackActionExecutor(JavaSparkContext 
jsc,
 case HoodieTimeline.COMMIT_ACTION:
   LOG.info("Rolling back commit action. There are higher delta 
commits. So only rolling back this instant");
   partitionRollbackRequests.add(
-  
RollbackRequest.createRollbackRequestWithDeleteDataAndLogFilesAction(partitionPath,
 instantToRollback));
+  
RollbackRequest.createRollbackRequestWithDeleteBaseAndLogFilesAction(partitionPath,
 instantToRollback));

Review comment:
   there is no rollback plan that gets serialized etc.. so this should be 
fine 

##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/view/RemoteHoodieTableFileSystemView.java
##
@@ -71,18 +71,18 @@
 
   public static final String PENDING_COMPACTION_OPS = String.format("%s/%s", 
BASE_URL, "compactions/pending/");
 
-  public static final String LATEST_PARTITION_DATA_FILES_URL =
-  String.format("%s/%s", BASE_URL, "datafiles/latest/partition");
-  public static final String LATEST_PARTITION_DATA_FILE_URL =
-  String.format("%s/%s", BASE_URL, "datafile/latest/partition");
-  public static final String ALL_DATA_FILES = String.format("%s/%s", BASE_URL, 
"datafiles/all");
-  public static final String LATEST_ALL_DATA_FILES = String.format("%s/%s", 
BASE_URL, "datafiles/all/latest/");
-  public static final String LATEST_DATA_FILE_ON_INSTANT_URL = 
String.format("%s/%s", BASE_URL, "datafile/on/latest/");
-
-  public static final String LATEST_DATA_FILES_RANGE_INSTANT_URL =
-  String.format("%s/%s", BASE_URL, "datafiles/range/latest/");
-  public static final String LATEST_DATA_FILES_BEFORE_ON_INSTANT_URL =
-  String.format("%s/%s", BASE_URL, "datafiles/beforeoron/latest/");
+  public static final String LATEST_PARTITION_BASE_FILES_URL =
+  String.format("%s/%s", BASE_URL, "basefiles/latest/partition");

Review comment:
   since a job is deployed in one shot, this may be ok.. no need to support 
backwards compatibility of endpoints etc 

##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/util/RocksDBSchemaHelper.java
##
@@ -78,12 +78,12 @@ public String getPrefixForSliceViewByPartitionFile(String 
partitionPath, String
 return String.format("type=slice,part=%s,id=%s,instant=", partitionPath, 
fileId);
   }
 
-  public String getPrefixForDataFileViewByPartitionFile(String partitionPath, 
String fileId) {
-return String.format("type=df,part=%s,id=%s,instant=", partitionPath, 
fileId);
+  public String getPrefixForBaseFileViewByPartitionFile(String partitionPath, 
String fileId) {
+return String.format("type=bf,part=%s,id=%s,instant=", partitionPath, 
fileId);

Review comment:
   Hmmm. this should be since everytime we deploy, we rebuild the rocksdb 
cache? @bvaradar ? 

##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/view/RocksDbBasedFileSystemView.java
##
@@ -167,15 +167,15 @@ protected void storePartitionView(String partitionPath, 
List fi
 rocksDB.prefixDelete(schemaHelper.getColFamilyForView(),
 schemaHelper.getPrefixForSliceViewByPartition(partitionPath));
 rocksDB.prefixDelete(schemaHelper.getColFamilyForView(),
-schemaHelper.getPrefixForDataFileViewByPartition(partitionPath));
+schemaHelper.getPrefixForBaseFileViewByPartition(partitionPath));

Review comment:
   another thing to ensure.. no rocksdb level schema changes due to this . 
Seems fine.. 

##
File path: 
hudi-cli/src/main/java/org/apache/hudi/cli/commands/CompactionCommand.java
##
@@ -370,7 +370,7 @@ private String printCompaction(HoodieCompactionPlan 
compactionPlan,
 List rows = new ArrayList<>();
 if ((null != compactionPlan) && (null != compactionPlan.getOperations())) {
   for (HoodieCompactionOperation op : compactionPlan.getOperations()) {
-rows.add(new Comparable[]{op.getPartitionPath(), op.getFileId(), 
op.getBaseInstantTime(), op.getDataFilePath(),
+rows.add(new Comparable[]{op.getPartitionPath(), op.getFileId(), 
op.getBaseInstantTime(), op.getBaseFilePath(),

Review comment:
   cc @bvaradar is this field name serialized with the plan? if so, we 
can't change it easily 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1616: [HUDI-786] Fixing read beyond inline length in InlineFS

2020-05-22 Thread GitBox


bvaradar commented on a change in pull request #1616:
URL: https://github.com/apache/incubator-hudi/pull/1616#discussion_r429392893



##
File path: 
hudi-common/src/main/java/org/apache/hudi/common/fs/inline/InLineFsDataInputStream.java
##
@@ -56,24 +56,29 @@ public long getPos() throws IOException {
 
   @Override
   public int read(long position, byte[] buffer, int offset, int length) throws 
IOException {
+if ((length - offset) > this.length) {
+  throw new IOException("Attempting to read past inline content");
+}
 return outerStream.read(startOffset + position, buffer, offset, length);
   }
 
   @Override
   public void readFully(long position, byte[] buffer, int offset, int length) 
throws IOException {
+if ((length - offset) > this.length) {
+  throw new IOException("Attempting to read past inline content");
+}
 outerStream.readFully(startOffset + position, buffer, offset, length);
   }
 
   @Override
   public void readFully(long position, byte[] buffer)
   throws IOException {
-outerStream.readFully(startOffset + position, buffer, 0, buffer.length);
+readFully(position, buffer, 0, buffer.length);
   }
 
   @Override
   public boolean seekToNewSource(long targetPos) throws IOException {

Review comment:
   @nsivabalan : Can we do bounds check first and throw error if it fails 
then delegate. Even if it is a different copy, the offsets are expected to be 
consistent across copies. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] afilipchik commented on pull request #1514: [HUDI-774] Addressing incorrect Spark to Avro schema generation

2020-05-22 Thread GitBox


afilipchik commented on pull request #1514:
URL: https://github.com/apache/incubator-hudi/pull/1514#issuecomment-632839321


   @umehrot2 yep, it is attempt to fix schema generated by spark-avro. Moving 
generation in house makes sense, but, if I recall correctly, the issue is not 
coming from spark itself but from underlying library they are using. So, it can 
be a bit of work to rewrite it. 
   
   On the test case -> incoming dataset is transformed using Spark Sql with 
schema derived from the query result (NullTargetConverter). Then we add new 
field to the output, write a batch and run a compaction. At this point new 
schema can't be used to read old data as it will fail on new non default 
fields. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] garyli1019 commented on pull request #1574: [HUDI-701]Add unit test for HDFSParquetImportCommand

2020-05-22 Thread GitBox


garyli1019 commented on pull request #1574:
URL: https://github.com/apache/incubator-hudi/pull/1574#issuecomment-632830706


   Thanks @hddong . I am using Spark installed by brew and `mkdir 
/tmp/spark-events/` fix the issue. 
   IMO if docker is not possible then we may consider using a bootstrap script 
set up the environment for testing. Something like 
https://github.com/apache/impala/blob/master/bin/impala-config.sh. This script 
will be executed before every build.
   The bad side is that it will impact the user's local environment variable. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1647: [HUDI-867]: fixed IllegalArgumentException from graphite metrics in deltaStreamer continuous mode

2020-05-22 Thread GitBox


pratyakshsharma commented on a change in pull request #1647:
URL: https://github.com/apache/incubator-hudi/pull/1647#discussion_r429381950



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##
@@ -416,10 +425,12 @@ public DeltaSync getDeltaSync() {
   jssc.setLocalProperty("spark.scheduler.pool", 
SchedulerConfGenerator.DELTASYNC_POOL_NAME);
 }
 try {
+  int iteration = 1;
   while (!isShutdownRequested()) {
 try {
   long start = System.currentTimeMillis();
-  Option scheduledCompactionInstant = deltaSync.syncOnce();
+  HoodieMetrics.setTableName(cfg.metricsTableName + "_" + 
iteration);

Review comment:
   @leesf yes. This PR only fixes this for continuous mode of 
HoodieDeltaStreamer. If you can point me to relevant code of spark datasource 
from where this can be executed, I can try fixing there as well. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] garyli1019 commented on a change in pull request #1652: [HUDI-918] Fix kafkaOffsetGen can not read kafka data bug

2020-05-22 Thread GitBox


garyli1019 commented on a change in pull request #1652:
URL: https://github.com/apache/incubator-hudi/pull/1652#discussion_r429371998



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##
@@ -207,6 +208,11 @@ public KafkaOffsetGen(TypedProperties props) {
 maxEventsToReadFromKafka = (maxEventsToReadFromKafka == Long.MAX_VALUE || 
maxEventsToReadFromKafka == Integer.MAX_VALUE)
 ? Config.maxEventsFromKafkaSource : maxEventsToReadFromKafka;
 long numEvents = sourceLimit == Long.MAX_VALUE ? maxEventsToReadFromKafka 
: sourceLimit;
+
+if (numEvents < toOffsets.size()) {

Review comment:
   Do you mean something like setting sourceLimit to 5 but you have 10 
kafka partitions? Why setting sourceLimit to this small?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] codecov-commenter edited a comment on pull request #1648: [HUDI-916]: added support for multiple input formats in TimestampBasedKeyGenerator

2020-05-22 Thread GitBox


codecov-commenter edited a comment on pull request #1648:
URL: https://github.com/apache/incubator-hudi/pull/1648#issuecomment-631889483


   # 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=h1) 
Report
   > Merging 
[#1648](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-hudi/commit/244d47494e2d4d5b3ca60e460e1feb9351fb8e69=desc)
 will **increase** coverage by `1.48%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-hudi/pull/1648/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#1648  +/-   ##
   
   + Coverage 16.60%   18.09%   +1.48% 
   - Complexity  800  856  +56 
   
 Files   344  351   +7 
 Lines 1517215436 +264 
 Branches   1512 1539  +27 
   
   + Hits   2520 2793 +273 
   + Misses1232012286  -34 
   - Partials332  357  +25 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...src/main/java/org/apache/hudi/DataSourceUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9EYXRhU291cmNlVXRpbHMuamF2YQ==)
 | `31.37% <0.00%> (-0.96%)` | `0.00 <0.00> (ø)` | |
   | 
[...e/hudi/exception/HoodieDeltaStreamerException.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9leGNlcHRpb24vSG9vZGllRGVsdGFTdHJlYW1lckV4Y2VwdGlvbi5qYXZh)
 | `0.00% <ø> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...apache/hudi/keygen/TimestampBasedKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vVGltZXN0YW1wQmFzZWRLZXlHZW5lcmF0b3IuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...e/hudi/keygen/parser/HoodieDateTimeParserImpl.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vcGFyc2VyL0hvb2RpZURhdGVUaW1lUGFyc2VySW1wbC5qYXZh)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZVdyaXRlQ29uZmlnLmphdmE=)
 | `40.07% <0.00%> (-2.19%)` | `48.00% <0.00%> (ø%)` | |
   | 
[...ain/java/org/apache/hudi/avro/HoodieAvroUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYXZyby9Ib29kaWVBdnJvVXRpbHMuamF2YQ==)
 | `48.09% <0.00%> (-1.91%)` | `22.00% <0.00%> (ø%)` | |
   | 
[...apache/hudi/common/fs/HoodieWrapperFileSystem.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0hvb2RpZVdyYXBwZXJGaWxlU3lzdGVtLmphdmE=)
 | `21.98% <0.00%> (-0.71%)` | `28.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/metrics/MetricsReporterFactory.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9NZXRyaWNzUmVwb3J0ZXJGYWN0b3J5LmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (ø%)` | |
   | 
[...apache/hudi/metrics/datadog/DatadogHttpClient.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9kYXRhZG9nL0RhdGFkb2dIdHRwQ2xpZW50LmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | |
   | ... and [23 
more](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=footer).
 Last update 
[244d474...85db4a0](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the 

[GitHub] [incubator-hudi] codecov-commenter edited a comment on pull request #1648: [HUDI-916]: added support for multiple input formats in TimestampBasedKeyGenerator

2020-05-22 Thread GitBox


codecov-commenter edited a comment on pull request #1648:
URL: https://github.com/apache/incubator-hudi/pull/1648#issuecomment-631889483


   # 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=h1) 
Report
   > Merging 
[#1648](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-hudi/commit/244d47494e2d4d5b3ca60e460e1feb9351fb8e69=desc)
 will **increase** coverage by `1.48%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-hudi/pull/1648/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#1648  +/-   ##
   
   + Coverage 16.60%   18.09%   +1.48% 
   - Complexity  800  856  +56 
   
 Files   344  351   +7 
 Lines 1517215436 +264 
 Branches   1512 1539  +27 
   
   + Hits   2520 2793 +273 
   + Misses1232012286  -34 
   - Partials332  357  +25 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...src/main/java/org/apache/hudi/DataSourceUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9EYXRhU291cmNlVXRpbHMuamF2YQ==)
 | `31.37% <0.00%> (-0.96%)` | `0.00 <0.00> (ø)` | |
   | 
[...e/hudi/exception/HoodieDeltaStreamerException.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9leGNlcHRpb24vSG9vZGllRGVsdGFTdHJlYW1lckV4Y2VwdGlvbi5qYXZh)
 | `0.00% <ø> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...apache/hudi/keygen/TimestampBasedKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vVGltZXN0YW1wQmFzZWRLZXlHZW5lcmF0b3IuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...e/hudi/keygen/parser/HoodieDateTimeParserImpl.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vcGFyc2VyL0hvb2RpZURhdGVUaW1lUGFyc2VySW1wbC5qYXZh)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZVdyaXRlQ29uZmlnLmphdmE=)
 | `40.07% <0.00%> (-2.19%)` | `48.00% <0.00%> (ø%)` | |
   | 
[...ain/java/org/apache/hudi/avro/HoodieAvroUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYXZyby9Ib29kaWVBdnJvVXRpbHMuamF2YQ==)
 | `48.09% <0.00%> (-1.91%)` | `22.00% <0.00%> (ø%)` | |
   | 
[...apache/hudi/common/fs/HoodieWrapperFileSystem.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0hvb2RpZVdyYXBwZXJGaWxlU3lzdGVtLmphdmE=)
 | `21.98% <0.00%> (-0.71%)` | `28.00% <0.00%> (-1.00%)` | |
   | 
[...rg/apache/hudi/metrics/MetricsReporterFactory.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9NZXRyaWNzUmVwb3J0ZXJGYWN0b3J5LmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (ø%)` | |
   | 
[...g/apache/hudi/metrics/datadog/DatadogReporter.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9kYXRhZG9nL0RhdGFkb2dSZXBvcnRlci5qYXZh)
 | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | |
   | 
[...apache/hudi/metrics/datadog/DatadogHttpClient.java](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9kYXRhZG9nL0RhdGFkb2dIdHRwQ2xpZW50LmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | |
   | ... and [23 
more](https://codecov.io/gh/apache/incubator-hudi/pull/1648/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1648?src=pr=footer).
 Last update 

[GitHub] [incubator-hudi] vinothchandar commented on issue #661: Tracking ticket for reporting Hudi usages from the community

2020-05-22 Thread GitBox


vinothchandar commented on issue #661:
URL: https://github.com/apache/incubator-hudi/issues/661#issuecomment-632815250


   @maduxi do you mind sharing the company name or can we add this to the site? 
   
   @sungjuly same question. Can we add this to the site? 
   
   Please let me know 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] vinothchandar commented on pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-22 Thread GitBox


vinothchandar commented on pull request #1558:
URL: https://github.com/apache/incubator-hudi/pull/1558#issuecomment-632813623


   you can hop onto travis and click restart.. CI has been pretty stable for a 
while now.. Do you know what the error is 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] xushiyan edited a comment on pull request #1603: [HUDI-836] Add configs for Datadog metrics reporter

2020-05-22 Thread GitBox


xushiyan edited a comment on pull request #1603:
URL: https://github.com/apache/incubator-hudi/pull/1603#issuecomment-629712980


   @yanghua This is ready for review now. But not sure if the merging is 
blocked by generating 0.5.3 docs as this is for 0.6.0 docs.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (HUDI-836) Implement datadog metrics reporter

2020-05-22 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu resolved HUDI-836.
-
Resolution: Implemented

> Implement datadog metrics reporter
> --
>
> Key: HUDI-836
> URL: https://issues.apache.org/jira/browse/HUDI-836
> Project: Apache Hudi (incubating)
>  Issue Type: New Feature
>  Components: Common Core
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Major
>  Labels: bug-bash-0.6.0, pull-request-available
> Fix For: 0.6.0
>
>
> To implement a new metrics reporter type for datadog API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] pratyakshsharma commented on pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-22 Thread GitBox


pratyakshsharma commented on pull request #1558:
URL: https://github.com/apache/incubator-hudi/pull/1558#issuecomment-632793624


   Wondering why are the tests crashing today in between. This happened twice 
with me today. Is there any way to re-trigger travis build apart from 
re-pushing the code? @vinothchandar @yanghua 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] xushiyan commented on pull request #1572: [HUDI-836] Implement datadog metrics reporter

2020-05-22 Thread GitBox


xushiyan commented on pull request #1572:
URL: https://github.com/apache/incubator-hudi/pull/1572#issuecomment-632789214


   > @xushiyan may be we need to doc this or write a small post?
   
   Yes @vinothchandar will do the post soon. Config docs update is in #1603 . 
As for http-client, I have been using this implementation in production, 
looking good so far. Will definitely test it with more scenarios.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on pull request #1433: [HUDI-728]: Implement custom key generator

2020-05-22 Thread GitBox


pratyakshsharma commented on pull request #1433:
URL: https://github.com/apache/incubator-hudi/pull/1433#issuecomment-632778203


   @nsivabalan I squashed the commits and force pushed after unstaging the 2 
parquet files, but they are still showing. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] vinothchandar commented on pull request #1572: [HUDI-836] Implement datadog metrics reporter

2020-05-22 Thread GitBox


vinothchandar commented on pull request #1572:
URL: https://github.com/apache/incubator-hudi/pull/1572#issuecomment-632774712


   @xushiyan may be we need to doc this or write a small post? 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] vinothchandar merged pull request #1572: [HUDI-836] Implement datadog metrics reporter

2020-05-22 Thread GitBox


vinothchandar merged pull request #1572:
URL: https://github.com/apache/incubator-hudi/pull/1572


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[incubator-hudi] branch master updated: [HUDI-836] Implement datadog metrics reporter (#1572)

2020-05-22 Thread vinoth
This is an automated email from the ASF dual-hosted git repository.

vinoth pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new f34de3f  [HUDI-836] Implement datadog metrics reporter (#1572)
f34de3f is described below

commit f34de3fb2738c8c36c937eba8df2a6848fafa886
Author: Raymond Xu <2701446+xushi...@users.noreply.github.com>
AuthorDate: Fri May 22 09:14:21 2020 -0700

[HUDI-836] Implement datadog metrics reporter (#1572)

- Adds support for emitting metrics to datadog
- Tests, configs..
---
 .../apache/hudi/config/HoodieMetricsConfig.java|   3 +
 .../hudi/config/HoodieMetricsDatadogConfig.java| 127 +++
 .../org/apache/hudi/config/HoodieWriteConfig.java  |  44 ++
 .../main/java/org/apache/hudi/metrics/Metrics.java |   2 +-
 .../hudi/metrics/MetricsReporterFactory.java   |   4 +
 .../apache/hudi/metrics/MetricsReporterType.java   |   2 +-
 .../hudi/metrics/datadog/DatadogHttpClient.java| 127 +++
 .../metrics/datadog/DatadogMetricsReporter.java|  93 +++
 .../hudi/metrics/datadog/DatadogReporter.java  | 171 +
 .../metrics/datadog/TestDatadogHttpClient.java | 152 ++
 .../datadog/TestDatadogMetricsReporter.java|  77 ++
 .../hudi/metrics/datadog/TestDatadogReporter.java  | 105 +
 packaging/hudi-spark-bundle/pom.xml|   2 +-
 13 files changed, 906 insertions(+), 3 deletions(-)

diff --git 
a/hudi-client/src/main/java/org/apache/hudi/config/HoodieMetricsConfig.java 
b/hudi-client/src/main/java/org/apache/hudi/config/HoodieMetricsConfig.java
index 4792d6f..42555ce 100644
--- a/hudi-client/src/main/java/org/apache/hudi/config/HoodieMetricsConfig.java
+++ b/hudi-client/src/main/java/org/apache/hudi/config/HoodieMetricsConfig.java
@@ -130,6 +130,9 @@ public class HoodieMetricsConfig extends 
DefaultHoodieConfig {
   DEFAULT_JMX_HOST);
   setDefaultOnCondition(props, !props.containsKey(JMX_PORT), JMX_PORT,
   String.valueOf(DEFAULT_JMX_PORT));
+  MetricsReporterType reporterType = 
MetricsReporterType.valueOf(props.getProperty(METRICS_REPORTER_TYPE));
+  setDefaultOnCondition(props, reporterType == MetricsReporterType.DATADOG,
+  
HoodieMetricsDatadogConfig.newBuilder().fromProperties(props).build());
   return config;
 }
   }
diff --git 
a/hudi-client/src/main/java/org/apache/hudi/config/HoodieMetricsDatadogConfig.java
 
b/hudi-client/src/main/java/org/apache/hudi/config/HoodieMetricsDatadogConfig.java
new file mode 100644
index 000..e6dcc28
--- /dev/null
+++ 
b/hudi-client/src/main/java/org/apache/hudi/config/HoodieMetricsDatadogConfig.java
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.config;
+
+import org.apache.hudi.common.config.DefaultHoodieConfig;
+
+import javax.annotation.concurrent.Immutable;
+
+import java.util.Properties;
+
+import static org.apache.hudi.config.HoodieMetricsConfig.METRIC_PREFIX;
+
+/**
+ * Configs for Datadog reporter type.
+ * 
+ * {@link org.apache.hudi.metrics.MetricsReporterType#DATADOG}
+ */
+@Immutable
+public class HoodieMetricsDatadogConfig extends DefaultHoodieConfig {
+
+  public static final String DATADOG_PREFIX = METRIC_PREFIX + ".datadog";
+  public static final String DATADOG_REPORT_PERIOD_SECONDS = DATADOG_PREFIX + 
".report.period.seconds";
+  public static final int DEFAULT_DATADOG_REPORT_PERIOD_SECONDS = 30;
+  public static final String DATADOG_API_SITE = DATADOG_PREFIX + ".api.site";
+  public static final String DATADOG_API_KEY = DATADOG_PREFIX + ".api.key";
+  public static final String DATADOG_API_KEY_SKIP_VALIDATION = DATADOG_PREFIX 
+ ".api.key.skip.validation";
+  public static final boolean DEFAULT_DATADOG_API_KEY_SKIP_VALIDATION = false;
+  public static final String DATADOG_API_KEY_SUPPLIER = DATADOG_PREFIX + 
".api.key.supplier";
+  public static final String DATADOG_API_TIMEOUT_SECONDS = DATADOG_PREFIX + 
".api.timeout.seconds";
+  public static final int DEFAULT_DATADOG_API_TIMEOUT_SECONDS = 3;
+  public static 

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1572: [HUDI-836] Implement datadog metrics reporter

2020-05-22 Thread GitBox


vinothchandar commented on pull request #1572:
URL: https://github.com/apache/incubator-hudi/pull/1572#issuecomment-632774153


   Sorry folks.. Got delayed ..  This looks fine at a high level.. I was 
mulling if using http-client directly will cause any issues for us.. but seems 
ok.. 
   
   If something comes up, we can fix forward.. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] leesf commented on a change in pull request #1647: [HUDI-867]: fixed IllegalArgumentException from graphite metrics in deltaStreamer continuous mode

2020-05-22 Thread GitBox


leesf commented on a change in pull request #1647:
URL: https://github.com/apache/incubator-hudi/pull/1647#discussion_r429313527



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##
@@ -416,10 +425,12 @@ public DeltaSync getDeltaSync() {
   jssc.setLocalProperty("spark.scheduler.pool", 
SchedulerConfGenerator.DELTASYNC_POOL_NAME);
 }
 try {
+  int iteration = 1;
   while (!isShutdownRequested()) {
 try {
   long start = System.currentTimeMillis();
-  Option scheduledCompactionInstant = deltaSync.syncOnce();
+  HoodieMetrics.setTableName(cfg.metricsTableName + "_" + 
iteration);

Review comment:
   @pratyakshsharma Thanks for the fix, I have a small question, when using 
Spark Streaming writing to Hudi (which seems like the continuous mode of 
deltastreamer), the exception will happen again? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] codecov-commenter edited a comment on pull request #1433: [HUDI-728]: Implement custom key generator

2020-05-22 Thread GitBox


codecov-commenter edited a comment on pull request #1433:
URL: https://github.com/apache/incubator-hudi/pull/1433#issuecomment-631535136


   # 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=h1) 
Report
   > Merging 
[#1433](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-hudi/commit/25a0080b2f6ddce0e528b2a72aea33a565f0e565=desc)
 will **increase** coverage by `1.52%`.
   > The diff coverage is `8.45%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-hudi/pull/1433/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#1433  +/-   ##
   
   + Coverage 16.71%   18.24%   +1.52% 
   - Complexity  795  854  +59 
   
 Files   340  347   +7 
 Lines 1503015257 +227 
 Branches   1499 1525  +26 
   
   + Hits   2512 2783 +271 
   + Misses1218812122  -66 
   - Partials330  352  +22 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...e/hudi/exception/HoodieDeltaStreamerException.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9leGNlcHRpb24vSG9vZGllRGVsdGFTdHJlYW1lckV4Y2VwdGlvbi5qYXZh)
 | `0.00% <ø> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...va/org/apache/hudi/keygen/ComplexKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vQ29tcGxleEtleUdlbmVyYXRvci5qYXZh)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...ava/org/apache/hudi/keygen/CustomKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vQ3VzdG9tS2V5R2VuZXJhdG9yLmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...g/apache/hudi/keygen/GlobalDeleteKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vR2xvYmFsRGVsZXRlS2V5R2VuZXJhdG9yLmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...apache/hudi/keygen/NonpartitionedKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vTm9ucGFydGl0aW9uZWRLZXlHZW5lcmF0b3IuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...apache/hudi/keygen/TimestampBasedKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vVGltZXN0YW1wQmFzZWRLZXlHZW5lcmF0b3IuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...ava/org/apache/hudi/keygen/SimpleKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vU2ltcGxlS2V5R2VuZXJhdG9yLmphdmE=)
 | `73.68% <75.00%> (+14.86%)` | `0.00 <0.00> (ø)` | |
   | 
[...ain/java/org/apache/hudi/avro/HoodieAvroUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYXZyby9Ib29kaWVBdnJvVXRpbHMuamF2YQ==)
 | `48.09% <0.00%> (-5.88%)` | `22.00% <0.00%> (ø%)` | |
   | 
[...in/scala/org/apache/hudi/AvroConversionUtils.scala](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2h1ZGkvQXZyb0NvbnZlcnNpb25VdGlscy5zY2FsYQ==)
 | `45.45% <0.00%> (-4.55%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...c/main/java/org/apache/hudi/index/HoodieIndex.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvSG9vZGllSW5kZXguamF2YQ==)
 | `36.84% <0.00%> (-4.34%)` | `3.00% <0.00%> (ø%)` | |
   | ... and [36 
more](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=footer).
 Last update 

[GitHub] [incubator-hudi] codecov-commenter edited a comment on pull request #1433: [HUDI-728]: Implement custom key generator

2020-05-22 Thread GitBox


codecov-commenter edited a comment on pull request #1433:
URL: https://github.com/apache/incubator-hudi/pull/1433#issuecomment-631535136


   # 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=h1) 
Report
   > Merging 
[#1433](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-hudi/commit/25a0080b2f6ddce0e528b2a72aea33a565f0e565=desc)
 will **increase** coverage by `1.52%`.
   > The diff coverage is `8.45%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-hudi/pull/1433/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#1433  +/-   ##
   
   + Coverage 16.71%   18.24%   +1.52% 
   - Complexity  795  854  +59 
   
 Files   340  347   +7 
 Lines 1503015257 +227 
 Branches   1499 1525  +26 
   
   + Hits   2512 2783 +271 
   + Misses1218812122  -66 
   - Partials330  352  +22 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...e/hudi/exception/HoodieDeltaStreamerException.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9leGNlcHRpb24vSG9vZGllRGVsdGFTdHJlYW1lckV4Y2VwdGlvbi5qYXZh)
 | `0.00% <ø> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...va/org/apache/hudi/keygen/ComplexKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vQ29tcGxleEtleUdlbmVyYXRvci5qYXZh)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...ava/org/apache/hudi/keygen/CustomKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vQ3VzdG9tS2V5R2VuZXJhdG9yLmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...g/apache/hudi/keygen/GlobalDeleteKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vR2xvYmFsRGVsZXRlS2V5R2VuZXJhdG9yLmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...apache/hudi/keygen/NonpartitionedKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vTm9ucGFydGl0aW9uZWRLZXlHZW5lcmF0b3IuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
   | 
[...apache/hudi/keygen/TimestampBasedKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vVGltZXN0YW1wQmFzZWRLZXlHZW5lcmF0b3IuamF2YQ==)
 | `0.00% <0.00%> (ø)` | `0.00 <0.00> (?)` | |
   | 
[...ava/org/apache/hudi/keygen/SimpleKeyGenerator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9rZXlnZW4vU2ltcGxlS2V5R2VuZXJhdG9yLmphdmE=)
 | `73.68% <75.00%> (+14.86%)` | `0.00 <0.00> (ø)` | |
   | 
[...ain/java/org/apache/hudi/avro/HoodieAvroUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYXZyby9Ib29kaWVBdnJvVXRpbHMuamF2YQ==)
 | `48.09% <0.00%> (-5.88%)` | `22.00% <0.00%> (ø%)` | |
   | 
[...in/scala/org/apache/hudi/AvroConversionUtils.scala](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9zY2FsYS9vcmcvYXBhY2hlL2h1ZGkvQXZyb0NvbnZlcnNpb25VdGlscy5zY2FsYQ==)
 | `45.45% <0.00%> (-4.55%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...c/main/java/org/apache/hudi/index/HoodieIndex.java](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvSG9vZGllSW5kZXguamF2YQ==)
 | `36.84% <0.00%> (-4.34%)` | `3.00% <0.00%> (ø%)` | |
   | ... and [36 
more](https://codecov.io/gh/apache/incubator-hudi/pull/1433/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1433?src=pr=footer).
 Last update 

[GitHub] [incubator-hudi] codecov-commenter edited a comment on pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-22 Thread GitBox


codecov-commenter edited a comment on pull request #1558:
URL: https://github.com/apache/incubator-hudi/pull/1558#issuecomment-630856735


   # 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=h1) 
Report
   > Merging 
[#1558](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-hudi/commit/a64afdfd17ac974e451bceb877f3d40a9c775253=desc)
 will **decrease** coverage by `53.41%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-hudi/pull/1558/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=tree)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#1558   +/-   ##
   =
   - Coverage 71.75%   18.33%   -53.42% 
   + Complexity 1089  855  -234 
   =
 Files   385  344   -41 
 Lines 1659915167 -1432 
 Branches   1668 1512  -156 
   =
   - Hits  11910 2781 -9129 
   - Misses 396212033 +8071 
   + Partials727  353  -374 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...n/java/org/apache/hudi/io/AppendHandleFactory.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vQXBwZW5kSGFuZGxlRmFjdG9yeS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9NZXRyaWNzUmVwb3J0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[.../java/org/apache/hudi/common/model/ActionType.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0FjdGlvblR5cGUuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...java/org/apache/hudi/io/HoodieRangeInfoHandle.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vSG9vZGllUmFuZ2VJbmZvSGFuZGxlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[.../java/org/apache/hudi/hadoop/InputPathHandler.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0lucHV0UGF0aEhhbmRsZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...a/org/apache/hudi/exception/HoodieIOException.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUlPRXhjZXB0aW9uLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...org/apache/hudi/table/action/commit/SmallFile.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9TbWFsbEZpbGUuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...rg/apache/hudi/index/bloom/KeyRangeLookupTree.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvYmxvb20vS2V5UmFuZ2VMb29rdXBUcmVlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...g/apache/hudi/exception/HoodieInsertException.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUluc2VydEV4Y2VwdGlvbi5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [307 
more](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=footer).
 Last update 

[GitHub] [incubator-hudi] codecov-commenter edited a comment on pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-22 Thread GitBox


codecov-commenter edited a comment on pull request #1558:
URL: https://github.com/apache/incubator-hudi/pull/1558#issuecomment-630856735


   # 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=h1) 
Report
   > Merging 
[#1558](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-hudi/commit/a64afdfd17ac974e451bceb877f3d40a9c775253=desc)
 will **decrease** coverage by `53.41%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-hudi/pull/1558/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=tree)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#1558   +/-   ##
   =
   - Coverage 71.75%   18.33%   -53.42% 
   + Complexity 1089  855  -234 
   =
 Files   385  344   -41 
 Lines 1659915167 -1432 
 Branches   1668 1512  -156 
   =
   - Hits  11910 2781 -9129 
   - Misses 396212033 +8071 
   + Partials727  353  -374 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...n/java/org/apache/hudi/io/AppendHandleFactory.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vQXBwZW5kSGFuZGxlRmFjdG9yeS5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9NZXRyaWNzUmVwb3J0ZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[.../java/org/apache/hudi/common/model/ActionType.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0FjdGlvblR5cGUuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...java/org/apache/hudi/io/HoodieRangeInfoHandle.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vSG9vZGllUmFuZ2VJbmZvSGFuZGxlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[.../java/org/apache/hudi/hadoop/InputPathHandler.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0lucHV0UGF0aEhhbmRsZXIuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...a/org/apache/hudi/exception/HoodieIOException.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUlPRXhjZXB0aW9uLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...org/apache/hudi/table/action/commit/SmallFile.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9TbWFsbEZpbGUuamF2YQ==)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...rg/apache/hudi/index/bloom/KeyRangeLookupTree.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvYmxvb20vS2V5UmFuZ2VMb29rdXBUcmVlLmphdmE=)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | 
[...g/apache/hudi/exception/HoodieInsertException.java](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUluc2VydEV4Y2VwdGlvbi5qYXZh)
 | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [307 
more](https://codecov.io/gh/apache/incubator-hudi/pull/1558/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1558?src=pr=footer).
 Last update 

[GitHub] [incubator-hudi] hddong commented on a change in pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-22 Thread GitBox


hddong commented on a change in pull request #1558:
URL: https://github.com/apache/incubator-hudi/pull/1558#discussion_r429251126



##
File path: hudi-cli/src/main/java/org/apache/hudi/cli/commands/SparkMain.java
##
@@ -263,13 +265,26 @@ private static int compact(JavaSparkContext jsc, String 
basePath, String tableNa
   }
 
   private static int deduplicatePartitionPath(JavaSparkContext jsc, String 
duplicatedPartitionPath,
-  String repairedOutputPath, String basePath, String dryRun) {
+  String repairedOutputPath, String basePath, boolean dryRun, String 
dedupeType) {
 DedupeSparkJob job = new DedupeSparkJob(basePath, duplicatedPartitionPath, 
repairedOutputPath, new SQLContext(jsc),
-FSUtils.getFs(basePath, jsc.hadoopConfiguration()));
-job.fixDuplicates(Boolean.parseBoolean(dryRun));
+FSUtils.getFs(basePath, jsc.hadoopConfiguration()), 
getDedupeType(dedupeType));
+job.fixDuplicates(dryRun);
 return 0;
   }
 
+  private static Enumeration.Value getDedupeType(String type) {
+switch (type) {
+  case "insertType":
+return DeDupeType.insertType();
+  case "updateType":
+return DeDupeType.updateType();
+  case "upsertType":
+return DeDupeType.upsertType();
+  default:
+throw new IllegalArgumentException("Please provide valid dedupe 
type!");
+}
+  }
+

Review comment:
   @pratyakshsharma : I mean that we can use 
`DeDupeType.withName("insertType")` to convert `String` to `Enum`.  
`getDedupeType` Function may not need here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] nsivabalan commented on pull request #1648: [HUDI-916]: added support for multiple input formats in TimestampBasedKeyGenerator

2020-05-22 Thread GitBox


nsivabalan commented on pull request #1648:
URL: https://github.com/apache/incubator-hudi/pull/1648#issuecomment-632683547


   Also, for book-keeping purposes, can you fix the description of this patch. 
I guess you missed to copy from the original. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] nsivabalan commented on a change in pull request #1648: [HUDI-916]: added support for multiple input formats in TimestampBasedKeyGenerator

2020-05-22 Thread GitBox


nsivabalan commented on a change in pull request #1648:
URL: https://github.com/apache/incubator-hudi/pull/1648#discussion_r429230492



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/keygen/TimestampBasedKeyGenerator.java
##
@@ -53,44 +56,74 @@
 
   private final TimestampType timestampType;
 
-  private SimpleDateFormat inputDateFormat;
-
   private final String outputDateFormat;
+  private DateTimeFormatter inputFormatter;
+  private final String configInputDateFormatList;
+  private final String configInputDateFormatDelimiter;
 
   // TimeZone detailed settings reference
   // https://docs.oracle.com/javase/8/docs/api/java/util/TimeZone.html
-  private final TimeZone timeZone;
+  private final DateTimeZone inputDateTimeZone;
+  private final DateTimeZone outputDateTimeZone;
 
   /**
* Supported configs.
*/
   static class Config {
 
 // One value from TimestampType above
-private static final String TIMESTAMP_TYPE_FIELD_PROP = 
"hoodie.deltastreamer.keygen.timebased.timestamp.type";
-private static final String INPUT_TIME_UNIT =
+public static final String TIMESTAMP_TYPE_FIELD_PROP = 
"hoodie.deltastreamer.keygen.timebased.timestamp.type";
+public static final String INPUT_TIME_UNIT =
 "hoodie.deltastreamer.keygen.timebased.timestamp.scalar.time.unit";
-private static final String TIMESTAMP_INPUT_DATE_FORMAT_PROP =
+//This prop can now accept list of input date formats.
+public static final String TIMESTAMP_INPUT_DATE_FORMAT_PROP =
 "hoodie.deltastreamer.keygen.timebased.input.dateformat";
-private static final String TIMESTAMP_OUTPUT_DATE_FORMAT_PROP =
+public static final String 
TIMESTAMP_INPUT_DATE_FORMAT_LIST_DELIMITER_REGEX_PROP = 
"hoodie.deltastreamer.keygen.timebased.input.dateformatlistdelimiterregex";

Review comment:
   can we add dots (".") between words. 
hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex

##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/keygen/TimestampBasedKeyGenerator.java
##
@@ -53,44 +56,74 @@
 
   private final TimestampType timestampType;
 
-  private SimpleDateFormat inputDateFormat;
-
   private final String outputDateFormat;
+  private DateTimeFormatter inputFormatter;
+  private final String configInputDateFormatList;
+  private final String configInputDateFormatDelimiter;
 
   // TimeZone detailed settings reference
   // https://docs.oracle.com/javase/8/docs/api/java/util/TimeZone.html
-  private final TimeZone timeZone;
+  private final DateTimeZone inputDateTimeZone;
+  private final DateTimeZone outputDateTimeZone;
 
   /**
* Supported configs.
*/
   static class Config {
 
 // One value from TimestampType above
-private static final String TIMESTAMP_TYPE_FIELD_PROP = 
"hoodie.deltastreamer.keygen.timebased.timestamp.type";
-private static final String INPUT_TIME_UNIT =
+public static final String TIMESTAMP_TYPE_FIELD_PROP = 
"hoodie.deltastreamer.keygen.timebased.timestamp.type";

Review comment:
   why public? package private isn't sufficient? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] codecov-commenter edited a comment on pull request #1652: [HUDI-918] Fix kafkaOffsetGen can not read kafka data bug

2020-05-22 Thread GitBox


codecov-commenter edited a comment on pull request #1652:
URL: https://github.com/apache/incubator-hudi/pull/1652#issuecomment-632676442


   # 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1652?src=pr=h1) 
Report
   > Merging 
[#1652](https://codecov.io/gh/apache/incubator-hudi/pull/1652?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-hudi/commit/459356e292ea869ffe5f39235646dc474da76ea5=desc)
 will **increase** coverage by `1.73%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-hudi/pull/1652/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1652?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#1652  +/-   ##
   
   + Coverage 16.59%   18.33%   +1.73% 
   - Complexity  798  855  +57 
   
 Files   344  344  
 Lines 1516015167   +7 
 Branches   1510 1512   +2 
   
   + Hits   2516 2781 +265 
   + Misses1231412033 -281 
   - Partials330  353  +23 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-hudi/pull/1652?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...ain/java/org/apache/hudi/avro/HoodieAvroUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYXZyby9Ib29kaWVBdnJvVXRpbHMuamF2YQ==)
 | `48.09% <0.00%> (-1.91%)` | `22.00% <0.00%> (ø%)` | |
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZVdyaXRlQ29uZmlnLmphdmE=)
 | `42.25% <0.00%> (-0.30%)` | `48.00% <0.00%> (+1.00%)` | :arrow_down: |
   | 
[.../apache/hudi/common/util/ObjectSizeCalculator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvT2JqZWN0U2l6ZUNhbGN1bGF0b3IuamF2YQ==)
 | `77.61% <0.00%> (ø)` | `25.00% <0.00%> (ø%)` | |
   | 
[...che/hudi/table/action/commit/BulkInsertHelper.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9CdWxrSW5zZXJ0SGVscGVyLmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (ø%)` | |
   | 
[...di/common/table/timeline/HoodieActiveTimeline.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZUFjdGl2ZVRpbWVsaW5lLmphdmE=)
 | `28.49% <0.00%> (+0.15%)` | `17.00% <0.00%> (+1.00%)` | |
   | 
[.../table/action/commit/BaseCommitActionExecutor.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9CYXNlQ29tbWl0QWN0aW9uRXhlY3V0b3IuamF2YQ==)
 | `46.01% <0.00%> (+0.48%)` | `14.00% <0.00%> (ø%)` | |
   | 
[...le/view/IncrementalTimelineSyncFileSystemView.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvSW5jcmVtZW50YWxUaW1lbGluZVN5bmNGaWxlU3lzdGVtVmlldy5qYXZh)
 | `4.51% <0.00%> (+0.56%)` | `4.00% <0.00%> (+1.00%)` | |
   | 
[...common/table/view/AbstractTableFileSystemView.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvQWJzdHJhY3RUYWJsZUZpbGVTeXN0ZW1WaWV3LmphdmE=)
 | `8.59% <0.00%> (+2.34%)` | `6.00% <0.00%> (+1.00%)` | |
   | 
[...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh)
 | `52.30% <0.00%> (+4.61%)` | `28.00% <0.00%> (+4.00%)` | |
   | 
[.../main/java/org/apache/hudi/common/util/Option.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvT3B0aW9uLmphdmE=)
 | `51.35% <0.00%> (+5.40%)` | `18.00% <0.00%> (+1.00%)` | |
   | ... and [13 
more](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1652?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not 

[GitHub] [incubator-hudi] codecov-commenter commented on pull request #1652: [HUDI-918] Fix kafkaOffsetGen can not read kafka data bug

2020-05-22 Thread GitBox


codecov-commenter commented on pull request #1652:
URL: https://github.com/apache/incubator-hudi/pull/1652#issuecomment-632676442


   # 
[Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1652?src=pr=h1) 
Report
   > Merging 
[#1652](https://codecov.io/gh/apache/incubator-hudi/pull/1652?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-hudi/commit/459356e292ea869ffe5f39235646dc474da76ea5=desc)
 will **increase** coverage by `1.73%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-hudi/pull/1652/graphs/tree.svg?width=650=150=pr=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1652?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#1652  +/-   ##
   
   + Coverage 16.59%   18.33%   +1.73% 
   - Complexity  798  855  +57 
   
 Files   344  344  
 Lines 1516015167   +7 
 Branches   1510 1512   +2 
   
   + Hits   2516 2781 +265 
   + Misses1231412033 -281 
   - Partials330  353  +23 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-hudi/pull/1652?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...ain/java/org/apache/hudi/avro/HoodieAvroUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYXZyby9Ib29kaWVBdnJvVXRpbHMuamF2YQ==)
 | `48.09% <0.00%> (-1.91%)` | `22.00% <0.00%> (ø%)` | |
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZVdyaXRlQ29uZmlnLmphdmE=)
 | `42.25% <0.00%> (-0.30%)` | `48.00% <0.00%> (+1.00%)` | :arrow_down: |
   | 
[.../apache/hudi/common/util/ObjectSizeCalculator.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvT2JqZWN0U2l6ZUNhbGN1bGF0b3IuamF2YQ==)
 | `77.61% <0.00%> (ø)` | `25.00% <0.00%> (ø%)` | |
   | 
[...che/hudi/table/action/commit/BulkInsertHelper.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9CdWxrSW5zZXJ0SGVscGVyLmphdmE=)
 | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (ø%)` | |
   | 
[...di/common/table/timeline/HoodieActiveTimeline.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZUFjdGl2ZVRpbWVsaW5lLmphdmE=)
 | `28.49% <0.00%> (+0.15%)` | `17.00% <0.00%> (+1.00%)` | |
   | 
[.../table/action/commit/BaseCommitActionExecutor.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9CYXNlQ29tbWl0QWN0aW9uRXhlY3V0b3IuamF2YQ==)
 | `46.01% <0.00%> (+0.48%)` | `14.00% <0.00%> (ø%)` | |
   | 
[...le/view/IncrementalTimelineSyncFileSystemView.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvSW5jcmVtZW50YWxUaW1lbGluZVN5bmNGaWxlU3lzdGVtVmlldy5qYXZh)
 | `4.51% <0.00%> (+0.56%)` | `4.00% <0.00%> (+1.00%)` | |
   | 
[...common/table/view/AbstractTableFileSystemView.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvQWJzdHJhY3RUYWJsZUZpbGVTeXN0ZW1WaWV3LmphdmE=)
 | `8.59% <0.00%> (+2.34%)` | `6.00% <0.00%> (+1.00%)` | |
   | 
[...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh)
 | `52.30% <0.00%> (+4.61%)` | `28.00% <0.00%> (+4.00%)` | |
   | 
[.../main/java/org/apache/hudi/common/util/Option.java](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvT3B0aW9uLmphdmE=)
 | `51.35% <0.00%> (+5.40%)` | `18.00% <0.00%> (+1.00%)` | |
   | ... and [13 
more](https://codecov.io/gh/apache/incubator-hudi/pull/1652/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1652?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, 

[GitHub] [incubator-hudi] nsivabalan commented on pull request #1433: [HUDI-728]: Implement custom key generator

2020-05-22 Thread GitBox


nsivabalan commented on pull request #1433:
URL: https://github.com/apache/incubator-hudi/pull/1433#issuecomment-632675414


   @pratyakshsharma : can you fix the build issue. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] sassai commented on issue #1646: [SUPPORT]: Unable to query Hive table through Spark SQL

2020-05-22 Thread GitBox


sassai commented on issue #1646:
URL: https://github.com/apache/incubator-hudi/issues/1646#issuecomment-632666310


   Hi @bvaradar,
   
   thanks for the reply. After some digging I found a solution. 
   
   The problem was that the spark did not load the jars specified in 
`HIVE_AUX_JARS_PATH` and on the Cloudera Data Platform the 
`spark-defaults.conf` has the following properties.
   
   ```console
   
spark.sql.hive.metastore.jars=${env:HADOOP_COMMON_HOME}/../hive/lib/*:${env:HADOOP_COMMON_HOME}/client/*
   ```
   
   To fix this problem I edited the `spark-defaults.conf` via the Cloudera 
Manager and added the path to the hudi-mr-bundle jars. 
   
   Here is a brief description on how to resolve the issue on CDP:
   
   1. Go to Cloudera Manager > Cluster > Spark > Configuration > search for 
"safety".
   
   2. Edit the snippet for `spark-conf/spark-defaults.conf` and add
   
   ```console
   
spark.sql.hive.metastore.jars=${env:HADOOP_COMMON_HOME}/../hive/lib/*:${env:HADOOP_COMMON_HOME}/client/*:/shared/jars/hive/*
   ```
   
   > Note: /shared/jars/hive/* is the path containing the hudi-mr-bundle 
(HIVE_AUX_JARS_PATH)
   
   3. Restart Spark



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] maduxi edited a comment on issue #661: Tracking ticket for reporting Hudi usages from the community

2020-05-22 Thread GitBox


maduxi edited a comment on issue #661:
URL: https://github.com/apache/incubator-hudi/issues/661#issuecomment-632323358







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-22 Thread GitBox


pratyakshsharma commented on a change in pull request #1558:
URL: https://github.com/apache/incubator-hudi/pull/1558#discussion_r429177537



##
File path: hudi-cli/src/main/scala/org/apache/hudi/cli/DedupeSparkJob.scala
##
@@ -98,34 +97,92 @@ class DedupeSparkJob(basePath: String,
 ON h.`_hoodie_record_key` = d.dupe_key
   """
 val dupeMap = sqlContext.sql(dupeDataSql).collectAsList().groupBy(r => 
r.getString(0))
-val fileToDeleteKeyMap = new HashMap[String, HashSet[String]]()
+getDedupePlan(dupeMap)
+  }
 
-// Mark all files except the one with latest commits for deletion
+  private def getDedupePlan(dupeMap: Map[String, Buffer[Row]]): 
HashMap[String, HashSet[String]] = {
+val fileToDeleteKeyMap = new HashMap[String, HashSet[String]]()
 dupeMap.foreach(rt => {
   val (key, rows) = rt
-  var maxCommit = -1L
-
-  rows.foreach(r => {
-val c = r(3).asInstanceOf[String].toLong
-if (c > maxCommit)
-  maxCommit = c
-  })
-
-  rows.foreach(r => {
-val c = r(3).asInstanceOf[String].toLong
-if (c != maxCommit) {
-  val f = r(2).asInstanceOf[String].split("_")(0)
-  if (!fileToDeleteKeyMap.contains(f)) {
-fileToDeleteKeyMap(f) = HashSet[String]()
-  }
-  fileToDeleteKeyMap(f).add(key)
-}
-  })
+
+  dedupeType match {
+case DeDupeType.updateType =>
+  /*
+  This corresponds to the case where all duplicates have been updated 
at least once.
+  Once updated, duplicates are bound to have same commit time unless 
forcefully modified.
+  */
+  rows.init.foreach(r => {
+val f = r(2).asInstanceOf[String].split("_")(0)
+if (!fileToDeleteKeyMap.contains(f)) {
+  fileToDeleteKeyMap(f) = HashSet[String]()
+}
+fileToDeleteKeyMap(f).add(key)
+  })
+case DeDupeType.insertType =>
+  /*
+  This corresponds to the case where duplicates got created due to 
INSERT and have never been updated.
+  */
+  var maxCommit = -1L
+
+  rows.foreach(r => {
+val c = r(3).asInstanceOf[String].toLong
+if (c > maxCommit)
+  maxCommit = c
+  })
+  rows.foreach(r => {
+val c = r(3).asInstanceOf[String].toLong
+if (c != maxCommit) {
+  val f = r(2).asInstanceOf[String].split("_")(0)
+  if (!fileToDeleteKeyMap.contains(f)) {
+fileToDeleteKeyMap(f) = HashSet[String]()
+  }
+  fileToDeleteKeyMap(f).add(key)
+}
+  })
+
+case DeDupeType.upsertType =>
+  /*
+  This corresponds to the case where duplicates got created as a 
result of inserts as well as updates,
+  i.e few duplicate records have been updated, while others were never 
updated.
+   */
+  var maxCommit = -1L
+
+  rows.foreach(r => {
+val c = r(3).asInstanceOf[String].toLong
+if (c > maxCommit)
+  maxCommit = c
+  })
+  val rowsWithMaxCommit = new ListBuffer[Row]()
+  rows.foreach(r => {
+val c = r(3).asInstanceOf[String].toLong
+if (c != maxCommit) {
+  val f = r(2).asInstanceOf[String].split("_")(0)
+  if (!fileToDeleteKeyMap.contains(f)) {
+fileToDeleteKeyMap(f) = HashSet[String]()
+  }
+  fileToDeleteKeyMap(f).add(key)
+} else {
+  rowsWithMaxCommit += r
+}
+  })
+
+  rowsWithMaxCommit.toList.init.foreach(r => {
+val f = r(2).asInstanceOf[String].split("_")(0)
+if (!fileToDeleteKeyMap.contains(f)) {
+  fileToDeleteKeyMap(f) = HashSet[String]()
+}
+fileToDeleteKeyMap(f).add(key)
+  })
+
+case _ => throw new IllegalArgumentException("Please provide valid 
type for deduping!")
+  }
 })
+LOG.debug("fileToDeleteKeyMap size : " + fileToDeleteKeyMap.size + ", map: 
" + fileToDeleteKeyMap)

Review comment:
   Done. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-22 Thread GitBox


pratyakshsharma commented on a change in pull request #1558:
URL: https://github.com/apache/incubator-hudi/pull/1558#discussion_r429176562



##
File path: hudi-cli/src/main/scala/org/apache/hudi/cli/DeDupeType.scala
##
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.cli
+
+object DeDupeType extends Enumeration {
+
+  type dedupeType = Value
+
+  val insertType = Value("insertType")
+  val updateType = Value("updateType")
+  val upsertType = Value("upsertType")

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-22 Thread GitBox


pratyakshsharma commented on a change in pull request #1558:
URL: https://github.com/apache/incubator-hudi/pull/1558#discussion_r429174118



##
File path: 
hudi-cli/src/main/java/org/apache/hudi/cli/commands/RepairsCommand.java
##
@@ -77,7 +77,9 @@ public String deduplicate(
   help = "Spark executor memory") final String sparkMemory,
   @CliOption(key = {"dryrun"},
   help = "Should we actually remove duplicates or just run and store 
result to repairedOutputPath",
-  unspecifiedDefaultValue = "true") final boolean dryRun)
+  unspecifiedDefaultValue = "true") final boolean dryRun,
+  @CliOption(key = {"dedupeType"}, help = "Check DeDupeType.scala for 
valid values",
+  unspecifiedDefaultValue = "insertType") final String dedupeType)

Review comment:
   Done. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-22 Thread GitBox


pratyakshsharma commented on a change in pull request #1558:
URL: https://github.com/apache/incubator-hudi/pull/1558#discussion_r429171990



##
File path: hudi-cli/src/main/java/org/apache/hudi/cli/commands/SparkMain.java
##
@@ -263,13 +265,26 @@ private static int compact(JavaSparkContext jsc, String 
basePath, String tableNa
   }
 
   private static int deduplicatePartitionPath(JavaSparkContext jsc, String 
duplicatedPartitionPath,
-  String repairedOutputPath, String basePath, String dryRun) {
+  String repairedOutputPath, String basePath, boolean dryRun, String 
dedupeType) {
 DedupeSparkJob job = new DedupeSparkJob(basePath, duplicatedPartitionPath, 
repairedOutputPath, new SQLContext(jsc),
-FSUtils.getFs(basePath, jsc.hadoopConfiguration()));
-job.fixDuplicates(Boolean.parseBoolean(dryRun));
+FSUtils.getFs(basePath, jsc.hadoopConfiguration()), 
getDedupeType(dedupeType));
+job.fixDuplicates(dryRun);
 return 0;
   }
 
+  private static Enumeration.Value getDedupeType(String type) {
+switch (type) {
+  case "insertType":
+return DeDupeType.insertType();
+  case "updateType":
+return DeDupeType.updateType();
+  case "upsertType":
+return DeDupeType.upsertType();
+  default:
+throw new IllegalArgumentException("Please provide valid dedupe 
type!");
+}
+  }
+

Review comment:
   But what difference does it create? 
   
   `DeDupeType.insertType()` and `DeDupeType.withName("insertType")` - both 
return the same Value. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on pull request #1648: [HUDI-916]: added support for multiple input formats in TimestampBasedKeyGenerator

2020-05-22 Thread GitBox


pratyakshsharma commented on pull request #1648:
URL: https://github.com/apache/incubator-hudi/pull/1648#issuecomment-632611762


   > @pratyakshsharma : was this patch already reviewed as part of #1597? or do 
I need to review it from scratch
   
   The changes in https://github.com/apache/incubator-hudi/pull/1597 were done 
by creating a separate class (MultiFormatTimestampBasedKeyGenerator). In this 
PR, I have tried to include all the changes in already existing 
TimestampBasedKeyGenerator class. So you need to take a look from scratch. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1647: [HUDI-867]: fixed IllegalArgumentException from graphite metrics in deltaStreamer continuous mode

2020-05-22 Thread GitBox


pratyakshsharma commented on a change in pull request #1647:
URL: https://github.com/apache/incubator-hudi/pull/1647#discussion_r429158430



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##
@@ -416,10 +425,12 @@ public DeltaSync getDeltaSync() {
   jssc.setLocalProperty("spark.scheduler.pool", 
SchedulerConfGenerator.DELTASYNC_POOL_NAME);
 }
 try {
+  int iteration = 1;
   while (!isShutdownRequested()) {
 try {
   long start = System.currentTimeMillis();
-  Option scheduledCompactionInstant = deltaSync.syncOnce();
+  HoodieMetrics.setTableName(cfg.metricsTableName + "_" + 
iteration);

Review comment:
   IllegalArgumentException is happening because the metrics name are 
generated in same way using tableName in each run. So we need some way of 
differentiating metrics names for every run and the easiest way to do that is 
altering the table name like "tableName_iteration". We need to do this change 
at 2 places, for HoodieDeltaStreamerMetrics and for HoodieMetrics. 
   The table name getting passed with syncOnce() method takes care of 
HoodieDeltaStreamerMetrics only. For all the other metrics, we need to reset 
table name in HoodieMetrics class. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1647: [HUDI-867]: fixed IllegalArgumentException from graphite metrics in deltaStreamer continuous mode

2020-05-22 Thread GitBox


pratyakshsharma commented on a change in pull request #1647:
URL: https://github.com/apache/incubator-hudi/pull/1647#discussion_r429158430



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##
@@ -416,10 +425,12 @@ public DeltaSync getDeltaSync() {
   jssc.setLocalProperty("spark.scheduler.pool", 
SchedulerConfGenerator.DELTASYNC_POOL_NAME);
 }
 try {
+  int iteration = 1;
   while (!isShutdownRequested()) {
 try {
   long start = System.currentTimeMillis();
-  Option scheduledCompactionInstant = deltaSync.syncOnce();
+  HoodieMetrics.setTableName(cfg.metricsTableName + "_" + 
iteration);

Review comment:
   IllegalArgumentException is happening because the metrics name are 
generated in same way using tableName in each run. So we need some way of 
differentiating metrics names for every run and the easiest way to do that is 
altering the table name like "_". We need to do this 
change at 2 places, for HoodieDeltaStreamerMetrics and for HoodieMetrics. 
   The table name getting passed with syncOnce() method takes care of 
HoodieDeltaStreamerMetrics only. For all the other metrics, we need to reset 
table name in HoodieMetrics class. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1647: [HUDI-867]: fixed IllegalArgumentException from graphite metrics in deltaStreamer continuous mode

2020-05-22 Thread GitBox


pratyakshsharma commented on a change in pull request #1647:
URL: https://github.com/apache/incubator-hudi/pull/1647#discussion_r429158430



##
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##
@@ -416,10 +425,12 @@ public DeltaSync getDeltaSync() {
   jssc.setLocalProperty("spark.scheduler.pool", 
SchedulerConfGenerator.DELTASYNC_POOL_NAME);
 }
 try {
+  int iteration = 1;
   while (!isShutdownRequested()) {
 try {
   long start = System.currentTimeMillis();
-  Option scheduledCompactionInstant = deltaSync.syncOnce();
+  HoodieMetrics.setTableName(cfg.metricsTableName + "_" + 
iteration);

Review comment:
   IllegalArgumentException is happening because the metrics name are 
generated in same way using tableName in each run. So we need some way of 
differentiating metrics names for every run and the easiest way to do that is 
altering the table name like _. We need to do this change 
at 2 places, for HoodieDeltaStreamerMetrics and for HoodieMetrics. 
   The table name getting passed with syncOnce() method takes care of 
HoodieDeltaStreamerMetrics only. For all the other metrics, we need to reset 
table name in HoodieMetrics class. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (HUDI-473) IllegalArgumentException in QuickstartUtils

2020-05-22 Thread Bhavani Sudha (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhavani Sudha updated HUDI-473:
---
Status: In Progress  (was: Open)

> IllegalArgumentException in QuickstartUtils 
> 
>
> Key: HUDI-473
> URL: https://issues.apache.org/jira/browse/HUDI-473
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Usability
>Reporter: zhangpu
>Assignee: Bhavani Sudha
>Priority: Minor
>  Labels: bug-bash-0.6.0, starter
>
>  First call dataGen.generateInserts to write the data,Then another process 
> call dataGen.generateUpdates ,Throws the following exception:
> Exception in thread "main" java.lang.IllegalArgumentException: bound must be 
> positive
>   at java.util.Random.nextInt(Random.java:388)
>   at 
> org.apache.hudi.QuickstartUtils$DataGenerator.generateUpdates(QuickstartUtils.java:163)
> Is the design reasonable?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-473) IllegalArgumentException in QuickstartUtils

2020-05-22 Thread Bhavani Sudha (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113893#comment-17113893
 ] 

Bhavani Sudha commented on HUDI-473:


[~zhangpu-paul] I believe you can reproduce this exception when doing  
dataGen.generateUpdates before calling any other dataGen's methods(including 
dataGen.generateInserts) . This is happening because numExistingKeys would be 0 
since no Inserts happened yet. This class QuickStartUtils is meant specifically 
for [Hudi quickstart page|https://hudi.apache.org/docs/quick-start-guide.html] 
and works on the assumption a user trying the quickstart page would first try 
insert and then update. Also the usage assumes some one is trying this in a 
single process. 

If you are already using it in that way, can you  paste the spark-shell command 
you are using when you got this exception? I can debug this further.

Otherwise, I think this exception is expected. 

> IllegalArgumentException in QuickstartUtils 
> 
>
> Key: HUDI-473
> URL: https://issues.apache.org/jira/browse/HUDI-473
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Usability
>Reporter: zhangpu
>Assignee: Bhavani Sudha
>Priority: Minor
>  Labels: bug-bash-0.6.0, starter
>
>  First call dataGen.generateInserts to write the data,Then another process 
> call dataGen.generateUpdates ,Throws the following exception:
> Exception in thread "main" java.lang.IllegalArgumentException: bound must be 
> positive
>   at java.util.Random.nextInt(Random.java:388)
>   at 
> org.apache.hudi.QuickstartUtils$DataGenerator.generateUpdates(QuickstartUtils.java:163)
> Is the design reasonable?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-115) Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-05-22 Thread Bhavani Sudha (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhavani Sudha updated HUDI-115:
---
Status: Open  (was: New)

> Enhance OverwriteWithLatestAvroPayload to also respect ordering value of 
> record in storage
> --
>
> Key: HUDI-115
> URL: https://issues.apache.org/jira/browse/HUDI-115
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Spark Integration
>Reporter: Vinoth Chandar
>Assignee: Bhavani Sudha
>Priority: Major
>  Labels: bug-bash-0.6.0
> Fix For: 0.6.0
>
>
> https://lists.apache.org/thread.html/45035cc88901b37e3f985b72def90ee5529c4caf87e48d650c00327d@
>  
> context here 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-115) Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-05-22 Thread Bhavani Sudha (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhavani Sudha updated HUDI-115:
---
Labels: bug-bash-0.6.0  (was: )

> Enhance OverwriteWithLatestAvroPayload to also respect ordering value of 
> record in storage
> --
>
> Key: HUDI-115
> URL: https://issues.apache.org/jira/browse/HUDI-115
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Spark Integration
>Reporter: Vinoth Chandar
>Assignee: Bhavani Sudha
>Priority: Major
>  Labels: bug-bash-0.6.0
> Fix For: 0.6.0
>
>
> https://lists.apache.org/thread.html/45035cc88901b37e3f985b72def90ee5529c4caf87e48d650c00327d@
>  
> context here 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HUDI-473) IllegalArgumentException in QuickstartUtils

2020-05-22 Thread Bhavani Sudha (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhavani Sudha reassigned HUDI-473:
--

Assignee: Bhavani Sudha  (was: Bhavani Sudha Saktheeswaran)

> IllegalArgumentException in QuickstartUtils 
> 
>
> Key: HUDI-473
> URL: https://issues.apache.org/jira/browse/HUDI-473
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Usability
>Reporter: zhangpu
>Assignee: Bhavani Sudha
>Priority: Minor
>  Labels: bug-bash-0.6.0, starter
>
>  First call dataGen.generateInserts to write the data,Then another process 
> call dataGen.generateUpdates ,Throws the following exception:
> Exception in thread "main" java.lang.IllegalArgumentException: bound must be 
> positive
>   at java.util.Random.nextInt(Random.java:388)
>   at 
> org.apache.hudi.QuickstartUtils$DataGenerator.generateUpdates(QuickstartUtils.java:163)
> Is the design reasonable?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (HUDI-790) Improve OverwriteWithLatestAvroPayload to consider value on disk as well before overwriting.

2020-05-22 Thread Bhavani Sudha (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhavani Sudha closed HUDI-790.
--
Resolution: Duplicate

> Improve OverwriteWithLatestAvroPayload to consider value on disk as well 
> before overwriting.
> 
>
> Key: HUDI-790
> URL: https://issues.apache.org/jira/browse/HUDI-790
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: Bhavani Sudha
>Assignee: Bhavani Sudha
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-790) Improve OverwriteWithLatestAvroPayload to consider value on disk as well before overwriting.

2020-05-22 Thread Bhavani Sudha (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhavani Sudha updated HUDI-790:
---
Status: Open  (was: New)

> Improve OverwriteWithLatestAvroPayload to consider value on disk as well 
> before overwriting.
> 
>
> Key: HUDI-790
> URL: https://issues.apache.org/jira/browse/HUDI-790
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: Bhavani Sudha
>Assignee: Bhavani Sudha
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-hudi] bhasudha commented on issue #1653: [SUPPORT]: Hudi Deltastreamer OffsetoutofRange Exception reading from Kafka topic (12 partitions)

2020-05-22 Thread GitBox


bhasudha commented on issue #1653:
URL: https://github.com/apache/incubator-hudi/issues/1653#issuecomment-632516157


   @prashanthpdesai  sorry I assumed you were referring to your own check 
pointing.  Your understanding is right. Checkpoints are written to hoodie 
commit metadata after each round of DeltaStreamer run.
   
   The Exception you described seems to be possible if the offfsets supplied is 
larger or smaller than what the server has for a given partition. I am 
suspecting if this could be be because of retention policy of the kafka topic 
kicking in. It should be easy to check this. I think  some command like this 
```kafka-topics.sh --bootstrap-server server_ip:9092 --describe --topic 
topic_name``` will help print the topic config.  Can we start to debug from 
there?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org