[GitHub] [hudi] Mathieu1124 commented on a change in pull request #1842: [HUDI-1037]Introduce a write committed callback hook

2020-07-21 Thread GitBox
Mathieu1124 commented on a change in pull request #1842: URL: https://github.com/apache/hudi/pull/1842#discussion_r458552435 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieWriteCommitCallbackConfig.java ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache

[jira] [Updated] (HUDI-1050) Support filter pushdown and column pruning for MOR table on Spark Datasource

2020-07-21 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1050: - Fix Version/s: (was: 0.6.1) 0.6.0 > Support filter pushdown and column

[GitHub] [hudi] Mathieu1124 commented on a change in pull request #1842: [HUDI-1037]Introduce a write committed callback hook

2020-07-21 Thread GitBox
Mathieu1124 commented on a change in pull request #1842: URL: https://github.com/apache/hudi/pull/1842#discussion_r458550790 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieWriteCommitCallbackConfig.java ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache

[jira] [Updated] (HUDI-781) Re-design test utilities

2020-07-21 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-781: Status: In Progress (was: Open) > Re-design test utilities > > >

[jira] [Assigned] (HUDI-781) Re-design test utilities

2020-07-21 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-781: --- Assignee: Raymond Xu > Re-design test utilities > > > Key:

[jira] [Updated] (HUDI-781) Re-design test utilities

2020-07-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-781: Labels: pull-request-available (was: ) > Re-design test utilities > > >

[GitHub] [hudi] xushiyan opened a new pull request #1861: [HUDI-781] [WIP] Refactor test utils classes

2020-07-21 Thread GitBox
xushiyan opened a new pull request #1861: URL: https://github.com/apache/hudi/pull/1861 ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary

[jira] [Updated] (HUDI-896) Parallelize CI testing to reduce CI wait time

2020-07-21 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-896: Status: In Progress (was: Open) > Parallelize CI testing to reduce CI wait time >

[jira] [Resolved] (HUDI-896) Parallelize CI testing to reduce CI wait time

2020-07-21 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu resolved HUDI-896. - Resolution: Done > Parallelize CI testing to reduce CI wait time >

[GitHub] [hudi] stackfun opened a new issue #1860: [SUPPORT] Issue when querying from Spark Datasource if COW table is being written to at the same time

2020-07-21 Thread GitBox
stackfun opened a new issue #1860: URL: https://github.com/apache/hudi/issues/1860 **Describe the problem you faced** In one pyspark job, I'm appending 10 rows to a COW table in a loop In another pyspark job, I'm doing a select count(*) on the same table in another loop.

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #346

2020-07-21 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.34 KB...] /home/jenkins/tools/maven/apache-maven-3.5.4/conf: logging settings.xml toolchains.xml

[GitHub] [hudi] satishkotha commented on pull request #1859: [HUDI-1072] Use replace metadata file to filter excluded files in views

2020-07-21 Thread GitBox
satishkotha commented on pull request #1859: URL: https://github.com/apache/hudi/pull/1859#issuecomment-662223025 > Reviewed 50%, high level, I feel the changes of excludeFileGroups is being forced into many of the `TableFileSystem` implementations. Need to think more if there is a way to

[GitHub] [hudi] satishkotha commented on a change in pull request #1859: [HUDI-1072] Use replace metadata file to filter excluded files in views

2020-07-21 Thread GitBox
satishkotha commented on a change in pull request #1859: URL: https://github.com/apache/hudi/pull/1859#discussion_r458513411 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java ## @@ -103,14 +105,19 @@ protected void

[GitHub] [hudi] satishkotha commented on pull request #1853: [HUDI-1072] Add replace metadata file to timeline

2020-07-21 Thread GitBox
satishkotha commented on pull request #1853: URL: https://github.com/apache/hudi/pull/1853#issuecomment-662220607 > High level, introducing `replace` action changes seem fine to me, interested in learning how old_file_group -> new_file_group mapping is stored and accessed. Yet to review

[GitHub] [hudi] satishkotha commented on a change in pull request #1853: [HUDI-1072] Add replace metadata file to timeline

2020-07-21 Thread GitBox
satishkotha commented on a change in pull request #1853: URL: https://github.com/apache/hudi/pull/1853#discussion_r458511328 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieTimeline.java ## @@ -126,6 +129,13 @@ */ HoodieTimeline

[GitHub] [hudi] satishkotha commented on a change in pull request #1853: [HUDI-1072] Add replace metadata file to timeline

2020-07-21 Thread GitBox
satishkotha commented on a change in pull request #1853: URL: https://github.com/apache/hudi/pull/1853#discussion_r458510963 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java ## @@ -65,7 +65,8 @@ COMMIT_EXTENSION,

[GitHub] [hudi] satishkotha commented on a change in pull request #1853: [HUDI-1072] Add replace metadata file to timeline

2020-07-21 Thread GitBox
satishkotha commented on a change in pull request #1853: URL: https://github.com/apache/hudi/pull/1853#discussion_r458510585 ## File path: hudi-common/src/main/avro/HoodieReplaceMetadata.avsc ## @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] [hudi] nsivabalan commented on a change in pull request #1858: [WIP] [1014] Part 1: Adding Upgrade or downgrade infra

2020-07-21 Thread GitBox
nsivabalan commented on a change in pull request #1858: URL: https://github.com/apache/hudi/pull/1858#discussion_r458509862 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableConfig.java ## @@ -151,6 +154,27 @@ public HoodieTableType

[GitHub] [hudi] n3nash commented on a change in pull request #1859: [HUDI-1072] Use replace metadata file to filter excluded files in views

2020-07-21 Thread GitBox
n3nash commented on a change in pull request #1859: URL: https://github.com/apache/hudi/pull/1859#discussion_r458490961 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java ## @@ -103,14 +105,19 @@ protected void

[GitHub] [hudi] leesf commented on a change in pull request #1851: [HUDI-1113] Add user define metrics reporter

2020-07-21 Thread GitBox
leesf commented on a change in pull request #1851: URL: https://github.com/apache/hudi/pull/1851#discussion_r458478829 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieMetricsConfig.java ## @@ -58,6 +59,12 @@ public static final String

[GitHub] [hudi] n3nash commented on a change in pull request #1859: [HUDI-1072] Use replace metadata file to filter excluded files in views

2020-07-21 Thread GitBox
n3nash commented on a change in pull request #1859: URL: https://github.com/apache/hudi/pull/1859#discussion_r458478588 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java ## @@ -103,14 +105,19 @@ protected void

[GitHub] [hudi] leesf commented on a change in pull request #1851: [HUDI-1113] Add user define metrics reporter

2020-07-21 Thread GitBox
leesf commented on a change in pull request #1851: URL: https://github.com/apache/hudi/pull/1851#discussion_r458478400 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieMetricsConfig.java ## @@ -58,6 +59,12 @@ public static final String

[GitHub] [hudi] leesf commented on a change in pull request #1851: [HUDI-1113] Add user define metrics reporter

2020-07-21 Thread GitBox
leesf commented on a change in pull request #1851: URL: https://github.com/apache/hudi/pull/1851#discussion_r458477826 ## File path: hudi-client/src/main/java/org/apache/hudi/metrics/userdefined/DefaultUserDefinedMetricsReporter.java ## @@ -0,0 +1,48 @@ +/* + * Licensed to

[GitHub] [hudi] leesf commented on a change in pull request #1851: [HUDI-1113] Add user define metrics reporter

2020-07-21 Thread GitBox
leesf commented on a change in pull request #1851: URL: https://github.com/apache/hudi/pull/1851#discussion_r458477465 ## File path: hudi-client/src/main/java/org/apache/hudi/metrics/MetricsReporterType.java ## @@ -22,5 +22,5 @@ * Types of the reporter. Right now we only

[GitHub] [hudi] leesf commented on a change in pull request #1851: [HUDI-1113] Add user define metrics reporter

2020-07-21 Thread GitBox
leesf commented on a change in pull request #1851: URL: https://github.com/apache/hudi/pull/1851#discussion_r458477384 ## File path: hudi-client/src/main/java/org/apache/hudi/metrics/MetricsReporterFactory.java ## @@ -48,6 +51,10 @@ public static MetricsReporter

[GitHub] [hudi] n3nash commented on a change in pull request #1853: [HUDI-1072] Add replace metadata file to timeline

2020-07-21 Thread GitBox
n3nash commented on a change in pull request #1853: URL: https://github.com/apache/hudi/pull/1853#discussion_r458476891 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieTimeline.java ## @@ -126,6 +129,13 @@ */ HoodieTimeline

[GitHub] [hudi] n3nash commented on a change in pull request #1853: [HUDI-1072] Add replace metadata file to timeline

2020-07-21 Thread GitBox
n3nash commented on a change in pull request #1853: URL: https://github.com/apache/hudi/pull/1853#discussion_r458476453 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java ## @@ -65,7 +65,8 @@ COMMIT_EXTENSION,

[GitHub] [hudi] n3nash commented on a change in pull request #1853: [HUDI-1072] Add replace metadata file to timeline

2020-07-21 Thread GitBox
n3nash commented on a change in pull request #1853: URL: https://github.com/apache/hudi/pull/1853#discussion_r458476085 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java ## @@ -304,6 +305,22 @@ public HoodieInstant

[GitHub] [hudi] n3nash commented on a change in pull request #1853: [HUDI-1072] Add replace metadata file to timeline

2020-07-21 Thread GitBox
n3nash commented on a change in pull request #1853: URL: https://github.com/apache/hudi/pull/1853#discussion_r458475548 ## File path: hudi-common/src/main/avro/HoodieReplaceMetadata.avsc ## @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + *

[GitHub] [hudi] yanghua commented on a change in pull request #1770: [HUDI-708]Add temps show and unit test for TempViewCommand

2020-07-21 Thread GitBox
yanghua commented on a change in pull request #1770: URL: https://github.com/apache/hudi/pull/1770#discussion_r458466053 ## File path: hudi-cli/src/main/java/org/apache/hudi/cli/commands/TempViewCommand.java ## @@ -20,36 +20,55 @@ import org.apache.hudi.cli.HoodieCLI;

[GitHub] [hudi] bvaradar commented on issue #1852: [SUPPORT]

2020-07-21 Thread GitBox
bvaradar commented on issue #1852: URL: https://github.com/apache/hudi/issues/1852#issuecomment-662177092 MacBook-Pro:hudi balaji.varadarajan$ grep -c '\.clean.requested' ~/Downloads/dot_hoodie_folder.txt 16 MacBook-Pro:hudi balaji.varadarajan$ grep -c '\.deltacommit.requested'

[GitHub] [hudi] satishkotha opened a new pull request #1859: [HUDI-1072] Use replace metadata file to filter excluded files in views

2020-07-21 Thread GitBox
satishkotha opened a new pull request #1859: URL: https://github.com/apache/hudi/pull/1859 ## What is the purpose of the pull request Follow up on #1853 Use metadata and filter excluded files from views. Changed base views. If general approach looks good, I can update

[GitHub] [hudi] bvaradar commented on issue #1825: [SUPPORT] Compaction of parquet and meta file

2020-07-21 Thread GitBox
bvaradar commented on issue #1825: URL: https://github.com/apache/hudi/issues/1825#issuecomment-662170951 With 0.5.[1/2], Hudi stopped using renames for state transition. Hence, you are seeing separate state files for each action. All these files (except rollback) will be cleaned up as

[jira] [Updated] (HUDI-1118) Cleanup rollback files residing in .hoodie folder

2020-07-21 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1118: - Fix Version/s: (was: 0.6.1) 0.6.0 > Cleanup rollback files

[jira] [Updated] (HUDI-1118) Cleanup rollback files residing in .hoodie folder

2020-07-21 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1118: - Status: Open (was: New) > Cleanup rollback files residing in .hoodie folder >

[jira] [Created] (HUDI-1118) Cleanup rollback files residing in .hoodie folder

2020-07-21 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-1118: Summary: Cleanup rollback files residing in .hoodie folder Key: HUDI-1118 URL: https://issues.apache.org/jira/browse/HUDI-1118 Project: Apache Hudi

[jira] [Comment Edited] (HUDI-1117) Add tdunning json library to spark and utilities bundle

2020-07-21 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162402#comment-17162402 ] Balaji Varadarajan edited comment on HUDI-1117 at 7/22/20, 12:07 AM: -

[jira] [Commented] (HUDI-1117) Add tdunning json library to spark and utilities bundle

2020-07-21 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162402#comment-17162402 ] Balaji Varadarajan commented on HUDI-1117: -- THis can also be potentially solved by including

[jira] [Assigned] (HUDI-1117) Add tdunning json library to spark and utilities bundle

2020-07-21 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-1117: Assignee: Balaji Varadarajan > Add tdunning json library to spark and utilities

[GitHub] [hudi] bvaradar commented on issue #1787: Exception During Insert

2020-07-21 Thread GitBox
bvaradar commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-662166742 @asheeshgarg : JSONException class is coming from https://mvnrepository.com/artifact/org.json/json There is licensing issue and hence not part of hudi bundle packages. The underlying

[jira] [Updated] (HUDI-1117) Add tdunning json library to spark and utilities bundle

2020-07-21 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1117: - Status: Open (was: New) > Add tdunning json library to spark and utilities bundle >

[jira] [Created] (HUDI-1117) Add tdunning json library to spark and utilities bundle

2020-07-21 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-1117: Summary: Add tdunning json library to spark and utilities bundle Key: HUDI-1117 URL: https://issues.apache.org/jira/browse/HUDI-1117 Project: Apache Hudi

[GitHub] [hudi] yihua commented on a change in pull request #1149: [HUDI-472] Introduce configurations and new modes of sorting for bulk_insert

2020-07-21 Thread GitBox
yihua commented on a change in pull request #1149: URL: https://github.com/apache/hudi/pull/1149#discussion_r458381933 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -245,6 +250,16 @@ public int getMaxConsistencyCheckIntervalMs() {

[GitHub] [hudi] vinothchandar commented on a change in pull request #1858: [WIP] [1014] Part 1: Adding Upgrade or downgrade infra

2020-07-21 Thread GitBox
vinothchandar commented on a change in pull request #1858: URL: https://github.com/apache/hudi/pull/1858#discussion_r458412092 ## File path: hudi-client/src/main/java/org/apache/hudi/table/UpgradeDowngradeHelper.java ## @@ -0,0 +1,175 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] vinothchandar commented on a change in pull request #1858: [WIP] [1014] Part 1: Adding Upgrade or downgrade infra

2020-07-21 Thread GitBox
vinothchandar commented on a change in pull request #1858: URL: https://github.com/apache/hudi/pull/1858#discussion_r458411863 ## File path: hudi-client/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java ## @@ -190,6 +192,7 @@ public HoodieMetrics

[GitHub] [hudi] nsivabalan commented on a change in pull request #1858: [WIP] [1014] Part 1: Adding Upgrade or downgrade infra

2020-07-21 Thread GitBox
nsivabalan commented on a change in pull request #1858: URL: https://github.com/apache/hudi/pull/1858#discussion_r458365915 ## File path: hudi-client/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java ## @@ -190,6 +192,7 @@ public HoodieMetrics getMetrics() {

[GitHub] [hudi] vinothchandar commented on pull request #1765: [HUDI-1049] 0.5.3 Patch - In inline compaction mode, previously failed compactions needs to be retried before new compactions

2020-07-21 Thread GitBox
vinothchandar commented on pull request #1765: URL: https://github.com/apache/hudi/pull/1765#issuecomment-662094985 Closing this in favor of #1857 This is an automated message from the Apache Git Service. To respond to the

[GitHub] [hudi] vinothchandar closed pull request #1765: [HUDI-1049] 0.5.3 Patch - In inline compaction mode, previously failed compactions needs to be retried before new compactions

2020-07-21 Thread GitBox
vinothchandar closed pull request #1765: URL: https://github.com/apache/hudi/pull/1765 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] nsivabalan opened a new pull request #1858: [WIP] [1014] Part 1: Adding Upgrade or downgrade infra

2020-07-21 Thread GitBox
nsivabalan opened a new pull request #1858: URL: https://github.com/apache/hudi/pull/1858 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[jira] [Resolved] (HUDI-92) Include custom names for spark HUDI spark DAG stages for easier understanding

2020-07-21 Thread Prashant Wason (Jira)
[ https://issues.apache.org/jira/browse/HUDI-92?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Wason resolved HUDI-92. Resolution: Fixed > Include custom names for spark HUDI spark DAG stages for easier understanding >

[GitHub] [hudi] prashantwason commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-07-21 Thread GitBox
prashantwason commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r458355792 ## File path: hudi-client/src/main/java/org/apache/hudi/io/HoodieSortedMergeHandle.java ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] prashantwason commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-07-21 Thread GitBox
prashantwason commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r458355367 ## File path: hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java ## @@ -0,0 +1,301 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] prashantwason commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-07-21 Thread GitBox
prashantwason commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r458355188 ## File path: hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileReader.java ## @@ -34,7 +35,17 @@ public Set filterRowKeys(Set

[GitHub] [hudi] prashantwason commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-07-21 Thread GitBox
prashantwason commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r458355532 ## File path: hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileReader.java ## @@ -0,0 +1,301 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] prashantwason commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-07-21 Thread GitBox
prashantwason commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r458354886 ## File path: hudi-client/src/main/java/org/apache/hudi/io/HoodieSortedMergeHandle.java ## @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] xushiyan commented on issue #1846: [SUPPORT] HoodieSnapshotCopier example

2020-07-21 Thread GitBox
xushiyan commented on issue #1846: URL: https://github.com/apache/hudi/issues/1846#issuecomment-662059975 @tooptoop4 actually you should be able to achieve that with `HoodieDeltaStreamer`: just point the source to the existing hudi table and write to another dir, make sure set file size

[GitHub] [hudi] prashantwason commented on a change in pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-07-21 Thread GitBox
prashantwason commented on a change in pull request #1804: URL: https://github.com/apache/hudi/pull/1804#discussion_r458319202 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieLogBlock.java ## @@ -110,7 +110,7 @@ public long

[jira] [Assigned] (HUDI-767) Support transformation when export to Hudi

2020-07-21 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-767: --- Assignee: (was: Raymond Xu) > Support transformation when export to Hudi >

[GitHub] [hudi] xushiyan commented on issue #1846: [SUPPORT] HoodieSnapshotCopier example

2020-07-21 Thread GitBox
xushiyan commented on issue #1846: URL: https://github.com/apache/hudi/issues/1846#issuecomment-662038478 > @xushiyan I want to replace contents of existing table. ie read existing 10k small files from tableA and replace tableA with 20 big files @tooptoop4 as i mentioned,

[GitHub] [hudi] tooptoop4 commented on issue #1846: [SUPPORT] HoodieSnapshotCopier example

2020-07-21 Thread GitBox
tooptoop4 commented on issue #1846: URL: https://github.com/apache/hudi/issues/1846#issuecomment-662028709 @xushiyan I want to replace contents of existing table. ie read existing 10k small files from tableA and replace tableA with 20 big files

[GitHub] [hudi] vinothchandar opened a new pull request #1857: [HUDI-1029] In inline compaction mode, previously failed compactions …

2020-07-21 Thread GitBox
vinothchandar opened a new pull request #1857: URL: https://github.com/apache/hudi/pull/1857 …needs to be retried before new compactions - Prevents failed compactions from causing issues with future commits - Need to add tests ## *Tips* - *Thank you very much for

[hudi] branch hudi_test_suite_refactor updated (247d923 -> ea2c616)

2020-07-21 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a change to branch hudi_test_suite_refactor in repository https://gitbox.apache.org/repos/asf/hudi.git. discard 247d923 [HUDI-394] Provide a basic implementation of test suite add ea2c616 [HUDI-394]

[GitHub] [hudi] xushiyan commented on issue #1846: [SUPPORT] HoodieSnapshotCopier example

2020-07-21 Thread GitBox
xushiyan commented on issue #1846: URL: https://github.com/apache/hudi/issues/1846#issuecomment-662007289 > Can I use it to read all 0.4.6 COW hoodie data from one path and write back into less files in 0.5.3 format on same path? IIUC, this is to perform write operation from one

[hudi] branch master updated: [HUDI-994] Move TestHoodieIndex test cases to unit tests (#1850)

2020-07-21 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 5e7ab11 [HUDI-994] Move TestHoodieIndex test

[GitHub] [hudi] vinothchandar merged pull request #1850: [HUDI-994] Move TestHoodieIndex test cases to unit tests

2020-07-21 Thread GitBox
vinothchandar merged pull request #1850: URL: https://github.com/apache/hudi/pull/1850 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] garyli1019 commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-21 Thread GitBox
garyli1019 commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r458247881 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HudiMergeOnReadRDD.scala ## @@ -0,0 +1,195 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [hudi] garyli1019 commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-21 Thread GitBox
garyli1019 commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r458245661 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HudiMergeOnReadRDD.scala ## @@ -0,0 +1,195 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [hudi] garyli1019 commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-21 Thread GitBox
garyli1019 commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r458245661 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HudiMergeOnReadRDD.scala ## @@ -0,0 +1,195 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [hudi] garyli1019 commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-21 Thread GitBox
garyli1019 commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r458242933 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HudiMergeOnReadRDD.scala ## @@ -0,0 +1,195 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [hudi] garyli1019 commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-21 Thread GitBox
garyli1019 commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r458241891 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HudiMergeOnReadRDD.scala ## @@ -0,0 +1,195 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [hudi] garyli1019 commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-21 Thread GitBox
garyli1019 commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r458237536 ## File path: hudi-spark/src/main/scala/org/apache/hudi/SnapshotRelation.scala ## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [hudi] ssomuah edited a comment on issue #1852: [SUPPORT]

2020-07-21 Thread GitBox
ssomuah edited a comment on issue #1852: URL: https://github.com/apache/hudi/issues/1852#issuecomment-661970919 I don't see any exceptions in the driver logs or executor logs. I see these two warnings in driver logs ``` 20/07/21 13:12:28 WARN

[GitHub] [hudi] ssomuah commented on issue #1852: [SUPPORT]

2020-07-21 Thread GitBox
ssomuah commented on issue #1852: URL: https://github.com/apache/hudi/issues/1852#issuecomment-661970919 I don't see any exceptions in the driver logs or executor logs. I see these two warnings in driver logs ``` 20/07/21 13:12:28 WARN IncrementalTimelineSyncFileSystemView:

[GitHub] [hudi] garyli1019 commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-21 Thread GitBox
garyli1019 commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r458232450 ## File path: hudi-spark/src/main/scala/org/apache/hudi/SnapshotRelation.scala ## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [hudi] garyli1019 commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-21 Thread GitBox
garyli1019 commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r458230756 ## File path: hudi-spark/src/main/scala/org/apache/hudi/SnapshotRelation.scala ## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [hudi] garyli1019 commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-21 Thread GitBox
garyli1019 commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r458224048 ## File path: hudi-spark/src/main/scala/org/apache/hudi/DataSourceOptions.scala ## @@ -110,6 +112,10 @@ object DataSourceReadOptions { */ val

[GitHub] [hudi] tooptoop4 commented on issue #1825: [SUPPORT] Compaction of parquet and meta file

2020-07-21 Thread GitBox
tooptoop4 commented on issue #1825: URL: https://github.com/apache/hudi/issues/1825#issuecomment-661955977 i'm facing the same entries under .hoodie This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] garyli1019 commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-21 Thread GitBox
garyli1019 commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r458215093 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/AbstractRealtimeRecordReader.java ## @@ -147,12 +146,4 @@ public Schema

[GitHub] [hudi] garyli1019 commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-21 Thread GitBox
garyli1019 commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r458213831 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/config/HadoopSerializableConfiguration.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to

[GitHub] [hudi] shenh062326 commented on a change in pull request #1819: [HUDI-1058] Make delete marker configurable

2020-07-21 Thread GitBox
shenh062326 commented on a change in pull request #1819: URL: https://github.com/apache/hudi/pull/1819#discussion_r458181162 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/OverwriteWithLatestAvroPayload.java ## @@ -66,8 +74,9 @@ public

[GitHub] [hudi] shenh062326 commented on a change in pull request #1819: [HUDI-1058] Make delete marker configurable

2020-07-21 Thread GitBox
shenh062326 commented on a change in pull request #1819: URL: https://github.com/apache/hudi/pull/1819#discussion_r458181162 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/OverwriteWithLatestAvroPayload.java ## @@ -66,8 +74,9 @@ public

[GitHub] [hudi] GurRonenExplorium edited a comment on issue #1856: [SUPPORT] HiveSyncTool fails on alter table cascade

2020-07-21 Thread GitBox
GurRonenExplorium edited a comment on issue #1856: URL: https://github.com/apache/hudi/issues/1856#issuecomment-661909397 Additional context, the hudi configuration: ``` val hudiOptions = Map[String, String]( HoodieWriteConfig.TABLE_NAME -> tableName,

[GitHub] [hudi] GurRonenExplorium commented on issue #1856: [SUPPORT] HiveSyncTool fails on alter table cascade

2020-07-21 Thread GitBox
GurRonenExplorium commented on issue #1856: URL: https://github.com/apache/hudi/issues/1856#issuecomment-661909397 Additional context, the hudi configuration: ``` val hudiOptions = Map[String, String]( HoodieWriteConfig.TABLE_NAME -> tableName,

[GitHub] [hudi] GurRonenExplorium opened a new issue #1856: [SUPPORT] HiveSyncTool fails on alter table cascade

2020-07-21 Thread GitBox
GurRonenExplorium opened a new issue #1856: URL: https://github.com/apache/hudi/issues/1856 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)? - Join the mailing list to engage in conversations and get

[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-21 Thread GitBox
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-661874955 @bvaradar any recommendation on this please. This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] asheeshgarg commented on issue #1825: [SUPPORT] Compaction of parquet and meta file

2020-07-21 Thread GitBox
asheeshgarg commented on issue #1825: URL: https://github.com/apache/hudi/issues/1825#issuecomment-661874341 @bvaradar so the insert are looking fine now the COW compaction is generating 2 parquet file for each date. I also set the following properties "hoodie.keep.min.commits":

[GitHub] [hudi] Mathieu1124 commented on a change in pull request #1842: [HUDI-1037]Introduce a write committed callback hook

2020-07-21 Thread GitBox
Mathieu1124 commented on a change in pull request #1842: URL: https://github.com/apache/hudi/pull/1842#discussion_r458060234 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -632,6 +632,21 @@ public FileSystemViewStorageConfig

[jira] [Commented] (HUDI-1116) Support time travel using timestamp type

2020-07-21 Thread linshan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161997#comment-17161997 ] linshan commented on HUDI-1116: --- hi,[~vbalaji]       would you describe the problem in detail? I want to

[GitHub] [hudi] leesf commented on pull request #1855: [HUDI-871] Add support for Tencent Cloud Object Storage(COS)

2020-07-21 Thread GitBox
leesf commented on pull request #1855: URL: https://github.com/apache/hudi/pull/1855#issuecomment-661790185 close to retrigger This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [hudi] leesf closed pull request #1855: [HUDI-871] Add support for Tencent Cloud Object Storage(COS)

2020-07-21 Thread GitBox
leesf closed pull request #1855: URL: https://github.com/apache/hudi/pull/1855 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Assigned] (HUDI-1109) Support Spark Structured Streaming read from Hudi table

2020-07-21 Thread linshan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] linshan reassigned HUDI-1109: - Assignee: linshan > Support Spark Structured Streaming read from Hudi table >

[jira] [Assigned] (HUDI-1116) Support time travel using timestamp type

2020-07-21 Thread linshan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] linshan reassigned HUDI-1116: - Assignee: linshan > Support time travel using timestamp type >

[jira] [Commented] (HUDI-871) Add support for Tencent cloud COS

2020-07-21 Thread deyzhong (Jira)
[ https://issues.apache.org/jira/browse/HUDI-871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161900#comment-17161900 ] deyzhong commented on HUDI-871: --- I have submit a pr([https://github.com/apache/hudi/pull/1855]), please help

[jira] [Updated] (HUDI-871) Add support for Tencent cloud COS

2020-07-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-871: Labels: newbie pull-request-available starter (was: newbie starter) > Add support for Tencent cloud

[GitHub] [hudi] DeyinZhong opened a new pull request #1855: [HUDI-871] Add support for Tencent Cloud Object Storage(COS)

2020-07-21 Thread GitBox
DeyinZhong opened a new pull request #1855: URL: https://github.com/apache/hudi/pull/1855 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[jira] [Commented] (HUDI-1116) Support time travel using timestamp type

2020-07-21 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161883#comment-17161883 ] Balaji Varadarajan commented on HUDI-1116: -- One option is to provide a mapping utility which can

[GitHub] [hudi] lw309637554 commented on pull request #1810: [HUDI-875] Abstract hudi-sync-common, and support hudi-hive-sync

2020-07-21 Thread GitBox
lw309637554 commented on pull request #1810: URL: https://github.com/apache/hudi/pull/1810#issuecomment-661736470 @vinothchandar The pr is ready overall. Can you help to review ? This is an automated message from the Apache

[jira] [Updated] (HUDI-1116) Support time travel using timestamp type

2020-07-21 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1116: - Status: Open (was: New) > Support time travel using timestamp type >

[jira] [Created] (HUDI-1116) Support time travel using timestamp type

2020-07-21 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-1116: Summary: Support time travel using timestamp type Key: HUDI-1116 URL: https://issues.apache.org/jira/browse/HUDI-1116 Project: Apache Hudi Issue

[GitHub] [hudi] sbernauer commented on issue #1845: [SUPPORT] Support for Schema evolution. Facing an error

2020-07-21 Thread GitBox
sbernauer commented on issue #1845: URL: https://github.com/apache/hudi/issues/1845#issuecomment-661711698 @bvaradar, yes i am appending the field to end of the schema (as reproduced in the test). The definition of the event is outside my scope, we just consume this events ;)

  1   2   >