[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17395049#comment-17395049
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

nsivabalan merged pull request #3419:
URL: https://github.com/apache/hudi/pull/3419


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17395039#comment-17395039
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3419:
URL: https://github.com/apache/hudi/pull/3419#issuecomment-893980561


   
   ## CI report:
   
   * 835c62fca2fad8622c358cfdb053d4cca86d1747 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1428)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1448)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17395029#comment-17395029
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3419:
URL: https://github.com/apache/hudi/pull/3419#issuecomment-893980561


   
   ## CI report:
   
   * 835c62fca2fad8622c358cfdb053d4cca86d1747 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1428)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1448)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17395025#comment-17395025
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

nsivabalan commented on pull request #3419:
URL: https://github.com/apache/hudi/pull/3419#issuecomment-894576839


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394608#comment-17394608
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3419:
URL: https://github.com/apache/hudi/pull/3419#issuecomment-893980561


   
   ## CI report:
   
   * 835c62fca2fad8622c358cfdb053d4cca86d1747 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1428)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394576#comment-17394576
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3419:
URL: https://github.com/apache/hudi/pull/3419#issuecomment-893980561


   
   ## CI report:
   
   * 9a72ddd7f012528a3e7c0c9441f760bfb2a18f3d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1423)
 
   * 835c62fca2fad8622c358cfdb053d4cca86d1747 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1428)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394574#comment-17394574
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3419:
URL: https://github.com/apache/hudi/pull/3419#issuecomment-893980561


   
   ## CI report:
   
   * 9a72ddd7f012528a3e7c0c9441f760bfb2a18f3d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1423)
 
   * 835c62fca2fad8622c358cfdb053d4cca86d1747 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394513#comment-17394513
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3419:
URL: https://github.com/apache/hudi/pull/3419#issuecomment-893980561


   
   ## CI report:
   
   * 9a72ddd7f012528a3e7c0c9441f760bfb2a18f3d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1423)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394507#comment-17394507
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

vinothchandar commented on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-893993045


   Closing in favor of #3419 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394508#comment-17394508
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

vinothchandar closed pull request #3211:
URL: https://github.com/apache/hudi/pull/3211


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394496#comment-17394496
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3419:
URL: https://github.com/apache/hudi/pull/3419#issuecomment-893980561


   
   ## CI report:
   
   * 9a72ddd7f012528a3e7c0c9441f760bfb2a18f3d Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1423)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394492#comment-17394492
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot commented on pull request #3419:
URL: https://github.com/apache/hudi/pull/3419#issuecomment-893980561


   
   ## CI report:
   
   * 9a72ddd7f012528a3e7c0c9441f760bfb2a18f3d UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17394491#comment-17394491
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

codope opened a new pull request #3419:
URL: https://github.com/apache/hudi/pull/3419


   …metadata as part of clustering
   
   This PR is a re-work of #3211 with some minor changes and tests.
   
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393858#comment-17393858
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

satishkotha commented on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-892786714


   > @codope @satishkotha what's the next step here?
   > Could I help somehow to get this moving along
   
   @codope is working on adding additional tests for this PR. he mentioned he 
opened https://github.com/codope/hudi/pull/3 
   I'll review that and merge it here sometime this week/early next week.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393297#comment-17393297
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

satishkotha commented on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-892786714


   > @codope @satishkotha what's the next step here?
   > Could I help somehow to get this moving along
   
   @codope is working on adding additional tests for this PR. he mentioned he 
opened https://github.com/codope/hudi/pull/3 
   I'll review that and merge it here sometime this week/early next week.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available, release-blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-08-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17391812#comment-17391812
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

vinothchandar commented on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-891366331


   @codope @satishkotha what's the next step here? 
   Could I help somehow to get this moving along


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-09 Thread liwei (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377966#comment-17377966
 ] 

liwei commented on HUDI-1468:
-

[~vinoth] hello , is [https://github.com/apache/hudi/pull/3139/files] land it , 
this issue can close?

> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375529#comment-17375529
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

codope commented on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-874070580


   @satishkotha Couple of high level questions:
   * Would preserving commit time be sufficient to support incremental read? 
Won't we need incremental timeline support (#2388 ) as well? 
   * I see that a new `SparkAllowUpdateStrategy` has been added. How are we 
handling update conflicts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17374790#comment-17374790
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

codope commented on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-874070580


   @satishkotha Couple of high level questions:
   * Would preserving commit time be sufficient to support incremental read? 
Won't we need incremental timeline support (#2388 ) as well? 
   * I see that a new `SparkAllowUpdateStrategy` has been added. How are we 
handling update conflicts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373838#comment-17373838
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

codecov-commenter edited a comment on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872810385


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3211](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 (56f4484) into 
[master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 (6eca06d) will **increase** coverage by `18.25%`.
   > The diff coverage is `60.30%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3211/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3211   +/-   ##
   =
   + Coverage 47.51%   65.76%   +18.25% 
   + Complexity 5429  796 -4633 
   =
 Files   922  101  -821 
 Lines 40968 3529-37439 
 Branches   4105  351 -3754 
   =
   - Hits  19464 2321-17143 
   + Misses19780 1070-18710 
   + Partials   1724  138 -1586 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `65.76% <60.30%> (+31.18%)` | :arrow_up: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...trategy/SparkRecentDaysClusteringPlanStrategy.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2NsdXN0ZXJpbmcvcGxhbi9zdHJhdGVneS9TcGFya1JlY2VudERheXNDbHVzdGVyaW5nUGxhblN0cmF0ZWd5LmphdmE=)
 | `100.00% <ø> (+24.39%)` | :arrow_up: |
   | 
[...SparkSelectedPartitionsClusteringPlanStrategy.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2NsdXN0ZXJpbmcvcGxhbi9zdHJhdGVneS9TcGFya1NlbGVjdGVkUGFydGl0aW9uc0NsdXN0ZXJpbmdQbGFuU3RyYXRlZ3kuamF2YQ==)
 | `0.00% <0.00%> (ø)` | |
   | 
[.../run/strategy/SingleSparkJobExecutionStrategy.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2NsdXN0ZXJpbmcvcnVuL3N0cmF0ZWd5L1NpbmdsZVNwYXJrSm9iRXhlY3V0aW9uU3RyYXRlZ3kuamF2YQ==)
 | `0.00% <0.00%> (ø)` | |
   | 
[...ring/update/strategy/SparkAllowUpdateStrategy.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2NsdXN0ZXJpbmcvdXBkYXRlL3N0cmF0ZWd5L1NwYXJrQWxsb3dVcGRhdGVTdHJhdGVneS5qYXZh)
 | `0.00% <0.00%> (ø)` | |
   | 
[...SparkInsertOver

[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373726#comment-17373726
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872639166


   
   ## CI report:
   
   * 56f44844fbb9f251f0840a556553f3862771d4fc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=661)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373725#comment-17373725
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872639166


   
   ## CI report:
   
   * c9c9a0d5343b65e690544dfcb85e71d915c455e1 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=637)
 
   * 56f44844fbb9f251f0840a556553f3862771d4fc UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373361#comment-17373361
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

codecov-commenter edited a comment on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872810385






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373356#comment-17373356
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

codecov-commenter edited a comment on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872810385


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3211](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 (c9c9a0d) into 
[master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 (6eca06d) will **decrease** coverage by `29.95%`.
   > The diff coverage is `48.58%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3211/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#3211   +/-   ##
   =
   - Coverage 47.51%   17.55%   -29.96% 
   + Complexity 5429  878 -4551 
   =
 Files   922  383  -539 
 Lines 4096815122-25846 
 Branches   4105 1297 -2808 
   =
   - Hits  19464 2655-16809 
   + Misses1978012303 -7477 
   + Partials   1724  164 -1560 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `20.91% <48.58%> (-13.67%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `9.31% <ø> (-48.72%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=)
 | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh)
 | `0.00% <0.00%> (-42.79%)` | :arrow_down: |
   | 
[...n/java/org/apache/hudi/io/CreateHandleFactory.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0NyZWF0ZUhhbmRsZUZhY3RvcnkuamF2YQ==)
 | `0.00% <0.00%> (ø)` | |
   | 
[...in/java/org/apache/hudi/io/HoodieCreateHandle.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZUNyZWF0ZUhhbmRsZS5qYXZh)
 | `0.00% <0.00%> (ø)` | |
   | 
[...rg/apache/hudi/io/HoodieUnboundedCreateHandle.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=githu

[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373324#comment-17373324
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

codecov-commenter commented on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872810385


   # 
[Codecov](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3211](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 (c9c9a0d) into 
[master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 (6eca06d) will **decrease** coverage by `44.62%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hudi/pull/3211/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@ Coverage Diff  @@
   ## master   #3211   +/-   ##
   
   - Coverage 47.51%   2.88%   -44.63% 
   + Complexity 5429  82 -5347 
   
 Files   922 282  -640 
 Lines 40968   11593-29375 
 Branches   4105 946 -3159 
   
   - Hits  19464 334-19130 
   + Misses19780   11233 -8547 
   + Partials   1724  26 -1698 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `9.31% <ø> (-48.72%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hudi/pull/3211?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=)
 | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
   | 
[...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh)
 | `0.00% <0.00%> (-42.79%)` | :arrow_down: |
   | 
[...n/java/org/apache/hudi/io/CreateHandleFactory.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0NyZWF0ZUhhbmRsZUZhY3RvcnkuamF2YQ==)
 | `0.00% <0.00%> (ø)` | |
   | 
[...in/java/org/apache/hudi/io/HoodieCreateHandle.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2lvL0hvb2RpZUNyZWF0ZUhhbmRsZS5qYXZh)
 | `0.00% <0.00%> (ø)` | |
   | 
[...rg/apache/hudi/io/HoodieUnboundedCreateHandle.java](https://codecov.io/gh/apache/hudi/pull/3211/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&u

[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373234#comment-17373234
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872639166


   
   ## CI report:
   
   * c9c9a0d5343b65e690544dfcb85e71d915c455e1 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=637)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373214#comment-17373214
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872639166


   
   ## CI report:
   
   * ab7bacb26d44f383e7f61ec81531b34011f1383b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=629)
 
   * c9c9a0d5343b65e690544dfcb85e71d915c455e1 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=637)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373212#comment-17373212
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872639166


   
   ## CI report:
   
   * ab7bacb26d44f383e7f61ec81531b34011f1383b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=629)
 
   * c9c9a0d5343b65e690544dfcb85e71d915c455e1 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373143#comment-17373143
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872639166


   
   ## CI report:
   
   * ab7bacb26d44f383e7f61ec81531b34011f1383b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=629)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373124#comment-17373124
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot edited a comment on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872639166


   
   ## CI report:
   
   * ab7bacb26d44f383e7f61ec81531b34011f1383b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=629)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373122#comment-17373122
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

hudi-bot commented on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872639166


   
   ## CI report:
   
   * ab7bacb26d44f383e7f61ec81531b34011f1383b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373123#comment-17373123
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

satishkotha commented on pull request #3211:
URL: https://github.com/apache/hudi/pull/3211#issuecomment-872639220


   @n3nash @vinothchandar   this includes all my changes done for supporting 
encryption style usecases using clustering framework. I still need to port some 
tests. But please take a look and add any comments


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-07-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373121#comment-17373121
 ] 

ASF GitHub Bot commented on HUDI-1468:
--

satishkotha opened a new pull request #3211:
URL: https://github.com/apache/hudi/pull/3211


   
   ## What is the purpose of the pull request
   Support custom clustering strategies and preserve commit time to support 
incremental read
   
   ## Brief change log
   
   * introduce new way of running clustering using 
SingleSparkJobExecutionStrategy for usecases that dont need sorting
   *  Push down more logic into clustering strategies to avoid RDD union.
   * Make some performance improvements after running at large scale. Avoid RDD 
collect multiple times.
   * Preserve Hoodie commit time (optional for backward compatibility) while 
rewriting the data
   
   
   ## Verify this pull request
   
   This change added tests and can be verified as follows:
   
   ## Committer checklist
   
- [ ] Has a corresponding JIRA in PR title & commit

- [ ] Commit message is descriptive of the change

- [ ] CI is green
   
- [ ] Necessary doc changes done or have another open PR
  
- [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1468) incremental read support with clustering

2021-06-29 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17371819#comment-17371819
 ] 

Vinoth Chandar commented on HUDI-1468:
--

[~309637554] do you plan to work on this? It would be good to have this in the 
next release.

> incremental read support with clustering
> 
>
> Key: HUDI-1468
> URL: https://issues.apache.org/jira/browse/HUDI-1468
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: Incremental Pull
>Affects Versions: 0.9.0
>Reporter: satish
>Assignee: liwei
>Priority: Blocker
> Fix For: 0.9.0
>
>
> As part of clustering, metadata such as hoodie_commit_time changes for 
> records that are clustered. This is specific to 
> SparkBulkInsertBasedRunClusteringStrategy implementation. Figure out a way to 
> carry commit_time from original record to support incremental queries.
> Also, incremental queries dont work with 'replacecommit' used by clustering 
> HUDI-1264. Change incremental query to work for replacecommits created by 
> Clustering.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)