[GitHub] [hudi] xiarixiaoyao commented on pull request #4308: [HUDI-3008] Fixing HoodieFileIndex partition column parsing for nested fields

2021-12-17 Thread GitBox


xiarixiaoyao commented on pull request #4308:
URL: https://github.com/apache/hudi/pull/4308#issuecomment-997159950


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4308: [HUDI-3008] Fixing HoodieFileIndex partition column parsing for nested fields

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4308:
URL: https://github.com/apache/hudi/pull/4308#issuecomment-99715


   
   ## CI report:
   
   * a28311298525e0713ef000c79633a73162a304bc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4438)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4474)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4308: [HUDI-3008] Fixing HoodieFileIndex partition column parsing for nested fields

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4308:
URL: https://github.com/apache/hudi/pull/4308#issuecomment-996819976


   
   ## CI report:
   
   * a28311298525e0713ef000c79633a73162a304bc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4438)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4346: [HUDI-3045] New ClusteringPlanStrategy to use regex choose partitions when building clustering plan

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4346:
URL: https://github.com/apache/hudi/pull/4346#issuecomment-997153375


   
   ## CI report:
   
   * 2227d98a76c74d94538a57467fe4d72f0a0daeae Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4399)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4406)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4408)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4425)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4430)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4435)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4458)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f
 0d7039a0cc/_build/results?buildId=4473) 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4346: [HUDI-3045] New ClusteringPlanStrategy to use regex choose partitions when building clustering plan

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4346:
URL: https://github.com/apache/hudi/pull/4346#issuecomment-997147548


   
   ## CI report:
   
   * 2227d98a76c74d94538a57467fe4d72f0a0daeae Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4399)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4406)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4408)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4425)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4430)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4435)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4458)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f
 0d7039a0cc/_build/results?buildId=4473) 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4078: [HUDI-2833] Clean up unused archive files instead of expanding indefinitely.

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4078:
URL: https://github.com/apache/hudi/pull/4078#issuecomment-997147506


   
   ## CI report:
   
   * 8f8ae385baf21dacd4b9fedd3670133160001dc0 UNKNOWN
   * 019e161bb908731244e13cdf36d12781956f0114 UNKNOWN
   * 9b9620a298b45a57af6e596c9305a49ccc69345a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4432)
 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4427)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4457)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4472)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4078: [HUDI-2833] Clean up unused archive files instead of expanding indefinitely.

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4078:
URL: https://github.com/apache/hudi/pull/4078#issuecomment-997153336


   
   ## CI report:
   
   * 8f8ae385baf21dacd4b9fedd3670133160001dc0 UNKNOWN
   * 019e161bb908731244e13cdf36d12781956f0114 UNKNOWN
   * 9b9620a298b45a57af6e596c9305a49ccc69345a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4432)
 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4427)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4457)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4472)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (HUDI-3060) DROP TABLE for spark sql

2021-12-17 Thread Forward Xu (Jira)
Forward Xu created HUDI-3060:


 Summary: DROP TABLE for spark sql
 Key: HUDI-3060
 URL: https://issues.apache.org/jira/browse/HUDI-3060
 Project: Apache Hudi
  Issue Type: New Feature
  Components: Spark Integration
Reporter: Forward Xu
Assignee: Forward Xu


drop table [if exists] ; 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[GitHub] [hudi] hudi-bot commented on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997149401


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   * 8ebdbe56f2cec0198f5f19a518906d4d9b834b73 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4471)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997143022


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   * a3e3b87be58f705d665f73e938977ac13b314657 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4467)
 
   * 8ebdbe56f2cec0198f5f19a518906d4d9b834b73 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4471)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4346: [HUDI-3045] New ClusteringPlanStrategy to use regex choose partitions when building clustering plan

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4346:
URL: https://github.com/apache/hudi/pull/4346#issuecomment-997110367


   
   ## CI report:
   
   * 2227d98a76c74d94538a57467fe4d72f0a0daeae Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4399)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4406)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4408)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4425)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4430)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4435)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4458)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4346: [HUDI-3045] New ClusteringPlanStrategy to use regex choose partitions when building clustering plan

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4346:
URL: https://github.com/apache/hudi/pull/4346#issuecomment-997147548


   
   ## CI report:
   
   * 2227d98a76c74d94538a57467fe4d72f0a0daeae Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4399)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4406)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4408)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4425)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4430)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4435)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4458)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f
 0d7039a0cc/_build/results?buildId=4473) 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4078: [HUDI-2833] Clean up unused archive files instead of expanding indefinitely.

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4078:
URL: https://github.com/apache/hudi/pull/4078#issuecomment-997105300


   
   ## CI report:
   
   * 8f8ae385baf21dacd4b9fedd3670133160001dc0 UNKNOWN
   * 019e161bb908731244e13cdf36d12781956f0114 UNKNOWN
   * 9b9620a298b45a57af6e596c9305a49ccc69345a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4432)
 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4427)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4457)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4078: [HUDI-2833] Clean up unused archive files instead of expanding indefinitely.

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4078:
URL: https://github.com/apache/hudi/pull/4078#issuecomment-997147506


   
   ## CI report:
   
   * 8f8ae385baf21dacd4b9fedd3670133160001dc0 UNKNOWN
   * 019e161bb908731244e13cdf36d12781956f0114 UNKNOWN
   * 9b9620a298b45a57af6e596c9305a49ccc69345a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4432)
 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4427)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4457)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4472)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] zhangyue19921010 commented on pull request #4346: [HUDI-3045] New ClusteringPlanStrategy to use regex choose partitions when building clustering plan

2021-12-17 Thread GitBox


zhangyue19921010 commented on pull request #4346:
URL: https://github.com/apache/hudi/pull/4346#issuecomment-997147500


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] zhangyue19921010 commented on pull request #4078: [HUDI-2833] Clean up unused archive files instead of expanding indefinitely.

2021-12-17 Thread GitBox


zhangyue19921010 commented on pull request #4078:
URL: https://github.com/apache/hudi/pull/4078#issuecomment-997147483


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-997146359


   
   ## CI report:
   
   * 7e13f359550fed056bc3315d245736f7596d320b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4470)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-997140256


   
   ## CI report:
   
   * 29113e6aff644be7511d84ae8428a8597a5b10b2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4465)
 
   * 7e13f359550fed056bc3315d245736f7596d320b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4470)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-997131996


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 67cbb2f4ab421fb7a90e4c5d1061613ed331c837 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4460)
 
   * cecde3b6734576c5f2863ec2b4b90689600cb746 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4469)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-997144563


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * cecde3b6734576c5f2863ec2b4b90689600cb746 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4469)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997143022


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   * a3e3b87be58f705d665f73e938977ac13b314657 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4467)
 
   * 8ebdbe56f2cec0198f5f19a518906d4d9b834b73 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4471)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997139942


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   * a3e3b87be58f705d665f73e938977ac13b314657 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4467)
 
   * 8ebdbe56f2cec0198f5f19a518906d4d9b834b73 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] zztttt edited a comment on issue #4072: [SUPPORT]Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://localhost:9000/scala/table6

2021-12-17 Thread GitBox


zz edited a comment on issue #4072:
URL: https://github.com/apache/hudi/issues/4072#issuecomment-997140947


   > 
   
   yes, I read the related documents: 
[https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html](url)
 and find a sentence saying "You can configure javax.jdo.option properties in 
hive-site.xml or using options with spark.hadoop prefix." , then I can achieve 
the target. These config are written in hard code by scala.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] zztttt edited a comment on issue #4072: [SUPPORT]Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://localhost:9000/scala/table6

2021-12-17 Thread GitBox


zz edited a comment on issue #4072:
URL: https://github.com/apache/hudi/issues/4072#issuecomment-997140947


   > 
   
   yes, I read the related documents: 
[https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html](url)
 and find a sentences saying "You can configure javax.jdo.option properties in 
hive-site.xml or using options with spark.hadoop prefix." , then I can achieve 
the target. These config is write in hard code by scala.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] zztttt commented on issue #4072: [SUPPORT]Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://localhost:9000/scala/table6

2021-12-17 Thread GitBox


zz commented on issue #4072:
URL: https://github.com/apache/hudi/issues/4072#issuecomment-997140947


   > 
   
   yes, I read the related documents: 
[https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-hive-metastore.html](url)
 and find a sentences saying "You can configure javax.jdo.option properties in 
hive-site.xml or using options with spark.hadoop prefix." , then I can achieve 
the target.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-997139964


   
   ## CI report:
   
   * 29113e6aff644be7511d84ae8428a8597a5b10b2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4465)
 
   * 7e13f359550fed056bc3315d245736f7596d320b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-997140256


   
   ## CI report:
   
   * 29113e6aff644be7511d84ae8428a8597a5b10b2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4465)
 
   * 7e13f359550fed056bc3315d245736f7596d320b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4470)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-997133775


   
   ## CI report:
   
   * 29113e6aff644be7511d84ae8428a8597a5b10b2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4465)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-997139964


   
   ## CI report:
   
   * 29113e6aff644be7511d84ae8428a8597a5b10b2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4465)
 
   * 7e13f359550fed056bc3315d245736f7596d320b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997139942


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   * a3e3b87be58f705d665f73e938977ac13b314657 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4467)
 
   * 8ebdbe56f2cec0198f5f19a518906d4d9b834b73 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997138989


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   * a3e3b87be58f705d665f73e938977ac13b314657 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4467)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997138989


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   * a3e3b87be58f705d665f73e938977ac13b314657 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4467)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997129617


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * 6ae6b2781237d7e4af95bd78062c3da765ebe9a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4434)
 
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   * a3e3b87be58f705d665f73e938977ac13b314657 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4467)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] esaeki commented on issue #4348: [SUPPORT] How to set timezone for "_hoodie_commit_time" column?

2021-12-17 Thread GitBox


esaeki commented on issue #4348:
URL: https://github.com/apache/hudi/issues/4348#issuecomment-997135786


   Thank you for your response. 
   I develop datalake for Japanese client, and Japan's standard time zone is 
UTC +9 hours. That's why, it's better to adjust the timezone for proper data 
management.
   I would appreciate if you tell me an alternative for this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] manojpec commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


manojpec commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771776844



##
File path: 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/transaction/TestTransactionManager.java
##
@@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client.transaction;
+
+import org.apache.hudi.client.transaction.lock.InProcessLockProvider;
+import org.apache.hudi.common.model.HoodieFailedWritesCleaningPolicy;
+import org.apache.hudi.common.model.WriteConcurrencyMode;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieLockConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieLockException;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+
+public class TestTransactionManager extends HoodieCommonTestHarness {
+  HoodieWriteConfig writeConfig;
+  TransactionManager transactionManager;
+
+  @BeforeEach
+  private void init() throws IOException {
+initPath();
+initMetaClient();
+this.writeConfig = getWriteConfig();
+this.transactionManager = new TransactionManager(this.writeConfig, 
this.metaClient.getFs());
+  }
+
+  private HoodieWriteConfig getWriteConfig() {
+return HoodieWriteConfig.newBuilder()
+.withPath(basePath)
+.withCompactionConfig(HoodieCompactionConfig.newBuilder()
+
.withFailedWritesCleaningPolicy(HoodieFailedWritesCleaningPolicy.LAZY)
+.build())
+
.withWriteConcurrencyMode(WriteConcurrencyMode.OPTIMISTIC_CONCURRENCY_CONTROL)
+.withLockConfig(HoodieLockConfig.newBuilder()
+.withLockProvider(InProcessLockProvider.class)
+.build())
+.build();
+  }
+
+  @Test
+  public void testSingleWriterTransaction() {
+transactionManager.beginTransaction();
+transactionManager.endTransaction();
+  }
+
+  @Test
+  public void testSingleWriterNestedTransaction() {
+transactionManager.beginTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.beginTransaction();
+});
+
+transactionManager.endTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.endTransaction();
+});
+  }
+
+  @Test
+  public void testSingleWriterMultipleTransactions() {
+for (int i = 0; i < 32; i++) {
+  transactionManager.beginTransaction();
+  transactionManager.endTransaction();
+}
+  }
+
+  @Test
+  public void testMultiWriterTransactions() {
+final int threadCount = 3;
+final long awaitMaxTimeoutMs = 2000L;
+final CountDownLatch latch = new CountDownLatch(threadCount);
+final AtomicBoolean writer1Completed = new AtomicBoolean(false);
+final AtomicBoolean writer2Completed = new AtomicBoolean(false);
+
+// Let writer1 get the lock first, then wait for others
+// to join the sync up point.
+Thread writer1 = new Thread(() -> {
+  assertDoesNotThrow(() -> {
+transactionManager.beginTransaction();
+  });
+  latch.countDown();
+  try {
+latch.await(awaitMaxTimeoutMs, TimeUnit.MILLISECONDS);
+// Following sleep is to make sure writer2 attempts
+// to try lock and to get bocked on the lock which
+// this thread is currently holding.
+Thread.sleep(50);
+  } catch (InterruptedException e) {
+//
+  }
+  assertDoesNotThrow(() -> {
+transactionManager.endTransaction();
+  });
+  writer1Completed.set(true);
+});
+writer1.start();
+
+// Writer2 will block on trying to acquire the lock
+// and will eventually get the lock before the timeout.
+Thread writer2 = new Thread(() -> {
+  latch.countDown()

[GitHub] [hudi] manojpec commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


manojpec commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771776738



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/TransactionManager.java
##
@@ -55,28 +55,32 @@ public void beginTransaction() {
 
   public void beginTransaction(Option currentTxnOwnerInstant,
Option 
lastCompletedTxnOwnerInstant) {
-if (supportsOptimisticConcurrency) {
+if (isOptimisticConcurrencyControlEnabled) {
   LOG.info("Transaction starting for " + currentTxnOwnerInstant
   + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
   lockManager.lock();
-  this.currentTxnOwnerInstant = currentTxnOwnerInstant;
-  this.lastCompletedTxnOwnerInstant = lastCompletedTxnOwnerInstant;
+  reset(currentTxnOwnerInstant, lastCompletedTxnOwnerInstant);
   LOG.info("Transaction started for " + currentTxnOwnerInstant
   + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
 }
   }
 
   public void endTransaction() {
-if (supportsOptimisticConcurrency) {
+if (isOptimisticConcurrencyControlEnabled) {

Review comment:
   Good catch, will close the gap in the reset with CAS like operation. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #3879: [SUPPORT] Incomplete Table Migration

2021-12-17 Thread GitBox


nsivabalan commented on issue #3879:
URL: https://github.com/apache/hudi/issues/3879#issuecomment-997135231


   @jardel-lima : let us know if you have any updates or if you can share the 
dataset. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#issuecomment-997134928


   
   ## CI report:
   
   * f0555fa1c09b27744084d20199683a1f8e68d9b7 UNKNOWN
   * a10ec8a603b8297e0a69246b4d33866c9b7f5ad6 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4466)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#issuecomment-997128379


   
   ## CI report:
   
   * f0555fa1c09b27744084d20199683a1f8e68d9b7 UNKNOWN
   * 46bfd4cb47cb7cba1185b9e146cfc8396a91af88 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4464)
 
   * a10ec8a603b8297e0a69246b4d33866c9b7f5ad6 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4466)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #3890: [SUPPORT] Hudi Sync did not add previous partitions

2021-12-17 Thread GitBox


nsivabalan commented on issue #3890:
URL: https://github.com/apache/hudi/issues/3890#issuecomment-997134256


   @stym06 : Can you respond to my questions above. would like to get to the 
bottom of this. But hive sync in general, keeps track of last synced time. so 
not sure how this could happen. If you were able to resolve the issue, feel 
free to close it out. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-997126936


   
   ## CI report:
   
   * 1677eab2ead6910016c2ed0b67640c97757633bd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4410)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4431)
 
   * 29113e6aff644be7511d84ae8428a8597a5b10b2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4465)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-997133775


   
   ## CI report:
   
   * 29113e6aff644be7511d84ae8428a8597a5b10b2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4465)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] YannByron commented on issue #4200: spark-sql query timestamp partition error

2021-12-17 Thread GitBox


YannByron commented on issue #4200:
URL: https://github.com/apache/hudi/issues/4200#issuecomment-997132424


   @nsivabalan  i'll locate this in next days and reply asap. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-997129628


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 67cbb2f4ab421fb7a90e4c5d1061613ed331c837 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4460)
 
   * cecde3b6734576c5f2863ec2b4b90689600cb746 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-997131996


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 67cbb2f4ab421fb7a90e4c5d1061613ed331c837 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4460)
 
   * cecde3b6734576c5f2863ec2b4b90689600cb746 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4469)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] YannByron commented on issue #4154: [SUPPORT] INSERT OVERWRITE operation does not work when using Spark SQL

2021-12-17 Thread GitBox


YannByron commented on issue #4154:
URL: https://github.com/apache/hudi/issues/4154#issuecomment-997131446


   @nsivabalan I failed to reproduce this. @danny0405 can you reproduce this 
issues?
   And @BenjMaq just execute `create table`, `insert into`, and `insert 
overwrite` these three steps? any other commits?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


alexeykudinkin commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771773363



##
File path: 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/transaction/TestTransactionManager.java
##
@@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client.transaction;
+
+import org.apache.hudi.client.transaction.lock.InProcessLockProvider;
+import org.apache.hudi.common.model.HoodieFailedWritesCleaningPolicy;
+import org.apache.hudi.common.model.WriteConcurrencyMode;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieLockConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieLockException;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+
+public class TestTransactionManager extends HoodieCommonTestHarness {
+  HoodieWriteConfig writeConfig;
+  TransactionManager transactionManager;
+
+  @BeforeEach
+  private void init() throws IOException {
+initPath();
+initMetaClient();
+this.writeConfig = getWriteConfig();
+this.transactionManager = new TransactionManager(this.writeConfig, 
this.metaClient.getFs());
+  }
+
+  private HoodieWriteConfig getWriteConfig() {
+return HoodieWriteConfig.newBuilder()
+.withPath(basePath)
+.withCompactionConfig(HoodieCompactionConfig.newBuilder()
+
.withFailedWritesCleaningPolicy(HoodieFailedWritesCleaningPolicy.LAZY)
+.build())
+
.withWriteConcurrencyMode(WriteConcurrencyMode.OPTIMISTIC_CONCURRENCY_CONTROL)
+.withLockConfig(HoodieLockConfig.newBuilder()
+.withLockProvider(InProcessLockProvider.class)
+.build())
+.build();
+  }
+
+  @Test
+  public void testSingleWriterTransaction() {
+transactionManager.beginTransaction();
+transactionManager.endTransaction();
+  }
+
+  @Test
+  public void testSingleWriterNestedTransaction() {
+transactionManager.beginTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.beginTransaction();
+});
+
+transactionManager.endTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.endTransaction();
+});
+  }
+
+  @Test
+  public void testSingleWriterMultipleTransactions() {
+for (int i = 0; i < 32; i++) {
+  transactionManager.beginTransaction();
+  transactionManager.endTransaction();
+}
+  }
+
+  @Test
+  public void testMultiWriterTransactions() {
+final int threadCount = 3;
+final long awaitMaxTimeoutMs = 2000L;
+final CountDownLatch latch = new CountDownLatch(threadCount);
+final AtomicBoolean writer1Completed = new AtomicBoolean(false);
+final AtomicBoolean writer2Completed = new AtomicBoolean(false);
+
+// Let writer1 get the lock first, then wait for others
+// to join the sync up point.
+Thread writer1 = new Thread(() -> {
+  assertDoesNotThrow(() -> {
+transactionManager.beginTransaction();
+  });
+  latch.countDown();
+  try {
+latch.await(awaitMaxTimeoutMs, TimeUnit.MILLISECONDS);
+// Following sleep is to make sure writer2 attempts
+// to try lock and to get bocked on the lock which
+// this thread is currently holding.
+Thread.sleep(50);
+  } catch (InterruptedException e) {
+//
+  }
+  assertDoesNotThrow(() -> {
+transactionManager.endTransaction();
+  });
+  writer1Completed.set(true);
+});
+writer1.start();
+
+// Writer2 will block on trying to acquire the lock
+// and will eventually get the lock before the timeout.
+Thread writer2 = new Thread(() -> {
+  latch.count

[jira] [Updated] (HUDI-3029) TransactionManager synchronized begin/endTransaction() leading to deadlock

2021-12-17 Thread Manoj Govindassamy (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HUDI-3029:
-
Description: 
I see the TransactionManager has begin and end transactions as synchronized 
methods. Based on the lock provider implementation, this can have adverse 
effects. Say the lock provider has the blocking call for the lock() or 
tryLock() (which is genereally the case), then the following sequence will lead 
to a deadlock.

Client 1: beginTransaction() => txn manager instance lock acquired,  lock() 
went through, instance lock released

Client 2: beginTransaction() => txn manager instance lock acquired, lock() is 
blocking 

Cilent 1: endTransaction() => Waiting to lock the txn manager instance to enter 
the synchronized method

 

 
{noformat}
public synchronized void beginTransaction(Option 
currentTxnOwnerInstant, Option lastCompletedTxnOwnerInstant) {
  if (supportsOptimisticConcurrency) {
this.lastCompletedTxnOwnerInstant = lastCompletedTxnOwnerInstant;
lockManager.setLatestCompletedWriteInstant(lastCompletedTxnOwnerInstant);
LOG.info("Latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
this.currentTxnOwnerInstant = currentTxnOwnerInstant;
LOG.info("Transaction starting with transaction owner " + 
currentTxnOwnerInstant);
lockManager.lock();
LOG.info("Transaction started");
  }
}

public synchronized void endTransaction() {
  if (supportsOptimisticConcurrency) {
LOG.info("Transaction ending with transaction owner " + 
currentTxnOwnerInstant);
lockManager.unlock();
LOG.info("Transaction ended");
this.lastCompletedTxnOwnerInstant = Option.empty();
lockManager.resetLatestCompletedWriteInstant();
  }
}{noformat}
 

 

The reason why it may be working with the current model is when the lock 
provider implementation of tryLock() has sleep() or retry with timeout etc., 
But, we can't assume on the lock provider implementation at the transaction 
manager layer.

 

cc: [~nishith29]  [~shivnarayan] 

  was:
I see the TransactionManager has begin and end transactions as synchronized 
methods. Based on the lock provider implementation, this can have adverse 
effects. Say the lock provider has the blocking call for the lock() or 
tryLock() (which is genereally the case), then the following sequence will lead 
to a deadlock.

Client 1: beginTransaction() => txn manager instance lock acquired,  lock() 
went through, instance lock released

Client 2: beginTransaction() => txn manager instance lock acquired, lock() is 
blocking 

Cilent 3: endTransaction() => Waiting to lock the txn manager instance to enter 
the synchronized method

 

 
{noformat}
public synchronized void beginTransaction(Option 
currentTxnOwnerInstant, Option lastCompletedTxnOwnerInstant) {
  if (supportsOptimisticConcurrency) {
this.lastCompletedTxnOwnerInstant = lastCompletedTxnOwnerInstant;
lockManager.setLatestCompletedWriteInstant(lastCompletedTxnOwnerInstant);
LOG.info("Latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
this.currentTxnOwnerInstant = currentTxnOwnerInstant;
LOG.info("Transaction starting with transaction owner " + 
currentTxnOwnerInstant);
lockManager.lock();
LOG.info("Transaction started");
  }
}

public synchronized void endTransaction() {
  if (supportsOptimisticConcurrency) {
LOG.info("Transaction ending with transaction owner " + 
currentTxnOwnerInstant);
lockManager.unlock();
LOG.info("Transaction ended");
this.lastCompletedTxnOwnerInstant = Option.empty();
lockManager.resetLatestCompletedWriteInstant();
  }
}{noformat}
 

 

The reason why it may be working with the current model is when the lock 
provider implementation of tryLock() has sleep() or retry with timeout etc., 
But, we can't assume on the lock provider implementation at the transaction 
manager layer.

 

cc: [~nishith29]  [~shivnarayan] 


> TransactionManager synchronized begin/endTransaction() leading to deadlock 
> ---
>
> Key: HUDI-3029
> URL: https://issues.apache.org/jira/browse/HUDI-3029
> Project: Apache Hudi
>  Issue Type: Task
>  Components: Writer Core
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.11.0
>
>
> I see the TransactionManager has begin and end transactions as synchronized 
> methods. Based on the lock provider implementation, this can have adverse 
> effects. Say the lock provider has the blocking call for the lock() or 
> tryLock() (which is genereally the case), then the following sequence will 
> lead to a deadlock.
> Client 1: beginTransaction() => txn manager instance lock acquired,  lock() 
> went through, insta

[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


alexeykudinkin commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771772917



##
File path: 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/transaction/TestTransactionManager.java
##
@@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client.transaction;
+
+import org.apache.hudi.client.transaction.lock.InProcessLockProvider;
+import org.apache.hudi.common.model.HoodieFailedWritesCleaningPolicy;
+import org.apache.hudi.common.model.WriteConcurrencyMode;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieLockConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieLockException;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+
+public class TestTransactionManager extends HoodieCommonTestHarness {
+  HoodieWriteConfig writeConfig;
+  TransactionManager transactionManager;
+
+  @BeforeEach
+  private void init() throws IOException {
+initPath();
+initMetaClient();
+this.writeConfig = getWriteConfig();
+this.transactionManager = new TransactionManager(this.writeConfig, 
this.metaClient.getFs());
+  }
+
+  private HoodieWriteConfig getWriteConfig() {
+return HoodieWriteConfig.newBuilder()
+.withPath(basePath)
+.withCompactionConfig(HoodieCompactionConfig.newBuilder()
+
.withFailedWritesCleaningPolicy(HoodieFailedWritesCleaningPolicy.LAZY)
+.build())
+
.withWriteConcurrencyMode(WriteConcurrencyMode.OPTIMISTIC_CONCURRENCY_CONTROL)
+.withLockConfig(HoodieLockConfig.newBuilder()
+.withLockProvider(InProcessLockProvider.class)
+.build())
+.build();
+  }
+
+  @Test
+  public void testSingleWriterTransaction() {
+transactionManager.beginTransaction();
+transactionManager.endTransaction();
+  }
+
+  @Test
+  public void testSingleWriterNestedTransaction() {
+transactionManager.beginTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.beginTransaction();
+});
+
+transactionManager.endTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.endTransaction();
+});
+  }
+
+  @Test
+  public void testSingleWriterMultipleTransactions() {
+for (int i = 0; i < 32; i++) {
+  transactionManager.beginTransaction();
+  transactionManager.endTransaction();
+}
+  }
+
+  @Test
+  public void testMultiWriterTransactions() {
+final int threadCount = 3;
+final long awaitMaxTimeoutMs = 2000L;
+final CountDownLatch latch = new CountDownLatch(threadCount);
+final AtomicBoolean writer1Completed = new AtomicBoolean(false);
+final AtomicBoolean writer2Completed = new AtomicBoolean(false);
+
+// Let writer1 get the lock first, then wait for others
+// to join the sync up point.
+Thread writer1 = new Thread(() -> {
+  assertDoesNotThrow(() -> {
+transactionManager.beginTransaction();
+  });
+  latch.countDown();
+  try {
+latch.await(awaitMaxTimeoutMs, TimeUnit.MILLISECONDS);
+// Following sleep is to make sure writer2 attempts
+// to try lock and to get bocked on the lock which
+// this thread is currently holding.
+Thread.sleep(50);
+  } catch (InterruptedException e) {
+//
+  }
+  assertDoesNotThrow(() -> {
+transactionManager.endTransaction();
+  });
+  writer1Completed.set(true);
+});
+writer1.start();
+
+// Writer2 will block on trying to acquire the lock
+// and will eventually get the lock before the timeout.
+Thread writer2 = new Thread(() -> {
+  latch.count

[GitHub] [hudi] nsivabalan commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


nsivabalan commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771772721



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/TransactionManager.java
##
@@ -55,28 +55,32 @@ public void beginTransaction() {
 
   public void beginTransaction(Option currentTxnOwnerInstant,
Option 
lastCompletedTxnOwnerInstant) {
-if (supportsOptimisticConcurrency) {
+if (isOptimisticConcurrencyControlEnabled) {
   LOG.info("Transaction starting for " + currentTxnOwnerInstant
   + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
   lockManager.lock();
-  this.currentTxnOwnerInstant = currentTxnOwnerInstant;
-  this.lastCompletedTxnOwnerInstant = lastCompletedTxnOwnerInstant;
+  reset(currentTxnOwnerInstant, lastCompletedTxnOwnerInstant);
   LOG.info("Transaction started for " + currentTxnOwnerInstant
   + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
 }
   }
 
   public void endTransaction() {
-if (supportsOptimisticConcurrency) {
+if (isOptimisticConcurrencyControlEnabled) {

Review comment:
   yes, but the failure happens only at L 72 right. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] manojpec commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


manojpec commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771772619



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/TransactionManager.java
##
@@ -55,28 +55,32 @@ public void beginTransaction() {
 
   public void beginTransaction(Option currentTxnOwnerInstant,
Option 
lastCompletedTxnOwnerInstant) {
-if (supportsOptimisticConcurrency) {
+if (isOptimisticConcurrencyControlEnabled) {
   LOG.info("Transaction starting for " + currentTxnOwnerInstant
   + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
   lockManager.lock();
-  this.currentTxnOwnerInstant = currentTxnOwnerInstant;
-  this.lastCompletedTxnOwnerInstant = lastCompletedTxnOwnerInstant;
+  reset(currentTxnOwnerInstant, lastCompletedTxnOwnerInstant);
   LOG.info("Transaction started for " + currentTxnOwnerInstant
   + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
 }
   }
 
   public void endTransaction() {
-if (supportsOptimisticConcurrency) {
+if (isOptimisticConcurrencyControlEnabled) {

Review comment:
   writer2 end transaction will fail as he doesn't hold the lock




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


nsivabalan commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771772587



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/TransactionManager.java
##
@@ -55,28 +55,32 @@ public void beginTransaction() {
 
   public void beginTransaction(Option currentTxnOwnerInstant,
Option 
lastCompletedTxnOwnerInstant) {
-if (supportsOptimisticConcurrency) {
+if (isOptimisticConcurrencyControlEnabled) {
   LOG.info("Transaction starting for " + currentTxnOwnerInstant
   + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
   lockManager.lock();
-  this.currentTxnOwnerInstant = currentTxnOwnerInstant;
-  this.lastCompletedTxnOwnerInstant = lastCompletedTxnOwnerInstant;
+  reset(currentTxnOwnerInstant, lastCompletedTxnOwnerInstant);
   LOG.info("Transaction started for " + currentTxnOwnerInstant
   + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
 }
   }
 
   public void endTransaction() {
-if (supportsOptimisticConcurrency) {
+if (isOptimisticConcurrencyControlEnabled) {

Review comment:
   help me understand something. lets say writer1 acquires the lock and 
takes lot of time to release. writer2 tries to acquire the lock, but times out. 
In finally block of any transaction handling code, we do end transaction right. 
In this case when writer2 fails to acquire, will end transaction be called? 
   if yes, wouldn't writer2 resets the transaction owner at L 71




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-997118146


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 67cbb2f4ab421fb7a90e4c5d1061613ed331c837 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4460)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997129617


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * 6ae6b2781237d7e4af95bd78062c3da765ebe9a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4434)
 
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   * a3e3b87be58f705d665f73e938977ac13b314657 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4467)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4333: [HUDI-431] Adding support for Parquet in MOR `LogBlock`s

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4333:
URL: https://github.com/apache/hudi/pull/4333#issuecomment-997129628


   
   ## CI report:
   
   * 286aa8b95627eaaa01114567797186263a830774 UNKNOWN
   * e722499ee75403ab62f646fdabca1a2c59570164 UNKNOWN
   * de0d4385394dc5d820964cefc872f099cee7a02b UNKNOWN
   * 67cbb2f4ab421fb7a90e4c5d1061613ed331c837 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4460)
 
   * cecde3b6734576c5f2863ec2b4b90689600cb746 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997128969


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * 6ae6b2781237d7e4af95bd78062c3da765ebe9a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4434)
 
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   * a3e3b87be58f705d665f73e938977ac13b314657 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] manojpec commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


manojpec commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771772308



##
File path: 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/transaction/TestTransactionManager.java
##
@@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client.transaction;
+
+import org.apache.hudi.client.transaction.lock.InProcessLockProvider;
+import org.apache.hudi.common.model.HoodieFailedWritesCleaningPolicy;
+import org.apache.hudi.common.model.WriteConcurrencyMode;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieLockConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieLockException;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+
+public class TestTransactionManager extends HoodieCommonTestHarness {
+  HoodieWriteConfig writeConfig;
+  TransactionManager transactionManager;
+
+  @BeforeEach
+  private void init() throws IOException {
+initPath();
+initMetaClient();
+this.writeConfig = getWriteConfig();
+this.transactionManager = new TransactionManager(this.writeConfig, 
this.metaClient.getFs());
+  }
+
+  private HoodieWriteConfig getWriteConfig() {
+return HoodieWriteConfig.newBuilder()
+.withPath(basePath)
+.withCompactionConfig(HoodieCompactionConfig.newBuilder()
+
.withFailedWritesCleaningPolicy(HoodieFailedWritesCleaningPolicy.LAZY)
+.build())
+
.withWriteConcurrencyMode(WriteConcurrencyMode.OPTIMISTIC_CONCURRENCY_CONTROL)
+.withLockConfig(HoodieLockConfig.newBuilder()
+.withLockProvider(InProcessLockProvider.class)
+.build())
+.build();
+  }
+
+  @Test
+  public void testSingleWriterTransaction() {
+transactionManager.beginTransaction();
+transactionManager.endTransaction();
+  }
+
+  @Test
+  public void testSingleWriterNestedTransaction() {
+transactionManager.beginTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.beginTransaction();
+});
+
+transactionManager.endTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.endTransaction();
+});
+  }
+
+  @Test
+  public void testSingleWriterMultipleTransactions() {
+for (int i = 0; i < 32; i++) {
+  transactionManager.beginTransaction();
+  transactionManager.endTransaction();
+}
+  }
+
+  @Test
+  public void testMultiWriterTransactions() {
+final int threadCount = 3;
+final long awaitMaxTimeoutMs = 2000L;
+final CountDownLatch latch = new CountDownLatch(threadCount);
+final AtomicBoolean writer1Completed = new AtomicBoolean(false);
+final AtomicBoolean writer2Completed = new AtomicBoolean(false);
+
+// Let writer1 get the lock first, then wait for others
+// to join the sync up point.
+Thread writer1 = new Thread(() -> {
+  assertDoesNotThrow(() -> {
+transactionManager.beginTransaction();
+  });
+  latch.countDown();
+  try {
+latch.await(awaitMaxTimeoutMs, TimeUnit.MILLISECONDS);
+// Following sleep is to make sure writer2 attempts
+// to try lock and to get bocked on the lock which
+// this thread is currently holding.
+Thread.sleep(50);
+  } catch (InterruptedException e) {
+//
+  }
+  assertDoesNotThrow(() -> {
+transactionManager.endTransaction();
+  });
+  writer1Completed.set(true);
+});
+writer1.start();
+
+// Writer2 will block on trying to acquire the lock
+// and will eventually get the lock before the timeout.
+Thread writer2 = new Thread(() -> {
+  latch.countDown()

[GitHub] [hudi] manojpec commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


manojpec commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771772205



##
File path: 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/transaction/TestTransactionManager.java
##
@@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client.transaction;
+
+import org.apache.hudi.client.transaction.lock.InProcessLockProvider;
+import org.apache.hudi.common.model.HoodieFailedWritesCleaningPolicy;
+import org.apache.hudi.common.model.WriteConcurrencyMode;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieLockConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieLockException;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+
+public class TestTransactionManager extends HoodieCommonTestHarness {
+  HoodieWriteConfig writeConfig;
+  TransactionManager transactionManager;
+
+  @BeforeEach
+  private void init() throws IOException {
+initPath();
+initMetaClient();
+this.writeConfig = getWriteConfig();
+this.transactionManager = new TransactionManager(this.writeConfig, 
this.metaClient.getFs());
+  }
+
+  private HoodieWriteConfig getWriteConfig() {
+return HoodieWriteConfig.newBuilder()
+.withPath(basePath)
+.withCompactionConfig(HoodieCompactionConfig.newBuilder()
+
.withFailedWritesCleaningPolicy(HoodieFailedWritesCleaningPolicy.LAZY)
+.build())
+
.withWriteConcurrencyMode(WriteConcurrencyMode.OPTIMISTIC_CONCURRENCY_CONTROL)
+.withLockConfig(HoodieLockConfig.newBuilder()
+.withLockProvider(InProcessLockProvider.class)
+.build())
+.build();
+  }
+
+  @Test
+  public void testSingleWriterTransaction() {
+transactionManager.beginTransaction();
+transactionManager.endTransaction();
+  }
+
+  @Test
+  public void testSingleWriterNestedTransaction() {
+transactionManager.beginTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.beginTransaction();
+});
+
+transactionManager.endTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.endTransaction();
+});
+  }
+
+  @Test
+  public void testSingleWriterMultipleTransactions() {
+for (int i = 0; i < 32; i++) {
+  transactionManager.beginTransaction();
+  transactionManager.endTransaction();
+}
+  }
+
+  @Test
+  public void testMultiWriterTransactions() {
+final int threadCount = 3;
+final long awaitMaxTimeoutMs = 2000L;
+final CountDownLatch latch = new CountDownLatch(threadCount);
+final AtomicBoolean writer1Completed = new AtomicBoolean(false);
+final AtomicBoolean writer2Completed = new AtomicBoolean(false);
+
+// Let writer1 get the lock first, then wait for others
+// to join the sync up point.
+Thread writer1 = new Thread(() -> {
+  assertDoesNotThrow(() -> {
+transactionManager.beginTransaction();
+  });
+  latch.countDown();
+  try {
+latch.await(awaitMaxTimeoutMs, TimeUnit.MILLISECONDS);
+// Following sleep is to make sure writer2 attempts
+// to try lock and to get bocked on the lock which
+// this thread is currently holding.
+Thread.sleep(50);
+  } catch (InterruptedException e) {
+//
+  }
+  assertDoesNotThrow(() -> {
+transactionManager.endTransaction();
+  });
+  writer1Completed.set(true);
+});
+writer1.start();
+
+// Writer2 will block on trying to acquire the lock
+// and will eventually get the lock before the timeout.
+Thread writer2 = new Thread(() -> {
+  latch.countDown()

[GitHub] [hudi] hudi-bot commented on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997128969


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * 6ae6b2781237d7e4af95bd78062c3da765ebe9a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4434)
 
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   * a3e3b87be58f705d665f73e938977ac13b314657 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997126921


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * 6ae6b2781237d7e4af95bd78062c3da765ebe9a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4434)
 
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[hudi] branch master updated (7784249 -> 4785244)

2021-12-17 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git.


from 7784249  [HUDI-2962] InProcess lock provider to guard single writer 
process with async table operations (#4259)
 add 4785244  [HUDI-3043] De-coupling multi writer tests (#4362)

No new revisions were added by this update.

Summary of changes:
 .../TestHoodieDeltaStreamerWithMultiWriter.java| 22 ++
 1 file changed, 18 insertions(+), 4 deletions(-)


[GitHub] [hudi] nsivabalan merged pull request #4362: [HUDI-3043] De-coupling multi writer tests

2021-12-17 Thread GitBox


nsivabalan merged pull request #4362:
URL: https://github.com/apache/hudi/pull/4362


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] manojpec commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


manojpec commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771771785



##
File path: 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/transaction/TestTransactionManager.java
##
@@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client.transaction;
+
+import org.apache.hudi.client.transaction.lock.InProcessLockProvider;
+import org.apache.hudi.common.model.HoodieFailedWritesCleaningPolicy;
+import org.apache.hudi.common.model.WriteConcurrencyMode;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieLockConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieLockException;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+
+public class TestTransactionManager extends HoodieCommonTestHarness {
+  HoodieWriteConfig writeConfig;
+  TransactionManager transactionManager;
+
+  @BeforeEach
+  private void init() throws IOException {
+initPath();
+initMetaClient();
+this.writeConfig = getWriteConfig();
+this.transactionManager = new TransactionManager(this.writeConfig, 
this.metaClient.getFs());
+  }
+
+  private HoodieWriteConfig getWriteConfig() {
+return HoodieWriteConfig.newBuilder()
+.withPath(basePath)
+.withCompactionConfig(HoodieCompactionConfig.newBuilder()
+
.withFailedWritesCleaningPolicy(HoodieFailedWritesCleaningPolicy.LAZY)
+.build())
+
.withWriteConcurrencyMode(WriteConcurrencyMode.OPTIMISTIC_CONCURRENCY_CONTROL)
+.withLockConfig(HoodieLockConfig.newBuilder()
+.withLockProvider(InProcessLockProvider.class)
+.build())
+.build();
+  }
+
+  @Test
+  public void testSingleWriterTransaction() {
+transactionManager.beginTransaction();
+transactionManager.endTransaction();
+  }
+
+  @Test
+  public void testSingleWriterNestedTransaction() {
+transactionManager.beginTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.beginTransaction();
+});
+
+transactionManager.endTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.endTransaction();
+});
+  }
+
+  @Test
+  public void testSingleWriterMultipleTransactions() {

Review comment:
   Same thread able to do multiple transactions. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#issuecomment-997126961


   
   ## CI report:
   
   * f0555fa1c09b27744084d20199683a1f8e68d9b7 UNKNOWN
   * 46bfd4cb47cb7cba1185b9e146cfc8396a91af88 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4464)
 
   * a10ec8a603b8297e0a69246b4d33866c9b7f5ad6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#issuecomment-997128379


   
   ## CI report:
   
   * f0555fa1c09b27744084d20199683a1f8e68d9b7 UNKNOWN
   * 46bfd4cb47cb7cba1185b9e146cfc8396a91af88 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4464)
 
   * a10ec8a603b8297e0a69246b4d33866c9b7f5ad6 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4466)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Assigned] (HUDI-3059) save point rollback not working with hudi-cli

2021-12-17 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan reassigned HUDI-3059:
-

Assignee: sivabalan narayanan

> save point rollback not working with hudi-cli
> -
>
> Key: HUDI-3059
> URL: https://issues.apache.org/jira/browse/HUDI-3059
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Usability
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Major
>  Labels: sev:critical
>
> Ref issue:
> [https://github.com/apache/hudi/issues/3870]
>  
>  # create Hudi dataset
>  # add some data so there are multiple commits
>  # create a savepoint
>  # try to rollback savepoint
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HUDI-3059) save point rollback not working with hudi-cli

2021-12-17 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-3059:
--
Labels: sev:critical  (was: )

> save point rollback not working with hudi-cli
> -
>
> Key: HUDI-3059
> URL: https://issues.apache.org/jira/browse/HUDI-3059
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Usability
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: sev:critical
>
> Ref issue:
> [https://github.com/apache/hudi/issues/3870]
>  
>  # create Hudi dataset
>  # add some data so there are multiple commits
>  # create a savepoint
>  # try to rollback savepoint
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HUDI-3059) save point rollback not working with hudi-cli

2021-12-17 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-3059:
-

 Summary: save point rollback not working with hudi-cli
 Key: HUDI-3059
 URL: https://issues.apache.org/jira/browse/HUDI-3059
 Project: Apache Hudi
  Issue Type: Bug
  Components: Usability
Reporter: sivabalan narayanan


Ref issue:

[https://github.com/apache/hudi/issues/3870]

 
 # create Hudi dataset
 # add some data so there are multiple commits
 # create a savepoint
 # try to rollback savepoint

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HUDI-3058) SqlQueryEqualityPreCommitValidator errors with java.util.ConcurrentModificationException

2021-12-17 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan reassigned HUDI-3058:
-

Assignee: satish

> SqlQueryEqualityPreCommitValidator errors with 
> java.util.ConcurrentModificationException
> 
>
> Key: HUDI-3058
> URL: https://issues.apache.org/jira/browse/HUDI-3058
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Usability
>Affects Versions: 0.10.0
>Reporter: sivabalan narayanan
>Assignee: satish
>Priority: Major
>  Labels: sev:high
> Fix For: 0.11.0
>
>
> Ref issue: [https://github.com/apache/hudi/issues/4109]
>  
> Faced concurrentModificationException when trying to test 
> SqlQueryEqualityPreCommitValidator in quickstart guide
> *To Reproduce*
> Steps to reproduce the behavior:
>  # Insert data without any pre commit validations
>  # Update data (ensured the updates dont touch the fare column in quickstart 
> example) with the following precommit validator props
> {{option("hoodie.precommit.validators", 
> "org.apache.hudi.client.validator.SqlQueryEqualityPreCommitValidator").
> option("hoodie.precommit.validators.equality.sql.queries", "select sum(fare) 
> from ").}}
> stacktrace:
> {code:java}
> org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit 
> time 20211124114945342
> at 
> org.apache.hudi.table.action.commit.AbstractWriteHelper.write(AbstractWriteHelper.java:62)
> at 
> org.apache.hudi.table.action.commit.SparkUpsertCommitActionExecutor.execute(SparkUpsertCommitActionExecutor.java:46)
> at 
> org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:111)
> at 
> org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:95)
> at 
> org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:174)
> at org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:214)
> at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:276)
> at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
> at 
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
> at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
> at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
> at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
> at 
> org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
> ... 70 elided
> Caused by: java.util.ConcurrentModificationException
> at java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1633)
> at 
> java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at java.util.stream.ReferencePipeline.forEach(Referen

[jira] [Updated] (HUDI-3058) SqlQueryEqualityPreCommitValidator errors with java.util.ConcurrentModificationException

2021-12-17 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-3058:
--
Labels: sev:high  (was: )

> SqlQueryEqualityPreCommitValidator errors with 
> java.util.ConcurrentModificationException
> 
>
> Key: HUDI-3058
> URL: https://issues.apache.org/jira/browse/HUDI-3058
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Usability
>Affects Versions: 0.10.0
>Reporter: sivabalan narayanan
>Priority: Major
>  Labels: sev:high
> Fix For: 0.11.0
>
>
> Ref issue: [https://github.com/apache/hudi/issues/4109]
>  
> Faced concurrentModificationException when trying to test 
> SqlQueryEqualityPreCommitValidator in quickstart guide
> *To Reproduce*
> Steps to reproduce the behavior:
>  # Insert data without any pre commit validations
>  # Update data (ensured the updates dont touch the fare column in quickstart 
> example) with the following precommit validator props
> {{option("hoodie.precommit.validators", 
> "org.apache.hudi.client.validator.SqlQueryEqualityPreCommitValidator").
> option("hoodie.precommit.validators.equality.sql.queries", "select sum(fare) 
> from ").}}
> stacktrace:
> {code:java}
> org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit 
> time 20211124114945342
> at 
> org.apache.hudi.table.action.commit.AbstractWriteHelper.write(AbstractWriteHelper.java:62)
> at 
> org.apache.hudi.table.action.commit.SparkUpsertCommitActionExecutor.execute(SparkUpsertCommitActionExecutor.java:46)
> at 
> org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:111)
> at 
> org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:95)
> at 
> org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:174)
> at org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:214)
> at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:276)
> at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
> at 
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
> at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
> at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
> at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
> at 
> org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
> ... 70 elided
> Caused by: java.util.ConcurrentModificationException
> at java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1633)
> at 
> java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
> at ja

[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


alexeykudinkin commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771769971



##
File path: 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/transaction/TestTransactionManager.java
##
@@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client.transaction;
+
+import org.apache.hudi.client.transaction.lock.InProcessLockProvider;
+import org.apache.hudi.common.model.HoodieFailedWritesCleaningPolicy;
+import org.apache.hudi.common.model.WriteConcurrencyMode;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieLockConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieLockException;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+
+public class TestTransactionManager extends HoodieCommonTestHarness {
+  HoodieWriteConfig writeConfig;
+  TransactionManager transactionManager;
+
+  @BeforeEach
+  private void init() throws IOException {
+initPath();
+initMetaClient();
+this.writeConfig = getWriteConfig();
+this.transactionManager = new TransactionManager(this.writeConfig, 
this.metaClient.getFs());
+  }
+
+  private HoodieWriteConfig getWriteConfig() {
+return HoodieWriteConfig.newBuilder()
+.withPath(basePath)
+.withCompactionConfig(HoodieCompactionConfig.newBuilder()
+
.withFailedWritesCleaningPolicy(HoodieFailedWritesCleaningPolicy.LAZY)
+.build())
+
.withWriteConcurrencyMode(WriteConcurrencyMode.OPTIMISTIC_CONCURRENCY_CONTROL)
+.withLockConfig(HoodieLockConfig.newBuilder()
+.withLockProvider(InProcessLockProvider.class)
+.build())
+.build();
+  }
+
+  @Test
+  public void testSingleWriterTransaction() {
+transactionManager.beginTransaction();
+transactionManager.endTransaction();
+  }
+
+  @Test
+  public void testSingleWriterNestedTransaction() {
+transactionManager.beginTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.beginTransaction();
+});
+
+transactionManager.endTransaction();
+assertThrows(HoodieLockException.class, () -> {
+  transactionManager.endTransaction();
+});
+  }
+
+  @Test
+  public void testSingleWriterMultipleTransactions() {

Review comment:
   Not sure i understand what exactly we're testing with this one

##
File path: 
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/transaction/TestTransactionManager.java
##
@@ -0,0 +1,208 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client.transaction;
+
+import org.apache.hudi.client.transaction.lock.InProcessLockProvider;
+import org.apache.hudi.common.model.HoodieFailedWritesCleaningPolicy;
+import org.apache.hudi.common.model.WriteConcurrencyMode;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apac

[GitHub] [hudi] nsivabalan commented on issue #4135: [SUPPORT] Zordering clustering on a moderate size dataset taking large amounts of time.

2021-12-17 Thread GitBox


nsivabalan commented on issue #4135:
URL: https://github.com/apache/hudi/issues/4135#issuecomment-997127574


   Hey folks, is there any pending things to be resolved. If not, can we close 
this one out. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] manojpec commented on a change in pull request #3989: [HUDI-2589] RFC-37: Metadata table based bloom index

2021-12-17 Thread GitBox


manojpec commented on a change in pull request #3989:
URL: https://github.com/apache/hudi/pull/3989#discussion_r771612579



##
File path: rfc/rfc-37/rfc-37.md
##
@@ -0,0 +1,286 @@
+
+# RFC-37: Metadata based Bloom Index
+
+## Proposers
+- @nsivabalan
+- @manojpec
+
+## Approvers
+ - @vinothchandar
+ - @satishkotha
+
+## Status
+JIRA: https://issues.apache.org/jira/browse/HUDI-2703
+
+## Abstract
+Hudi maintains several indices to locate/map incoming records to file groups 
during writes. Most commonly 
+used record index is the HoodieBloomIndex. Larger tables and global index has 
performance issues
+as the bloom filter from a large number of data files needed to be read and 
looked up. Reading from several
+files over the cloud object storage like S3 also faces request throttling 
issues. We are proposing to 
+build a new Metadata index (metadata table based bloom index) to boost the 
performance of existing bloom index. 
+
+## Background
+HoodieBloomIndex is used to find the location of incoming records during every 
write. Bloom index assists Hudi in
+deterministically routing records to a given file group and to distinguish 
inserts vs updates. This aggregate bloom
+index is built from several bloom filters stored in the base file footers. 
Prior to bloom filter lookup, the file
+pruning for the incoming records is also done based on the record key min/max 
stats stored in the base file footers.
+In this RFC, we plan to build a new index for the bloom filters under the 
metadata table which to assist in 
+bloom index based record location tagging. 
+
+## Design
+HoodieBloomIndex involves the following steps to find the right location of 
incoming records
+1. Find all the interested partitions and list all its data files.
+2. File Pruning: Load record key min/max details from all the interested data 
file footers. Filter files and generate
+   files to keys mapping for the incoming records based on the key ranges 
using range interval tree built from
+   previously loaded min/max details.
+3. Bloom Filter lookup: Filter files and prune files to keys mapping for the 
incoming keys mapping based on the bloom
+   filter key lookup
+4. Final Look up in actual data files to find the right location of every 
incoming record
+
+As we could see from step 1 and 2, we are in need of min and max values for 
"_hoodie_record_key" and bloom filters
+from all interested data files to perform the location tagging. In this 
design, we will add these key stats and
+bloom filter to the metadata table and thereby able to quickly load the 
interested details and do faster lookups.
+
+Metadata table already has one partition `files` to help in partition file 
listing. For the metadata table based
+indices, we are proposing to add following two new partitions:
+1. `bloom_filter` - for the file level bloom filter
+2. `column_stats` - for the key range stats
+
+Why metadata table: 
+Metadata table uses HBase HFile - the map file format to store and retrieve 
data. HFile is an indexed file format and
+supports map like faster lookups by keys. Since, we will be storing 
stats/bloom for every file and the index will do
+lookups based on files, we should be able to benefit from the faster lookups 
in HFile. 
+
+
+
+Following sections will talk about different partitions, key formats and then 
dive into the data and control flows.
+
+### MetaIndex/BloomFilter:
+
+A new partition `bloom_filter` will be added under the metadata table. Bloom 
filters from all the base files in the
+data table will be added here. Metadata table is already in the HFile format. 
The existing metadata payload schema will
+be extended and shared for this partition also. The type field will be used to 
detect the bloom filter payload record.
+Here is the schema for the bloom filter payload record.
+```
+   {
+"doc": "Metadata about base file bloom filters",
+"name": "BloomFilterMetadata",
+"type": [
+"null",
+{
+"doc": "Base FileID and its BloomFilter details",
+"name": "HoodieMetadataBloomFilter",
+"type": "record",
+"fields": [
+{
+"doc": "Version/type of the bloom filter metadata",
+"name": "version",
+"type": "string"
+},
+{
+"doc": "Instant timestamp when this metadata was 
created/updated",
+"name": "timestamp",
+"type": "string"
+},
+{
+"doc": "Bloom filter binary byte array",
+"name": "bloomfilter",
+"type": "bytes"
+},
+{
+"doc": "True if

[GitHub] [hudi] nsivabalan commented on issue #4184: [SUPPORT]parquet is not a Parquet file (too small length:4)

2021-12-17 Thread GitBox


nsivabalan commented on issue #4184:
URL: https://github.com/apache/hudi/issues/4184#issuecomment-997127027


   @bhasudha @bvaradar @leesf @danny0405 : have you folks encountered this 
before. a parquet file of size 4 bytes. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-997126501


   
   ## CI report:
   
   * 1677eab2ead6910016c2ed0b67640c97757633bd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4410)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4431)
 
   * 29113e6aff644be7511d84ae8428a8597a5b10b2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-996752326


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * 6ae6b2781237d7e4af95bd78062c3da765ebe9a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4434)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#issuecomment-997126514


   
   ## CI report:
   
   * d017173b44682dd26fa7238635ba9eb8fd750a1a Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4461)
 
   * f0555fa1c09b27744084d20199683a1f8e68d9b7 UNKNOWN
   * 46bfd4cb47cb7cba1185b9e146cfc8396a91af88 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4464)
 
   * a10ec8a603b8297e0a69246b4d33866c9b7f5ad6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-997126936


   
   ## CI report:
   
   * 1677eab2ead6910016c2ed0b67640c97757633bd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4410)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4431)
 
   * 29113e6aff644be7511d84ae8428a8597a5b10b2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4465)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#issuecomment-997126961


   
   ## CI report:
   
   * f0555fa1c09b27744084d20199683a1f8e68d9b7 UNKNOWN
   * 46bfd4cb47cb7cba1185b9e146cfc8396a91af88 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4464)
 
   * a10ec8a603b8297e0a69246b4d33866c9b7f5ad6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] yihua commented on issue #4230: [SUPPORT] org.apache.hudi.exception.HoodieRemoteException: Failed to create marker file

2021-12-17 Thread GitBox


yihua commented on issue #4230:
URL: https://github.com/apache/hudi/issues/4230#issuecomment-997126934


   > This is happening also in 'Delete archive instants'
   > 
   @h7kanna This could be due to FS timeout.  The writer may still proceed with 
retries after the exception.  Do you see this failing the write actions 
constantly?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4306: [HUDI-3014] add table option to set utc timezone

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4306:
URL: https://github.com/apache/hudi/pull/4306#issuecomment-997126921


   
   ## CI report:
   
   * a39258ca69c6302da42cdb1fe1a0794676480952 UNKNOWN
   * a1ba1e2c81b74948a93589c3192ab24ef320107b UNKNOWN
   * c347bb78b3c799dce34db7a00c7f6a07c95ec777 UNKNOWN
   * d6a0ac9027bf12362b56729a86e9755dbe1c21db UNKNOWN
   * 4fd974b6b45e337f75bfaa9e6d54dc7e82cf1473 UNKNOWN
   * 0afe75ecfe523bdc74c8c37ba50de0cb0601166d UNKNOWN
   * 70457e0ba0b8dfd4ae63fd8c096abbbf051d6256 UNKNOWN
   * 6ae6b2781237d7e4af95bd78062c3da765ebe9a2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4434)
 
   * bfdbb7db27a02f6c414769e58aa8cb1e841c3a21 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #4200: spark-sql query timestamp partition error

2021-12-17 Thread GitBox


nsivabalan commented on issue #4200:
URL: https://github.com/apache/hudi/issues/4200#issuecomment-997126706


   @YannByron : Can we please follow up on this one. If its a bug, please do 
file a tracking jira and close this one out. But lets try to work towards a fix 
it its a valid bug. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#issuecomment-997119164


   
   ## CI report:
   
   * d017173b44682dd26fa7238635ba9eb8fd750a1a Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4461)
 
   * f0555fa1c09b27744084d20199683a1f8e68d9b7 UNKNOWN
   * 46bfd4cb47cb7cba1185b9e146cfc8396a91af88 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4464)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#issuecomment-997126514


   
   ## CI report:
   
   * d017173b44682dd26fa7238635ba9eb8fd750a1a Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4461)
 
   * f0555fa1c09b27744084d20199683a1f8e68d9b7 UNKNOWN
   * 46bfd4cb47cb7cba1185b9e146cfc8396a91af88 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4464)
 
   * a10ec8a603b8297e0a69246b4d33866c9b7f5ad6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-997126501


   
   ## CI report:
   
   * 1677eab2ead6910016c2ed0b67640c97757633bd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4410)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4431)
 
   * 29113e6aff644be7511d84ae8428a8597a5b10b2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot removed a comment on pull request #4349: [MINOR] remove unused import in HoodieFileIndex

2021-12-17 Thread GitBox


hudi-bot removed a comment on pull request #4349:
URL: https://github.com/apache/hudi/pull/4349#issuecomment-996699650


   
   ## CI report:
   
   * 1677eab2ead6910016c2ed0b67640c97757633bd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4410)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=4431)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #4221: [SUPPORT] hudi mor table has a lack of data

2021-12-17 Thread GitBox


nsivabalan commented on issue #4221:
URL: https://github.com/apache/hudi/issues/4221#issuecomment-997126353


   If the conversation is taken offline, can we close this out. But please file 
a tracking jira if a bug if triaged as one. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] manojpec commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


manojpec commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771769770



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/TransactionManager.java
##
@@ -35,44 +35,43 @@
 public class TransactionManager implements Serializable {
 
   private static final Logger LOG = 
LogManager.getLogger(TransactionManager.class);
-
   private final LockManager lockManager;
+  private final boolean supportsOptimisticConcurrency;

Review comment:
   right, without changing the original config and all its usage, am going 
with `isOptimisticConcurrencyControlEnabled` flag in this class. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] manojpec commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


manojpec commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771769708



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/TransactionManager.java
##
@@ -35,44 +35,43 @@
 public class TransactionManager implements Serializable {
 
   private static final Logger LOG = 
LogManager.getLogger(TransactionManager.class);
-
   private final LockManager lockManager;
+  private final boolean supportsOptimisticConcurrency;
   private Option currentTxnOwnerInstant;
   private Option lastCompletedTxnOwnerInstant;
-  private boolean supportsOptimisticConcurrency;
 
   public TransactionManager(HoodieWriteConfig config, FileSystem fs) {
 this.lockManager = new LockManager(config, fs);
 this.supportsOptimisticConcurrency = 
config.getWriteConcurrencyMode().supportsOptimisticConcurrencyControl();
   }
 
-  public synchronized void beginTransaction() {
+  public void beginTransaction() {
 if (supportsOptimisticConcurrency) {
   LOG.info("Transaction starting without a transaction owner");
   lockManager.lock();
-  LOG.info("Transaction started");
+  LOG.info("Transaction started without a transaction owner");
 }
   }
 
-  public synchronized void beginTransaction(Option 
currentTxnOwnerInstant, Option lastCompletedTxnOwnerInstant) {
+  public void beginTransaction(Option currentTxnOwnerInstant,
+   Option 
lastCompletedTxnOwnerInstant) {
 if (supportsOptimisticConcurrency) {
-  this.lastCompletedTxnOwnerInstant = lastCompletedTxnOwnerInstant;
-  lockManager.setLatestCompletedWriteInstant(lastCompletedTxnOwnerInstant);
-  LOG.info("Latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
-  this.currentTxnOwnerInstant = currentTxnOwnerInstant;
-  LOG.info("Transaction starting with transaction owner " + 
currentTxnOwnerInstant);
+  LOG.info("Transaction starting for " + currentTxnOwnerInstant
+  + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
   lockManager.lock();
-  LOG.info("Transaction started");
+  this.currentTxnOwnerInstant = currentTxnOwnerInstant;
+  this.lastCompletedTxnOwnerInstant = lastCompletedTxnOwnerInstant;
+  LOG.info("Transaction started for " + currentTxnOwnerInstant
+  + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
 }
   }
 
-  public synchronized void endTransaction() {
+  public void endTransaction() {
 if (supportsOptimisticConcurrency) {
   LOG.info("Transaction ending with transaction owner " + 
currentTxnOwnerInstant);
-  lockManager.unlock();
-  LOG.info("Transaction ended");
   this.lastCompletedTxnOwnerInstant = Option.empty();
-  lockManager.resetLatestCompletedWriteInstant();
+  lockManager.unlock();

Review comment:
   sounds good, fixed it. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #4241: [SUPPORT] Disaster Recovery (DR) Setup? Questions.

2021-12-17 Thread GitBox


nsivabalan commented on issue #4241:
URL: https://github.com/apache/hudi/issues/4241#issuecomment-997125988


   @xushiyan @bhasudha @bvaradar @yanghua : Do you folks have any pointes on 
this regard. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #4154: [SUPPORT] INSERT OVERWRITE operation does not work when using Spark SQL

2021-12-17 Thread GitBox


nsivabalan commented on issue #4154:
URL: https://github.com/apache/hudi/issues/4154#issuecomment-997125124


   @YannByron @danny0405 : Can either of you triage this. We might need a fix 
if its a bug. Feel free to file a tracking jira and work towards it. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #4131: [SUPPORT] org.apache.hudi.exception.HoodieException: The value of can not be null

2021-12-17 Thread GitBox


nsivabalan commented on issue #4131:
URL: https://github.com/apache/hudi/issues/4131#issuecomment-997122358


   @YannByron : Can you look into this issue please. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] alexeykudinkin commented on a change in pull request #4363: [HUDI-3029] Transaction manager: avoid deadlock when doing begin and end transactions

2021-12-17 Thread GitBox


alexeykudinkin commented on a change in pull request #4363:
URL: https://github.com/apache/hudi/pull/4363#discussion_r771766731



##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/TransactionManager.java
##
@@ -35,44 +35,43 @@
 public class TransactionManager implements Serializable {
 
   private static final Logger LOG = 
LogManager.getLogger(TransactionManager.class);
-
   private final LockManager lockManager;
+  private final boolean supportsOptimisticConcurrency;

Review comment:
   Name of the flag is misleading: had to go and check what it actually 
refers to to fully understand its semantic -- this one is rather about 
enabling/disabling CC (which you can disable if you only have a single writer)

##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/TransactionManager.java
##
@@ -35,44 +35,43 @@
 public class TransactionManager implements Serializable {
 
   private static final Logger LOG = 
LogManager.getLogger(TransactionManager.class);
-
   private final LockManager lockManager;
+  private final boolean supportsOptimisticConcurrency;
   private Option currentTxnOwnerInstant;
   private Option lastCompletedTxnOwnerInstant;
-  private boolean supportsOptimisticConcurrency;
 
   public TransactionManager(HoodieWriteConfig config, FileSystem fs) {
 this.lockManager = new LockManager(config, fs);
 this.supportsOptimisticConcurrency = 
config.getWriteConcurrencyMode().supportsOptimisticConcurrencyControl();
   }
 
-  public synchronized void beginTransaction() {
+  public void beginTransaction() {
 if (supportsOptimisticConcurrency) {
   LOG.info("Transaction starting without a transaction owner");
   lockManager.lock();
-  LOG.info("Transaction started");
+  LOG.info("Transaction started without a transaction owner");
 }
   }
 
-  public synchronized void beginTransaction(Option 
currentTxnOwnerInstant, Option lastCompletedTxnOwnerInstant) {
+  public void beginTransaction(Option currentTxnOwnerInstant,
+   Option 
lastCompletedTxnOwnerInstant) {
 if (supportsOptimisticConcurrency) {
-  this.lastCompletedTxnOwnerInstant = lastCompletedTxnOwnerInstant;
-  lockManager.setLatestCompletedWriteInstant(lastCompletedTxnOwnerInstant);
-  LOG.info("Latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
-  this.currentTxnOwnerInstant = currentTxnOwnerInstant;
-  LOG.info("Transaction starting with transaction owner " + 
currentTxnOwnerInstant);
+  LOG.info("Transaction starting for " + currentTxnOwnerInstant
+  + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
   lockManager.lock();
-  LOG.info("Transaction started");
+  this.currentTxnOwnerInstant = currentTxnOwnerInstant;
+  this.lastCompletedTxnOwnerInstant = lastCompletedTxnOwnerInstant;
+  LOG.info("Transaction started for " + currentTxnOwnerInstant
+  + "with latest completed transaction instant " + 
lastCompletedTxnOwnerInstant);
 }
   }
 
-  public synchronized void endTransaction() {
+  public void endTransaction() {
 if (supportsOptimisticConcurrency) {
   LOG.info("Transaction ending with transaction owner " + 
currentTxnOwnerInstant);
-  lockManager.unlock();
-  LOG.info("Transaction ended");
   this.lastCompletedTxnOwnerInstant = Option.empty();
-  lockManager.resetLatestCompletedWriteInstant();
+  lockManager.unlock();

Review comment:
   Would suggest to create `reset(Instant, Instant)` method that you can 
invoke from both `lock` and `unlock`

##
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/TransactionManager.java
##
@@ -35,44 +35,43 @@
 public class TransactionManager implements Serializable {
 
   private static final Logger LOG = 
LogManager.getLogger(TransactionManager.class);
-
   private final LockManager lockManager;
+  private final boolean supportsOptimisticConcurrency;
   private Option currentTxnOwnerInstant;
   private Option lastCompletedTxnOwnerInstant;
-  private boolean supportsOptimisticConcurrency;
 
   public TransactionManager(HoodieWriteConfig config, FileSystem fs) {
 this.lockManager = new LockManager(config, fs);
 this.supportsOptimisticConcurrency = 
config.getWriteConcurrencyMode().supportsOptimisticConcurrencyControl();
   }
 
-  public synchronized void beginTransaction() {
+  public void beginTransaction() {
 if (supportsOptimisticConcurrency) {
   LOG.info("Transaction starting without a transaction owner");
   lockManager.lock();
-  LOG.info("Transaction started");
+  LOG.info("Transaction started without a transaction owner");
 }
   }
 
-  public synchronized void beginTransaction(Option 
currentTxnOwnerInstant, Option lastCompletedTxnOwnerInstant) {
+  public void beginTransaction(Option 

[GitHub] [hudi] nsivabalan commented on issue #4340: [SUPPORT] Incremental read fails when no commit in the particular zone

2021-12-17 Thread GitBox


nsivabalan commented on issue #4340:
URL: https://github.com/apache/hudi/issues/4340#issuecomment-997121672


   @fireking77 : may I know what do you mean by timezone here? do you mean, if 
there is no commit between begin time and end time? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] hudi-bot commented on pull request #4361: [WIP][DO_NOT_MERGE] Test failure testing5

2021-12-17 Thread GitBox


hudi-bot commented on pull request #4361:
URL: https://github.com/apache/hudi/pull/4361#issuecomment-997120007


   
   ## CI report:
   
   * 3f3780a32d11ac67c935870beaa460b67363dbbe UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan closed issue #4109: [SUPPORT] SqlQueryEqualityPreCommitValidator errors with java.util.ConcurrentModificationException

2021-12-17 Thread GitBox


nsivabalan closed issue #4109:
URL: https://github.com/apache/hudi/issues/4109


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] nsivabalan commented on issue #4109: [SUPPORT] SqlQueryEqualityPreCommitValidator errors with java.util.ConcurrentModificationException

2021-12-17 Thread GitBox


nsivabalan commented on issue #4109:
URL: https://github.com/apache/hudi/issues/4109#issuecomment-997119344


   Have filed a tracking 
[jira](https://issues.apache.org/jira/browse/HUDI-3058). Will close this out. 
@satishkotha : Once you have a PR, let me know. I can help review. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (HUDI-3058) SqlQueryEqualityPreCommitValidator errors with java.util.ConcurrentModificationException

2021-12-17 Thread sivabalan narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17461747#comment-17461747
 ] 

sivabalan narayanan commented on HUDI-3058:
---

Proposed fix: 

CocurrentModificationException seems to be coming from here
[https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/table/view/HoodieTablePreCommitFileSystemView.java#L83]

We need to redo this logic to avoid newFilesWrittenForPartition.remove(...).

Simple option to try out:
Replace line 72 with

{{ Map newFilesWrittenForPartition  = new 
ConcurrentHashMap(filesWritten.stream()
.filter(file -> partitionStr.equals(file.getPartitionPath()))
.collect(Collectors.toMap(HoodieWriteStat::getFileId, writeStat -> 
new HoodieBaseFile(new Path(tableMetaClient.getBasePath(), 
writeStat.getPath()).toString()}}
Above is more a short-term workaround. Probably better option is to avoid 
modifying the Map in first place. This can be done by grouping based on fileId 
i.e., replace line 78 -88 with:

{{Map baseFilesForCommittedFileIds = committedBaseFiles
// Remove files replaced by current inflight commit
.filter(baseFile -> 
!replacedFileIdsForPartition.contains(baseFile.getFileId()))
collect(Collectors.toMap(HoodieBaseFile::getFileId, baseFile -> 
baseFile))

baseFilesForCommittedFileIds.putAll(newFilesWrittenForPartition)
return baseFilesForCommittedFileIds.values().stream();}}
This needs some more testing. I can send PR next week.

> SqlQueryEqualityPreCommitValidator errors with 
> java.util.ConcurrentModificationException
> 
>
> Key: HUDI-3058
> URL: https://issues.apache.org/jira/browse/HUDI-3058
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Usability
>Affects Versions: 0.10.0
>Reporter: sivabalan narayanan
>Priority: Major
> Fix For: 0.11.0
>
>
> Ref issue: [https://github.com/apache/hudi/issues/4109]
>  
> Faced concurrentModificationException when trying to test 
> SqlQueryEqualityPreCommitValidator in quickstart guide
> *To Reproduce*
> Steps to reproduce the behavior:
>  # Insert data without any pre commit validations
>  # Update data (ensured the updates dont touch the fare column in quickstart 
> example) with the following precommit validator props
> {{option("hoodie.precommit.validators", 
> "org.apache.hudi.client.validator.SqlQueryEqualityPreCommitValidator").
> option("hoodie.precommit.validators.equality.sql.queries", "select sum(fare) 
> from ").}}
> stacktrace:
> {code:java}
> org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit 
> time 20211124114945342
> at 
> org.apache.hudi.table.action.commit.AbstractWriteHelper.write(AbstractWriteHelper.java:62)
> at 
> org.apache.hudi.table.action.commit.SparkUpsertCommitActionExecutor.execute(SparkUpsertCommitActionExecutor.java:46)
> at 
> org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:111)
> at 
> org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:95)
> at 
> org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:174)
> at org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:214)
> at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:276)
> at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
> at 
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
> at 

[jira] [Updated] (HUDI-3058) SqlQueryEqualityPreCommitValidator errors with java.util.ConcurrentModificationException

2021-12-17 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-3058:
--
Affects Version/s: 0.10.0

> SqlQueryEqualityPreCommitValidator errors with 
> java.util.ConcurrentModificationException
> 
>
> Key: HUDI-3058
> URL: https://issues.apache.org/jira/browse/HUDI-3058
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: Usability
>Affects Versions: 0.10.0
>Reporter: sivabalan narayanan
>Priority: Major
> Fix For: 0.11.0
>
>
> Faced concurrentModificationException when trying to test 
> SqlQueryEqualityPreCommitValidator in quickstart guide
> *To Reproduce*
> Steps to reproduce the behavior:
>  # Insert data without any pre commit validations
>  # Update data (ensured the updates dont touch the fare column in quickstart 
> example) with the following precommit validator props
> {{option("hoodie.precommit.validators", 
> "org.apache.hudi.client.validator.SqlQueryEqualityPreCommitValidator").
> option("hoodie.precommit.validators.equality.sql.queries", "select sum(fare) 
> from ").}}
> stacktrace:
> {code:java}
> org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit 
> time 20211124114945342
> at 
> org.apache.hudi.table.action.commit.AbstractWriteHelper.write(AbstractWriteHelper.java:62)
> at 
> org.apache.hudi.table.action.commit.SparkUpsertCommitActionExecutor.execute(SparkUpsertCommitActionExecutor.java:46)
> at 
> org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:111)
> at 
> org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:95)
> at 
> org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:174)
> at org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:214)
> at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:276)
> at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
> at 
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
> at 
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
> at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
> at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
> at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
> at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
> at 
> org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
> ... 70 elided
> Caused by: java.util.ConcurrentModificationException
> at java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1633)
> at 
> java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> at 
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
> at 
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
> at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
> at java.util.HashMap

[jira] [Updated] (HUDI-3058) SqlQueryEqualityPreCommitValidator errors with java.util.ConcurrentModificationException

2021-12-17 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-3058:
--
Description: 
Ref issue: [https://github.com/apache/hudi/issues/4109]

 

Faced concurrentModificationException when trying to test 
SqlQueryEqualityPreCommitValidator in quickstart guide

*To Reproduce*

Steps to reproduce the behavior:
 # Insert data without any pre commit validations
 # Update data (ensured the updates dont touch the fare column in quickstart 
example) with the following precommit validator props

{{option("hoodie.precommit.validators", 
"org.apache.hudi.client.validator.SqlQueryEqualityPreCommitValidator").
option("hoodie.precommit.validators.equality.sql.queries", "select sum(fare) 
from ").}}

stacktrace:
{code:java}
org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit 
time 20211124114945342
at 
org.apache.hudi.table.action.commit.AbstractWriteHelper.write(AbstractWriteHelper.java:62)
at 
org.apache.hudi.table.action.commit.SparkUpsertCommitActionExecutor.execute(SparkUpsertCommitActionExecutor.java:46)
at 
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:111)
at 
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:95)
at 
org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:174)
at org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:214)
at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:276)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
at 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
... 70 elided
Caused by: java.util.ConcurrentModificationException
at java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1633)
at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at 
java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:272)
at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1556)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
at 
org.apache.hudi.client.utils.SparkValidatorUtils.getRecordsFromPendingCommits(SparkValidatorUtils.java:159)
at 
org.apache.hudi.client.utils.SparkValidatorUtils.runValidators(SparkValidatorUtils.java:78)

  1   2   3   4   5   6   >