[jira] [Updated] (HUDI-5404) add flink bundle validation

2022-12-17 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5404:
-
Sprint: 2022/12/12

> add flink bundle validation
> ---
>
> Key: HUDI-5404
> URL: https://issues.apache.org/jira/browse/HUDI-5404
> Project: Apache Hudi
>  Issue Type: Test
>  Components: tests-ci
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Blocker
> Fix For: 0.12.2
>
>
> Make flink bundles validated via GitHub actions CI



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #7497: [HUDI-5412] Send the boostrap event if the JM also rebooted

2022-12-17 Thread GitBox


hudi-bot commented on PR #7497:
URL: https://github.com/apache/hudi/pull/7497#issuecomment-1356676335

   
   ## CI report:
   
   * f6bba36d1adc3a52889839246129a89b754f8a43 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13837)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7497: [HUDI-5412] Send the boostrap event if the JM also rebooted

2022-12-17 Thread GitBox


hudi-bot commented on PR #7497:
URL: https://github.com/apache/hudi/pull/7497#issuecomment-1356674047

   
   ## CI report:
   
   * f6bba36d1adc3a52889839246129a89b754f8a43 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-5412) Send the boostrap event if the JM also rebooted

2022-12-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-5412:
-
Labels: pull-request-available  (was: )

> Send the boostrap event if the JM also rebooted
> ---
>
> Key: HUDI-5412
> URL: https://issues.apache.org/jira/browse/HUDI-5412
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink
>Reporter: Danny Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.2, 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] danny0405 opened a new pull request, #7497: [HUDI-5412] Send the boostrap event if the JM also rebooted

2022-12-17 Thread GitBox


danny0405 opened a new pull request, #7497:
URL: https://github.com/apache/hudi/pull/7497

   ### Change Logs
   
   When the JM and TM executors both restart, the initial instant on JM is 
null, the writers then all wait for the instant to bootstrap until timeout 
triggers.
   
   ### Impact
   
   Fix the timeout because of the JM crush.
   
   ### Risk level (write none, low medium or high below)
   
   none
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-5412) Send the boostrap event if the JM also rebooted

2022-12-17 Thread Danny Chen (Jira)
Danny Chen created HUDI-5412:


 Summary: Send the boostrap event if the JM also rebooted
 Key: HUDI-5412
 URL: https://issues.apache.org/jira/browse/HUDI-5412
 Project: Apache Hudi
  Issue Type: Bug
  Components: flink
Reporter: Danny Chen
 Fix For: 0.12.2, 0.13.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


svn commit: r58796 - /release/hudi/KEYS

2022-12-17 Thread sivabalan
Author: sivabalan
Date: Sun Dec 18 04:04:16 2022
New Revision: 58796

Log:
Adding keys for satish

Modified:
release/hudi/KEYS

Modified: release/hudi/KEYS
==
--- release/hudi/KEYS (original)
+++ release/hudi/KEYS Sun Dec 18 04:04:16 2022
@@ -,3 +,63 @@ vVmwNnpErMRCa+GMaulS06s2mkJdLVX8EW5z3BLz
 RRaeFMCVTqi/Xw==
 =LZ7a
 -END PGP PUBLIC KEY BLOCK-
+pub   rsa4096 2022-11-19 [SC] [expires: 2026-11-19]
+  6DA0B39A13C2658D22AE7D14D08C4B6BD98EA659
+uid   [ultimate] hudi 
+sig 3D08C4B6BD98EA659 2022-11-19  hudi 
+sub   rsa4096 2022-11-19 [E] [expires: 2026-11-19]
+sig  D08C4B6BD98EA659 2022-11-19  hudi 
+
+-BEGIN PGP PUBLIC KEY BLOCK-
+
+mQINBGN5EkYBEACeR5uneVMbI5W5BtHEibjXEuskBHT7Z+MiU158YQEAmVvy+9NK
+TBntrIpNioqgdQlyZypcGQvuse11+TYh0AbCmK6y+8iOqi7EiDsSsdDOwTR7ROXJ
+Br1/ZLQbjb9poPFQxOzSLQWeQT6ETU6wZwrZ8ZC/ZJ1hGKX+SsDjqHWGZhPZInfB
+c/uqmB1advGZEdWRKSN4b4IIcOL69vO50NfGFTbu5n6MQjpFGnBoW9Ed3IO+UsvN
+3K7opD6/DYth5F88shvW5YEEuS/yHBHKHPs4XAqVCtjUozDmMmpbIEKuF3Id9eO6
+l+mhAUBLy9Tj6dgkIyQ6nkjzJwe0BsijyL/U1O5XFpnP+ETK1QsrPbQrQ4ykFv3R
+LV8qnXilyiM3iKtxDwvEtZPF7fMtGwhQEKDBwWu0zfsVZ1kQSvuvYC/T83BCQ2Rx
+faP+Xy3bc/979WtvxM30yxp2aQ+ZcCLB6ORF95irXTVu3Z6QSZRGCUfJmAfckXOv
+u7mLh4wH2Lua8ppruqE8Ic1cpl/VQOLuYOBM0yMuJSpAUuSVt9k8XKdZt8zrg0BV
+NdIo9uir7lf0csKtFR79vPJq0YBOo7sj56C6KDQZLQ6B9Dx54qGr03RZTAeidb2R
+qTaNSrvJzVtMNdPwgHtn507OGt2ZJOo4cIbl+Im5IzMesvC4uRzrqZTrJwARAQAB
+tBhodWRpIDxzYXRpc2hAYXBhY2hlLm9yZz6JAlQEEwEIAD4WIQRtoLOaE8JljSKu
+fRTQjEtr2Y6mWQUCY3kSRgIbAwUJB4YfXQULCQgHAgYVCgkICwIEFgIDAQIeAQIX
+gAAKCRDQjEtr2Y6mWdyuD/9AgaM08CYxbAYDtPAb6uC1edCZbvPzkP98us4m8jL/
+979grfvgyPkH2c87f8ec/JlGIOZSaDZOsNO9hhsCyfT3SrN/DQnIqlimEkh4k7Wb
+DGp3aktP5Qv80BtExkIca8J92Z7Cs5FRub9Vp51bqfS9wDgBZDvTbOpXc2snJgK5
+Bh9JlfFUyb4ev6pFizrT/sL5COhkqYKgyunl8fMOiX2hgl/aNyOjOCOQrHQNpW8d
+EsdTvj8+IVkadeCkD5+lqaNS1cY0U7ycGpciwjGZ4aNypb27lF2L0o5zmKT0u2yD
+gtOg28RpMV6uz4rQWNibz1vH/USGxIdd67dPFVYshUqicdhjmH4848qGkkXvhgqE
+wL0e5EH9HxbN6VocxZ9YHvnNfA8hy2K0sJTm+TMQJvD7dW112LLX3u8XLmDp4URQ
+bYwtEw82VfcbYIZbXUIWY5NPLNevDKVs3SXmkdXXz0OzsX0ODb+pp3rW/NztOeQ9
+4huvwXLmm9WiKiTgz7SQXvhZNpi6sUIlX82yEHr/+KbCXRTsz4xmSMBrqKufbTrF
+P/QyH6mONeXdsCb50jkMG8L2TzFEQdElchInobcfAZ0E2SuZ8Rmm6HdB2iS4SN6O
+jexAkN0VVi63f2Zsl2XjZhckW3x/X52CzlyAPc6m0NxsrEYjsmdzX0ACn0g/6KCj
+M7kCDQRjeRJGARAAs8p3JcS7icMBJIl1MHDjF0nGBMrway7HpqzROfnXJLogjUu9
+L0ASGojloytecQzcDGDbB8zuF2o+qyu0EtEzMc8m2PrRgBmOg+TMEZOovCSjiIEZ
+/w7ZOlfOU8Iva3fBbAg++oFb24LEOC+z1gjcUie0QlvReZWLWvZ1ATD3y0oWapqR
+IyyqOHaSF/l8cIRvv2kgigEvLch8iVuVHnc/ZOjyQ5iEbBZpe/ejg08dlPU2VCdO
+jGcL17JOqoltCKsQmK+xBnAHQL5VNcTd9fEo6FeiUIofyf/9d/LpqPVjo1zo6Eld
+9hk7q3I5Ms4Lh+cbtslnzi7t7U+cI+Zs0s7G0FcMBIXzqdgiHP3doQVm1Viex12f
+Wp+lN+QJDmyo+wEtkxbWSXKutiL0OAdSmO/1Cx901ygSTw5F/lmTxzqn3oc8F3vq
+lMQKn+WpKRcMwWeQU3nhtSi/zw7Zto/LyWmt8JQdQUoFYAXIXIbQaihP1k3COnnC
+wPj3XbzIBUWCM2642jcX/ieUrsKLRu7/WoVf0L6CPLqk8QKPBI85HUooB5oA8ZzX
+U4Io/VzRZ9plo1q8I9JOR9g4HLoGh4GovWlfjsifa/h5j2W1o7Z/Ix9Ze5fDvKW+
+0wvRONQKiAeiKYy0k/SfX6WBxHhMNg9VmaWT6bhCKstSJX/Mo1uu4d+BUxkAEQEA
+AYkCPAQYAQgAJhYhBG2gs5oTwmWNIq59FNCMS2vZjqZZBQJjeRJGAhsMBQkHhh9d
+AAoJENCMS2vZjqZZYs0P/RspAruxRd5dhWc0YA1KM8BkGg7UZDa1o3EYBkX/clm5
+QaeI2ozTphVPACyonCSxsH1AkC8Vi5TFkg3PKHMe7MAlPDxlW94nLnBBIk+ncDeL
+kz+CI1oFDXF1KrohSyzgxTfw9wHMn5vsBMJ+Of1+YSSNhTN5XmMgA1qwz9po6SU2
+FgzTrfrMSv8E7vANusqcl+hfGpdUg6oOA9LRJziIzd+Zrddq0urq49qAaDF3VEq6
+kh/nAMtvRiT/idLJE6z0O1INRpj6Bq8J6JsadM9CSsVHYn4Vn/38rJTl4FPJpaxn
+Hyfn/j+BsaWCr1mCRqVsUexcIhDQCtND9mVkYt0RBJaCJ+jVpGReoXcxL/yqpV2H
+rQyKTQYYJmBTUimbvX30ct+7UH7wM7llTcRRqF+EnVU+5+y8AMtbDlIoByX1NnPM
+qgJYRxlxbo79FGg03fKA0NRSxszpZb2BqGuVZtVfXMLCorMpua/5S8KriHujGLlx
+KJtkpiC/npPXxvWvVyi/4h184Xrp7wCQ0ITapxnCHaxLHdBak2kSCbhBuAcNnalJ
+FDeahocca6V+Sxosds8J9keQNIzz+HAoFbdGBBRsPv/3rZbG50s4CTGGxPpiBIpR
+bTekUhOFAo/Xl12LSY0Wv5c7YEWWgbFH9qfKg5srtYEGqjJe0yWKWpzQKZVuSMA4
+=pbmq
+-END PGP PUBLIC KEY BLOCK-
+
+




svn commit: r58795 - in /dev/hudi/hudi-0.12.2-rc1: ./ hudi-0.12.2-rc1.src.tgz hudi-0.12.2-rc1.src.tgz.asc hudi-0.12.2-rc1.src.tgz.sha512

2022-12-17 Thread satish
Author: satish
Date: Sun Dec 18 03:52:01 2022
New Revision: 58795

Log:
artifacts for 0.12.2-rc1

Added:
dev/hudi/hudi-0.12.2-rc1/
dev/hudi/hudi-0.12.2-rc1/hudi-0.12.2-rc1.src.tgz   (with props)
dev/hudi/hudi-0.12.2-rc1/hudi-0.12.2-rc1.src.tgz.asc
dev/hudi/hudi-0.12.2-rc1/hudi-0.12.2-rc1.src.tgz.sha512

Added: dev/hudi/hudi-0.12.2-rc1/hudi-0.12.2-rc1.src.tgz
==
Binary file - no diff available.

Propchange: dev/hudi/hudi-0.12.2-rc1/hudi-0.12.2-rc1.src.tgz
--
svn:mime-type = application/octet-stream

Added: dev/hudi/hudi-0.12.2-rc1/hudi-0.12.2-rc1.src.tgz.asc
==
--- dev/hudi/hudi-0.12.2-rc1/hudi-0.12.2-rc1.src.tgz.asc (added)
+++ dev/hudi/hudi-0.12.2-rc1/hudi-0.12.2-rc1.src.tgz.asc Sun Dec 18 03:52:01 
2022
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIzBAABCAAdFiEEbaCzmhPCZY0irn0U0IxLa9mOplkFAmOeUYwACgkQ0IxLa9mO
+plkMSxAAmo3SlEaMiPPb8d0k/nkl/Zsn1DkgE7h3xqWsao3afFe7SmWeOsqEkZn5
+thjor2Ayz7ocHN0A+UlY7v5xbKx6TV7ME9RiaO2xFtRu4WCm9Hpt8l/6dgPg2HSR
+lk6wnU7A1zCliC4ZlIdmn0psqKJ9eQkNXmKqwEE8jRtZQ4g7kUsqg6mHSzdnXpV5
+M4q9TE87fb539tb2zlbUeHCrMF9Xcz6DqtHkF7/fH6/ps4EDGedsmoOAKhxOPDCC
+K2+G2BEpFzT/zojo/7fRMn9VNDOAbTaA058fg3XsR7+xtGCWnlhwjaJYJaIidFVG
+cbgpa5tHfDRcrkIDi3JYcUhVLJemUr05XxV4MEYpaDGyBKxkRr9TU4di1gnOHEPU
+aCNZ66NQMwmfPfr7+52UWBIksM+X0pnbQFyFa3d+nxxxuR+HgSi97ZAUNx4FXWl0
+1ktNnOdUqZOsA2Xs/JCvl2k8q7fLcyU7sB1cy8Fqa4US7/sVVeWUJLRvlljluLNP
+Bu4/JN2m9+FJ+j0zG0wCZEiYUgRLNjKQVMuKxrA5HcuZDBhmICOt0OKl8LY0dc+/
+fikq/rsP+EuNrw7UCPt6cg+ab2fv7hpkQqaxF6ob/6YpOTs5Sl0tSdtiSBKDrvPQ
+RX8dvb3Y2vrc/10nT82F7j/d7u3tnL82W+1aTLKtZpia4K68L4Q=
+=Q2f9
+-END PGP SIGNATURE-

Added: dev/hudi/hudi-0.12.2-rc1/hudi-0.12.2-rc1.src.tgz.sha512
==
--- dev/hudi/hudi-0.12.2-rc1/hudi-0.12.2-rc1.src.tgz.sha512 (added)
+++ dev/hudi/hudi-0.12.2-rc1/hudi-0.12.2-rc1.src.tgz.sha512 Sun Dec 18 03:52:01 
2022
@@ -0,0 +1 @@
+9132b6f66ef431e24c99e8da6c7da189b100148681c366e7008b334756b34de7878ee27be2a6b4211f33beb6d22f5e4e60832b8d0b64b23de09c95be7a09e407
  hudi-0.12.2-rc1.src.tgz




[hudi] branch asf-site updated: [MINOR] Adding more blogs to our website, dec 17, 2022 (#7496)

2022-12-17 Thread leesf
This is an automated email from the ASF dual-hosted git repository.

leesf pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 6ab4d15a73 [MINOR] Adding more blogs to our website, dec 17, 2022 
(#7496)
6ab4d15a73 is described below

commit 6ab4d15a73354e0eb76cfa1557dd8b3cb7f0388e
Author: Sivabalan Narayanan 
AuthorDate: Sat Dec 17 19:48:19 2022 -0800

[MINOR] Adding more blogs to our website, dec 17, 2022 (#7496)
---
 ...lake-on-AWS-using-Amazon-EMR\342\200\223Part-1.mdx" |  17 +
 .../2022-11-22-aws_hudi_best_practices_part1.jpeg  | Bin 0 -> 50517 bytes
 2 files changed, 17 insertions(+)

diff --git 
"a/website/blog/2022-11-22-Build-your-Apache-Hudi-data-lake-on-AWS-using-Amazon-EMR\342\200\223Part-1.mdx"
 
"b/website/blog/2022-11-22-Build-your-Apache-Hudi-data-lake-on-AWS-using-Amazon-EMR\342\200\223Part-1.mdx"
new file mode 100644
index 00..79fdf65b90
--- /dev/null
+++ 
"b/website/blog/2022-11-22-Build-your-Apache-Hudi-data-lake-on-AWS-using-Amazon-EMR\342\200\223Part-1.mdx"
@@ -0,0 +1,17 @@
+---
+title: "Build your Apache Hudi data lake on AWS using Amazon EMR – Part 1"
+authors:
+- name: Suthan Phillips
+- name: Dylan Qu
+category: blog
+image: /assets/images/blog/2022-11-22-aws_hudi_best_practices_part1.jpeg
+tags:
+- how-to
+- best-practices
+- amazon
+---
+
+import Redirect from '@site/src/components/Redirect';
+
+https://aws.amazon.com/blogs/big-data/part-1-build-your-apache-hudi-data-lake-on-aws-using-amazon-emr//";>Redirecting...
 please wait!! 
+
diff --git 
a/website/static/assets/images/2022-11-22-aws_hudi_best_practices_part1.jpeg 
b/website/static/assets/images/2022-11-22-aws_hudi_best_practices_part1.jpeg
new file mode 100644
index 00..f44875941d
Binary files /dev/null and 
b/website/static/assets/images/2022-11-22-aws_hudi_best_practices_part1.jpeg 
differ



[GitHub] [hudi] leesf merged pull request #7496: [MINOR] Updating blog articles

2022-12-17 Thread GitBox


leesf merged PR #7496:
URL: https://github.com/apache/hudi/pull/7496


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated: [HUDI-5377] Write call stack information to lock file (#7440)

2022-12-17 Thread leesf
This is an automated email from the ASF dual-hosted git repository.

leesf pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 6c7271f60a [HUDI-5377] Write call stack information to lock file 
(#7440)
6c7271f60a is described below

commit 6c7271f60a4bd398c469a44dd5efb9d31f068a2b
Author: HunterXHunter <1356469...@qq.com>
AuthorDate: Sun Dec 18 11:46:25 2022 +0800

[HUDI-5377] Write call stack information to lock file (#7440)
---
 .../lock/FileSystemBasedLockProvider.java  | 44 +-
 .../hudi/client/transaction/lock/LockInfo.java | 67 ++
 .../hudi/client/transaction/lock/LockManager.java  |  2 +-
 .../org/apache/hudi/common/lock/LockProvider.java  |  4 ++
 4 files changed, 113 insertions(+), 4 deletions(-)

diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/FileSystemBasedLockProvider.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/FileSystemBasedLockProvider.java
index 4135ef9acd..efa644f4b0 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/FileSystemBasedLockProvider.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/lock/FileSystemBasedLockProvider.java
@@ -20,6 +20,8 @@
 package org.apache.hudi.client.transaction.lock;
 
 import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FSDataOutputStream;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;
 import org.apache.hudi.common.config.LockConfiguration;
@@ -27,6 +29,7 @@ import org.apache.hudi.common.fs.FSUtils;
 import org.apache.hudi.common.lock.LockProvider;
 import org.apache.hudi.common.lock.LockState;
 import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.util.FileIOUtils;
 import org.apache.hudi.common.util.StringUtils;
 import org.apache.hudi.common.util.ValidationUtils;
 import org.apache.hudi.config.HoodieWriteConfig;
@@ -37,6 +40,7 @@ import org.apache.log4j.Logger;
 
 import java.io.IOException;
 import java.io.Serializable;
+import java.text.SimpleDateFormat;
 import java.util.concurrent.TimeUnit;
 
 import static 
org.apache.hudi.common.config.LockConfiguration.FILESYSTEM_LOCK_EXPIRE_PROP_KEY;
@@ -50,13 +54,14 @@ import static 
org.apache.hudi.common.config.LockConfiguration.FILESYSTEM_LOCK_PA
 public class FileSystemBasedLockProvider implements LockProvider, 
Serializable {
 
   private static final Logger LOG = 
LogManager.getLogger(FileSystemBasedLockProvider.class);
-
   private static final String LOCK_FILE_NAME = "lock";
-
   private final int lockTimeoutMinutes;
   private final transient FileSystem fs;
   private final transient Path lockFile;
   protected LockConfiguration lockConfiguration;
+  private SimpleDateFormat sdf;
+  private LockInfo lockInfo;
+  private String currentOwnerLockInfo;
 
   public FileSystemBasedLockProvider(final LockConfiguration 
lockConfiguration, final Configuration configuration) {
 checkRequiredProps(lockConfiguration);
@@ -68,6 +73,8 @@ public class FileSystemBasedLockProvider implements 
LockProvider, Serial
 }
 this.lockTimeoutMinutes = 
lockConfiguration.getConfig().getInteger(FILESYSTEM_LOCK_EXPIRE_PROP_KEY);
 this.lockFile = new Path(lockDirectory + Path.SEPARATOR + LOCK_FILE_NAME);
+this.lockInfo = new LockInfo();
+this.sdf = new SimpleDateFormat("-MM-dd HH:mm:ss.SSS");
 this.fs = FSUtils.getFs(this.lockFile.toString(), configuration);
   }
 
@@ -92,6 +99,7 @@ public class FileSystemBasedLockProvider implements 
LockProvider, Serial
 fs.delete(this.lockFile, true);
 LOG.warn("Delete expired lock file: " + this.lockFile);
   } else {
+reloadCurrentOwnerLockInfo();
 return false;
   }
 }
@@ -122,6 +130,11 @@ public class FileSystemBasedLockProvider implements 
LockProvider, Serial
 return this.lockFile.toString();
   }
 
+  @Override
+  public String getCurrentOwnerLockInfo() {
+return currentOwnerLockInfo;
+  }
+
   private boolean checkIfExpired() {
 if (lockTimeoutMinutes == 0) {
   return false;
@@ -139,7 +152,32 @@ public class FileSystemBasedLockProvider implements 
LockProvider, Serial
 
   private void acquireLock() {
 try {
-  fs.create(this.lockFile, false).close();
+  if (!fs.exists(this.lockFile)) {
+FSDataOutputStream fos = fs.create(this.lockFile, false);
+initLockInfo();
+fos.writeBytes(lockInfo.toString());
+fos.close();
+  }
+} catch (IOException e) {
+  throw new 
HoodieIOException(generateLogStatement(LockState.FAILED_TO_ACQUIRE), e);
+}
+  }
+
+  public void initLockInfo() {
+lockInfo.setLockCreateTime(sdf.format(System.currentT

[GitHub] [hudi] leesf merged pull request #7440: [HUDI-5377] Write call stack information to lock file

2022-12-17 Thread GitBox


leesf merged PR #7440:
URL: https://github.com/apache/hudi/pull/7440


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] annotated tag release-0.12.2-rc1 updated (94db72e2c9 -> c97d0ad4cb)

2022-12-17 Thread satish
This is an automated email from the ASF dual-hosted git repository.

satish pushed a change to annotated tag release-0.12.2-rc1
in repository https://gitbox.apache.org/repos/asf/hudi.git


*** WARNING: tag release-0.12.2-rc1 was modified! ***

from 94db72e2c9 (commit)
  to c97d0ad4cb (tag)
 tagging 94db72e2c9435307b554fb1d2fbc8e0431f9ff38 (commit)
 replaces release-0.12.1
  by Satish Kotha
  on Sat Dec 17 19:41:44 2022 -0800

- Log -
0.12.2
-BEGIN PGP SIGNATURE-

iQIzBAABCAAdFiEEbaCzmhPCZY0irn0U0IxLa9mOplkFAmOei/gACgkQ0IxLa9mO
plmxaw//fcfahtYq3lx4xWKib+shGalD18f72vLFqQsITwTBQrx1Ci6XDR8rHV2s
ZkPW2tOTJLDD/hIv22aNE9Q9IZESYu3sFhkLmLx/HIyXtspSRp7+hZEdgKnG0P/6
3KdBxSMBhXt23L+qA2E68B3XC/0tgyjl/6pt48fEZgIXd4t7F902l8X2fxUmEsNB
rGtCN4/auNGpq/ZqYsC7BN0fLEgxNtlSh3nR4Urtw5EulZQOKBxf2TWYJw9qM71n
HbVRLOPBhUWyeCTRdnK5ipvK6p1sfUuLLRbbEysRnVA8r8VBxvFCkl6GaopLPaWp
/LDo9UYxk+nM3BYw6RzjTXdp4QFfSbgEVsFFqKVhWFMPqnJ46muBEHJFPhGWFGo1
LcTQtiAAyI9ZX+TS7qfxYzPaLrrRkHcazlPnnqChGYvfsypHuKaTmZdnvQw35F3L
XHwgqTg0Wog7twWVsmGQrJOLQM688Up/nqro1wHekEeJq2l3cyqyD3d9wC0wlDzi
5R6XehyWZACXZgJVDcBmsLy663sGJKVq8x8fSl0qmy1wjd/TcVWnUEjX855UeRTM
17FZ0J5FZ06vg9OGKdks9ZG6yOJCdtEhhIbwOT8TL3TLvIKTpqfRzMUV+U8hOLWv
r1WrC4zQura/Vh0ACNCGr9OdN/XW/c0W67Q9LywgqfGh5lEQ/W4=
=Jvv5
-END PGP SIGNATURE-
---


No new revisions were added by this update.

Summary of changes:



[GitHub] [hudi] danny0405 commented on a diff in pull request #7482: [MINOR] Fix the inconsistent behavior to calculate the value count between COW and MOR

2022-12-17 Thread GitBox


danny0405 commented on code in PR #7482:
URL: https://github.com/apache/hudi/pull/7482#discussion_r1051514947


##
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##
@@ -148,6 +148,8 @@ class ColumnStats {
 final Object fieldVal = 
convertValueForSpecificDataTypes(field.schema(), 
genericRecord.get(field.name()), false);
 final Schema fieldSchema = 
getNestedFieldSchemaFromWriteSchema(genericRecord.getSchema(), field.name());
 
+colStats.valueCount++;
+
 if (fieldVal != null && canCompare(fieldSchema)) {

Review Comment:
   Seems correct fix, can we add a test case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] Virmaline commented on issue #6278: [SUPPORT] Deltastreamer fails with data and timestamp related exception after upgrading to EMR 6.5 and spark3

2022-12-17 Thread GitBox


Virmaline commented on issue #6278:
URL: https://github.com/apache/hudi/issues/6278#issuecomment-1356496864

   Can you post your exact spark submit? Do you know why it's failing, what is 
the data type and value in the column?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] 01/01: Bumping mvn version to 0.12.2-1

2022-12-17 Thread satish
This is an automated email from the ASF dual-hosted git repository.

satish pushed a commit to branch release-0.12.2
in repository https://gitbox.apache.org/repos/asf/hudi.git

commit 94db72e2c9435307b554fb1d2fbc8e0431f9ff38
Author: Satish Kotha 
AuthorDate: Sat Dec 17 14:33:55 2022 -0800

Bumping mvn version to 0.12.2-1
---
 docker/hoodie/hadoop/base/pom.xml  | 2 +-
 docker/hoodie/hadoop/base_java11/pom.xml   | 2 +-
 docker/hoodie/hadoop/datanode/pom.xml  | 2 +-
 docker/hoodie/hadoop/historyserver/pom.xml | 2 +-
 docker/hoodie/hadoop/hive_base/pom.xml | 2 +-
 docker/hoodie/hadoop/namenode/pom.xml  | 2 +-
 docker/hoodie/hadoop/pom.xml   | 2 +-
 docker/hoodie/hadoop/prestobase/pom.xml| 2 +-
 docker/hoodie/hadoop/spark_base/pom.xml| 2 +-
 docker/hoodie/hadoop/sparkadhoc/pom.xml| 2 +-
 docker/hoodie/hadoop/sparkmaster/pom.xml   | 2 +-
 docker/hoodie/hadoop/sparkworker/pom.xml   | 2 +-
 docker/hoodie/hadoop/trinobase/pom.xml | 2 +-
 docker/hoodie/hadoop/trinocoordinator/pom.xml  | 2 +-
 docker/hoodie/hadoop/trinoworker/pom.xml   | 2 +-
 hudi-aws/pom.xml   | 4 ++--
 hudi-cli/pom.xml   | 2 +-
 hudi-client/hudi-client-common/pom.xml | 4 ++--
 hudi-client/hudi-flink-client/pom.xml  | 4 ++--
 hudi-client/hudi-java-client/pom.xml   | 4 ++--
 hudi-client/hudi-spark-client/pom.xml  | 4 ++--
 hudi-client/pom.xml| 2 +-
 hudi-common/pom.xml| 2 +-
 hudi-examples/hudi-examples-common/pom.xml | 2 +-
 hudi-examples/hudi-examples-flink/pom.xml  | 2 +-
 hudi-examples/hudi-examples-java/pom.xml   | 2 +-
 hudi-examples/hudi-examples-spark/pom.xml  | 2 +-
 hudi-examples/pom.xml  | 2 +-
 hudi-flink-datasource/hudi-flink/pom.xml   | 4 ++--
 hudi-flink-datasource/hudi-flink1.13.x/pom.xml | 4 ++--
 hudi-flink-datasource/hudi-flink1.14.x/pom.xml | 4 ++--
 hudi-flink-datasource/hudi-flink1.15.x/pom.xml | 4 ++--
 hudi-flink-datasource/pom.xml  | 4 ++--
 hudi-gcp/pom.xml   | 2 +-
 hudi-hadoop-mr/pom.xml | 2 +-
 hudi-integ-test/pom.xml| 2 +-
 hudi-kafka-connect/pom.xml | 4 ++--
 hudi-spark-datasource/hudi-spark-common/pom.xml| 4 ++--
 hudi-spark-datasource/hudi-spark/pom.xml   | 4 ++--
 hudi-spark-datasource/hudi-spark2-common/pom.xml   | 2 +-
 hudi-spark-datasource/hudi-spark2/pom.xml  | 4 ++--
 hudi-spark-datasource/hudi-spark3-common/pom.xml   | 2 +-
 hudi-spark-datasource/hudi-spark3.1.x/pom.xml  | 4 ++--
 hudi-spark-datasource/hudi-spark3.2.x/pom.xml  | 4 ++--
 hudi-spark-datasource/hudi-spark3.2plus-common/pom.xml | 2 +-
 hudi-spark-datasource/hudi-spark3.3.x/pom.xml  | 4 ++--
 hudi-spark-datasource/pom.xml  | 2 +-
 hudi-sync/hudi-adb-sync/pom.xml| 2 +-
 hudi-sync/hudi-datahub-sync/pom.xml| 2 +-
 hudi-sync/hudi-hive-sync/pom.xml   | 2 +-
 hudi-sync/hudi-sync-common/pom.xml | 2 +-
 hudi-sync/pom.xml  | 2 +-
 hudi-tests-common/pom.xml  | 2 +-
 hudi-timeline-service/pom.xml  | 2 +-
 hudi-utilities/pom.xml | 2 +-
 packaging/hudi-aws-bundle/pom.xml  | 2 +-
 packaging/hudi-datahub-sync-bundle/pom.xml | 2 +-
 packaging/hudi-flink-bundle/pom.xml| 2 +-
 packaging/hudi-gcp-bundle/pom.xml  | 2 +-
 packaging/hudi-hadoop-mr-bundle/pom.xml| 2 +-
 packaging/hudi-hive-sync-bundle/pom.xml| 2 +-
 packaging/hudi-integ-test-bundle/pom.xml   | 2 +-
 packaging/hudi-kafka-connect-bundle/pom.xml| 2 +-
 packaging/hudi-presto-bundle/pom.xml   | 2 +-
 packaging/hudi-spark-bundle/pom.xml| 2 +-
 packaging/hudi-timeline-server-bundle/pom.xml  | 2 +-
 packaging/hudi-trino-bundle/pom.xml| 2 +-
 packaging/hudi-utilities-bundle/pom.xml| 2 +-
 packaging/hudi-utilities-slim-bundle/pom.xml   | 2 +-
 pom.xml| 2 +-
 70 files changed, 87 insertions(+), 87 deletions(-)

diff --git a/docker/hoodie/hadoop/base/pom.xml 
b/docker/hoodie/hadoop/base/pom.xml
index 39ceb4006b..d8715bd077 100644
--- a/docker/hoodie/hadoop/base/pom.xml
+++ b/dock

[hudi] branch release-0.12.2 created (now 94db72e2c9)

2022-12-17 Thread satish
This is an automated email from the ASF dual-hosted git repository.

satish pushed a change to branch release-0.12.2
in repository https://gitbox.apache.org/repos/asf/hudi.git


  at 94db72e2c9 Bumping mvn version to 0.12.2-1

This branch includes the following new commits:

 new 94db72e2c9 Bumping mvn version to 0.12.2-1

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.




[GitHub] [hudi] hudi-bot commented on pull request #7495: [HUDI-5383] Release 0.12.2 branch test dec17

2022-12-17 Thread GitBox


hudi-bot commented on PR #7495:
URL: https://github.com/apache/hudi/pull/7495#issuecomment-1356484995

   
   ## CI report:
   
   * 0345a01d81ea529191f26895ec1471c27b4e54bf UNKNOWN
   * 90310cdec15f97d47c46e76650f07fa5e68488d7 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13835)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7455: [DO_NOT_MERGE] Release 0.12.2 blockers candidate

2022-12-17 Thread GitBox


hudi-bot commented on PR #7455:
URL: https://github.com/apache/hudi/pull/7455#issuecomment-1356463124

   
   ## CI report:
   
   * 9ad6939e4e9bacfb1a324bab216198a56f410c9d UNKNOWN
   * de5191707931a08b984239297c45eb4e0e80c9cd Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13834)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan opened a new pull request, #7496: [MINOR] Updating blog articles

2022-12-17 Thread GitBox


nsivabalan opened a new pull request, #7496:
URL: https://github.com/apache/hudi/pull/7496

   ### Change Logs
   
   Adding blog articles
   
   ### Impact
   
   More awareness and visibility for hudi
   
   ### Risk level (write none, low medium or high below)
   
   none.
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (HUDI-5409) Avoid file index and use fs view cache in COW input format

2022-12-17 Thread Alexey Kudinkin (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kudinkin closed HUDI-5409.
-
Resolution: Fixed

> Avoid file index and use fs view cache in COW input format
> --
>
> Key: HUDI-5409
> URL: https://issues.apache.org/jira/browse/HUDI-5409
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.12.2
>
>
> For Trino-Hive connector querying Hudi tables, we observed a perf regression 
> with latest hudi-trino-bundle vs that of version 0.8.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-5411) Make sure Trino does not re-instantiates Hive's InputFormat for every partition during file listing

2022-12-17 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-5411:
-

 Summary: Make sure Trino does not re-instantiates Hive's 
InputFormat for every partition during file listing
 Key: HUDI-5411
 URL: https://issues.apache.org/jira/browse/HUDI-5411
 Project: Apache Hudi
  Issue Type: Bug
  Components: trino-presto
Reporter: Alexey Kudinkin
Assignee: Sagar Sumit
 Fix For: 0.13.0


To unblock 0.12.2, we've implemented a stop-gap falling back to 
FileSystemView-based listing (HUDI-5409).

This is not an appropriate long-term solution though, and we need to make sure 
we fix it properly by avoiding re-instantiating InputFormats w/in Trino itself 
(so that we can properly use the FileIndex and MT) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-5409) Avoid file index and use fs view cache in COW input format

2022-12-17 Thread Alexey Kudinkin (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kudinkin reassigned HUDI-5409:
-

Assignee: Sagar Sumit

> Avoid file index and use fs view cache in COW input format
> --
>
> Key: HUDI-5409
> URL: https://issues.apache.org/jira/browse/HUDI-5409
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Assignee: Sagar Sumit
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.12.2
>
>
> For Trino-Hive connector querying Hudi tables, we observed a perf regression 
> with latest hudi-trino-bundle vs that of version 0.8.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5409) Avoid file index and use fs view cache in COW input format

2022-12-17 Thread Alexey Kudinkin (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kudinkin updated HUDI-5409:
--
Priority: Blocker  (was: Major)

> Avoid file index and use fs view cache in COW input format
> --
>
> Key: HUDI-5409
> URL: https://issues.apache.org/jira/browse/HUDI-5409
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.12.2
>
>
> For Trino-Hive connector querying Hudi tables, we observed a perf regression 
> with latest hudi-trino-bundle vs that of version 0.8.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #7495: [HUDI-5383] Release 0.12.2 blockers candidate test dec17

2022-12-17 Thread GitBox


hudi-bot commented on PR #7495:
URL: https://github.com/apache/hudi/pull/7495#issuecomment-1356404376

   
   ## CI report:
   
   * e8a193c2f7527f53c2c84d323ac892107b9c7bed Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13833)
 
   * 0345a01d81ea529191f26895ec1471c27b4e54bf UNKNOWN
   * 90310cdec15f97d47c46e76650f07fa5e68488d7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13835)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7455: [DO_NOT_MERGE] Release 0.12.2 blockers candidate

2022-12-17 Thread GitBox


hudi-bot commented on PR #7455:
URL: https://github.com/apache/hudi/pull/7455#issuecomment-1356404341

   
   ## CI report:
   
   * 9ad6939e4e9bacfb1a324bab216198a56f410c9d UNKNOWN
   * 3a4ffaf9f7414b00526fe741a5fdfef1268e8ff0 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13832)
 
   * de5191707931a08b984239297c45eb4e0e80c9cd Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13834)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7495: [HUDI-5383] Release 0.12.2 blockers candidate test dec17

2022-12-17 Thread GitBox


hudi-bot commented on PR #7495:
URL: https://github.com/apache/hudi/pull/7495#issuecomment-1356403152

   
   ## CI report:
   
   * e8a193c2f7527f53c2c84d323ac892107b9c7bed Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13833)
 
   * 0345a01d81ea529191f26895ec1471c27b4e54bf UNKNOWN
   * 90310cdec15f97d47c46e76650f07fa5e68488d7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7455: [DO_NOT_MERGE] Release 0.12.2 blockers candidate

2022-12-17 Thread GitBox


hudi-bot commented on PR #7455:
URL: https://github.com/apache/hudi/pull/7455#issuecomment-1356403109

   
   ## CI report:
   
   * 9ad6939e4e9bacfb1a324bab216198a56f410c9d UNKNOWN
   * 0a530808e8e90f5a419ebee0350e792325b4057c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13827)
 
   * 3a4ffaf9f7414b00526fe741a5fdfef1268e8ff0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13832)
 
   * de5191707931a08b984239297c45eb4e0e80c9cd UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7495: [HUDI-5383] Release 0.12.2 blockers candidate test dec17

2022-12-17 Thread GitBox


hudi-bot commented on PR #7495:
URL: https://github.com/apache/hudi/pull/7495#issuecomment-1356402289

   
   ## CI report:
   
   * e8a193c2f7527f53c2c84d323ac892107b9c7bed Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13833)
 
   * 0345a01d81ea529191f26895ec1471c27b4e54bf UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7413: [HUDI-5321] Fix inconsistencies in arePartitionRecordsSorted and try to limit lots of small files during bulk insert

2022-12-17 Thread GitBox


hudi-bot commented on PR #7413:
URL: https://github.com/apache/hudi/pull/7413#issuecomment-1356402227

   
   ## CI report:
   
   * 101759e2156d55c90bf79d37899ca3b2bd8ea3d4 UNKNOWN
   * 6080a34977a48a57e9f5775067788847d43075eb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13830)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch release-0.12.2-blockers-candidate updated (3a4ffaf9f7 -> de51917079)

2022-12-17 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a change to branch release-0.12.2-blockers-candidate
in repository https://gitbox.apache.org/repos/asf/hudi.git


from 3a4ffaf9f7 [HUDI-5409] Avoid file index and use fs view cache in COW 
input format (#7493)
 add de51917079 Fixing build failures

No new revisions were added by this update.

Summary of changes:
 hudi-cli/src/main/scala/org/apache/hudi/cli/SparkHelpers.scala  | 2 +-
 .../hudi/client/functional/TestHoodieClientOnMergeOnReadStorage.java| 0
 .../test/java/org/apache/hudi/execution/TestDisruptorMessageQueue.java  | 0
 hudi-common/src/main/java/org/apache/hudi/BaseHoodieTableFileIndex.java | 2 +-
 .../src/main/scala/org/apache/hudi/cdc/HoodieCDCRDD.scala   | 0
 5 files changed, 2 insertions(+), 2 deletions(-)
 delete mode 100644 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieClientOnMergeOnReadStorage.java
 delete mode 100644 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/execution/TestDisruptorMessageQueue.java
 delete mode 100644 
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/cdc/HoodieCDCRDD.scala



[GitHub] [hudi] hudi-bot commented on pull request #7495: [HUDI-5383] Release 0.12.2 blockers candidate test dec17

2022-12-17 Thread GitBox


hudi-bot commented on PR #7495:
URL: https://github.com/apache/hudi/pull/7495#issuecomment-1356357086

   
   ## CI report:
   
   * e8a193c2f7527f53c2c84d323ac892107b9c7bed Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13833)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7455: [DO_NOT_MERGE] Release 0.12.2 blockers candidate

2022-12-17 Thread GitBox


hudi-bot commented on PR #7455:
URL: https://github.com/apache/hudi/pull/7455#issuecomment-1356356992

   
   ## CI report:
   
   * 9ad6939e4e9bacfb1a324bab216198a56f410c9d UNKNOWN
   * 0a530808e8e90f5a419ebee0350e792325b4057c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13827)
 
   * 3a4ffaf9f7414b00526fe741a5fdfef1268e8ff0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13832)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7495: [HUDI-5383] Release 0.12.2 blockers candidate test dec17

2022-12-17 Thread GitBox


hudi-bot commented on PR #7495:
URL: https://github.com/apache/hudi/pull/7495#issuecomment-1356355467

   
   ## CI report:
   
   * e8a193c2f7527f53c2c84d323ac892107b9c7bed UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7455: [DO_NOT_MERGE] Release 0.12.2 blockers candidate

2022-12-17 Thread GitBox


hudi-bot commented on PR #7455:
URL: https://github.com/apache/hudi/pull/7455#issuecomment-1356355317

   
   ## CI report:
   
   * 9ad6939e4e9bacfb1a324bab216198a56f410c9d UNKNOWN
   * 0a530808e8e90f5a419ebee0350e792325b4057c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13827)
 
   * 3a4ffaf9f7414b00526fe741a5fdfef1268e8ff0 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan opened a new pull request, #7495: [HUDI-5383] Release 0.12.2 blockers candidate test dec17

2022-12-17 Thread GitBox


nsivabalan opened a new pull request, #7495:
URL: https://github.com/apache/hudi/pull/7495

   ### Change Logs
   
   Testing release branch for 0.12.2 
   
   ### Impact
   
   Azure CI testing patch
   
   ### Risk level (write none, low medium or high below)
   
   low
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch release-0.12.2-blockers-candidate updated (0a530808e8 -> 3a4ffaf9f7)

2022-12-17 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a change to branch release-0.12.2-blockers-candidate
in repository https://gitbox.apache.org/repos/asf/hudi.git


from 0a530808e8 [HUDI-5357] Optimize deployment of release artifacts (#7419)
 add 31959f99aa [HUDI-5403] Turn off metadata-table-based file listing in 
BaseHoodieTableFileIndex (#7488)
 add 3a4ffaf9f7 [HUDI-5409] Avoid file index and use fs view cache in COW 
input format (#7493)

No new revisions were added by this update.

Summary of changes:
 .../TestHoodieClientOnMergeOnReadStorage.java  |   0
 .../hudi/execution/TestDisruptorMessageQueue.java  |   0
 .../org/apache/hudi/BaseHoodieTableFileIndex.java  |   5 +
 .../hudi/metadata/HoodieTableMetadataUtil.java |  46 ---
 .../hadoop/HoodieCopyOnWriteTableInputFormat.java  | 144 ++---
 .../HoodieMergeOnReadTableInputFormat.java |   9 +-
 .../hudi/hadoop/utils/HoodieInputFormatUtils.java  |   2 +-
 .../org/apache/hudi/integ/ITTestHoodieSanity.java  |  12 +-
 .../org/apache/hudi/BaseFileOnlyRelation.scala |   4 +-
 .../scala/org/apache/hudi/HoodieFileIndex.scala|  24 ++--
 .../org/apache/hudi/HoodieSparkConfUtils.scala |  10 +-
 .../scala/org/apache/hudi/cdc/HoodieCDCRDD.scala   |   0
 12 files changed, 166 insertions(+), 90 deletions(-)
 create mode 100644 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieClientOnMergeOnReadStorage.java
 create mode 100644 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/execution/TestDisruptorMessageQueue.java
 create mode 100644 
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/cdc/HoodieCDCRDD.scala



[hudi] branch master updated: [HUDI-5409] Avoid file index and use fs view cache in COW input format (#7493)

2022-12-17 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new cc1c1e7b33 [HUDI-5409] Avoid file index and use fs view cache in COW 
input format (#7493)
cc1c1e7b33 is described below

commit cc1c1e7b33d9c95e5a2ba0e9a1db428d1e1b2a00
Author: Sagar Sumit 
AuthorDate: Sat Dec 17 23:01:08 2022 +0530

[HUDI-5409] Avoid file index and use fs view cache in COW input format 
(#7493)

- This PR falls back to the original code path using fs view cache as in 
0.10.1 or earlier, instead of creating file index.

- Query engines using initial InputFormat based integration will not be 
using file index. Instead directly fetch file status from fs view cache.
---
 .../hudi/execution/TestDisruptorMessageQueue.java  |   4 +-
 .../hadoop/HoodieCopyOnWriteTableInputFormat.java  | 144 ++---
 .../HoodieMergeOnReadTableInputFormat.java |  30 ++---
 .../hudi/hadoop/utils/HoodieInputFormatUtils.java  |   2 +-
 4 files changed, 119 insertions(+), 61 deletions(-)

diff --git 
a/hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/execution/TestDisruptorMessageQueue.java
 
b/hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/execution/TestDisruptorMessageQueue.java
index 76c22f96e7..7d324e5296 100644
--- 
a/hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/execution/TestDisruptorMessageQueue.java
+++ 
b/hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/execution/TestDisruptorMessageQueue.java
@@ -39,6 +39,7 @@ import org.apache.spark.TaskContext;
 import org.apache.spark.TaskContext$;
 import org.junit.jupiter.api.AfterEach;
 import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Disabled;
 import org.junit.jupiter.api.Test;
 import org.junit.jupiter.api.Timeout;
 import scala.Tuple2;
@@ -85,10 +86,11 @@ public class TestDisruptorMessageQueue extends 
HoodieClientTestHarness {
 
   // Test to ensure that we are reading all records from queue iterator in the 
same order
   // without any exceptions.
+  @Disabled("Disabled for unblocking 0.12.2 release. Disruptor queue is not 
part of this minor release. Tracked in HUDI-5410")
   @SuppressWarnings("unchecked")
   @Test
   @Timeout(value = 60)
-  public void testRecordReading() throws Exception {
+  public void testRecordReading() {
 
 final List hoodieRecords = 
dataGen.generateInserts(instantTime, 100);
 ArrayList beforeRecord = new ArrayList<>();
diff --git 
a/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
 
b/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
index 140e7ff5b6..ce441bf2e2 100644
--- 
a/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
+++ 
b/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
@@ -18,21 +18,9 @@
 
 package org.apache.hudi.hadoop;
 
-import org.apache.avro.Schema;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.io.ArrayWritable;
-import org.apache.hadoop.io.NullWritable;
-import org.apache.hadoop.mapred.FileInputFormat;
-import org.apache.hadoop.mapred.FileSplit;
-import org.apache.hadoop.mapred.InputSplit;
-import org.apache.hadoop.mapred.JobConf;
-import org.apache.hadoop.mapred.RecordReader;
-import org.apache.hadoop.mapred.Reporter;
-import org.apache.hadoop.mapreduce.Job;
 import org.apache.hudi.common.config.TypedProperties;
 import org.apache.hudi.common.engine.HoodieLocalEngineContext;
+import org.apache.hudi.common.fs.FSUtils;
 import org.apache.hudi.common.model.FileSlice;
 import org.apache.hudi.common.model.HoodieBaseFile;
 import org.apache.hudi.common.model.HoodieLogFile;
@@ -42,7 +30,8 @@ import org.apache.hudi.common.table.HoodieTableMetaClient;
 import org.apache.hudi.common.table.TableSchemaResolver;
 import org.apache.hudi.common.table.timeline.HoodieInstant;
 import org.apache.hudi.common.table.timeline.HoodieTimeline;
-import org.apache.hudi.common.util.CollectionUtils;
+import org.apache.hudi.common.table.view.FileSystemViewManager;
+import org.apache.hudi.common.table.view.HoodieTableFileSystemView;
 import org.apache.hudi.common.util.Option;
 import org.apache.hudi.common.util.StringUtils;
 import org.apache.hudi.exception.HoodieException;
@@ -50,21 +39,42 @@ import org.apache.hudi.exception.HoodieIOException;
 import org.apache.hudi.hadoop.realtime.HoodieVirtualKeyInfo;
 import org.apache.hudi.hadoop.utils.HoodieHiveUtils;
 import org.apache.hudi.hadoop.utils.HoodieInputFormatUtils;
+import org.apache.hudi.metadata.HoodieTableMetadataUtil;
+
+import org.apache.avro.Schema;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;

[GitHub] [hudi] nsivabalan merged pull request #7493: [HUDI-5409] Avoid file index and use fs view cache in COW input format

2022-12-17 Thread GitBox


nsivabalan merged PR #7493:
URL: https://github.com/apache/hudi/pull/7493


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #7493: [HUDI-5409] Avoid file index and use fs view cache in COW input format

2022-12-17 Thread GitBox


nsivabalan commented on PR #7493:
URL: https://github.com/apache/hudi/pull/7493#issuecomment-1356346520

   There is one flaky test (col stats). CI is green otherwise. 
   https://user-images.githubusercontent.com/513218/208254075-9df3cad7-6abc-4ac6-82ce-37465149d717.png";>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7413: [HUDI-5321] Fix inconsistencies in arePartitionRecordsSorted and try to limit lots of small files during bulk insert

2022-12-17 Thread GitBox


hudi-bot commented on PR #7413:
URL: https://github.com/apache/hudi/pull/7413#issuecomment-1356328384

   
   ## CI report:
   
   * 101759e2156d55c90bf79d37899ca3b2bd8ea3d4 UNKNOWN
   * 9b10597d8621e96aac63a6a3e7e334a1eafe4e60 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13771)
 
   * 6080a34977a48a57e9f5775067788847d43075eb Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13830)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7413: [HUDI-5321] Fix inconsistencies in arePartitionRecordsSorted and try to limit lots of small files during bulk insert

2022-12-17 Thread GitBox


hudi-bot commented on PR #7413:
URL: https://github.com/apache/hudi/pull/7413#issuecomment-1356327505

   
   ## CI report:
   
   * 101759e2156d55c90bf79d37899ca3b2bd8ea3d4 UNKNOWN
   * 9b10597d8621e96aac63a6a3e7e334a1eafe4e60 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13771)
 
   * 6080a34977a48a57e9f5775067788847d43075eb UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] jonvex commented on a diff in pull request #7413: [HUDI-5321] Fix inconsistencies in arePartitionRecordsSorted and try to limit lots of small files during bulk insert

2022-12-17 Thread GitBox


jonvex commented on code in PR #7413:
URL: https://github.com/apache/hudi/pull/7413#discussion_r1051423297


##
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestInsertTable.scala:
##
@@ -1046,4 +1048,62 @@ class TestInsertTable extends HoodieSparkSqlTestBase {
   )
 }
   }
+
+  /**
+   * This test is to make sure that bulk insert doesn't create a bunch of tiny 
files if
+   * hoodie.bulkinsert.user.defined.partitioner.sort.columns doesn't start 
with the partition columns
+   *
+   */
+  forAll(BulkInsertSortMode.values().toList) { (sortMode: BulkInsertSortMode) 
=>
+val sortModeName = sortMode.name()
+test(s"Test Bulk Insert with BulkInsertSortMode: '$sortModeName'") {

Review Comment:
   I did this when I made TestSparkSqlCoreFlow and it seems to work. It is a 
bit annoying that you need to comment out the other tests in the class if you 
just want to run this test, but when we made the core flow test, it seemed to 
be the best way to do this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7493: [HUDI-5409] Avoid file index and use fs view cache in COW input format

2022-12-17 Thread GitBox


hudi-bot commented on PR #7493:
URL: https://github.com/apache/hudi/pull/7493#issuecomment-1356299825

   
   ## CI report:
   
   * 081d6925c0bd2ee8a122a6c023b2f28fdf0532d0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13826)
 
   * 0c697f08b6ebd6119187039a19c114dc61f39038 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13829)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7493: [HUDI-5409] Avoid file index and use fs view cache in COW input format

2022-12-17 Thread GitBox


hudi-bot commented on PR #7493:
URL: https://github.com/apache/hudi/pull/7493#issuecomment-1356296290

   
   ## CI report:
   
   * 081d6925c0bd2ee8a122a6c023b2f28fdf0532d0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13826)
 
   * 0c697f08b6ebd6119187039a19c114dc61f39038 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] idatya opened a new issue, #7494: FileNotFoundException while writing dataframe to local file system

2022-12-17 Thread GitBox


idatya opened a new issue, #7494:
URL: https://github.com/apache/hudi/issues/7494

   I am following https://hudi.apache.org/docs/quick-start-guide
   and using spark  version 3.3.1 and Python version 3.8.10 
   
   It gives me FileNotFoundException at the step when below command getting 
executed:
   
   ```
   df.write.format("hudi"). \
   options(**hudi_options). \
   mode("overwrite"). \
   save(basePath)
   ```
   Below is error log:
   
   ```
   22/12/17 14:54:34 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 2) 
(ip-10-177-165-98.ec2.internal executor 1): java.io.FileNotFoundException: File 
file:/tmp/hudi_trips_cow_4 does not exist
   at 
org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:597)
   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1972)
   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:2014)
   at 
org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:761)
   at 
org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.lambda$listAllPartitions$a9d991ce$1(HoodieBackedTableMetadataWriter.java:634)
   at 
org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1070)
   at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
   at scala.collection.Iterator.foreach(Iterator.scala:943)
   at scala.collection.Iterator.foreach$(Iterator.scala:943)
   at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
   at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
   at 
scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
   at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105)
   at 
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49)
   at scala.collection.TraversableOnce.to(TraversableOnce.scala:366)
   at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364)
   at scala.collection.AbstractIterator.to(Iterator.scala:1431)
   at 
scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358)
   at 
scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358)
   at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1431)
   at 
scala.collection.TraversableOnce.toArray(TraversableOnce.scala:345)
   at 
scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:339)
   at scala.collection.AbstractIterator.toArray(Iterator.scala:1431)
   at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1021)
   at 
org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2268)
   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   at org.apache.spark.scheduler.Task.run(Task.scala:136)
   at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
   at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   at java.lang.Thread.run(Thread.java:750)
   
   22/12/17 14:54:34 ERROR TaskSetManager: Task 0 in stage 1.0 failed 4 times; 
aborting job
   Traceback (most recent call last):
 File "", line 1, in 
 File "/home/doe/spark/python/pyspark/sql/readwriter.py", line 968, in save
   self._jwrite.save(path)
 File 
"/home/doe/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 
1321, in __call__
 File "/home/doe/spark/python/pyspark/sql/utils.py", line 190, in deco
   return f(*a, **kw)
 File "/home/doe/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py", 
line 326, in get_return_value
   py4j.protocol.Py4JJavaError: An error occurred while calling o76.save.
   : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 
(TID 5) (ip-10-177-165-98.ec2.internal executor 1): 
java.io.FileNotFoundException: File file:/tmp/hudi_trips_cow_4 does not exist
   at 
org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:597)
   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1972)
   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:2014)
   at 
org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:761)
   at 
org.apache.hudi.metadata.HoodieBackedTableMetadataWriter.lambda$listAllPartitions$a9d991ce$1(HoodieBackedTableMetadataWriter.java:634)
   at 
org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1070)
   

[GitHub] [hudi] hudi-bot commented on pull request #7440: [HUDI-5377] Write call stack information to lock file

2022-12-17 Thread GitBox


hudi-bot commented on PR #7440:
URL: https://github.com/apache/hudi/pull/7440#issuecomment-1356294332

   
   ## CI report:
   
   * 67e64ca0d35342d303f5c0027db72ec4c14f1890 UNKNOWN
   * 391cc64f7aaabdc0f72c85fa3ac03036d09ef43a UNKNOWN
   * 63fe7e7c8a882d757cbea6a4d26b7aba4bdad748 UNKNOWN
   * a509172c60864820f6758716c4e832645a97a57f UNKNOWN
   * 34b4683a3581aee7cab95fd94ac56a2c984c9001 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13828)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-5410) Fix flaky testRecordReading

2022-12-17 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-5410:
-

 Summary: Fix flaky testRecordReading
 Key: HUDI-5410
 URL: https://issues.apache.org/jira/browse/HUDI-5410
 Project: Apache Hudi
  Issue Type: Test
Reporter: Sagar Sumit
Assignee: Sagar Sumit
 Fix For: 0.13.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #7440: [HUDI-5377] Write call stack information to lock file

2022-12-17 Thread GitBox


hudi-bot commented on PR #7440:
URL: https://github.com/apache/hudi/pull/7440#issuecomment-1356234839

   
   ## CI report:
   
   * 67e64ca0d35342d303f5c0027db72ec4c14f1890 UNKNOWN
   * 391cc64f7aaabdc0f72c85fa3ac03036d09ef43a UNKNOWN
   * 63fe7e7c8a882d757cbea6a4d26b7aba4bdad748 UNKNOWN
   * a509172c60864820f6758716c4e832645a97a57f UNKNOWN
   * af1ed3ce37f1204e6a817585f9cea845c791066b Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13788)
 
   * 34b4683a3581aee7cab95fd94ac56a2c984c9001 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13828)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (HUDI-5357) Optimize release artifacts' deployment

2022-12-17 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu reassigned HUDI-5357:


Assignee: Raymond Xu

> Optimize release artifacts' deployment 
> ---
>
> Key: HUDI-5357
> URL: https://issues.apache.org/jira/browse/HUDI-5357
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: dependencies
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.12.2
>
>
> - Avoid duplicate upload by narrow down to bundle modules
> - Overwrite avro.version for flink bundles



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-5357) Optimize release artifacts' deployment

2022-12-17 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu closed HUDI-5357.

Resolution: Fixed

> Optimize release artifacts' deployment 
> ---
>
> Key: HUDI-5357
> URL: https://issues.apache.org/jira/browse/HUDI-5357
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: dependencies
>Reporter: Raymond Xu
>Assignee: Raymond Xu
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.12.2
>
>
> - Avoid duplicate upload by narrow down to bundle modules
> - Overwrite avro.version for flink bundles



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #7440: [HUDI-5377] Write call stack information to lock file

2022-12-17 Thread GitBox


hudi-bot commented on PR #7440:
URL: https://github.com/apache/hudi/pull/7440#issuecomment-1356232019

   
   ## CI report:
   
   * 67e64ca0d35342d303f5c0027db72ec4c14f1890 UNKNOWN
   * 391cc64f7aaabdc0f72c85fa3ac03036d09ef43a UNKNOWN
   * 63fe7e7c8a882d757cbea6a4d26b7aba4bdad748 UNKNOWN
   * a509172c60864820f6758716c4e832645a97a57f UNKNOWN
   * af1ed3ce37f1204e6a817585f9cea845c791066b Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13788)
 
   * 34b4683a3581aee7cab95fd94ac56a2c984c9001 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-5409) Avoid file index and use fs view cache in COW input format

2022-12-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-5409:
-
Labels: pull-request-available  (was: )

> Avoid file index and use fs view cache in COW input format
> --
>
> Key: HUDI-5409
> URL: https://issues.apache.org/jira/browse/HUDI-5409
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.2
>
>
> For Trino-Hive connector querying Hudi tables, we observed a perf regression 
> with latest hudi-trino-bundle vs that of version 0.8.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #7493: [HUDI-5409] Avoid file index and use fs view cache in COW input format

2022-12-17 Thread GitBox


hudi-bot commented on PR #7493:
URL: https://github.com/apache/hudi/pull/7493#issuecomment-1356229669

   
   ## CI report:
   
   * 081d6925c0bd2ee8a122a6c023b2f28fdf0532d0 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13826)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7455: [DO_NOT_MERGE] Release 0.12.2 blockers candidate

2022-12-17 Thread GitBox


hudi-bot commented on PR #7455:
URL: https://github.com/apache/hudi/pull/7455#issuecomment-1356229607

   
   ## CI report:
   
   * 9ad6939e4e9bacfb1a324bab216198a56f410c9d UNKNOWN
   * 0a530808e8e90f5a419ebee0350e792325b4057c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13827)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on issue #5552: [SUPPORT] It failed to compile raw hudi src with error "oodieTableMetadataUtil.java:[189,7] no suitable method found for collect(java.util.stream.Co

2022-12-17 Thread GitBox


danny0405 commented on issue #5552:
URL: https://github.com/apache/hudi/issues/5552#issuecomment-1356227282

   @alexeykudinkin Can we fix the compile error for jdk 11 then ? Jdk 11 should 
have the most number of users, it is better to support it though ~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] bydeath opened a new issue, #5552: [SUPPORT] It failed to compile raw hudi src with error "oodieTableMetadataUtil.java:[189,7] no suitable method found for collect(java.util.stream.Col

2022-12-17 Thread GitBox


bydeath opened a new issue, #5552:
URL: https://github.com/apache/hudi/issues/5552

   **Describe the problem you faced**
   
   My first compiling hudi src failed
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. git clone https://github.com/apache/hudi.git && cd hudi
   2. mvn clean package -DskipTests
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   [ERROR] 
hudi-src/hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:[189,7]
 no suitable method found for 
collect(java.util.stream.Collector,capture#1
 of 
?,java.util.Map>>)
   method 
java.util.stream.Stream.collect(java.util.function.Supplier,java.util.function.BiConsumer,java.util.function.BiConsumer)
 is not applicable
 (cannot infer type-variable(s) R
   (actual and formal argument lists differ in length))
   method java.util.stream.Stream.collect(java.util.stream.Collector) is not 
applicable
 (cannot infer type-variable(s) R,A
   (argument mismatch; 
java.util.stream.Collector,capture#1
 of 
?,java.util.Map>>
 cannot be converted to java.util.stream.Collector))
   [ERROR] 
hudi-src/hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java:[332,11]
 no suitable method found for 
collect(java.util.stream.Collector,capture#2
 of 
?,java.util.Map>>>)
   method 
java.util.stream.Stream.collect(java.util.function.Supplier,java.util.function.BiConsumer,java.util.function.BiConsumer)
 is not applicable
 (cannot infer type-variable(s) R
   (actual and formal argument lists differ in length))
   method java.util.stream.Stream.collect(java.util.stream.Collector) is not 
applicable
 (cannot infer type-variable(s) R,A
   (argument mismatch; 
java.util.stream.Collector,capture#2
 of 
?,java.util.Map>>>
 cannot be converted to java.util.stream.Collector))
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] lucabem commented on issue #6278: [SUPPORT] Deltastreamer fails with data and timestamp related exception after upgrading to EMR 6.5 and spark3

2022-12-17 Thread GitBox


lucabem commented on issue #6278:
URL: https://github.com/apache/hudi/issues/6278#issuecomment-1356180670

   Not in my case, Im still having this issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7355: [HUDI-5308] Hive query returns null when the where clause has a partition field

2022-12-17 Thread GitBox


hudi-bot commented on PR #7355:
URL: https://github.com/apache/hudi/pull/7355#issuecomment-1356173416

   
   ## CI report:
   
   * a5229bd6556c395f28b6ca71904c0f0a97d408ab Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13825)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7455: [DO_NOT_MERGE] Release 0.12.2 blockers candidate

2022-12-17 Thread GitBox


hudi-bot commented on PR #7455:
URL: https://github.com/apache/hudi/pull/7455#issuecomment-1356142628

   
   ## CI report:
   
   * 9ad6939e4e9bacfb1a324bab216198a56f410c9d UNKNOWN
   * d971f3e16869bfed4f1911c4eb7c5c025f3ff8e9 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13815)
 
   * 0a530808e8e90f5a419ebee0350e792325b4057c Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13827)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7455: [DO_NOT_MERGE] Release 0.12.2 blockers candidate

2022-12-17 Thread GitBox


hudi-bot commented on PR #7455:
URL: https://github.com/apache/hudi/pull/7455#issuecomment-1356139669

   
   ## CI report:
   
   * 9ad6939e4e9bacfb1a324bab216198a56f410c9d UNKNOWN
   * d971f3e16869bfed4f1911c4eb7c5c025f3ff8e9 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13815)
 
   * 0a530808e8e90f5a419ebee0350e792325b4057c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7355: [HUDI-5308] Hive query returns null when the where clause has a partition field

2022-12-17 Thread GitBox


hudi-bot commented on PR #7355:
URL: https://github.com/apache/hudi/pull/7355#issuecomment-1356139579

   
   ## CI report:
   
   * c60978aaf0dd183b05139dda6bd741ea43877f42 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13715)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13736)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13744)
 
   * a5229bd6556c395f28b6ca71904c0f0a97d408ab Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13825)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7482: [MINOR] Fix the inconsistent behavior to calculate the value count between COW and MOR

2022-12-17 Thread GitBox


hudi-bot commented on PR #7482:
URL: https://github.com/apache/hudi/pull/7482#issuecomment-1356137528

   
   ## CI report:
   
   * dd5d168ff9a755b24c53762d69d884373f47d017 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13799)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13823)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7355: [HUDI-5308] Hive query returns null when the where clause has a partition field

2022-12-17 Thread GitBox


hudi-bot commented on PR #7355:
URL: https://github.com/apache/hudi/pull/7355#issuecomment-1356137437

   
   ## CI report:
   
   * c60978aaf0dd183b05139dda6bd741ea43877f42 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13715)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13736)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13744)
 
   * a5229bd6556c395f28b6ca71904c0f0a97d408ab UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-5409) Avoid file index and use fs view cache in COW input format

2022-12-17 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5409:
--
Fix Version/s: 0.12.2

> Avoid file index and use fs view cache in COW input format
> --
>
> Key: HUDI-5409
> URL: https://issues.apache.org/jira/browse/HUDI-5409
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Priority: Major
> Fix For: 0.12.2
>
>
> For Trino-Hive connector querying Hudi tables, we observed a perf regression 
> with latest hudi-trino-bundle vs that of version 0.8.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5409) Avoid file index and use fs view cache in COW input format

2022-12-17 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5409:
--
Status: In Progress  (was: Open)

> Avoid file index and use fs view cache in COW input format
> --
>
> Key: HUDI-5409
> URL: https://issues.apache.org/jira/browse/HUDI-5409
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Priority: Major
> Fix For: 0.12.2
>
>
> For Trino-Hive connector querying Hudi tables, we observed a perf regression 
> with latest hudi-trino-bundle vs that of version 0.8.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5409) Avoid file index and use fs view cache in COW input format

2022-12-17 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5409:
--
Status: Patch Available  (was: In Progress)

> Avoid file index and use fs view cache in COW input format
> --
>
> Key: HUDI-5409
> URL: https://issues.apache.org/jira/browse/HUDI-5409
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Sagar Sumit
>Priority: Major
> Fix For: 0.12.2
>
>
> For Trino-Hive connector querying Hudi tables, we observed a perf regression 
> with latest hudi-trino-bundle vs that of version 0.8.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-5409) Avoid file index and use fs view cache in COW input format

2022-12-17 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-5409:
-

 Summary: Avoid file index and use fs view cache in COW input format
 Key: HUDI-5409
 URL: https://issues.apache.org/jira/browse/HUDI-5409
 Project: Apache Hudi
  Issue Type: Bug
Reporter: Sagar Sumit


For Trino-Hive connector querying Hudi tables, we observed a perf regression 
with latest hudi-trino-bundle vs that of version 0.8.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[hudi] branch release-0.12.2-blockers-candidate updated (d971f3e168 -> 0a530808e8)

2022-12-17 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a change to branch release-0.12.2-blockers-candidate
in repository https://gitbox.apache.org/repos/asf/hudi.git


from d971f3e168 [HUDI-5104] Add feature flag to disable HoodieFileIndex and 
fall back to HoodieROTablePathFilter (#7088)
 add 3fb6d337c4 [HUDI-5251] Split GitHub actions CI by spark and flink 
(#7265)
 add 0a530808e8 [HUDI-5357] Optimize deployment of release artifacts (#7419)

No new revisions were added by this update.

Summary of changes:
 .github/workflows/bot.yml | 54 +--
 hudi-examples/hudi-examples-flink/pom.xml |  3 +-
 hudi-flink-datasource/hudi-flink/pom.xml  |  3 +-
 packaging/hudi-flink-bundle/pom.xml   |  3 +-
 pom.xml   |  4 +++
 scripts/release/deploy_staging_jars.sh| 39 --
 6 files changed, 67 insertions(+), 39 deletions(-)



[GitHub] [hudi] yihua commented on a diff in pull request #7493: Avoid file index and use fs view cache in COW input format

2022-12-17 Thread GitBox


yihua commented on code in PR #7493:
URL: https://github.com/apache/hudi/pull/7493#discussion_r1051360692


##
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java:
##
@@ -236,28 +249,53 @@ private List 
listStatusForSnapshotMode(JobConf job,
   boolean shouldIncludePendingCommits =
   HoodieHiveUtils.shouldIncludePendingCommits(job, 
tableMetaClient.getTableConfig().getTableName());
 
-  HiveHoodieTableFileIndex fileIndex =
-  new HiveHoodieTableFileIndex(
-  engineContext,
-  tableMetaClient,
-  props,
-  HoodieTableQueryType.SNAPSHOT,
-  partitionPaths,
-  queryCommitInstant,
-  shouldIncludePendingCommits);
-
-  Map> partitionedFileSlices = 
fileIndex.listFileSlices();
-
-  Option virtualKeyInfoOpt = 
getHoodieVirtualKeyInfo(tableMetaClient);
-
-  targetFiles.addAll(
-  partitionedFileSlices.values()
-  .stream()
-  .flatMap(Collection::stream)
-  .filter(fileSlice -> checkIfValidFileSlice(fileSlice))
-  .map(fileSlice -> createFileStatusUnchecked(fileSlice, 
fileIndex, virtualKeyInfoOpt))
-  .collect(Collectors.toList())
-  );
+  if (conf.getBoolean(ENABLE.key(), DEFAULT_METADATA_ENABLE_FOR_READERS) 
&& HoodieTableMetadataUtil.isFilesPartitionAvailable(tableMetaClient)) {
+HiveHoodieTableFileIndex fileIndex =
+new HiveHoodieTableFileIndex(
+engineContext,
+tableMetaClient,
+props,
+HoodieTableQueryType.SNAPSHOT,
+partitionPaths,
+queryCommitInstant,
+shouldIncludePendingCommits);
+
+Map> partitionedFileSlices = 
fileIndex.listFileSlices();
+
+Option virtualKeyInfoOpt = 
getHoodieVirtualKeyInfo(tableMetaClient);
+
+targetFiles.addAll(
+partitionedFileSlices.values()
+.stream()
+.flatMap(Collection::stream)
+.filter(fileSlice -> checkIfValidFileSlice(fileSlice))
+.map(fileSlice -> createFileStatusUnchecked(fileSlice, 
fileIndex, virtualKeyInfoOpt))
+.collect(Collectors.toList())
+);
+  } else {
+HoodieTimeline timeline = tableMetaClient.getActiveTimeline();
+
+try {
+  HoodieTableFileSystemView fsView = 
fsViewCache.computeIfAbsent(tableMetaClient, hoodietableMetaClient ->

Review Comment:
   +1 Essentially, this is the old logic from 0.8.0 release, which the fallback 
when metadata table is not used: 
https://github.com/apache/hudi/blob/release-0.8.0/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieInputFormatUtils.java#L443



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a diff in pull request #7493: Avoid file index and use fs view cache in COW input format

2022-12-17 Thread GitBox


nsivabalan commented on code in PR #7493:
URL: https://github.com/apache/hudi/pull/7493#discussion_r1051358907


##
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java:
##
@@ -236,28 +249,53 @@ private List 
listStatusForSnapshotMode(JobConf job,
   boolean shouldIncludePendingCommits =
   HoodieHiveUtils.shouldIncludePendingCommits(job, 
tableMetaClient.getTableConfig().getTableName());
 
-  HiveHoodieTableFileIndex fileIndex =
-  new HiveHoodieTableFileIndex(
-  engineContext,
-  tableMetaClient,
-  props,
-  HoodieTableQueryType.SNAPSHOT,
-  partitionPaths,
-  queryCommitInstant,
-  shouldIncludePendingCommits);
-
-  Map> partitionedFileSlices = 
fileIndex.listFileSlices();
-
-  Option virtualKeyInfoOpt = 
getHoodieVirtualKeyInfo(tableMetaClient);
-
-  targetFiles.addAll(
-  partitionedFileSlices.values()
-  .stream()
-  .flatMap(Collection::stream)
-  .filter(fileSlice -> checkIfValidFileSlice(fileSlice))
-  .map(fileSlice -> createFileStatusUnchecked(fileSlice, 
fileIndex, virtualKeyInfoOpt))
-  .collect(Collectors.toList())
-  );
+  if (conf.getBoolean(ENABLE.key(), DEFAULT_METADATA_ENABLE_FOR_READERS) 
&& HoodieTableMetadataUtil.isFilesPartitionAvailable(tableMetaClient)) {
+HiveHoodieTableFileIndex fileIndex =
+new HiveHoodieTableFileIndex(
+engineContext,
+tableMetaClient,
+props,
+HoodieTableQueryType.SNAPSHOT,
+partitionPaths,
+queryCommitInstant,
+shouldIncludePendingCommits);
+
+Map> partitionedFileSlices = 
fileIndex.listFileSlices();
+
+Option virtualKeyInfoOpt = 
getHoodieVirtualKeyInfo(tableMetaClient);
+
+targetFiles.addAll(
+partitionedFileSlices.values()
+.stream()
+.flatMap(Collection::stream)
+.filter(fileSlice -> checkIfValidFileSlice(fileSlice))
+.map(fileSlice -> createFileStatusUnchecked(fileSlice, 
fileIndex, virtualKeyInfoOpt))
+.collect(Collectors.toList())
+);
+  } else {
+HoodieTimeline timeline = tableMetaClient.getActiveTimeline();
+
+try {
+  HoodieTableFileSystemView fsView = 
fsViewCache.computeIfAbsent(tableMetaClient, hoodietableMetaClient ->
+  
FileSystemViewManager.createInMemoryFileSystemViewWithTimeline(engineContext, 
hoodietableMetaClient, buildMetadataConfig(job), timeline));
+
+  List filteredBaseFiles = new ArrayList<>();
+
+  for (Path p : entry.getValue()) {
+String relativePartitionPath = 
FSUtils.getRelativePartitionPath(new Path(tableMetaClient.getBasePath()), p);
+List matched = 
fsView.getLatestBaseFiles(relativePartitionPath).collect(Collectors.toList());

Review Comment:
   minor. variable naming.
   ```
   List latestBaseFiles
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7493: Avoid file index and use fs view cache in COW input format

2022-12-17 Thread GitBox


hudi-bot commented on PR #7493:
URL: https://github.com/apache/hudi/pull/7493#issuecomment-1356114195

   
   ## CI report:
   
   * 081d6925c0bd2ee8a122a6c023b2f28fdf0532d0 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13826)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7493: Avoid file index and use fs view cache in COW input format

2022-12-17 Thread GitBox


hudi-bot commented on PR #7493:
URL: https://github.com/apache/hudi/pull/7493#issuecomment-1356112261

   
   ## CI report:
   
   * 081d6925c0bd2ee8a122a6c023b2f28fdf0532d0 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org