Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435780600


##
website/blog/2023-12-01-Getting-started-with-Apache-Hudi.mdx:
##
@@ -0,0 +1,20 @@
+---
+title: "Getting started with Apache Hudi"
+excerpt: "Getting started with Apache Hudi"
+author: DataCouch
+category: blog
+image: /assets/images/blog/2023-12-01-Getting-started-with-Apache-Hudi.png
+tags:
+- apache hudi
+- spark

Review Comment:
   apache spark



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435780699


##
website/blog/2023-12-06-Apache-Hudi-From-Zero-To-One-blog-7.mdx:
##
@@ -0,0 +1,26 @@
+---
+title: "Apache Hudi: From Zero To One (7/10)"
+excerpt: "Concurrently run writers and table services"
+author: Shiyan Xu
+category: blog
+image: /assets/images/blog/2023-12-06-Apache-Hudi-From-Zero-To-One-blog-7.png
+tags:
+- blog
+- apache hudi
+- concurrency
+- dataumagic

Review Comment:
   dataumagic -> datumagic



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435780255


##
website/blog/2023-11-28-Apache-Hudi-Part-1-History-Getting-Started.mdx:
##
@@ -0,0 +1,21 @@
+---
+title: "Apache Hudi (Part 1): History, Getting Started"
+excerpt: "Apache Hudi (Part 1): History, Getting Started"
+author: Dipankar Mazumdar
+category: blog
+image: 
/assets/images/blog/2023-11-28-Apache-Hudi-Part-1-History-Getting-Started.png

Review Comment:
   Looks like the image file is missed from commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435778261


##
website/blog/2023-11-22-Introducing-Apache-Hudi-support-with-AWS-Glue-crawlers.mdx:
##
@@ -0,0 +1,16 @@
+---
+title: "Introducing Apache Hudi support with AWS Glue crawlers"
+excerpt: "Introducing Apache Hudi support with AWS Glue crawlers"
+author: Noritaka Sekiyama, Kyle Duong, Sandeep Adwankar
+category: blog
+image: 
/assets/images/blog/2023-11-22-Introducing-Apache-Hudi-support-with-AWS-Glue-crawlers.png
+tags:
+- apache hudi
+- aws

Review Comment:
   Since there is aws glue we can remove this one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435778120


##
website/blog/2023-11-19-Hudi-Streamer-DeltaStreamer-Hands-On-Guide-Local-Ingestion-from-Parquet-Source.mdx:
##
@@ -0,0 +1,20 @@
+---
+title: "Hudi Streamer (Delta Streamer) Hands-On Guide: Local Ingestion from 
Parquet Source"
+excerpt: "Hudi Streamer (Delta Streamer) Hands-On Guide: Local Ingestion from 
Parquet Source"
+author: Soumil Shah
+category: blog
+image: 
/assets/images/blog/2023-11-19-Hudi-Streamer-DeltaStreamer-Hands-On-Guide-Local-Ingestion-from-Parquet-Source.png
+tags:
+- apache hudi
+- hudi streamer
+- how-to
+- parquet

Review Comment:
   apache parquet



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435777975


##
website/blog/2023-11-14-What-is-an-Open-Table-Format-and-Why-to-use-one.mdx:
##
@@ -0,0 +1,19 @@
+---
+title: "What is an Open Table Format? & Why to use one?"

Review Comment:
   Lets remove this one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435777964


##
website/blog/2023-11-14-What-is-an-Open-Table-Format-and-Why-to-use-one.mdx:
##
@@ -0,0 +1,19 @@
+---
+title: "What is an Open Table Format? & Why to use one?"

Review Comment:
   Skip this. This is not talking much about Hudi.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435777589


##
website/blog/2023-11-13-Apache-Hudi-From-Zero-To-One-blog-6.mdx:
##
@@ -0,0 +1,27 @@
+---
+title: "Apache Hudi: From Zero To One (6/10)"
+excerpt: "Demystify clustering and space-filling curves"
+author: Shiyan Xu
+category: blog
+image: /assets/images/blog/2023-11-16-Apache-Hudi-From-Zero-To-One-blog-6.png

Review Comment:
   2023-11-13 for the image name?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435776395


##
website/blog/2023-09-22-Exploring-the-Architecture-of-Apache-Iceberg-Delta-Lake-and-Apache-Hudi.mdx:
##
@@ -0,0 +1,21 @@
+---
+title: "Exploring the Architecture of Apache Iceberg, Delta Lake, and Apache 
Hudi"
+excerpt: "Exploring the Architecture of Apache Iceberg, Delta Lake, and Apache 
Hudi"
+author: Alex Merced
+category: blog
+image: 
/assets/images/blog/2023-09-22-Exploring-the-Architecture-of-Apache-Iceberg-Delta-Lake-and-Apache-Hudi.png
+tags:
+- apache hudi
+- apache iceberg
+- blog
+- apache hudi

Review Comment:
   repeated?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7140] [DNM] Trial Patch to test CI run [hudi]

2023-12-23 Thread via GitHub


hudi-bot commented on PR #10176:
URL: https://github.com/apache/hudi/pull/10176#issuecomment-1868450660

   
   ## CI report:
   
   * d1a43dc3694b6a51aa830fe2b78340503c6909b5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21688)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435775000


##
website/blog/2023-09-06-Apache-Hudi-From-Zero-To-One-blog-2.mdx:
##
@@ -0,0 +1,31 @@
+---
+title: "Apache Hudi: From Zero To One (2/10)"
+excerpt: "Dive into read operation flow and query types"
+author: Shiyan Xu
+category: blog
+image: /assets/images/blog/2023-09-06-Apache-Hudi-From-Zero-To-One-blog-2.png
+tags:
+- blog
+- apache hudi
+- query types

Review Comment:
   `query types` -> `queries`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435774973


##
website/blog/2023-09-06-Apache-Hudi-From-Zero-To-One-blog-2.mdx:
##
@@ -0,0 +1,31 @@
+---
+title: "Apache Hudi: From Zero To One (2/10)"
+excerpt: "Dive into read operation flow and query types"
+author: Shiyan Xu
+category: blog
+image: /assets/images/blog/2023-09-06-Apache-Hudi-From-Zero-To-One-blog-2.png
+tags:
+- blog
+- apache hudi
+- query types
+- read operations

Review Comment:
   change to `reads`. Lets not introduce word families as tags for the same 
word. Lets try to keep it to similar tag we have done that already. Pease 
ensure you always refer to the all tags to find the closest tag that already 
exists unless absolutely needed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435774727


##
website/blog/2023-09-06-Apache-Hudi-From-Zero-To-One-blog-2.mdx:
##
@@ -0,0 +1,31 @@
+---
+title: "Apache Hudi: From Zero To One (2/10)"
+excerpt: "Dive into read operation flow and query types"
+author: Shiyan Xu
+category: blog
+image: /assets/images/blog/2023-09-06-Apache-Hudi-From-Zero-To-One-blog-2.png
+tags:
+- blog
+- apache hudi
+- query types
+- read operations
+- datumagic
+- spark

Review Comment:
   qualify fully -> `apache spark`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS- update blogs for new content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10383:
URL: https://github.com/apache/hudi/pull/10383#discussion_r1435770831


##
website/blog/2022-12-09-Apache-Hudi-2022-A-year-in-Review.md:
##


Review Comment:
   I was mistaken. Seems like this blog is already there - 
https://hudi.apache.org/blog/2022/12/29/Apache-Hudi-2022-A-Year-In-Review under 
2022-12-29 date. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7140] [DNM] Trial Patch to test CI run [hudi]

2023-12-23 Thread via GitHub


hudi-bot commented on PR #10176:
URL: https://github.com/apache/hudi/pull/10176#issuecomment-1868430646

   
   ## CI report:
   
   * 73914cebbda35a22a2ede05065732c6bc9e03448 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21635)
 
   * d1a43dc3694b6a51aa830fe2b78340503c6909b5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21688)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



svn commit: r66288 - in /dev/hudi/hudi-0.14.1-rc1: hudi-0.14.1-rc1.src.tgz hudi-0.14.1-rc1.src.tgz.asc hudi-0.14.1-rc1.src.tgz.sha512

2023-12-23 Thread sivabalan
Author: sivabalan
Date: Sun Dec 24 04:25:05 2023
New Revision: 66288

Log:
Adding rc1 source release

Modified:
dev/hudi/hudi-0.14.1-rc1/hudi-0.14.1-rc1.src.tgz
dev/hudi/hudi-0.14.1-rc1/hudi-0.14.1-rc1.src.tgz.asc
dev/hudi/hudi-0.14.1-rc1/hudi-0.14.1-rc1.src.tgz.sha512

Modified: dev/hudi/hudi-0.14.1-rc1/hudi-0.14.1-rc1.src.tgz
==
Binary files - no diff available.

Modified: dev/hudi/hudi-0.14.1-rc1/hudi-0.14.1-rc1.src.tgz.asc
==
--- dev/hudi/hudi-0.14.1-rc1/hudi-0.14.1-rc1.src.tgz.asc (original)
+++ dev/hudi/hudi-0.14.1-rc1/hudi-0.14.1-rc1.src.tgz.asc Sun Dec 24 04:25:05 
2023
@@ -1,16 +1,16 @@
 -BEGIN PGP SIGNATURE-
 
-iQIzBAABCAAdFiEErNUqBmM9s7LH0OpWQsotPtWJUSIFAmWD5yoACgkQQsotPtWJ
-USL7Hg//VhxqeYYSIHoJbPb0VLgVy+MPd9QlKzhT507Mcey6+93Y3gFefIlnoocv
-r69krqu/uY4CrEPTSZRNzlBp0ChOm4NUa8Ws6sYVNQ/3ihBrmbgIzxmJV9w8e2Yd
-EDeD/a30G3ebeoqPSoOwzBQP6YpyYiJFL9l3sFjptB7IdmzjxZVyVQixDPJoBDT/
-F80Hg0DexK2ZUtTtZcRuwVMnfbDtQDMAhR5FnkIhOtn+7tsifLtlt2KiRCYrLc6L
-vL+zZiPU/iq5nPgltErmJKHaaX/3QzCsx++QqQuoMnIqsE7LBmmrw6jANTv4Cwna
-eDPMoBHZoC+eAJR7WdQOsYGNe/lGvYiKy68gkRqqhbHE/NDWhZ1Kvx4GwGSpAsAv
-GWekBlh+2rx3wFTurmLpzQ9w8mxfy3vJqRxWPtOaEuxhsF3D/BpQoJc8oxoeVl9d
-r4M30Xrn3s6wWW8P8DbdEbJr6K6xwQh5WSU2+s3IEDRvbqHrGXw4Rkngj1QQuGze
-hbh+EQLIV8J+COEvfvEgaM8iX6obLEsHISqNZCCJOkojujzZHdVBPYYeGjvC5cLi
-cxIF+JG47Q1TxVrt//lKVMw4R5BFseJcG8R3BWBeFvo6bdt1ZKscCQuRKJb6eNjf
-mUgQYPcTKozCratbifopMZ+8O98mKlWeO7wQCZjCS1sYfw8ABsU=
-=tLVq
+iQIzBAABCAAdFiEErNUqBmM9s7LH0OpWQsotPtWJUSIFAmWHogYACgkQQsotPtWJ
+USLssg/+NNPHJhgP7vf49irD8rL1z5cb22iaegnRLdxg3PYdHHHILSuruF6E8+iL
+4T83MNuKFNrWAtWO6SyHAyTebjri+9dxmtqzqyksjLf2qF5s8opTMStLMoVLksMu
+tQalrmPmkIxLsHpmD62xgxeKvP4jMM/lKZtmD6mlK1pLFCFF/DGZ2hfk5pnit6KO
+2zU7l2dFHjNF+4/WZStzX80fFUAGDuCkZAfwQSxaMKRGTcb+kiM3FgMfCvh4O/hx
+siS9EX1x78cLhUymihohkmswrfz6hJc6ykD8Jm5DAvnl2oLbNzyi3NR5JAZRe3bT
++MxF7TsmFCHnRVIBWgYQZ1FjMMavosWaSrN9I1eq6NEnY5xaIdif+w4n81XEJYxb
+Vecrm0ZlTSrCS2ydVoNbZVy0EraOxlMLkPubz6XOezQVmREV05xJIX9RVf8WTtAt
+tkBsskKTMYNjJpr3rjfn1YsgpiqvFn0d5UhQ/vPE8cJ5TGGzDscLxrpSViLNSG4d
+UW3cWfl0QCnqbhXhc4PjdF9+bDzVkT1y3bHrJ1oYbVisIj3Q8YGX1mQSy0t/N+Ky
+ESySh31dofmT3CVARzSWbTfyK53oTsZYDb+BWBWUziectgue36tlEw6Gr00BcPwr
+k1p6gYL/CFvDPZJK1JMzy7KVF+CVYABRLtiaKL/WcGd2H6jyEsY=
+=4T0B
 -END PGP SIGNATURE-

Modified: dev/hudi/hudi-0.14.1-rc1/hudi-0.14.1-rc1.src.tgz.sha512
==
--- dev/hudi/hudi-0.14.1-rc1/hudi-0.14.1-rc1.src.tgz.sha512 (original)
+++ dev/hudi/hudi-0.14.1-rc1/hudi-0.14.1-rc1.src.tgz.sha512 Sun Dec 24 04:25:05 
2023
@@ -1 +1 @@
-ca9facc49f462008a84bd6ceb6ae8170a10f49d8b0af6fa4aa8058676fabf77d8931005c6cb56be86ca941567b1ce1d551fd1d06e72d526ce7e8ab26a3d59b3b
  hudi-0.14.1-rc1.src.tgz
+4940fe3c108f9899a3fa1da543990fe88254b158c104d09d9eec86bf69375a4a29909c2cb6d377dcb070242021f87237cd232c6dfd27c6247135cdb912626e42
  hudi-0.14.1-rc1.src.tgz




Re: [PR] [HUDI-7140] [DNM] Trial Patch to test CI run [hudi]

2023-12-23 Thread via GitHub


hudi-bot commented on PR #10176:
URL: https://github.com/apache/hudi/pull/10176#issuecomment-1868429824

   
   ## CI report:
   
   * 73914cebbda35a22a2ede05065732c6bc9e03448 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21635)
 
   * d1a43dc3694b6a51aa830fe2b78340503c6909b5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(hudi) annotated tag release-0.14.1-rc1 updated (52309055f0c -> e3990f4860d)

2023-12-23 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a change to annotated tag release-0.14.1-rc1
in repository https://gitbox.apache.org/repos/asf/hudi.git


*** WARNING: tag release-0.14.1-rc1 was modified! ***

from 52309055f0c (commit)
  to e3990f4860d (tag)
 tagging 52309055f0ccac2f860c9f784e0610095f7d5d1d (commit)
 replaces release-0.14.0
  by sivabalan
  on Sat Dec 23 19:51:29 2023 -0800

- Log -
0.14.1
-BEGIN PGP SIGNATURE-

iQIzBAABCAAdFiEErNUqBmM9s7LH0OpWQsotPtWJUSIFAmWHqsEACgkQQsotPtWJ
USLdow//Xv5WthkSvB0lXewJCzx9BLhYQF3bSjzx42OXyAs4exThiQF+F8CD7Ny+
HCLc0lP2CcE1w4P2Fd2uz+aZD3fAMasRWyyM+dH3zpbGKtpfHq3WG7fBLxCxw0eH
naBZOaT19IW0jleASlcKu4UVGVQGQGFmk8U3gSQxkraoUneMMuVKLl98KNpJ3YS1
PshXuZv/CFLybEQcbf0h0/PcexLs4SiGqxiKG79bqaGH6gROmwv+po+5EgZfU0ej
q4NKHL7UVYbndgFciz+JUZPlMT/N+wOK4ygR7WPTZ2pEdrvInfhU3MJDojbRdDum
JcXrAaPau5PsDElolTGhH1+rCQ0JBa0G/Sdf2SRAYNNUym4BbJEDOTnMm4ZfVYdZ
MQB3+zGwMXztzbiLKi05jLOR4sYxLD4FVcV2oowrqUP9JMZekGBoOuuAc/spzjsj
mj3/NA54hEA14g6Duy9ln9v6GOFsP1MQV7eMYV1H9mcbMqg8tGPol5lLheUqdOy7
avp572XnAEJC+YgyOXXN8Wk2cDelAouB7CiVP4qzAHA6qX7bxuTen1ppitE/O1Vb
jV+JcbqQH3cBmr2akTWEkTmf1oxPcJEAa6yi0XEPDDeMJZ7MpPqdo2+LTN/jiuA7
00wjVdaA/Uj3h0rejkCyNnp/jz0JBj/0y8YrIe2h7vyweJnO8jg=
=cXcF
-END PGP SIGNATURE-
---


No new revisions were added by this update.

Summary of changes:



(hudi) annotated tag release-0.14.1-rc1 deleted (was 4e883eb3881)

2023-12-23 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a change to annotated tag release-0.14.1-rc1
in repository https://gitbox.apache.org/repos/asf/hudi.git


*** WARNING: tag release-0.14.1-rc1 was deleted! ***

   tag was  4e883eb3881

The revisions that were on this annotated tag are still contained in
other references; therefore, this change does not discard any commits
from the repository.



(hudi) branch release-0.14.1 updated: Revert "Add cachedSchema per batch, fix idempotency with getSourceSchema calls"

2023-12-23 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a commit to branch release-0.14.1
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/release-0.14.1 by this push:
 new 52309055f0c Revert "Add cachedSchema per batch, fix idempotency with 
getSourceSchema calls"
52309055f0c is described below

commit 52309055f0ccac2f860c9f784e0610095f7d5d1d
Author: sivabalan 
AuthorDate: Sat Dec 23 18:59:55 2023 -0800

Revert "Add cachedSchema per batch, fix idempotency with getSourceSchema 
calls"

This reverts commit dff42eb468cafe43e9208c0ae738c91184ded673.
---
 .../utilities/schema/FilebasedSchemaProvider.java  | 29 +
 .../hudi/utilities/schema/SchemaProvider.java  |  5 ---
 .../utilities/schema/SchemaRegistryProvider.java   | 36 +-
 .../apache/hudi/utilities/streamer/StreamSync.java |  5 +--
 .../schema/TestSchemaRegistryProvider.java | 20 
 5 files changed, 16 insertions(+), 79 deletions(-)

diff --git 
a/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/FilebasedSchemaProvider.java
 
b/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/FilebasedSchemaProvider.java
index 9dbf66325d7..3ca97b01f95 100644
--- 
a/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/FilebasedSchemaProvider.java
+++ 
b/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/FilebasedSchemaProvider.java
@@ -45,11 +45,6 @@ public class FilebasedSchemaProvider extends SchemaProvider {
 
   private final FileSystem fs;
 
-  private final String sourceFile;
-  private final String targetFile;
-  private final boolean shouldSanitize;
-  private final String invalidCharMask;
-
   protected Schema sourceSchema;
 
   protected Schema targetSchema;
@@ -57,21 +52,18 @@ public class FilebasedSchemaProvider extends SchemaProvider 
{
   public FilebasedSchemaProvider(TypedProperties props, JavaSparkContext jssc) 
{
 super(props, jssc);
 checkRequiredConfigProperties(props, 
Collections.singletonList(FilebasedSchemaProviderConfig.SOURCE_SCHEMA_FILE));
-this.sourceFile = getStringWithAltKeys(props, 
FilebasedSchemaProviderConfig.SOURCE_SCHEMA_FILE);
-this.targetFile = getStringWithAltKeys(props, 
FilebasedSchemaProviderConfig.TARGET_SCHEMA_FILE, sourceFile);
-this.shouldSanitize = SanitizationUtils.shouldSanitize(props);
-this.invalidCharMask = SanitizationUtils.getInvalidCharMask(props);
+String sourceFile = getStringWithAltKeys(props, 
FilebasedSchemaProviderConfig.SOURCE_SCHEMA_FILE);
+boolean shouldSanitize = SanitizationUtils.shouldSanitize(props);
+String invalidCharMask = SanitizationUtils.getInvalidCharMask(props);
 this.fs = FSUtils.getFs(sourceFile, jssc.hadoopConfiguration(), true);
-this.sourceSchema = parseSchema(this.sourceFile);
+this.sourceSchema = readAvroSchemaFromFile(sourceFile, this.fs, 
shouldSanitize, invalidCharMask);
 if (containsConfigProperty(props, 
FilebasedSchemaProviderConfig.TARGET_SCHEMA_FILE)) {
-  this.targetSchema = parseSchema(this.targetFile);
+  this.targetSchema = readAvroSchemaFromFile(
+  getStringWithAltKeys(props, 
FilebasedSchemaProviderConfig.TARGET_SCHEMA_FILE),
+  this.fs, shouldSanitize, invalidCharMask);
 }
   }
 
-  private Schema parseSchema(String schemaFile) {
-return readAvroSchemaFromFile(schemaFile, this.fs, shouldSanitize, 
invalidCharMask);
-  }
-
   @Override
   public Schema getSourceSchema() {
 return sourceSchema;
@@ -95,11 +87,4 @@ public class FilebasedSchemaProvider extends SchemaProvider {
 }
 return SanitizationUtils.parseAvroSchema(schemaStr, sanitizeSchema, 
invalidCharMask);
   }
-
-  // Per write batch, refresh the schemas from the file
-  @Override
-  public void refresh() {
-this.sourceSchema = parseSchema(this.sourceFile);
-this.targetSchema = parseSchema(this.targetFile);
-  }
 }
diff --git 
a/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/SchemaProvider.java
 
b/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/SchemaProvider.java
index 5c8ca8f6c1b..2410798d355 100644
--- 
a/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/SchemaProvider.java
+++ 
b/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/SchemaProvider.java
@@ -56,9 +56,4 @@ public abstract class SchemaProvider implements Serializable {
 // by default, use source schema as target for hoodie table as well
 return getSourceSchema();
   }
-
-  //every schema provider has the ability to refresh itself, which will mean 
something different per provider.
-  public void refresh() {
-
-  }
 }
diff --git 
a/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/SchemaRegistryProvider.java
 
b/hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/SchemaRegistryProvider.java
index f31e867e96e..c3541e6aab0 100644
--- 

Re: [PR] [HUDI-3016][RFC-43] Proposal to implement Table Service Manager [hudi]

2023-12-23 Thread via GitHub


zyclove commented on PR #4309:
URL: https://github.com/apache/hudi/pull/4309#issuecomment-1868420871

   @xushiyan @yuzhaojing @danny0405 
   Hi, Can version 1.0 support this feature? This feature is very necessary. 
Please push forward the progress.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Bulk insert does not handle hoodie metrics [hudi]

2023-12-23 Thread via GitHub


parisni commented on issue #10395:
URL: https://github.com/apache/hudi/issues/10395#issuecomment-1868406313

   so I did take a look into the code, and I don't see reason why bulk-insert 
operation would not report metrics. As other operation it goes into that path 
https://github.com/apache/hudi/blob/c2da8aaa5fadb1b3984f6fde2a034c806b501fc5/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DefaultSource.scala#L146
   which among all reports the metrics 
https://github.com/apache/hudi/blob/c2da8aaa5fadb1b3984f6fde2a034c806b501fc5/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/Metrics.java#L118
   
   I also can confirm when I set `hoodie.metrics.reporter.type=CONSOLE` 
,bulk_insert logs the metrics.
   The problem comes when using datadog as a reporter, it works fine with any 
operation except bulk_insert.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(hudi) branch asf-site updated: DOCS-added-video-content (#10385)

2023-12-23 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository.

bhavanisudha pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 1c88cf3e397 DOCS-added-video-content (#10385)
1c88cf3e397 is described below

commit 1c88cf3e39746f03cf6db35c115a5feb716cd53e
Author: nadine farah 
AuthorDate: Sat Dec 23 08:45:22 2023 -0800

DOCS-added-video-content (#10385)

* initial commit for content video add

added soumil's videos

updated tags and added videos

updated author tags

* More fixes

- change mdx to md format
- fix tag inconsistencies
- delete irrelevant guides
- change thumbnails

-

Co-authored-by: Bhavani Sudha Saktheeswaran 
<2179254+bhasu...@users.noreply.github.com>
---
 README.md  |   4 +++-
 ...-Serverless-Architecture-in-Hudi-Data-Lakes.png | Bin 0 -> 247259 bytes
 ...Guide-Local-Ingestion-from-Parquet-Source-1.png | Bin 0 -> 122866 bytes
 ...-On-Guide-Local-Ingestion-from-CSV-Source-2.png | Bin 0 -> 122377 bytes
 ...bles-using-Hudi-MultiTable-Delta-Streamer-3.png | Bin 0 -> 119607 bytes
 ...l-from-Postgres-to-Hudi-using-deltastreamer.png | Bin 0 -> 842209 bytes
 ...mer-in-continous-Mode-and-SQL-transformer-5.png | Bin 0 -> 127113 bytes
 ...ngest-data-from-Kafka-Topic-Hands-on-Labs-6.png | Bin 0 -> 122382 bytes
 .../video_blogs/2023-11-24-hudi-table-types.png| Bin 0 -> 323601 bytes
 ...zium-kafka-schema-registry-deltastreamer-7a.png | Bin 0 -> 230759 bytes
 ...zium-kafka-schema-registry-deltastreamer-7b.png | Bin 0 -> 55192 bytes
 ...tadata-table-Record-Level-Index-HBase-Index.png | Bin 0 -> 114468 bytes
 ...Streamer-in-Continuous-Mode-Hands-on-Labs-8.png | Bin 0 -> 126485 bytes
 ...ache-Hudi-DeltaStreamer-with-Hands-on-Lab-9.png | Bin 0 -> 380565 bytes
 ...-in-Incremental-Fashion-Bronze-to-Silver-10.png | Bin 0 -> 407136 bytes
 ...r-on-Local-Machine-for-Begineers-Easy-Setup.png | Bin 0 -> 124505 bytes
 ...ift-Server-and-Hudi-with-Beeline-in-Minutes.png | Bin 0 -> 125195 bytes
 ...ng-and-AvroKafkaSource-Consumption-11-Guide.png | Bin 0 -> 390151 bytes
 ...Data-using-Hue-and-Presto-CLI-Hands-on-Labs.png | Bin 0 -> 124038 bytes
 ...0-14-and-RLI-on-AWS-Glue-Step-by-Step-Guide.png | Bin 0 -> 123948 bytes
 ...d_Real_Time_Apache_Hudi_Transaction_Datalake.md |   2 +-
 ...eltastreamer_and_AWS_DMS_Hands_on_Lab_Part_1.md |   2 +-
 ...eltastreamer_and_AWS_DMS_Hands_on_Lab_Part_2.md |   2 +-
 ...eltastreamer_and_AWS_DMS_Hands_on_Lab_Part_3.md |   2 +-
 ...eltastreamer_and_AWS_DMS_Hands_on_Lab_Part_4.md |   2 +-
 ...eltastreamer_and_AWS_DMS_Hands_on_Lab_Part_5.md |   2 +-
 ...R_Serverless_Hands_on_Lab_step_by_step_guide.md |   2 +-
 ...t_Driven_Approach_using_Lambdas_Event_Bridge.md |   2 +-
 ...t_Apache_Hudi_Transformers_with_Hands_on_Lab.md |   2 +-
 ...c-Data-Platforms-Like-a-Pro-Final-Part-Demo.md} |   1 -
 ...m-Data-Processing-with-Python-Hands-on-Labs.md} |   0
 ...d-Apache-Flink-Hands-on-Guide-for-Beginners.md} |   0
 ...n-S3-with-Apache-Flink-CDC-Connector-Python.md} |   0
 ...ional-Datalakes-on-S3-using-PyFLink-Locally.md} |   0
 ...nerating-Primary-Keys-for-Modern-Data-Lakes.md} |   0
 ...h-DynamoDB-for-Faster-Commit-Time-Retrieval.md} |   0
 ...16-Hudi-0-14-0-Deep-Dive-Record-Level-Index.md} |   0
 ...-Course-for-beginner-Operations-Type-Part-5.md} |   0
 ...r-Data-Lake-using-Elastic-Search-and-Kibana.md} |   0
 ...our-Medallion-Architecture-with-Apache-Hudi.md} |   0
 ...g-Serverless-Architecture-in-Hudi-Data-Lakes.md |  15 +++
 ...-Guide-Local-Ingestion-from-Parquet-Source-1.md |  18 ++
 ...s-On-Guide-Local-Ingestion-from-CSV-Source-2.md |  20 
 ...ables-using-Hudi-MultiTable-Delta-Streamer-3.md |  16 
 ...ll-from-Postgres-to-Hudi-using-deltastreamer.md |  16 
 ...amer-in-continous-Mode-and-SQL-transformer-5.md |  17 +
 ...ingest-data-from-Kafka-Topic-Hands-on-Labs-6.md |  19 +++
 website/videoBlog/2023-11-24-hudi-table-types.md   |  16 
 ...ezium-kafka-schema-registry-deltastreamer-7a.md |  21 +
 ...ezium-kafka-schema-registry-deltastreamer-7b.md |  20 
 ...etadata-table-Record-Level-Index-HBase-Index.md |  17 +
 ...aStreamer-in-Continuous-Mode-Hands-on-Labs-8.md |  19 +++
 ...pache-Hudi-DeltaStreamer-with-Hands-on-Lab-9.md |  17 +
 ...e-in-Incremental-Fashion-Bronze-to-Silver-10.md |  20 
 ...er-on-Local-Machine-for-Begineers-Easy-Setup.md |  19 +++
 ...rift-Server-and-Hudi-with-Beeline-in-Minutes.md |  18 ++
 ...ing-and-AvroKafkaSource-Consumption-11-Guide.md |  20 
 ...-Data-using-Hue-and-Presto-CLI-Hands-on-Labs.md |  19 +++
 

Re: [PR] DOCS-added-video-content [hudi]

2023-12-23 Thread via GitHub


bhasudha merged PR #10385:
URL: https://github.com/apache/hudi/pull/10385


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS-added-video-content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10385:
URL: https://github.com/apache/hudi/pull/10385#discussion_r1435645589


##
website/videoBlog/2023-12-19-How-to-Use-Apache-Hudi-0-14-and-RLI-on-AWS-Glue-Step-by-Step-Guide.mdx:
##
@@ -0,0 +1,18 @@
+---
+title: "How to Use Apache Hudi 0.14 and RLI (record level index) on AWS Glue 
Step by Step Guide"
+last_modified_at: 2023-12-20T16:54:38.964863-07:00
+authors:
+- name: Soumil Shah
+category: blog
+image: 
/assets/images/video_blogs/2023-12-19-How-to-Use-Apache-Hudi-0-14-and-RLI-on-AWS-Glue-Step-by-Step-Guide.png
+navigate: "https://www.youtube.com/watch?v=HJ6QQN408AE;
+tags:
+- guide
+- beginner
+- record level index

Review Comment:
   add `indexing` tag



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS-added-video-content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10385:
URL: https://github.com/apache/hudi/pull/10385#discussion_r1435645468


##
website/videoBlog/2023-12-12-Apache-Hudi-DeltaStreamer-in-Action-Python-Publishing-and-AvroKafkaSource-Consumption-11-Guide.mdx:
##
@@ -0,0 +1,18 @@
+---
+title: "Apache Hudi Delta Streamer in Action: Python Publishing and 
AvroKafkaSource Consumption (#11 Guide)"
+last_modified_at: 2023-12-20T16:54:38.964863-07:00
+authors:
+- name: Soumil Shah
+category: blog
+image: 
/assets/images/video_blogs/2023-12-12-Apache-Hudi-DeltaStreamer-in-Action-Python-Publishing-and-AvroKafkaSource-Consumption-11-Guide.png
+navigate: "https://www.youtube.com/watch?v=FSpt4jSH_O0;
+tags:
+- guide
+- beginner
+- deltastreamer
+- AvroKafkaSource

Review Comment:
   avoid class names. These are implementation details



##
website/videoBlog/2023-12-16-Learn-How-to-Setup-Hudi-on-EMR-with-Hive-and-Query-Data-using-Hue-and-Presto-CLI-Hands-on-Labs.mdx:
##
@@ -0,0 +1,21 @@
+---
+title: "Learn How to Setup Hudi on EMR with Hive and Query Data using Hue and 
Presto CLI Hands on Labs"
+last_modified_at: 2023-12-20T16:54:38.964863-07:00
+authors:
+- name: Soumil Shah
+category: blog
+image: 
/assets/images/video_blogs/2023-12-16-Learn-How-to-Setup-Hudi-on-EMR-with-Hive-and-Query-Data-using-Hue-and-Presto-CLI-Hands-on-Labs.png
+navigate: "https://www.youtube.com/watch?v=oav6aEldk1o;
+tags:
+- guide
+- aws
+- beginner
+- hive

Review Comment:
   apache hive



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS-added-video-content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10385:
URL: https://github.com/apache/hudi/pull/10385#discussion_r1435645350


##
website/videoBlog/2023-12-11-Simplifying-Big-Data-Setting-Up-SparkSQL-Hive-Thrift-Server-and-Hudi-with-Beeline-in-Minutes.mdx:
##
@@ -0,0 +1,18 @@
+---
+title: "Simplifying Big Data: Setting Up Spark SQL, Hive Thrift Server, and 
Hudi with Beeline in Minutes"
+last_modified_at: 2023-12-20T16:54:38.964863-07:00
+authors:
+- name: Soumil Shah
+category: blog
+image: 
/assets/images/video_blogs/2023-12-11-Simplifying-Big-Data-Setting-Up-SparkSQL-Hive-Thrift-Server-and-Hudi-with-Beeline-in-Minutes.png
+navigate: "https://www.youtube.com/watch?v=lCorHcx2mvc;
+tags:
+- guide
+- aws

Review Comment:
   avoid plain `aws` tag and go with fully qualified tags like `aws emr` for 
example



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS-added-video-content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10385:
URL: https://github.com/apache/hudi/pull/10385#discussion_r1435645073


##
website/videoBlog/2023-12-09-Learn-How-to-use-DBT-with-Spark-and-Thrift-Server-on-Local-Machine-for-Begineers-Easy-Setup.mdx:
##
@@ -0,0 +1,18 @@
+---
+title: "Learn How to use DBT with Spark and Thrift Server on Local Machine for 
Begineers Easy Setup"
+last_modified_at: 2023-12-20T16:54:38.964863-07:00
+authors:
+- name: Soumil Shah
+category: blog
+image: 
/assets/images/video_blogs/2023-12-09-Learn-How-to-use-DBT-with-Spark-and-Thrift-Server-on-Local-Machine-for-Begineers-Easy-Setup.png
+navigate: "https://www.youtube.com/watch?v=k1HSFPlunlM;
+tags:
+- guide
+- beginner
+- spark

Review Comment:
   Tag with fully qualified names - `apache spark`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS-added-video-content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10385:
URL: https://github.com/apache/hudi/pull/10385#discussion_r1435644838


##
website/videoBlog/2023-11-21-RFC-14-Step-by-Step-Guide-for-Incremental-Data-Pull-from-Postgres-to-Hudi-using-deltastreamer.mdx:
##
@@ -0,0 +1,16 @@
+---
+title: "RFC-14: Step-by-Step Guide for Incremental Data Pull from Postgres to 
Hudi using DeltaStreamer (#4)"
+last_modified_at: 2023-12-20T16:54:38.964863-07:00
+authors:
+- name: Soumil Shah
+category: blog
+image: 
/assets/images/video_blogs/2023-11-21-RFC-14-Step-by-Step-Guide-for-Incremental-Data-Pull-from-Postgres-to-Hudi-using-deltastreamer.png
+navigate: "https://www.youtube.com/watch?v=kqQ0SVwfBig;
+tags:
+- guide
+- beginner
+- deltastreamer

Review Comment:
   Lets remember to add hudi streamer tag wherever we are adding deltastreamer. 
For reference deltastreamer was renamed in recent releases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS-added-video-content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10385:
URL: https://github.com/apache/hudi/pull/10385#discussion_r1435644596


##
website/static/assets/images/video_blogs/2023-12-11-Simplifying-Big-Data-Setting-Up-SparkSQL-Hive-Thrift-Server-and-Hudi-with-Beeline-in-Minutes.png:
##


Review Comment:
   Lets avoid these type of thumnails. Doesnt bring out that the video guide is 
about very well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOCS-added-video-content [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on code in PR #10385:
URL: https://github.com/apache/hudi/pull/10385#discussion_r1435644467


##
website/static/assets/images/video_blogs/2023-11-19-Hudi-Streamer-Hands-On-Guide-Local-Ingestion-from-Parquet-Source-1.png:
##


Review Comment:
   Lets avoid these type of thumnails. Doesnt bring out that the video guide is 
about very well.



##
website/static/assets/images/video_blogs/2023-11-20-Hudi-Streamer-Hands-On-Guide-Local-Ingestion-from-CSV-Source-2.png:
##


Review Comment:
   Lets avoid these type of thumnails. Doesnt bring out that the video guide is 
about very well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Seeking Assistance with Hudi Integration Issue in Spark Thrift Server and DBT [hudi]

2023-12-23 Thread via GitHub


soumilshah1995 closed issue #10287: [SUPPORT] Seeking Assistance with Hudi 
Integration Issue in Spark Thrift Server and DBT
URL: https://github.com/apache/hudi/issues/10287


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Seeking Assistance with Hudi Integration Issue in Spark Thrift Server and DBT [hudi]

2023-12-23 Thread via GitHub


soumilshah1995 commented on issue #10287:
URL: https://github.com/apache/hudi/issues/10287#issuecomment-1868324146

   ![Screenshot 2023-12-23 at 11 13 29 
AM](https://github.com/apache/hudi/assets/39345855/1029e731-be52-4ff8-81b8-1753c342de44)
   
   
   will be creating YouTube videos for this which will help everyone 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(hudi) branch asf-site updated: DOC-added talks/presentations (#10399)

2023-12-23 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository.

bhavanisudha pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new ca9d62964af DOC-added talks/presentations (#10399)
ca9d62964af is described below

commit ca9d62964af76d4b1f529e48c0feaebc27ebd0ee
Author: nadine farah 
AuthorDate: Sat Dec 23 06:37:16 2023 -0800

DOC-added talks/presentations (#10399)

* added talks/presentations

added talks from open source data summit that were hudi focused

* More fixes

Fixed order of talks, added conference names and fixed some links

-

Co-authored-by: Bhavani Sudha Saktheeswaran 
<2179254+bhasu...@users.noreply.github.com>
---
 README.md  | 15 +
 website/src/pages/talks.md | 84 ++
 2 files changed, 70 insertions(+), 29 deletions(-)

diff --git a/README.md b/README.md
index 9415e9fff12..a4ae4b3d96c 100644
--- a/README.md
+++ b/README.md
@@ -157,6 +157,21 @@ Example: When you change any file in 
`versioned_docs/version-0.7.0/`, it will on
 ## Configs
 Configs can be automatically updated by following these steps documented at 
../hudi-utils/README.md
 
+## Talks
+
+When adding a talk, please follow these guidelines.
+
+1. Ensure the entry is of the format 
+   "[Title](Hyperlink to video/resources)" - By , , 
. ,  .
+2. Please ensure the talks are in chronological order.
+3. Try to add links to videos and slide decks when possible. If they are not 
available in same page, feel free to add 
+   [Slides](Slides link) towards the end like for example:
+
+:::note
+   ["Hoodie: An Open Source Incremental Processing Framework From 
Uber"](http://www.dataengconf.com/hoodie-an-open-source-incremental-processing-framework-from-uber)
 - By Vinoth Chandar.
+   Apr 2017, DataEngConf, San Francisco, CA 
[Slides](https://www.slideshare.net/vinothchandar/hoodie-dataengconf-2017) 
[Video](https://www.youtube.com/watch?v=7Wudjc-v7CA)
+:::
+
 ## Blogs
 
 When adding a new blog, please follow these guidelines.
diff --git a/website/src/pages/talks.md b/website/src/pages/talks.md
index bea37571700..ddcfacb5fad 100644
--- a/website/src/pages/talks.md
+++ b/website/src/pages/talks.md
@@ -49,59 +49,85 @@ last_modified_at: 2019-12-31T15:59:57-04:00
 
 18. ["Next Generation Data lakes using Apache 
Hudi"](https://docs.google.com/presentation/d/1y-ryRwCdTbqQHGr_bn3lxM_B8L1L5nsZOIXlJsDl_wU/edit?usp=sharing)
 - By Balaji Varadarajan and Sivabalan Narayanan, Sep 2020, 
["ApacheCon"](https://www.apachecon.com/)
 
-19. ["Building Large-Scale, Transactional Data Lakes using Apache 
Hudi"](https://www.dbta.com/DataSummit/Fall2020/Agenda.aspx) - By Nishith 
Agarwal, Data Summit 2020
+19. ["Apache Hudi on Amazon 
EMR"](https://pages.awscloud.com/rs/112-TZM-766/images/EV_analytics-sprint-week-apache-hundi-amazon-emr_Sep-2020.pdf)
 - By the AWS team. September 2020
 
-20. ["Landing practice of Apache Hudi in 
T3go"](https://drive.google.com/file/d/1ULVPkjynaw-07wsutLcZm-4rVXf8E8N8/view?usp=sharing)
 - By VinoYang and XianghuWang, November 2020, Qcon.
+20. ["Building Large-Scale, Transactional Data Lakes using Apache 
Hudi"](https://www.dbta.com/DataSummit/Fall2020/Agenda.aspx) - By Nishith 
Agarwal, Data Summit 2020
 
-21. ["Meetup talk by Nishith 
Agarwal"](https://www.meetup.com/UberEvents/events/274924537/) - Uber Data 
Platforms Meetup, Dec 2020
+21. ["Landing practice of Apache Hudi in 
T3go"](https://drive.google.com/file/d/1ULVPkjynaw-07wsutLcZm-4rVXf8E8N8/view?usp=sharing)
 - By VinoYang and XianghuWang, November 2020, Qcon.
 
-22. ["Apache Hudi learning series: Understanding Hudi 
internals"](https://www.slideshare.net/NishithAgarwal3/hudi-architecture-fundamentals-and-capabilities)
 - By Abhishek Modi, Balajee Nagasubramaniam, Prashant Wason, Satish Kotha, 
Nishith Agarwal, Feb 2021, Uber Meetup
+22. ["Meetup talk by Nishith 
Agarwal"](https://www.meetup.com/UberEvents/events/274924537/) - Uber Data 
Platforms Meetup, Dec 2020
 
-23. ["Apache Hudi Meetup at Uber with talks from AWS, CityStorageSystems & 
Uber"](https://youtu.be/iXBInMLbjo0) - By Udit Mehrotra, Wenning Ding (AWS), 
Alexander Filipchik (CityStorageSystems), Prashant Wason, Satish Kotha (Uber), 
Feb 2021
+23. ["Apache Hudi learning series: Understanding Hudi 
internals"](https://www.slideshare.net/NishithAgarwal3/hudi-architecture-fundamentals-and-capabilities)
 - By Abhishek Modi, Balajee Nagasubramaniam, Prashant Wason, Satish Kotha, 
Nishith Agarwal, Feb 2021, Uber Meetup
 
-24. ["Apache Hudi: The Streaming Data Lake 
Platform"](https://docs.google.com/presentation/d/1lVpbYV7qytAZPdwx4X9DD9ii0qFh7n9WGKJ0XQ4VpIs/edit?usp=sharing)
 - By Nishith Agarwal, Sivabalan Narayanan, 
+24. ["Apache Hudi Meetup at Uber with talks from AWS, CityStorageSystems & 
Uber"](https://youtu.be/iXBInMLbjo0) - By Udit Mehrotra, Wenning 

Re: [PR] DOC-added talks/presentations [hudi]

2023-12-23 Thread via GitHub


bhasudha merged PR #10399:
URL: https://github.com/apache/hudi/pull/10399


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] DOC-added talks/presentations [hudi]

2023-12-23 Thread via GitHub


bhasudha commented on PR #10399:
URL: https://github.com/apache/hudi/pull/10399#issuecomment-1868304376

   @nfarah86 Thanks for the PR.  I reviewed and fixed a few things. For future 
reference, please ensure 
   - The talks are in chronological order
   - We mention the conference name and stick to the format.
   I added these in README.md as well for reference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Fixed unit tests [hudi]

2023-12-23 Thread via GitHub


geserdugarov commented on PR #10362:
URL: https://github.com/apache/hudi/pull/10362#issuecomment-1868303781

   I don't understand what is happening with CI. I've changed 2 unit tests:
   
   - `TestJavaHoodieBackedMetadata`, from `hudi-client/hudi-java-client`,
   - `TestHoodieDeltaStreamer`, from `hudi-utilities`.
   
   Both are Java tests.
   
   ### Azure CI
   
   `hudi-client/hudi-java-client` is not included in the Azure CI.
   `hudi-utilities` is included in the Azure CI in `UT FT other modules` job at 
`UT other modules` stage.
   So, `TestHoodieDeltaStreamer` test is the only one, which could brake the 
Azure CI.
   But the last log from `UT other modules` stage is
   > [INFO] Running org.apache.hudi.utilities.sources.TestSqlSource
   
   before
   > This job was abandoned. We have detected that logs from the agent may have 
not finished uploading. We have included our in-memory record of all log lines 
uploaded before we lost contact with the agent:
   
   My change in this test couldn't brake it this way, only test failure is 
possible. Maybe with my MR test ordering is changed and the unit tests running 
is hung at `@AfterAll/Each` of some test class or at `@BeforeAll/Each` of 
another one. But I couldn't reproduce the problem locally. This part of CI job 
is passing without any problem locally.
   
   ### GitHub Actions
   
   My change in `TestJavaHoodieBackedMetadata` from 
`hudi-client/hudi-java-client` should affect only 
`test-hudi-hadoop-mr-and-hudi-java-client` job, but not `test-spark`.
   And I see that `test-hudi-hadoop-mr-and-hudi-java-client` is ok, but there 
are hungs in `test-spark` and failure at `TestDataSourceForBootstrap` scala 
test after
   
   > 2023-12-23T04:01:07.0996155Z 4017081 [Executor task launch worker for task 
372] ERROR org.apache.spark.executor.Executor [] - Exception in task 0.0 in 
stage 133.0 (TID 372)
   2023-12-23T04:01:07.0997116Z java.lang.OutOfMemoryError: GC overhead limit 
exceeded
   
   @danny0405 , @yihua Could you, please, give me any suggestions what else can 
I try?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org