Re: [VOTE] Release 0.10.1, release candidate #2

2022-01-24 Thread Mehrotra, Udit
+1 binding - Compilation for Spark 2 and Spark 3 [OK] - RC validation [OK] - QuickStart [OK] Thanks, Udit On 1/25/22, 12:11 AM, "Balaji Varadarajan" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the s

Re: [DISCUSS] Hudi is the data lake platform

2021-04-13 Thread Mehrotra, Udit
Agree with the rebranding Vinoth. Hudi is not just a "table format" and we need to do justice to all the cool auxiliary features/services we have built. Also, timeline metadata service in particular would be a really big win if we move towards something like that. On 4/13/21, 11:01 AM, "Pratya

Re: [VOTE] Release 0.8.0, release candidate #1

2021-03-31 Thread Mehrotra, Udit
+1 - Release Validation Script [OK] - Compile with Spark 2/Spark 3 [OK] - Ran QuickStart with Spark 2/Spark 3 on EMR [OK] Thanks, Udit On 3/31/21, 5:16 PM, "Vinoth Chandar" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unle

Re: Congrats to our newest committers!

2020-12-03 Thread Mehrotra, Udit
Huge congrats guys ! Well deserved indeed. On 12/3/20, 11:44 AM, "Prashant Wason" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Thanks everyone. Ove

Re: [DISCUSS] 0.7.0 release timelines

2020-12-02 Thread Mehrotra, Udit
Option 2 seems better to me as well. On 12/2/20, 11:24 AM, "Vinoth Chandar" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. +1 for 2 from me as well.

Re: [DISCUSS] Formalizing the release process

2020-09-08 Thread Mehrotra, Udit
+1 on the process. On 9/8/20, 5:11 PM, "Vinoth Chandar" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. >, bit skeptical on minor version releases every mo

Re: [DISCUSS] New Community Weekly Sync up Time

2020-09-08 Thread Mehrotra, Udit
I am okay with this too. On 9/8/20, 5:33 PM, "Raymond Xu" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. I'm ok with 1 hr earlier. On Tue, Sep 8, 202

Re: Date handling in HUDI

2020-07-21 Thread Mehrotra, Udit
Hi Tanu/Balaji, I have not really faced the issue mentioned here. AFAIK, the Date and Timestamp types should work fine. The Logical Date type is represented as INT in Avro, that is why you see the integer ingested there https://avro.apache.org/docs/current/spec.html#Date . But it should not hav

Re: Hudi + AWS GLue NoSuchMethodError: org.apache.http.conn.ssl.SSLConnectionSocketFactory.

2020-06-15 Thread Mehrotra, Udit
Hi Anton, This issue seems specific to AWS EMR, and it would be best for you to open a support ticket with the service if you are able to reproduce the issue with Hudi version that is supported on the EMR release. On a parallel note, if you really want to build Hudi 0.5.3 you do not have to bu

Re: [VOTE] Release 0.5.3, release candidate #2

2020-06-11 Thread Mehrotra, Udit
+1 (non-binding) - Integration tests succeeded locally - Release validation script succeeded: Checking Signature -e Signature Check - [OK] Checking for binary files in source release -e No Binary Files in Source Release? - [OK] Ch

Re: Apache Hudi Graduation vote on general@incubator

2020-05-22 Thread Mehrotra, Udit
Congrats Vinoth and to this amazing community on a major milestone ! On 5/22/20, 10:11 AM, "Pratyaksh Sharma" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.

Re: [DISCUSS] Bug bash?

2020-04-23 Thread Mehrotra, Udit
+1 Happy to participate On 4/23/20, 6:32 PM, "vino yang" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. +1 Shiyan Xu 于2020年4月24日周五 上

Re: Pyspark with hudi scripts

2020-04-08 Thread Mehrotra, Udit
Hi Yaswanth, PFA an example I prepared sometime back which can help you get started. Thanks, Udit On 4/8/20, 3:21 PM, "Atluri Yaswanth" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and kno

Re: Apache Hudi Example - Python

2020-04-08 Thread Mehrotra, Udit
Hi Priya, PFA an example I prepared sometime back which can help you get started. Thanks, Udit On 4/8/20, 3:44 PM, "Priyadarshini Shivram" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender an

Re: Question

2020-03-17 Thread Mehrotra, Udit
Hi Zaidi, You should be able to use Hudi 0.5.1 in the next EMR release that should be fairly soon, but we can't give you an ETA. Meanwhile, there is nothing really stopping you to build your hudi 0.5.1 jars and replacing the ones on EMR cluster. The jars are located on the master node at /usr/l

Re: running Hudi in AWS Glue Spark

2020-03-06 Thread Mehrotra, Udit
Hi Jorge, AWS Glue service itself does not support Hudi. However, you can use Glue as a metastore with Hudi on EMR. Hope that answers your question. Thanks, Udit Mehrotra SDE | AWS EMR On 3/6/20, 9:44 AM, "Vinoth Chandar" wrote: CAUTION: This email originated from outside of the organiza

Re: Apache Hudi on AWS EMR

2020-02-27 Thread Mehrotra, Udit
write > > into Hudi, but as Databricks providing "cloudfiles" for failure handling, > > Is there something in EMR? or do we need to manually handle this failure > by > > introducing SQS and SNS? > > > > > > > >

Re: [DISCUSS] Relocate spark-avro dependency by maven-shade-plugin

2020-02-19 Thread Mehrotra, Udit
Here are my 2 cents on this. @Vinoth just to add to your points: >I think SchemaConverters does import things like types and those will be >tied to the spark version. if there are new types for e.g, our bundled >spark-avro may not recognize them for e.g.. If new types our added, with

Re: Apache Hudi on AWS EMR

2020-02-19 Thread Mehrotra, Udit
Hi Udit, Just a quick question on Presto EMR. Does EMR Presto support Hudi jars in its classpath ? On Tue, Feb 18, 2020 at 12:03 PM Mehrotra, Udit wrote: > Workaround provided by Gary can help querying Hudi tables through Athena > for Copy On Write tab

Re: Hudi on EMR syncing GLUE catalog issue

2020-02-18 Thread Mehrotra, Udit
Hi Igor, As of current implementation, Hudi submits queries like creating table, syncing partitions etc directly to the hive server instead of directly communicating with the metastore. Thus while launching the EMR cluster, you should install Hive on the cluster as well. Also enable glue catalo

Re: Apache Hudi on AWS EMR

2020-02-18 Thread Mehrotra, Udit
ve not tested it though. The > > feature is in Preview already. > > > > Thanks > > Raghu > > -Original Message- > > From: Shiyan Xu > > Sent: Tuesday, February 18, 2020 6:20 AM > > To: dev@hudi.apache.org

Re: Apache Hudi on AWS EMR

2020-02-12 Thread Mehrotra, Udit
Hi Raghvendra, You would have to re-write you Parquet Dataset in Hudi format. Here are the links you can follow to get started: https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hudi-work-with-dataset.html https://hudi.apache.org/docs/querying_data.html#spark-incr-pull Thanks, Udit On 2/

Re: [VOTE] Release 0.5.1-incubating, release candidate #1

2020-01-23 Thread Mehrotra, Udit
+1 (non-binding) - Ran validate release script -OK - Ran unit tests on my mac - OK - Ran docker tests - OK - Ran COW/MOR examples on EMR for Insert/Update/Delete use-cases, and querying through hive/presto - OK Thank, Udit Mehrotra SDE AWS | EMR On 1/23/20, 2:42 PM, "Bhavani Sudha" wrote:

Re: EMR + HUDI

2019-11-15 Thread Mehrotra, Udit
Yes, Hudi is finally officially supported application on AWS EMR. Thanks to all the Hudi PMC members for their willingness and guidance towards making this happen ! Look forward to continued collaboration with you guys, to continue to improve and grow this project/community. Thanks, Udit On