+1 (binding)
- Write and read COW table through spark
Balaji.V
On Friday, July 12, 2024 at 06:07:55 AM PDT, Lokesh Jain
wrote:
+1 (non-binding)
- Verified checksums and signatures
- Ran quickstart
Regards
Lokesh
On 2024/07/08 17:57:13 sagar sumit wrote:
> Hi everyone,
>
> Please
+1 (binding)
Balaji.V
On Sunday, June 2, 2024 at 06:48:01 PM PDT, Sivabalan
wrote:
+1
Ran deltastreamer tests, meta sync tests and Quick start. All good on my end.
On Thu, 30 May 2024 at 16:07, Yexiang Chang wrote:
>
> +1, I verified 0.15.0 hudi-spark-bundle and hudi-hadoop-mr-bundle
+1 (binding)
Ran validate stage testChecking Checksum of Source Release
Checksum Check of Source Release - [OK]
Checking Signature
Signature Check - [OK]
Checking for binary files in the source files
No Binary Files in the source files? - [OK]
Checking for DISCLAIMER
+1 (binding)
Ran release validation script.
(⎈|dev-core-0:N/A)balaji-varadarajan--NR26725P2G:scripts balaji.varadarajan$
./release/validate_staged_release.sh --release=0.12.2 --rc_num=1
/tmp/validation_scratch_dir_001 ~/code/oss/hudi/scripts
Downloading from svn co https://dist.apache.org
+1 (binding)
On Monday, August 15, 2022 at 08:42:08 AM PDT, Rahil C
wrote:
+1
-Rahil C
On Mon, Aug 15, 2022 at 8:07 AM Nishith wrote:
> +1 (binding)
>
> -Nishith
>
> > On Aug 15, 2022, at 12:20 AM, Shiyan Xu
> wrote:
> >
> > +1 (binding)
> >
> > Manually ran deltastreamer job with
+1 on option B.
Balaji.V
On Thu, Feb 17, 2022 at 11:20 PM Nishith wrote:
> +1 to B for the same reasons
>
> -Nishith
>
> > On Feb 17, 2022, at 9:22 PM, Vinoth Chandar wrote:
> >
> > +1 on B as well. same rationale as Raymond's. I think we have all major
> > chunks landed or PRs up.
> > Love
+1 binding. RC passed.
Balaji.V
On Monday, January 24, 2022, 10:28:58 AM PST, Bhavani Sudha
wrote:
+1 binding
Ran RC check, quickstart and some IDE tests.
Thanks,
Sudha
On Mon, Jan 24, 2022 at 9:23 AM sagar sumit wrote:
> +1
>
> - Builds for Spark2/3 [OK]
> - Spark quickstart
+1 (binding)
- Package Build successful- Overnight staging test - Data Validation successful
for COW upsert workload.
On Monday, December 6, 2021, 06:40:32 AM PST, vino yang
wrote:
+1 (binding)
- build successfully
- ran spark quickstart
- verified checksum
Best,
Vino
Y Ethan
+1 (binding)
$ ./release/validate_staged_release.sh --release=${RC_VERSION} --rc_num=2
...Downloading from svn co
https://dist.apache.org/repos/dist//dev/hudiValidating hudi-0.9.0-rc2 with
release type "dev"Checking Checksum of Source Release Checksum Check of Source
Release - [OK]
% Total
+1
Balaji.V
On Wed, Aug 11, 2021 at 7:12 PM Bhavani Sudha
wrote:
> +1
>
> Thanks,
> Sudha
>
> On Wed, Aug 11, 2021 at 7:08 PM vino yang wrote:
>
> > +1
> >
> > Best,
> > Vino
> >
> > Pratyaksh Sharma 于2021年8月12日周四 上午2:16写道:
> >
> > > +1
> > >
> > > I have never used it, but we can try this
Welcome to Apache Hudi Community !!
I have given contributor permissions. Looking forward to your contributions !!
Balaji.V
On Monday, January 25, 2021, 06:23:57 PM PST, jiangjiguang719
wrote:
Hi,
I want to contribute to Apache Hudi.
Would you please give me the contributor
+1 (binding)
1. Ran release validation script successfully.2. Build successful3. Quickstart
succeeded.
Checking Checksum of Source Release Checksum Check of Source Release - [OK]
% Total % Received % Xferd Average Speed Time Time Time Current
Very Well deserved !! Many congratulations to Satish and Prashant.
Balaji.V
On Thursday, December 3, 2020, 11:07:09 AM PST, Bhavani Sudha
wrote:
Congratulations Satish and Prashant!
On Thu, Dec 3, 2020 at 11:03 AM Pratyaksh Sharma wrote:
Congratulations Satish and Prashant!
On Fri,
+1 for (2)
On Wednesday, December 2, 2020, 08:09:29 AM PST, vino yang
wrote:
+1 for option 2
Gary Li 于2020年12月2日周三 下午4:01写道:
> vote for option 2.
>
> From: nishith agarwal
> Sent: Wednesday, December 2, 2020 3:16 PM
> To: dev@hudi.apache.org
>
Regarding rdd vs dataframe, the historical reason is that RDD provided more
control with low level API needed for Hudi to managing various aspects of
writing.
On a related note, If you look at the current approach with Flink support, the
input batch is getting parameterized to support
+1
On Sunday, November 1, 2020, 09:13:44 PM PST, Gary Li
wrote:
+1 for biweekly meeting.
Gary LiFrom: Vinoth Chandar
Sent: Friday, October 30, 2020 2:01:22 PM
To: dev@hudi.apache.org ; us...@hudi.apache.org
Subject: Re: Reg weekly sync meeting + users list as well.
On Thu, Oct 29,
Hi Selvaraj, I have replied in the jira.
Thanks,Balaji.VOn Sunday, November 1, 2020, 01:17:05 AM PST, selvaraj
periyasamy wrote:
Team,
Could you look into Hudi-1365? Performance is really heavily impacted for
some reasons .
Thanks,
Selva
Welcome to Apache Hudi community. I have added you as a contributor in Jira.
Balaji.V
On Wednesday, October 28, 2020, 08:11:00 PM PDT, jack_zhangsj
wrote:
Hi,
I want to contribute to Apache Hudi. Would you please give me the contributor
permission? My JIRA ID is jack_zhangsj .
tools as well and our choice would be based on ease of use and amount of
changes.
When would be a good time to chat today or tomorrow?
Thanks,
Roopa
From: Balaji Varadarajan
Date: Thursday, October 22, 2020 at 9:24 PM
To: "dev@hudi.apache.org"
Cc: DL-AIE
Subject
or tomorrow?
Thanks,
Roopa
From: Balaji Varadarajan
Date: Thursday, October 22, 2020 at 9:24 PM
To: "dev@hudi.apache.org"
Cc: DL-AIE
Subject: Re: [EXT] Re: Bucketing in Hudi
Hi Roopa,
Bucketing is a more general concept. I think what you are referring to is how
to
a spark bucketed table having metadata different from Hive
bucketed tables as Spark cannot understand Hive’s hashing algorithm.
Is this something that Hudi might support?
Thanks,
Roopa
From: Balaji Varadarajan
Date: Wednesday, October 21, 2020 at 9:01 PM
To: "dev@hudi.apache.org"
Hudi supports pluggable indexing (HoodieIndex) and the phases of index lookup
is nicely abstracted out. We have a Jira for supporting Bucket Indexing :
https://issues.apache.org/jira/browse/HUDI-55
You can get bucket indexing done by implementing that interface along with
additional changes
Fixing incorrect Satish's email.On Wednesday, October 21, 2020, 06:19:43 PM
PDT, Balaji Varadarajan wrote:
cc Satish who implemented Insert Overwrite support.
We have recently landed Insert Overwrite support in Hudi. Partition level
deletion is a logical extension of this feature
cc Satish who implemented Insert Overwrite support.
We have recently landed Insert Overwrite support in Hudi. Partition level
deletion is a logical extension of this feature but not currently available
yet. I have added a jira to track this :
https://issues.apache.org/jira/browse/HUDI-1350
We are planning to add parallel writing to Hudi (at different partition) levels
in the next release.
Balaji.V On Friday, October 16, 2020, 11:54:51 PM PDT, tanu dua
wrote:
Hi,
Do we have a support of concurrent writes in 0.6 as I got a similar
requirement to ingest parallely from
NULL
Bucket Columns: [] NULL
Sort Columns: [] NULL
Storage Desc Params: NULL NULL
serialization.format 1
On Fri, 9 Oct 2020 at 19:07, Balaji Varadarajan
wrote:
> Can you paste the detailed h
in hive / hue?
Regards,
Ranganath
On Thu, 1 Oct 2020 at 09:45, Balaji Varadarajan
wrote:
> Assuming commit1 happened before commit2, this is what you should expect
> when running a standard query through query engines.
> Balaji.V
>
> On Tuesday, September 29, 2020, 03:04:17 PM
Assuming commit1 happened before commit2, this is what you should expect when
running a standard query through query engines.
Balaji.V
On Tuesday, September 29, 2020, 03:04:17 PM PDT, Ranganath Tirumala
wrote:
Hi,
Is there a way we can query to get the latest record across commits?
Hi Jialun,
There is no outside documentation for this case except Javadocs
(https://issues.apache.org/jira/browse/HUDI-1277). The payload interface are
themselves first class citizens of Hudi (
ath ending with `/`
(a directory path). To me, this seems to be a corner case not being
covered. Could you kindly confirm the expectation please? Thanks.
On Tue, Sep 8, 2020 at 8:58 PM Balaji Varadarajan
wrote:
> Hi Raymond,
> IIRC, we need to give a blob path to make HoodieROTablePathFilter to work
Added. Welcome to Hudi community.
Balaji.V
On Tuesday, September 8, 2020, 09:31:37 PM PDT, Mani Jindal
wrote:
Hi team
Please guide me how can i request for the contributor access for jira so
that i can assign some jira tickets to myself and contribute to the hudi
community.
JIRA
Deleted.
Thanks,Balaji.VOn Tuesday, September 8, 2020, 08:51:36 PM PDT, Raymond Xu
wrote:
I think there is a mistakenly created version tag 0.60 in JIRA; the number
does not seem to follow the release format.
Anyone care to delete this?
Hi Raymond,
IIRC, we need to give a blob path to make HoodieROTablePathFilter to work
correctly (e.g: "base/partition/*"). The path-cache is at partition level and
not at table level so we need to extract the partition-path correctly to be
used as look-up key. To extract partition-path, the
+1
On Tuesday, September 8, 2020, 05:54:52 PM PDT, Mehrotra, Udit
wrote:
I am okay with this too.
On 9/8/20, 5:33 PM, "Raymond Xu" wrote:
CAUTION: This email originated from outside of the organization. Do not
click links or open attachments unless you can confirm the sender
Hi Ji,
Moving this discussion to https://github.com/apache/hudi/issues/2063 which you
have opened. I have added a possible workaround in the comments. Please try it
out and respond in the issue.
Thanks,Balaji.V
On Monday, September 7, 2020, 10:11:13 AM PDT, Jl Liu (cadl)
wrote:
Udit, Gary, Raymond and Pratyaksh,
Many congratulations :) Well deserved. Looking forward to your continued
contributions.
Balaji.V
On Thursday, September 3, 2020, 07:19:45 PM PDT, Sivabalan
wrote:
Congrats to all 3. Much deserved and really excited to see more committers
On Thu,
+1. All current and future contributors/committers need to read this.
Balaji.V
On Wednesday, September 2, 2020, 01:11:46 AM PDT, vino yang
wrote:
+1 to have the coding guidelines.
Left some comments.
Best,
Vino
Vinoth Chandar 于2020年9月2日周三 上午9:51写道:
> Hello all,
>
> Put together a
+1 on the process.
Balaji.VOn Tuesday, September 1, 2020, 04:56:55 PM PDT, Gary Li
wrote:
+1
Gary LiFrom: Bhavani Sudha
Sent: Wednesday, September 2, 2020 3:11:06 AM
To: us...@hudi.apache.org
Cc: dev@hudi.apache.org
Subject: Re: [DISCUSS] Formalizing the release process +1 on the
-executors 200 --executor-cores 1 --conf
spark.executor.memoryOverhead=4096 --conf
spark.shuffle.service.enabled=true --class
com.test.cdp.reporting.trr.TRREngine
/home/seperiya/transformation-engine.jar
Thanks,
Selva
On Sat, Aug 29, 2020 at 12:55 PM Balaji Varadarajan
wrote:
> Hi Selvaraj,
&g
+1. This would be a great contribution as all developers will benefit from
this work.
On Monday, August 31, 2020, 08:07:08 AM PDT, Vinoth Chandar
wrote:
+1 this is a great way to also ramp on the code base
On Sun, Aug 30, 2020 at 8:00 AM Sivabalan wrote:
> As Hudi matures as a
Hi Felix,
For read side performance, we are focussed on adding clustering support
(https://cwiki.apache.org/confluence/display/HUDI/RFC+-+19+Clustering+data+for+speed+and+query+performance)
and consolidated metadata
dating
spark-streaming-kafka artifact from 0.8_2.11/2.12 to 0.10_2.11/2.12.
- *IMPORTANT* This version requires your runtime spark version to be
upgraded to 2.4+.
Thanks,
Selva
On Sat, Aug 29, 2020 at 1:16 AM Balaji Varadarajan
wrote:
> From the hudiLogs.txt, I find only HoodieROTable
From the hudiLogs.txt, I find only HoodieROTablePathFiler related logs
repeating which suggests this is the read side. So, we recommend you using
latest version. I tried 2.3.3 and ran quickstart without issues. Give it a shot
and let us know if there are any issues.
Balaji.V
On Friday,
M selvaraj periyasamy <
selvaraj.periyasamy1...@gmail.com> wrote:
> Thanks Balaji.
>
> could you please provide more info on how to get it done and pass it to
> hudi?
>
> Thanks,
> Selva
>
> On Fri, Aug 21, 2020 at 12:33 PM Balaji Varadarajan
> wrote:
>
> &g
+1(binding)
1. Ran long running structured streaming writes on fake data and verified
compactions and ingestion is happening without errors.
2. Ran both scala and python based quickstart without any errors. There was
an issue in the documented quickstart steps (not in hudi) for python
example.
Thanks for the detailed email David. We had discussed this in last week
community meeting and Vinoth had ideas on how to implement this. This is
something that can be supported by the timeline layout that Hudi has. It would
be a new feature (new write operation) that basically appends the
Hi Selvaraj,
Even though the incoming batch has non null values for the new column, existing
data do not have this column. So, you need to make sure the avro schema has the
new column to be nullable and be backwards compatible.
Balaji.V
On Friday, August 21, 2020, 10:06:40 AM PDT, selvaraj
Welcome Trevor to Hudi community. It looks like you have been added to the
contributor role.
Balaji.VOn Thursday, August 20, 2020, 11:07:47 AM PDT, wowtua...@gmail.com
wrote:
I want to contribute to Apache Hudi.
Would you please give me the permission as a contributor ?
My JIRA
Hi linshan,
Sorry for the delay in responding. It is better to discuss code changes over
draft PR. Can you open one and tag us there. At a high level, it looks like you
are using Spark Datasource v2 APIs while currently the structured streaming
write is implemented using V1 API. Let's discuss
+1. This should be good to have as an option. If everybody agrees, please go
ahead with RFC and we can discuss details there.
Balaji.VOn Tuesday, August 18, 2020, 04:37:18 PM PDT, Abhishek Modi
wrote:
Hi everyone!
I was hoping to discuss adding support for making `_hoodie_record_key`
Please see answers inline...
On Sunday, July 19, 2020, 10:08:09 PM PDT, Lian Jiang
wrote:
Hi,
I have a kafka topic using a kafka s3 connector to dump data into s3 hourly in
parquet format. These parquet files are partitioned in ingestion time and each
record has fields which are
Gary/Udit,
As you are familiar with this part of it, Can you please answer this question ?
Thanks,Balaji.VOn Monday, July 20, 2020, 08:18:16 AM PDT, tanu dua
wrote:
Hi Guys,
May I know how do you guys handle date and time stamp in Hudi.
When I set DataTypes as Date in StructType it’s
Hi Sivaprakash,
You can configure cleaner to clean the older file versions which contain those
records to be deleted. You can take a look at
https://cwiki.apache.org/confluence/display/HUDI/FAQ#FAQ-WhatdoestheHudicleanerdo
for more details.
Balaji.V
On Friday, July 17, 2020, 07:47:55 AM
Hi Sivaprakash,
Uniqueness of records is determined by the record key you specify to hudi. Hudi
supports filtering out existing records (by record key). By default, it would
upsert all incoming records.
Please look at
+1
Sent from Yahoo Mail for iPhone
On Monday, June 29, 2020, 5:34 PM, Vinoth Chandar wrote:
+1 as well. (sorry , for jumping in late)
On Sun, Jun 28, 2020 at 11:36 AM Shiyan Xu
wrote:
> Thanks for the +1. Filed https://issues.apache.org/jira/browse/HUDI-1058
>
> On Sat, Jun 27, 2020 at
Hi Mario,
Timeline Server was designed to serve hudi metadata for Hudi writers and
readers. it may not be suitable to serve arbitrary data. But, it is an
interesting thought. Can you elaborate more on what kind of business metadata
are you looking. Is this something you are planning to store
Thanks for using Hudi. Looking at pom definitions between 0.5.1 and 0.5.2, I
don't see any difference that could cause this issue. As it works with 0.5.2, I
am assuming you are not blocked. Let us know otherwise.
Balaji.VOn Wednesday, May 20, 2020, 01:17:08 PM PDT, Lian Jiang
wrote:
Terrific job :) We are marching on !!
Balaji.V
On Tuesday, May 19, 2020, 05:16:57 PM PDT, Sivabalan
wrote:
wow ! 19 binding votes. Great :)
On Tue, May 19, 2020 at 1:55 AM lamber-ken wrote:
>
>
>
> Gread job! and good luck for apache hudi project.
>
>
>
>
> Best,
> Lamber-Ken
>
>
> > created, the person holding such office to serve at the direction of the
> > Board of Directors as the chair of the Apache Hudi Project, and to have
> > primary responsibility for management of the projects within the scope of
> > responsib
+1 on Sudha being RM and targeting next release for mid may.
Balaji.V
On 2020/04/23 14:27:46, Vinoth Chandar wrote:
> Thanks all. Encourage everyone to chime in more, so we can make a decision
> here!
>
> On Thu, Apr 23, 2020 at 6:29 AM Sivabalan wrote:
>
> > sounds good. We could go with a
+1. Would also be great if folks sign-up for testing/trying out the master
branch in their real environments
On Wednesday, April 22, 2020, 02:48:13 PM PDT, Bhavani Sudha
wrote:
+1 Sounds like a good idea
On Wed, Apr 22, 2020 at 1:51 PM Vinoth Chandar wrote:
> Just floating a very
+1
On Wednesday, April 22, 2020, 08:35:30 AM PDT, leesf
wrote:
+1
Vinoth Chandar 于2020年4月22日周三 下午2:24写道:
> +1 from me as well
>
> On Mon, Apr 20, 2020 at 9:37 PM vino yang wrote:
>
> > Hi Raymond,
> >
> > Thanks for opening this discussion.
> >
> > IMHO, as Hudi's user base grows,
e. Let me know
> your thoughts. It would be good to nail other details like whether/how to
> deal with external index management with this API.
> Thanks,Balaji.V
> On Thursday, April 16, 2020, 10:46:19 AM PDT, Balaji Varadarajan
> wrote:
>
>
> +1 from me. This is a really cool fe
+1 from me. This is a really cool feature.
Yes, A new file slice (empty parquet) is indeed generated for every file group
in a partition.
Regarding cleaning these "empty" file slices eventually by cleaner (to avoid
cases where there are too many of them lying around) in a safe way, we can
Congratulations Sudha :) Well deserved. Welcome to PPMC.
Balaji.V
On Tuesday, April 7, 2020, 03:04:37 PM PDT, Gary Li
wrote:
Congrats Sudha! Appreciated all the work you have done!
On Tue, Apr 7, 2020 at 2:57 PM Y Ethan Guo wrote:
> Congrats!!!
>
> On Tue, Apr 7, 2020 at 2:55 PM
Many Congratulations Lamber-Ken. Well deserved !!
Balaji.V
On Tuesday, April 7, 2020, 02:23:51 PM PDT, Y Ethan Guo
wrote:
Congrats!!!
On Tue, Apr 7, 2020 at 2:22 PM Gary Li wrote:
> Congrats lamber! Well deserved!
>
> On Tue, Apr 7, 2020 at 2:18 PM Vinoth Chandar wrote:
>
> >
Agree. The triaging process makes sense to me.
Balaji.V
On Monday, April 6, 2020, 09:54:24 AM PDT, Vinoth Chandar
wrote:
Hi,
I feel there are couple of action items here..
a) JIRA to track work for slack-ML integration
b) Document the support triaging process : Slack (level 1) ->
gt; > proceeding". So probably the embedded timeline server can recreate the
> view
> > next time it comes back up?
> >
> > Thanks
> > Prashant
> >
> >
> > On Wed, Mar 18, 2020 at 11:37 AM Balaji Varadarajan
> > wrote:
>
With 0.5.1, the key-generator classes are relocated to org.apache.hudi.keygen.
You can find the information in release notes in
https://hudi.incubator.apache.org/releases.html#release-051-incubating-docs
Balaji.VOn Saturday, March 21, 2020, 01:47:48 PM PDT, FO O
wrote:
Hi,
When
Prashanth,
I think we should not be reverting clean operations here. Cleans are done on
the oldest file slices and a restore/rollback is not completely undoing the
work of clean that happened before it.
For incremental timeline syncing, embedded timeline server needs to read these
clean
+1 on Vinoth's suggestion on waiting for the lower level (write-client)
re-factored and re-organized first. We can then look at Data-Source and
DeltaStreamer to make sure how to best organize them.
Balaji.VOn Sunday, March 8, 2020, 11:06:13 PM PDT, Vinoth Chandar
wrote:
>> make
+1 on cutting the branch.
Vino, let us know in this thread if you run into any problems in the release
process.
Balaji. V
Sent from Yahoo Mail for iPhone
On Saturday, February 29, 2020, 9:19 AM, Vinoth Chandar
wrote:
Great! Can we cut the release candidate branch 0.5.2 right away so that
Awesome Pratyaksh, would you mind opening a PR to documenting it.
Balaji.V
Sent from Yahoo Mail for iPhone
On Wednesday, February 26, 2020, 11:14 PM, Pratyaksh Sharma
wrote:
Hi,
I figured out the issue yesterday. Thank you for helping me out.
On Thu, Feb 27, 2020 at 12:01 AM
+1. Lets do it :)
Balaji.V
On Mon, Feb 24, 2020 at 6:36 PM Shiyan Xu
wrote:
> +1 great reading and values!
>
> On Mon, 24 Feb 2020, 15:31 nishith agarwal, wrote:
>
> > +100
> > - Reduces index lookup time hence improves job runtime
> > - Paves the way for streaming style ingestion
> > -
See if you can have a generic implementation where individual fields in the
partition-path can be configured with their own key-generator class. Currently,
TimestampBasedKeyGenerator is the only type specific custom generator. If we
are anticipating more such classes for specialized types,
t; So if I'm not wrong, the code will be marking all partitions which got
> > UPDATE data for partition update. Hence time consuming.
> >
> > Regards,
> > Purushotham Pushpavanth
> >
> >
> >
> > On Mon, 20 Jan 2020 at 08:58, Balaji Varadaraja
+1 as well. Looks great.
Balaji.V
On Thursday, January 23, 2020, 08:17:47 AM PST, Vinoth Chandar
wrote:
Looks good . +1 !
On Wed, Jan 22, 2020 at 11:44 PM lamberken wrote:
>
>
> Hello everyone,
>
>
> I redrawed the hudi data lake architecture diagram on landing page. If you
> have
+1 (binding)
Ran the following validation steps:
1. Checked out RC candidate source code and compiled successfully
2. Ran Apache Hudi quickstart steps successfully on 0.5.1-rc1
3. Ran Long running deltastreamer test for a half day without any
exceptions.
4. Compliance : Ran
-depth=immediates*
leesf 于2020年1月21日周二 下午3:07写道:
> Hi balaji,
>
> I would not find entrypoint to create a folder under dev/incubator/hudi,
> have no permissions? Please advise. Thanks.
>
> Balaji Varadarajan 于2020年1月21日周二 下午2:14写道:
>
>>
>> Hi Leesf,
>> TH
Hi Leesf,
THe staging directories are intentionally empty. The directories corresponding
to 0.5.0-incubating release were deleted from staging directory as the last
step of the release. You can create a folder "0.5.1-incubating" under
dev/incubator/hudi and add the source release tar balls
Hi Purushotham,
I am unable to reproduce same partitions getting hive-synced locally. Can you
add the following log message in HoodieHiveClient.java and run the code and
send us logs.
diff --git a/hudi-hive/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java
+1 Sunday should give breathing space to fix the blockers.
Balaji.V
On Wednesday, January 15, 2020, 06:50:28 AM PST, Vinoth Chandar
wrote:
+1 from me. I feel sunday is good in general, because the weekend gives
enough time for taking care of last minute things
On Wed, Jan 15, 2020 at
IIUC, this would look like a digest email summarizing discussion threads, jira
and PR activities.
+1
Balaji.V
On Sunday, January 5, 2020, 07:49:22 AM PST, leesf
wrote:
Hi all,
As Hudi attracts more attention recently and the community is developing
quickly as more and more
Added your id. Looking forward towards your contributions :) Welcome !!
Balaji.V
On Thursday, January 2, 2020, 05:44:51 PM PST, 谢雄
wrote:
Hi,
I want to contribute to Apache Hudi.
Would you please give me the contributor permission?
My JIRA ID is helloteddy.
+1 Thanks for doing this Vinoth. Covers all aspects of contribution in
detail. Big +1 to code/RFC review etiquettes.
Balaji.V
On Sat, Dec 28, 2019 at 7:20 PM vino yang wrote:
> Hi Vinoth,
>
> big +1 from my side.
>
> Thanks for spending time improving the contribution guidelines.
>
> It looks
t the plan into multiple subtasks?
Thanks,
Nicholas
At 2019-12-14 00:18:12, "Vinoth Chandar" wrote:
>+1 (per asf policy)
>
>+100 per my own excitement :) .. Happy to review this!
>
>On Fri, Dec 13, 2019 at 3:07 AM Balaji Varadarajan
>wrote:
>
>> With Apache Hud
Thanks Shahidha for the quick response.
Pratyaksh, I am ok with making the behavior consistent with other Key
generators. Please go ahead and submit a PR.
Thanks,
Balaji.V
On Thu, Dec 12, 2019 at 10:34 PM Pratyaksh Sharma
wrote:
> Hi Shahida,
>
> Thank you for the clarification. Actually I
With Apache Hudi growing in popularity, one of the fundamental challenges
for users has been about efficiently migrating their historical datasets to
Apache Hudi. Apache Hudi maintains per record metadata to perform core
operations such as upserts and incremental pull. To take advantage of
Hudi’s
Hello all,
In the spirit of making Apache Hudi (incubating) releases at regular
cadence, we are starting this thread to kickstart the planning and
preparatory work for next release (0.5.1).
As discussed in yesterdays meeting, the current plan is to have a release
by end of Jan 2020.
As
I have cancelled the weekly (9 pm PST) meeting just now. I guess many of us
are traveling or in vacation. We will meet next week same time
Balaji.V
Hi Gurudatt,
>From the stack-trace, it looks like you are using CombineInputFormat as
your default input format for the hive session. If your intention is to
use combined input format, can you instead try setting default (set
hive.input.format=) to
I updated the FAQ section to set defaults correctly and add more
information related to this :
https://cwiki.apache.org/confluence/display/HUDI/FAQ#FAQ-WhatdoestheHudicleanerdo
The cleaner retention configuration is based on counts (number of commits
to be retained) with the assumption that users
+1 on the exporter tool idea.
On Mon, Nov 11, 2019 at 10:36 PM vino yang wrote:
> Hi Shiyan,
>
> +1 for this proposal, Also, it looks like an exporter tool.
>
> @Vinoth Chandar Any thoughts about where to place it?
>
> Best,
> Vino
>
> Vinoth Chandar 于2019年11月12日周二 上午8:58写道:
>
> > We can
Regarding (1) , As the exception is happening inside parquet reader
(outside hudi), can you use Spark 2.3 (instead of spark 2.4 which brings
in particular version of avro/parquet) to create and ingest a brand new
dataset and try it out. This would hopefully help isolate the issue.
Regarding (2),
Agree with all 3 changes. The naming now looks more consistent than
earlier. +1 on them
Depending on whether we are renaming Input formats for (1) and (2) - this
could require some migration steps for
Balaji.V
On Mon, Nov 11, 2019 at 7:38 PM vino yang wrote:
> Hi Vinoth,
>
> Thanks for
+1. This would be a powerful feature which would open up use-cases
requiring repeatable query results.
Balaji.V
On Mon, Nov 11, 2019 at 8:12 AM nishith agarwal wrote:
> Folks,
>
> Starting a discussion thread for enabling time-travel for Hudi datasets.
> Please provide feedback on the RFC
Brandon,
Great initiative and thoughts. Thanks for writing detailed description on
what you are looking to achieve.
Here are some of my comments/thoughts:
1. HUDI-326 : There is some work that is happening in this direction.
But, we should be able to collaborate on this. Siva has opened
Hello Apache Hudi Community,
The Podling Project Management Committee (PPMC) for Apache Hudi
(Incubating) has invited Bhavani Sudha Saktheeswaran to become a committer
and we are
pleased to announce that she has accepted.
Bhavani Sudha has made great impact by fixing critical issues in hudi,
Thanks Sudha. The following times work for me :
Mon, Tue, Thursday - 9 p.m to 12 a.m PST
Wed - 5:00 to 6:00 am and 9:30 p.m to 12 a.m PST
On Wed, Nov 6, 2019 at 12:31 PM Vinoth Chandar wrote:
> Interested.
>
> Mon-Thu 5AM-6:30AM PST
> Mon-Thu 9PM-10:30PM PST
>
>
> On Wed, Nov 6, 2019 at
I have a different opinion on this. Usually, in production deployments
(atleast whatever I am aware of), database is generally managed at the
org/group level. Privacy policies like ACLs are usually done at database
level and would need first level management by admins. With such a setup,
its
1 - 100 of 127 matches
Mail list logo