nrlm1 commented on issue #10863:
URL: https://github.com/apache/hudi/issues/10863#issuecomment-1996220054
Thank you for looking into this issue. We are not specifying any.
Expectation is to use the default "upsert".
--
This is an automated message from the Apache Git Service.
To respond
danny0405 commented on issue #10863:
URL: https://github.com/apache/hudi/issues/10863#issuecomment-1996190248
Did you use `upsert` as the operation name or just `insert` ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and us
ennox108 opened a new issue, #10863:
URL: https://github.com/apache/hudi/issues/10863
I am trying to run a Flink job to get data from SQL server to S3.
I am doing offline compaction but whenever it is triggered I end up having
less records than before the compaction. Based on the com
joshhamann commented on issue #10822:
URL: https://github.com/apache/hudi/issues/10822#issuecomment-1983870364
You can see the timestamps in the above screenshots from the Spark UI if
that works. For instance, the test job, which is processing more data, goes
from around 23:18 to 23:23 (an
ad1happy2go commented on issue #10822:
URL: https://github.com/apache/hudi/issues/10822#issuecomment-1983745782
@joshhamann That's the correct understanding. If we are not using global
bloom, then if your incremental dataset only had data from very few partitions
, then index lookup stage w
joshhamann commented on issue #10822:
URL: https://github.com/apache/hudi/issues/10822#issuecomment-1981391484
Here is my configuration:
{'hoodie.table.name': 'analytics_events',
'hoodie.datasource.write.recordkey.field': 'event_uuid',
'hoodie.datasource.write.partitionpath.field':
ad1happy2go commented on issue #10823:
URL: https://github.com/apache/hudi/issues/10823#issuecomment-1980552281
@ShrutiBansal309 Able to reproduce this issue. Issue comes even when we just
try to read this table.
JIRA - https://issues.apache.org/jira/browse/HUDI-7485
Reproducible C
ad1happy2go commented on issue #10822:
URL: https://github.com/apache/hudi/issues/10822#issuecomment-1980135128
@joshhamann Can you please provide the writer configuration to look into
this more.
If you are using upsert operation type, The load to a new Hudi Table will be
expected to
ShrutiBansal309 opened a new issue, #10823:
URL: https://github.com/apache/hudi/issues/10823
**Issue**
I am using Hudi 0.14.0 and Spark 3.4.0 on EMR cluster 6.15.0.
I have a service that writes a Dataset to a table in Hudi located on
S3. I am facing issues when trying to delete data fr
joshhamann opened a new issue, #10822:
URL: https://github.com/apache/hudi/issues/10822
**Describe the problem you faced**
We have a production transform job using AWS Glue version 4.0, Hudi version
0.12.1 that loads data into a hudi table on s3. At some point, this job
starting taking
CTTY commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1977660648
This looks similar to this issue: https://github.com/apache/hudi/issues/7487
where user ran into S3 throttling issue due to too many S3 calls.
Was wondering if you can check if the
chenbodeng719 commented on issue #5777:
URL: https://github.com/apache/hudi/issues/5777#issuecomment-1970657108
> @chenbodeng719 Can you please create a new issue with all the details
about hudi/spark versions and steps to reproduce. Thanks.
ok
--
This is an automated message from
ad1happy2go commented on issue #5777:
URL: https://github.com/apache/hudi/issues/5777#issuecomment-1970644646
@chenbodeng719 Can you please create a new issue with all the details about
hudi/spark versions and steps to reproduce. Thanks.
--
This is an automated message from the Apache Git
chenbodeng719 commented on issue #5777:
URL: https://github.com/apache/hudi/issues/5777#issuecomment-1970368618
@nsivabalan I have the same issue. The below is my flink hudi config.
```
CREATE TABLE hudi_sink(
new_uid STRING PRIMARY KEY NOT ENFORCED,
Toroidals opened a new issue, #10779:
URL: https://github.com/apache/hudi/issues/10779
**_Tips before filing an issue_**
- Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? yes
- Join the mailing list to engage in conversations and get faster support at
dev-
danny0405 commented on issue #10754:
URL: https://github.com/apache/hudi/issues/10754#issuecomment-1963209068
you are right, we should enode the partition path for these special
characters.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
eshu opened a new issue, #10754:
URL: https://github.com/apache/hudi/issues/10754
When the partition column contains the slash character ("/"), Hudi could
write the data incorrectly or do not read the back.
Test (I use some helpers to write and read Hudi data, they write write data
t
yihua commented on issue #10566:
URL: https://github.com/apache/hudi/issues/10566#issuecomment-1962825044
test
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscrib
yihua commented on issue #10566:
URL: https://github.com/apache/hudi/issues/10566#issuecomment-1962806613
test
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscrib
alberttwong closed issue #10695: [SUPPORT] Hudi wants to write the database in
s3://datalake
URL: https://github.com/apache/hudi/issues/10695
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
ad1happy2go commented on issue #10566:
URL: https://github.com/apache/hudi/issues/10566#issuecomment-1919591261
@CTTY I was trying to reproduce this issue, but got into some other setup
issue. Will get back to you soon on this.
--
This is an automated message from the Apache Git Service.
ad1happy2go commented on issue #10112:
URL: https://github.com/apache/hudi/issues/10112#issuecomment-1919292489
@zyclove Did you got a chance to try this? Was this PR fixed your issue.
Please share the insights here. Thanks in advance.
--
This is an automated message from the Apache Git S
ad1happy2go commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1919021038
Thanks for trying @ergophobiac. @CTTY any insights here ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
ad1happy2go commented on issue #10458:
URL: https://github.com/apache/hudi/issues/10458#issuecomment-1918953387
I will work on updating the docs. Thanks @stayrascal
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
stayrascal commented on issue #10458:
URL: https://github.com/apache/hudi/issues/10458#issuecomment-1911319106
The official document should change the value of 'cdc.enabled' to ’false‘,
or change the value of 'table.type' to 'COPY_ON_WRITE', because only COW table
support cdc mode for Flin
CTTY opened a new issue, #10566:
URL: https://github.com/apache/hudi/issues/10566
**_Tips before filing an issue_**
- Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
- Join the mailing list to engage in conversations and get faster support at
dev-subscr...
soumilshah1995 closed issue #10499: [SUPPORT] Hudi DeltaStreamer with
Flattening Transformer
URL: https://github.com/apache/hudi/issues/10499
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
soumilshah1995 commented on issue #10499:
URL: https://github.com/apache/hudi/issues/10499#issuecomment-1909122022
I would need some time to play with flattening transformer
need to setup a test project to see if works
let me close this and reopen it later again as I would be doing th
ad1happy2go commented on issue #10499:
URL: https://github.com/apache/hudi/issues/10499#issuecomment-1907473634
@soumilshah1995 Let us know your findings and in case you need any help.
Thanks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please l
ad1happy2go commented on issue #10507:
URL: https://github.com/apache/hudi/issues/10507#issuecomment-1904404899
@zeeshan-media If I understand you clearly, with column stats you got
properly size files but with RLI you getting small files. Can you message me on
slack when you see this, we c
LIKE-HUB opened a new issue, #10543:
URL: https://github.com/apache/hudi/issues/10543
**_Tips before filing an issue_**
- Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
- Join the mailing list to engage in conversations and get faster support at
dev-subsc
zeeshan-media commented on issue #10507:
URL: https://github.com/apache/hudi/issues/10507#issuecomment-1897991450
yes, It was 7.2 MB's each. I was using COW mode, I had not used faker data
for that purpose, it was authentic data having 3 million records, the
configurations were same, only d
soumilshah1995 commented on issue #10499:
URL: https://github.com/apache/hudi/issues/10499#issuecomment-1897068762
let me get back to this issue after some more tries
want to try out few things
--
This is an automated message from the Apache Git Service.
To respond to the message, ple
ad1happy2go commented on issue #10507:
URL: https://github.com/apache/hudi/issues/10507#issuecomment-1896142856
@zeeshan-media Just to be sure, data files are 7.2 MB each? Number of record
keys will affect the record_index size not the data size.
Ideally small files should merge and c
soumilshah1995 commented on issue #8894:
URL: https://github.com/apache/hudi/issues/8894#issuecomment-1894700142
hey buddy
depends on how you have partitioned your tables if you have partitioned
tables with hive style
state='Connecticut. should work
lets connect on slack for
007vedant commented on issue #8894:
URL: https://github.com/apache/hudi/issues/8894#issuecomment-1894364961
Hi @soumilshah1995
I've a use case of deleting specific partitions from my Hudi table. I
verified that the deletion works when I just specify the list of partitions to
be deleted
zeeshan-media commented on issue #10507:
URL: https://github.com/apache/hudi/issues/10507#issuecomment-1894111848
@ad1happy2go each file is of 7 Mb's each. I used Amazon EMR pyspark version
3.4.1.
--
This is an automated message from the Apache Git Service.
To respond to the message, plea
ad1happy2go commented on issue #10507:
URL: https://github.com/apache/hudi/issues/10507#issuecomment-1893959075
@zeeshan-media In the first write, What is the size of those 30 files? The
number of files should not depend on RLI anyway. Ideally small file handling
should take place in upser
zeeshan-media commented on issue #10507:
URL: https://github.com/apache/hudi/issues/10507#issuecomment-1893874309
@ad1happy2go by parquet table I meant the hudi output directory as it is in
parquet format. My output hudi directory is of 210 mb's of data which contains
30 small files. Record
ad1happy2go commented on issue #10507:
URL: https://github.com/apache/hudi/issues/10507#issuecomment-1893862799
@zeeshan-media Hudi upsert path involves tagging of existing data to find
out which records are updated, which will not use RECORD_INDEX (doesn't matter
as there is no existing da
zeeshan-media commented on issue #10507:
URL: https://github.com/apache/hudi/issues/10507#issuecomment-1893778705
@ad1happy2go does it mean that for the first time when we run the job,
record index will not be used because it is creating 30 files(7 mb each) for
approximately 210 mb's (3 mil
ad1happy2go commented on issue #10507:
URL: https://github.com/apache/hudi/issues/10507#issuecomment-1893742984
@zeeshan-media Thanks for raising this.
I tried the code and realised that for the first time while writing data to
a empty table, it gives this warning as record_index is not
zeeshan-media opened a new issue, #10507:
URL: https://github.com/apache/hudi/issues/10507
### Problem Detail:
I am trying hudi record index on my machine, although my pyspark job runs
smoothly and data is written along with creation of record_index file in the
hudi's metadata table, it
soumilshah1995 commented on issue #10499:
URL: https://github.com/apache/hudi/issues/10499#issuecomment-1892837784
Following works
```
spark-submit \
--class org.apache.hudi.utilities.streamer.HoodieStreamer \
--packages org.apache.hudi:hudi-spark3.4-bundle_2.12:0.14.
soumilshah1995 commented on issue #10499:
URL: https://github.com/apache/hudi/issues/10499#issuecomment-1892807362
Also tried
```
spark-submit \
--class org.apache.hudi.utilities.streamer.HoodieStreamer \
--packages org.apache.hudi:hudi-spark3.4-bundle_2.12:0.14.0
soumilshah1995 opened a new issue, #10499:
URL: https://github.com/apache/hudi/issues/10499
Hello, I'm currently experimenting with the Hudi delta streamer and working
on creating part 12 of the delta streamer playlist. For the next video, my goal
is to cover the Hudi SQL-based transformer
ergophobiac commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1886257556
Hello @ad1happy2go ,
We ran a test with the same configurations, just one addition:
spark.hadoop.fs.s3a.connection.maximum=2000. (We found a resource saying the
default on EMR
ad1happy2go commented on issue #10458:
URL: https://github.com/apache/hudi/issues/10458#issuecomment-1884412880
Created a JIRA to track - https://issues.apache.org/jira/browse/HUDI-7287
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
ad1happy2go commented on issue #10458:
URL: https://github.com/apache/hudi/issues/10458#issuecomment-1880868263
@nicholasxu Thanks for raising this. I am also getting this error while
querying with 'read.streaming.enabled' and 'cdc.enabled' is true . Normal
reads are running fine. We will
ergophobiac commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1874167209
Hey @ad1happy2go, we have a test case running, we'll observe till we're sure
it's stable and let you know how it turns out.
--
This is an automated message from the Apache Git S
ad1happy2go commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1874092759
@ergophobiac Did you got a chance to try this out?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
ad1happy2go commented on issue #10415:
URL: https://github.com/apache/hudi/issues/10415#issuecomment-1870277355
@ergophobiac Are you setting fs.s3a.connection.maximum to a higher value?
Can you try increasing its value and try?
--
This is an automated message from the Apache Git Service.
ergophobiac opened a new issue, #10415:
URL: https://github.com/apache/hudi/issues/10415
**Describe the problem you faced**
Stack: Hudi 0.13.1, EMR 6.13.0, Spark 3.4.1
We are writing to an MOR table in S3, using Spark Structured Streaming job
on EMR. Once this job has run for a
parisni closed issue #10402: [SUPPORT] Hudi 0.14.1-rc1 has trouble with spark
3.2
URL: https://github.com/apache/hudi/issues/10402
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
parisni commented on issue #10402:
URL: https://github.com/apache/hudi/issues/10402#issuecomment-1867821807
> It worked fine for me.
good to know, sorry for inconvenience
> Can you confirm if scala version is same for your spark installation and
hudi is same.
Yes it's sc
ad1happy2go commented on issue #10402:
URL: https://github.com/apache/hudi/issues/10402#issuecomment-1867801673
@parisni It worked fine for me. Can you confirm if scala version is same for
your spark installation and hudi is same.
https://github.com/apache/hudi/assets/63430370/93cea61
parisni commented on issue #10402:
URL: https://github.com/apache/hudi/issues/10402#issuecomment-1867662880
@nsivabalan as the release manager for 0.14.1 maybe ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
parisni opened a new issue, #10402:
URL: https://github.com/apache/hudi/issues/10402
```python
# spark-3.2.4-bin-hadoop3.2/bin/pyspark --jars
/projects/hudi/packaging/hudi-spark-bundle/target/hudi-spark3.2-bundle_2.12-0.14.1-rc1.jar
--conf 'spark.serializer=org.apache.spark.serializer.Kr
LIKE-HUB opened a new issue, #10400:
URL: https://github.com/apache/hudi/issues/10400
**_Tips before filing an issue_**
- Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
yes
- Join the mailing list to engage in conversations and get faster support at
dev-su
zyclove commented on issue #10359:
URL: https://github.com/apache/hudi/issues/10359#issuecomment-1863737221
> @zyclove Have you started seeing this issue after hudi upgrade only? Not
sure if it's related to hudi.
Yes, The hudi 0.12.3 works ok, but upgrade to 0.14.x with this issue.
ad1happy2go commented on issue #10138:
URL: https://github.com/apache/hudi/issues/10138#issuecomment-1863070904
Sorry for the delay here. @abhisheksahani91 Checkout the nice blog from
@nsivabalan on timeline server
https://medium.com/@simpsons/timeline-server-in-apache-hudi-b5be25f85e47
-
ad1happy2go commented on issue #10359:
URL: https://github.com/apache/hudi/issues/10359#issuecomment-1862289560
@zyclove Have you started seeing this issue after hudi upgrade only? Not
sure if it's related to hudi.
--
This is an automated message from the Apache Git Service.
To respond t
zyclove commented on issue #10359:
URL: https://github.com/apache/hudi/issues/10359#issuecomment-1861986296
https://issues.apache.org/jira/browse/HDFS-8429
https://issues.apache.org/jira/browse/HADOOP-11333
It seems hdfs bug.
Does the config net.core.wmem_default work?
--
This is
zyclove commented on issue #10359:
URL: https://github.com/apache/hudi/issues/10359#issuecomment-1861977466
```
Thread 11806: (state = IN_NATIVE_TRANS)
- org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(int,
org.apache.hadoop.net.unix.DomainSocketWatcher$FdSet) @bci=0 (Interpret
zyclove opened a new issue, #10359:
URL: https://github.com/apache/hudi/issues/10359
After upgrading the community to version 0.14.1, there is still a
probability that the task will be stuck. The yarn task has been completed and
the task cannot be viewed through yarn app -list. It requires
khajaasmath786 opened a new issue, #10356:
URL: https://github.com/apache/hudi/issues/10356
Here is the intermittent we face in our current hudi data pipeline with
apache spark.
Traceback (most recent call last):
File
"/mnt/tmp/spark-d5dc3d59-8086-4598-b0f8-345b495e8dd1/baxaws-e
young138120 opened a new issue, #10320:
URL: https://github.com/apache/hudi/issues/10320
**Describe the problem you faced**
I run spark job to write data to hudi, and init spark session like this:
![image](https://github.com/apache/hudi/assets/11519151/37f69790-5cbd-44b4-94be-f2613e71f
Amar1404 closed issue #10311: [SUPPORT]
URL: https://github.com/apache/hudi/issues/10311
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubsc
Amar1404 opened a new issue, #10311:
URL: https://github.com/apache/hudi/issues/10311
**_Tips before filing an issue_**
- Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
- Join the mailing list to engage in conversations and get faster support at
dev-subsc
soumilshah1995 closed issue #8565: [SUPPORT] Hudi Bootstrap with METADATA_ONLY
with Hive Sync Fails on EMR Serverlkess 6.10
URL: https://github.com/apache/hudi/issues/8565
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
soumilshah1995 closed issue #8400: [SUPPORT] Hudi Offline Compaction in EMR
Serverless 6.10 for YouTube Video
URL: https://github.com/apache/hudi/issues/8400
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
abhisheksahani91 commented on issue #10270:
URL: https://github.com/apache/hudi/issues/10270#issuecomment-1849503205
@ad1happy2go
I have scaled the infra and compaction execution time has been reduced from
20 minutes to 10.
But I have one doubt, on every compaction, the number of
abhisheksahani91 commented on issue #10270:
URL: https://github.com/apache/hudi/issues/10270#issuecomment-1847465601
@ad1happy2go
The way we conducted the performance test for Hudi in our pre-production
environment is as follows:
1. Bootstrapping the table: We ingested data over K
abhisheksahani91 commented on issue #10270:
URL: https://github.com/apache/hudi/issues/10270#issuecomment-1845776273
@ad1happy2go
1. Are you setting any additional spark configurations:
No, I am not setting any additional spark configuration.
2. Total size of the table
ad1happy2go commented on issue #10270:
URL: https://github.com/apache/hudi/issues/10270#issuecomment-1845540298
@abhisheksahani91 Are you setting any additional spark configurations? How
much data you have in the table? Can you check the compaction timeline file and
see how many file groups
abhisheksahani91 commented on issue #10270:
URL: https://github.com/apache/hudi/issues/10270#issuecomment-1844997033
Please help with this as this is impacting our production pipeline.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
abhisheksahani91 opened a new issue, #10270:
URL: https://github.com/apache/hudi/issues/10270
**_Tips before filing an issue_**
- Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
- Join the mailing list to engage in conversations and get faster support at
d
zyclove commented on issue #10235:
URL: https://github.com/apache/hudi/issues/10235#issuecomment-1838182121
SparkMetadataTableRecordIndex
fileGroupSize =
hoodieTable.getMetadataTable().getNumFileGroupsForPartition(MetadataPartitionType.RECORD_INDEX);
Why not 512 fileGroupSi
zyclove commented on issue #10235:
URL: https://github.com/apache/hudi/issues/10235#issuecomment-1838138200
@danny0405 With set hoodie.metadata.enable=true, now is RECORD_INDEX.
But the follow stage is very very slow too.
![image](https://github.com/apache/hudi/assets/15028279/fa2
danny0405 commented on issue #10235:
URL: https://github.com/apache/hudi/issues/10235#issuecomment-1837959640
hoodie.metadata.table -> hoodie.metadata.enable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL abov
zyclove commented on issue #10235:
URL: https://github.com/apache/hudi/issues/10235#issuecomment-1837949851
@danny0405 why is back to GLOBAL_SIMPLE?
![image](https://github.com/apache/hudi/assets/15028279/20107e0d-46eb-4e28-9a5a-0fc8750cbc34)
23/12/04 14:39:29 WARN SparkMetadataTa
zyclove commented on issue #10235:
URL: https://github.com/apache/hudi/issues/10235#issuecomment-1837946117
@danny0405
why is back to GLOBAL_SIMPLE?
https://github.com/apache/hudi/assets/15028279/9cddf011-e25c-4c0f-9b40-c2d7fdd17cf9";>
23/12/04 14:39:29 WARN SparkMetadataTableRe
zyclove opened a new issue, #10235:
URL: https://github.com/apache/hudi/issues/10235
**Describe the problem you faced**
The spark job is too slow in follow stage. Adjusting CPU, memory, and
concurrency has no effect.
Which stage can be optimized or skipped?
![image](ht
XenosK opened a new issue, #10204:
URL: https://github.com/apache/hudi/issues/10204
**_Tips before filing an issue_**
- Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
- Join the mailing list to engage in conversations and get faster support at
dev-subscr.
xushiyan closed issue #6125: [SUPPORT] hudi-examples-dbt not running with spark
thrift server
URL: https://github.com/apache/hudi/issues/6125
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spe
xushiyan commented on issue #6125:
URL: https://github.com/apache/hudi/issues/6125#issuecomment-1828246693
closing as solution provided
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
blackcheckren commented on issue #9029:
URL: https://github.com/apache/hudi/issues/9029#issuecomment-1827003764
I also encountered this problem, but the corresponding directory on S3 could
not be deleted after dozens of manual attempts, and the log showed that a
folder with the same name wa
abhisheksahani91 commented on issue #10138:
URL: https://github.com/apache/hudi/issues/10138#issuecomment-1825219956
@ad1happy2go Testing is concluded and with recommended changes, I am not
observing the connection issue and also not observing any performance issue.
But can you pleas
abhisheksahani91 commented on issue #10138:
URL: https://github.com/apache/hudi/issues/10138#issuecomment-1824003259
@ad1happy2go Can you please explain if there can be a disadvantage if I
disable the timeline server.
--
This is an automated message from the Apache Git Service.
To respond
xushiyan commented on issue #6125:
URL: https://github.com/apache/hudi/issues/6125#issuecomment-181787
@sambhav13 I'm updating the instructions in the dbt example (using spark 3.2
and hudi 0.14.0). Please check this out and let us know if it helps.
https://github.com/apache/hudi/
ad1happy2go commented on issue #10138:
URL: https://github.com/apache/hudi/issues/10138#issuecomment-1822164822
@abhisheksahani91 There looks like related to this which is yet to be fixed.
https://github.com/apache/hudi/pull/5269
To unblock you can disable the timeline server for n
abhisheksahani91 commented on issue #10138:
URL: https://github.com/apache/hudi/issues/10138#issuecomment-1821565443
@ad1happy2go
I also want to add the point the connection refused error is observed when I
am generating the high load on hudi ingestion
https://github.com/apache/hudi/a
abhisheksahani91 commented on issue #10138:
URL: https://github.com/apache/hudi/issues/10138#issuecomment-1821325376
@ad1happy2go
Schema evolution is working now.
I Did not change anything further and added the same properties you
mentioned.
The only issue is connection was r
abhisheksahani91 commented on issue #10138:
URL: https://github.com/apache/hudi/issues/10138#issuecomment-1820976149
@ad1happy2go Today also I tried from scratch. At first, I inserting the
records and later I changed the schema to add new field and send the update
with new column
This t
ad1happy2go commented on issue #10138:
URL: https://github.com/apache/hudi/issues/10138#issuecomment-1820968721
@abhisheksahani91 I somehow tried a lot to reproduce the issue in my local
setup with 0.12.1 Hudi version but unable to reproduce. Can you try to
reproduce once like below -
abhisheksahani91 commented on issue #10138:
URL: https://github.com/apache/hudi/issues/10138#issuecomment-1819095816
@ad1happy2go thanks for all the support. Actually I am blocked from taking
Hudi live in production. Can you please help me with ETA for this?
--
This is an automated messa
zyclove commented on issue #10131:
URL: https://github.com/apache/hudi/issues/10131#issuecomment-1818731721
@ad1happy2go
Can bulk mode not generate small files? Directly output the 128M result file
and merge it later.
If hoodie.clustering is turned on, can small files be automatically
ad1happy2go commented on issue #10131:
URL: https://github.com/apache/hudi/issues/10131#issuecomment-1818465534
@zyclove Bulk_insert mode don't merge the small files while ingestion. So,
you have to do clustering after bulk_insert to optimise file size.
--
This is an automated message fro
ad1happy2go commented on issue #10138:
URL: https://github.com/apache/hudi/issues/10138#issuecomment-1818393135
@abhisheksahani91 Thanks @abhisheksahani91. I will work on reproducing this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
abhisheksahani91 commented on issue #10138:
URL: https://github.com/apache/hudi/issues/10138#issuecomment-1818379062
@ad1happy2go
I have added the config
"--hoodie-conf", "hoodie.schema.on.read.enable=true"
and schema is also
{
"name": "newCol",
"type": [
"null",
201 - 300 of 396 matches
Mail list logo