hudi-bot commented on PR #3391:
URL: https://github.com/apache/hudi/pull/3391#issuecomment-1116948868
## CI report:
* f9b524a53651db3e83dc922c08762bbae4e84233 Azure:
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=8418
[
https://issues.apache.org/jira/browse/HUDI-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
董可伦 reassigned HUDI-4001:
-
Assignee: 董可伦
> "hoodie.datasource.write.operation" from table config should not be used as
> write operation
>
BalaMahesh opened a new issue, #5494:
URL: https://github.com/apache/hudi/issues/5494
**_Tips before filing an issue_**
- Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
- Join the mailing list to engage in conversations and get faster support at
dev-subsc
rahil-c commented on issue #5484:
URL: https://github.com/apache/hudi/issues/5484#issuecomment-1116919671
Hi @jasondavindev, just curious on your setup of using hudi 0.11 on AWS EMR?
The most recently offered version of Hudi on EMR is`0.9.0`
https://docs.aws.amazon.com/emr/latest/ReleaseGui
hudi-bot commented on PR #3391:
URL: https://github.com/apache/hudi/pull/3391#issuecomment-1116916248
## CI report:
* 223c320447bc9adc8fccaabb9c590bed159b375d Azure:
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5574
rahil-c commented on issue #5298:
URL: https://github.com/apache/hudi/issues/5298#issuecomment-1116915923
@kasured If you have opened a case with AWS EMR support, we have a backport
of the fix for hudi 0.9.0 we can provide you. Let us know so we can close this
thread out for now.
--
This
hudi-bot commented on PR #3391:
URL: https://github.com/apache/hudi/pull/3391#issuecomment-1116914799
## CI report:
* 223c320447bc9adc8fccaabb9c590bed159b375d Azure:
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=5574
ksoullpwk commented on issue #5281:
URL: https://github.com/apache/hudi/issues/5281#issuecomment-1116900488
My expected scope for this issue is only for the properties file. For the
rest part for handling the data, I think it should be done by users.
The issue is I didn't know about t
leobiscassi closed issue #5485: [SUPPORT] Hudi Delta Streamer doesn't recognize
hive style date partition on S3
URL: https://github.com/apache/hudi/issues/5485
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
leobiscassi commented on issue #5485:
URL: https://github.com/apache/hudi/issues/5485#issuecomment-1116871276
@yihua nice, I'll work on this and submit a PR, thanks. 👍🏽
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use t
stackls opened a new issue, #5493:
URL: https://github.com/apache/hudi/issues/5493
While processing 200 tables sequentially using Hudi for delta records, each
time randomly 3 to 4 tables are getting failed with any of below two errors.
It's not same tables which are getting failed after ea
vinothchandar commented on PR #5366:
URL: https://github.com/apache/hudi/pull/5366#issuecomment-1116753207
@bschell is this still WIP ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
yihua commented on issue #5485:
URL: https://github.com/apache/hudi/issues/5485#issuecomment-1116735679
Feel free to close the issue if all good.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t
yihua commented on issue #5485:
URL: https://github.com/apache/hudi/issues/5485#issuecomment-1116735323
@leobiscassi no problem. I agree that the docs can be improved around the
key generator and partition field. If you already have sth in mind, I
encourage you to put up a PR on improving
vinothchandar commented on PR #5436:
URL: https://github.com/apache/hudi/pull/5436#issuecomment-1116735024
@danny0405 @YannByron
I see the major sticking point is -
Option A) separate `.cdc` folder, that contains the CDC log (similar to redo
logs in databases)
Option B) do
Ethan Guo created HUDI-4035:
---
Summary: Improve point lookup in Metadata Table
Key: HUDI-4035
URL: https://issues.apache.org/jira/browse/HUDI-4035
Project: Apache Hudi
Issue Type: Task
R
Ethan Guo created HUDI-4034:
---
Summary: Improve log merging performance for Metadata Table
Key: HUDI-4034
URL: https://issues.apache.org/jira/browse/HUDI-4034
Project: Apache Hudi
Issue Type: Task
[
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Guo closed HUDI-1015.
---
Resolution: Duplicate
> Audit all getAllPartitionPaths() calls and keep em out of fast path
>
Ethan Guo created HUDI-4033:
---
Summary: Aggregated cols stats at partition level in col stats
partition in MDT
Key: HUDI-4033
URL: https://issues.apache.org/jira/browse/HUDI-4033
Project: Apache Hudi
Ethan Guo created HUDI-4032:
---
Summary: Remove double file-listing in SparkHoodieFileIndex
Key: HUDI-4032
URL: https://issues.apache.org/jira/browse/HUDI-4032
Project: Apache Hudi
Issue Type: Task
vinothchandar commented on code in PR #5436:
URL: https://github.com/apache/hudi/pull/5436#discussion_r864305872
##
rfc/rfc-51/rfc-51.md:
##
@@ -0,0 +1,233 @@
+
+
+# RFC-50: Hudi CDC
+
+# Proposers
+
+- @Yann Byron
+
+# Approvers
+
+- @Raymond
+
+# Statue
+JIRA:
[https://issues
vinothchandar commented on code in PR #5436:
URL: https://github.com/apache/hudi/pull/5436#discussion_r864304303
##
rfc/rfc-51/rfc-51.md:
##
@@ -0,0 +1,233 @@
+
+
+# RFC-50: Hudi CDC
+
+# Proposers
+
+- @Yann Byron
+
+# Approvers
+
+- @Raymond
+
+# Statue
+JIRA:
[https://issues
vinothchandar commented on code in PR #5436:
URL: https://github.com/apache/hudi/pull/5436#discussion_r864303561
##
rfc/rfc-51/rfc-51.md:
##
@@ -0,0 +1,233 @@
+
+
+# RFC-50: Hudi CDC
+
+# Proposers
+
+- @Yann Byron
+
+# Approvers
+
+- @Raymond
+
+# Statue
+JIRA:
[https://issues
leobiscassi commented on issue #5485:
URL: https://github.com/apache/hudi/issues/5485#issuecomment-1116656782
> That's correct. If you want Spark like read to include the partition field
from the partition path, you may consider SqlSource or SQL transformer.
When I use the `ParquetDFS
yihua commented on issue #5485:
URL: https://github.com/apache/hudi/issues/5485#issuecomment-1116479729
> what you are saying is that independent of the datatype / style of the
partitions from source dataset they won't be considered as fields, since Hudi
Delta Streamer just list all the par
yihua commented on issue #5481:
URL: https://github.com/apache/hudi/issues/5481#issuecomment-1116452731
@MikeBuh Thanks for the clarification. What is the input size of your batch
reload? The similar principle can be applied here for calculating the
parallelism. To be conservative at fir
ashah-lightbox opened a new issue, #5492:
URL: https://github.com/apache/hudi/issues/5492
**Describe the problem you faced**
I tried _hoodie_is_delete on pyspark emr notebook and it works as desired.
Below is my attached example performed in EMR -
https://gist.github.com/as
[
https://issues.apache.org/jira/browse/HUDI-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4022:
--
Sprint: 2022/05/02
> Add support to validate table's internal state with integ test infr
[
https://issues.apache.org/jira/browse/HUDI-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4028:
--
Sprint: 2022/05/02
> Add failur injection tests to integ test framework
> --
[
https://issues.apache.org/jira/browse/HUDI-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4027:
--
Sprint: 2022/05/02
> add support to test non-core write operations (insert overwrite, de
[
https://issues.apache.org/jira/browse/HUDI-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4020:
--
Sprint: 2022/05/02
> Add support to multi-writer tests to integ test framework (4 concur
[
https://issues.apache.org/jira/browse/HUDI-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4017:
--
Sprint: 2022/05/02
> Spark sql tests as part of github actions for diff spark versions
>
[
https://issues.apache.org/jira/browse/HUDI-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4016:
--
Sprint: 2022/05/02
> Prepare a document to list all tests to be done as part of release
[
https://issues.apache.org/jira/browse/HUDI-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4018:
--
Sprint: 2022/05/02
> Prepare minimal set of yamls to be tested against any write mode an
[
https://issues.apache.org/jira/browse/HUDI-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-3957:
--
Sprint: 2022/05/02
> Support spark2 and scala12 testing w/ integ test bundle
> -
[
https://issues.apache.org/jira/browse/HUDI-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4019:
--
Sprint: 2022/05/02
> Add ability to test async clustering w/ integ test framework
>
[
https://issues.apache.org/jira/browse/HUDI-2464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan closed HUDI-2464.
-
Resolution: Fixed
> Create comprehensive spark datasource yamls similar to deltastreamer
>
[
https://issues.apache.org/jira/browse/HUDI-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-1590:
--
Sprint: 2022/05/02
> Support async clustering w/ test suite job
> --
[
https://issues.apache.org/jira/browse/HUDI-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-3989:
--
Sprint: (was: 2022/05/02)
> Prepare golden datasets for testing
>
[
https://issues.apache.org/jira/browse/HUDI-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-3990:
--
Sprint: (was: 2022/05/02)
> Integrate query engines read validation for each commit in
[
https://issues.apache.org/jira/browse/HUDI-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-3668:
--
Sprint: Hudi-Sprint-Apr-19, Hudi-Sprint-Apr-25 (was: Hudi-Sprint-Apr-19,
Hudi-Sprint-Ap
[
https://issues.apache.org/jira/browse/HUDI-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan closed HUDI-2466.
-
Resolution: Fixed
> Add and validate comprehensive yamls for spark dml
>
This is an automated email from the ASF dual-hosted git repository.
yihua pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/master by this push:
new 3343cbb47b [MINOR] Update RFC status (#5486)
3343cb
yihua merged PR #5486:
URL: https://github.com/apache/hudi/pull/5486
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org
[
https://issues.apache.org/jira/browse/HUDI-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4028:
--
Epic Link: HUDI-3303 (was: HUDI-4015)
> Add failur injection tests to integ test framew
[
https://issues.apache.org/jira/browse/HUDI-4029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4029:
--
Epic Link: HUDI-3303 (was: HUDI-4015)
> test out different lock providers using our int
[
https://issues.apache.org/jira/browse/HUDI-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan closed HUDI-4015.
-
Resolution: Duplicate
> Integ test Infra
>
>
> Key: HUDI-
[
https://issues.apache.org/jira/browse/HUDI-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-3991:
--
Description: Make integ test bundle slim and run tests w/ actual bundles
> Provide bundl
liuzhuang2017 closed pull request #5491: [MINOR] Update the committer list is
sorted by the first name
URL: https://github.com/apache/hudi/pull/5491
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t
[
https://issues.apache.org/jira/browse/HUDI-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4027:
--
Epic Link: HUDI-3303 (was: HUDI-4015)
> add support to test non-core write operations (
[
https://issues.apache.org/jira/browse/HUDI-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan closed HUDI-4030.
-
Resolution: Duplicate
> add ability to test spark-sql with integ test infra
>
liuzhuang2017 opened a new pull request, #5491:
URL: https://github.com/apache/hudi/pull/5491
## *Tips*
- *Thank you very much for contributing to Apache Hudi.*
- *Please review https://hudi.apache.org/contribute/how-to-contribute before
opening a pull request.*
## What is the p
[
https://issues.apache.org/jira/browse/HUDI-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4026:
--
Epic Link: HUDI-3303 (was: HUDI-4015)
> Add support for spark streaming writes to integ
[
https://issues.apache.org/jira/browse/HUDI-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4025:
--
Epic Link: HUDI-3303 (was: HUDI-4015)
> Add support to validate presto, trino and hive
[
https://issues.apache.org/jira/browse/HUDI-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4022:
--
Epic Link: HUDI-3303 (was: HUDI-4015)
> Add support to validate table's internal state
[
https://issues.apache.org/jira/browse/HUDI-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan closed HUDI-4024.
-
Resolution: Fixed
> Make integ test bundle slim and run tests w/ actual bundles
>
[
https://issues.apache.org/jira/browse/HUDI-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4019:
--
Epic Link: HUDI-3303 (was: HUDI-4015)
> Add ability to test async clustering w/ integ t
[
https://issues.apache.org/jira/browse/HUDI-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4020:
--
Epic Link: HUDI-3303 (was: HUDI-4015)
> Add support to multi-writer tests to integ test
[
https://issues.apache.org/jira/browse/HUDI-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4018:
--
Epic Link: HUDI-3303 (was: HUDI-4015)
> Prepare minimal set of yamls to be tested again
[
https://issues.apache.org/jira/browse/HUDI-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan reassigned HUDI-4018:
-
Assignee: sivabalan narayanan
> Prepare minimal set of yamls to be tested against
[
https://issues.apache.org/jira/browse/HUDI-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4017:
--
Epic Link: HUDI-3303 (was: HUDI-4015)
> Spark sql tests as part of github actions for d
[
https://issues.apache.org/jira/browse/HUDI-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-4016:
--
Epic Link: HUDI-3303 (was: HUDI-4015)
> Prepare a document to list all tests to be done
JavierLopezT opened a new issue, #5490:
URL: https://github.com/apache/hudi/issues/5490
Hello. I am facing an issue, and I am not even sure that it's Hudi's fault,
but I am totally lost. Sorry if it's not indeed due to Hudi.
I have a code that reads a commit Hudi file (JSON), takes so
[
https://issues.apache.org/jira/browse/HUDI-64?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-64:
---
Priority: Minor (was: Major)
> Estimation of compression ratio & other dynamic storage knobs based on
> histor
[
https://issues.apache.org/jira/browse/HUDI-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2669:
-
Sprint: 2022/05/02
> Upgrade Java toolset/runtime to JDK11
> -
>
>
[
https://issues.apache.org/jira/browse/HUDI-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2003:
-
Priority: Minor (was: Major)
> Auto Compute Compression ratio for input data to output parquet/orc file s
[
https://issues.apache.org/jira/browse/HUDI-10?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-10:
---
Priority: Minor (was: Major)
> Auto tune bulk insert parallelism #555
> --
[
https://issues.apache.org/jira/browse/HUDI-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2669:
-
Fix Version/s: 0.12.0
> Upgrade Java toolset/runtime to JDK11
> -
>
>
[
https://issues.apache.org/jira/browse/HUDI-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2669:
-
Component/s: performance
(was: code-quality)
Epic Link: HUDI-3249
Issue Typ
[
https://issues.apache.org/jira/browse/HUDI-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-1461:
-
Epic Link: (was: HUDI-3249)
> Bulk insert v2 creates additional small files
> --
[
https://issues.apache.org/jira/browse/HUDI-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2928:
-
Sprint: Hudi-Sprint-Jan-10, 2022/05/02 (was: Hudi-Sprint-Jan-10)
> Evaluate rebasing Hudi's default compr
[
https://issues.apache.org/jira/browse/HUDI-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu reassigned HUDI-2754:
Assignee: Jintao
> Performance improvement for IncrementalRelation
> --
[
https://issues.apache.org/jira/browse/HUDI-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2754:
-
Sprint: Cont' improve - 2022/03/7, 2022/05/02 (was: Cont' improve -
2022/03/7)
> Performance improvement
[
https://issues.apache.org/jira/browse/HUDI-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2754:
-
Reviewers: Alexey Kudinkin
> Performance improvement for IncrementalRelation
> ---
[
https://issues.apache.org/jira/browse/HUDI-413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-413:
Sprint: 2022/04/25
> Use ColumnIndex in parquet to speed up scans
> -
[
https://issues.apache.org/jira/browse/HUDI-64?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu reassigned HUDI-64:
--
Assignee: (was: Forward Xu)
> Estimation of compression ratio & other dynamic storage knobs based on
[
https://issues.apache.org/jira/browse/HUDI-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu reassigned HUDI-2003:
Assignee: (was: Raymond Xu)
> Auto Compute Compression ratio for input data to output parquet/o
[
https://issues.apache.org/jira/browse/HUDI-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2754:
-
Epic Link: HUDI-3249
> Performance improvement for IncrementalRelation
> -
[
https://issues.apache.org/jira/browse/HUDI-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-1041:
-
Epic Link: HUDI-3249
> Cache the explodeRecordRDDWithFileComparisons
> --
[
https://issues.apache.org/jira/browse/HUDI-411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-411:
Epic Link: HUDI-1238
> Quantify the benefit of sizing files using benchmarks
> --
[
https://issues.apache.org/jira/browse/HUDI-413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-413:
Epic Link: HUDI-3249
> Use ColumnIndex in parquet to speed up scans
> ---
[
https://issues.apache.org/jira/browse/HUDI-872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-872:
Epic Link: HUDI-1238
> Implement JMH benchmarks for all core classes
> -
[
https://issues.apache.org/jira/browse/HUDI-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu closed HUDI-3741.
Assignee: Danny Chen
Resolution: Fixed
> Fix flink bucket index bulk insert generates too many small f
[
https://issues.apache.org/jira/browse/HUDI-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu closed HUDI-3728.
Assignee: Danny Chen
Resolution: Fixed
> Set the sort operator parallelism for flink bucket bulk inser
[
https://issues.apache.org/jira/browse/HUDI-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-3918:
-
Epic Link: HUDI-3249
> Improve flink bulk_insert performace for partitioned table
> --
[
https://issues.apache.org/jira/browse/HUDI-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu closed HUDI-3808.
Assignee: Danny Chen
Resolution: Fixed
> Flink bulk_insert timestamp(3) can not be read by Spark
> ---
[
https://issues.apache.org/jira/browse/HUDI-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-3918:
-
Component/s: performance
> Improve flink bulk_insert performace for partitioned table
> --
[
https://issues.apache.org/jira/browse/HUDI-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-1461:
-
Epic Link: HUDI-3249
> Bulk insert v2 creates additional small files
> ---
[
https://issues.apache.org/jira/browse/HUDI-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-1461:
-
Component/s: performance
> Bulk insert v2 creates additional small files
> ---
[
https://issues.apache.org/jira/browse/HUDI-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-3993:
-
Fix Version/s: 0.12.0
> Avoid calling into Spark UDF in Bulk Insert
>
vinothchandar commented on PR #5436:
URL: https://github.com/apache/hudi/pull/5436#issuecomment-1116162592
@danny0405 catching up here. Lets keep the discussions on GH (I ll leave
some comments on the doc as well) so everyone in the community can discover
more easily?
Ideally, love
[
https://issues.apache.org/jira/browse/HUDI-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2928:
-
Component/s: performance
storage-management
Epic Link: HUDI-3249
> Evaluate rebasin
[
https://issues.apache.org/jira/browse/HUDI-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2928:
-
Issue Type: Improvement (was: Task)
> Evaluate rebasing Hudi's default compression from Gzip to Zstd
> --
[
https://issues.apache.org/jira/browse/HUDI-2928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-2928:
-
Fix Version/s: 0.12.0
> Evaluate rebasing Hudi's default compression from Gzip to Zstd
> -
leobiscassi commented on issue #5485:
URL: https://github.com/apache/hudi/issues/5485#issuecomment-1116157734
Hi @yihua, thanks for the answer! About the your points:
(1) Thank you, I didn't notice this possibility, these folders are annoying
😓
(2) (3)
> the parquet files you
[
https://issues.apache.org/jira/browse/HUDI-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-1976:
-
Sprint: Hudi-Sprint-Apr-19, Hudi-Sprint-Apr-25 (was: Hudi-Sprint-Apr-19,
Hudi-Sprint-Apr-25, 2022/04/25)
[
https://issues.apache.org/jira/browse/HUDI-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Raymond Xu updated HUDI-3883:
-
Epic Link: HUDI-3249
> File-sizing issues when writing COW table to S3
> -
parisni opened a new issue, #5489:
URL: https://github.com/apache/hudi/issues/5489
hudi 0.11.0
spark 3.2.1 / spark 2.4.x
When adding comments to schema then hudi_sync don't add it to the hive
table. Even when the feature is activate
```
+ spark3.2-comments.py 08_pyspark
nsivabalan commented on issue #5455:
URL: https://github.com/apache/hudi/issues/5455#issuecomment-1116086600
@bhasudha : Do we need to add any faq on this end? will let you take a call.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on t
parisni commented on issue #5482:
URL: https://github.com/apache/hudi/issues/5482#issuecomment-1116057329
I cannot really share the whole code, but parts of it.
> Also, do the timeouts prevent the ingestion from proceeding?
yes : I only get 5 commit done but I am trying 6 operation
1 - 100 of 157 matches
Mail list logo