[GitHub] [hudi] Gatsby-Lee commented on issue #4873: Processing time very Slow Updating records into Hudi Dataset(MOR) using AWS Glue

2022-03-08 Thread GitBox
Gatsby-Lee commented on issue #4873: URL: https://github.com/apache/hudi/issues/4873#issuecomment-1062606418 @pmgod8922 Hi, my config is pretty much like this with some change to use MoR https://github.com/apache/hudi/issues/4896 ``` 'hoodie.compact.inline': 'false'

[GitHub] [hudi] Gatsby-Lee commented on issue #4873: Processing time very Slow Updating records into Hudi Dataset(MOR) using AWS Glue

2022-03-03 Thread GitBox
Gatsby-Lee commented on issue #4873: URL: https://github.com/apache/hudi/issues/4873#issuecomment-1058590211 @cafelo-pfdrive Thank you. Have you figured out or confirmed if the Async Table Compaction runs in AWS Glue Streaming? I haven't confirmed it yet, but based on what I saw b

[GitHub] [hudi] Gatsby-Lee commented on issue #4873: Processing time very Slow Updating records into Hudi Dataset(MOR) using AWS Glue

2022-03-03 Thread GitBox
Gatsby-Lee commented on issue #4873: URL: https://github.com/apache/hudi/issues/4873#issuecomment-1058432631 @cafelo-pfdrive Unfortunately, It seems I didn't document what I investigated on Compaction. I guess when Compaction runs, you might be able to find `xxx.compaction.request` i

[GitHub] [hudi] Gatsby-Lee commented on issue #4873: Processing time very Slow Updating records into Hudi Dataset(MOR) using AWS Glue

2022-02-26 Thread GitBox
Gatsby-Lee commented on issue #4873: URL: https://github.com/apache/hudi/issues/4873#issuecomment-1053272675 @cafelo-pfdrive I think the compaction might happen based on the compaction strategy. ( default is num_commit ) * compaction.trigger.strategy * metadata.compaction.delt

[GitHub] [hudi] Gatsby-Lee commented on issue #4873: Processing time very Slow Updating records into Hudi Dataset(MOR) using AWS Glue

2022-02-24 Thread GitBox
Gatsby-Lee commented on issue #4873: URL: https://github.com/apache/hudi/issues/4873#issuecomment-1050255151 @cafelo-pfdrive oh. you need the Compaction if you want to use MoR. In Glue, ( if I am right ), Table compaction doesn't happen automatically for you. ( you might have to setu

[GitHub] [hudi] Gatsby-Lee commented on issue #4873: Processing time very Slow Updating records into Hudi Dataset(MOR) using AWS Glue

2022-02-24 Thread GitBox
Gatsby-Lee commented on issue #4873: URL: https://github.com/apache/hudi/issues/4873#issuecomment-1050113792 @cafelo-pfdrive I saw you use MERGE_ON_READ in AWS Glue ( I use Glue as well ). How do you run Table Compaction? What value do you use for these config? * hoodie.comp

[GitHub] [hudi] Gatsby-Lee commented on issue #4873: Processing time very Slow Updating records into Hudi Dataset(MOR) using AWS Glue

2022-02-23 Thread GitBox
Gatsby-Lee commented on issue #4873: URL: https://github.com/apache/hudi/issues/4873#issuecomment-1049512318 @nsivabalan I have a question. In the reported config, there are three fields. Do all three fields have to be "timestamp characteristics"? -- This is an automated message fro

[GitHub] [hudi] Gatsby-Lee commented on issue #4873: Processing time very Slow Updating records into Hudi Dataset(MOR) using AWS Glue

2022-02-23 Thread GitBox
Gatsby-Lee commented on issue #4873: URL: https://github.com/apache/hudi/issues/4873#issuecomment-1049511235 @cafelo-pfdrive it is sth that increase incrementally. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U