Re: [I] [SUPPORT] hudi RECORD_INDEX is too slow in "Building workload profile" stage . why is HoodieGlobalSimpleIndex ? [hudi]

2024-08-02 Thread via GitHub
Gatsby-Lee commented on issue #10235: URL: https://github.com/apache/hudi/issues/10235#issuecomment-2264980204 Any updates on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [I] [SUPPORT] hudi RECORD_INDEX is too slow in "Building workload profile" stage . why is HoodieGlobalSimpleIndex ? [hudi]

2023-12-04 Thread via GitHub
zyclove commented on issue #10235: URL: https://github.com/apache/hudi/issues/10235#issuecomment-1838182121 SparkMetadataTableRecordIndex fileGroupSize = hoodieTable.getMetadataTable().getNumFileGroupsForPartition(MetadataPartitionType.RECORD_INDEX); Why not 512 fileGroupSi

Re: [I] [SUPPORT] hudi RECORD_INDEX is too slow in "Building workload profile" stage . why is HoodieGlobalSimpleIndex ? [hudi]

2023-12-04 Thread via GitHub
zyclove commented on issue #10235: URL: https://github.com/apache/hudi/issues/10235#issuecomment-1838138200 @danny0405 With set hoodie.metadata.enable=true, now is RECORD_INDEX. But the follow stage is very very slow too. ![image](https://github.com/apache/hudi/assets/15028279/fa2

Re: [I] [SUPPORT] hudi RECORD_INDEX is too slow in "Building workload profile" stage . why is HoodieGlobalSimpleIndex ? [hudi]

2023-12-03 Thread via GitHub
danny0405 commented on issue #10235: URL: https://github.com/apache/hudi/issues/10235#issuecomment-1837959640 hoodie.metadata.table -> hoodie.metadata.enable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [I] [SUPPORT] hudi RECORD_INDEX is too slow in "Building workload profile" stage . why is HoodieGlobalSimpleIndex ? [hudi]

2023-12-03 Thread via GitHub
zyclove commented on issue #10235: URL: https://github.com/apache/hudi/issues/10235#issuecomment-1837949851 @danny0405 why is back to GLOBAL_SIMPLE? ![image](https://github.com/apache/hudi/assets/15028279/20107e0d-46eb-4e28-9a5a-0fc8750cbc34) 23/12/04 14:39:29 WARN SparkMetadataTa

Re: [I] [SUPPORT] hudi RECORD_INDEX is too slow in "Building workload profile" stage . why is HoodieGlobalSimpleIndex ? [hudi]

2023-12-03 Thread via GitHub
zyclove commented on issue #10235: URL: https://github.com/apache/hudi/issues/10235#issuecomment-1837946117 @danny0405 why is back to GLOBAL_SIMPLE? https://github.com/apache/hudi/assets/15028279/9cddf011-e25c-4c0f-9b40-c2d7fdd17cf9";> 23/12/04 14:39:29 WARN SparkMetadataTableRe

[I] [SUPPORT] hudi RECORD_INDEX is too slow in "Building workload profile" stage . why is HoodieGlobalSimpleIndex ? [hudi]

2023-12-03 Thread via GitHub
zyclove opened a new issue, #10235: URL: https://github.com/apache/hudi/issues/10235 **Describe the problem you faced** The spark job is too slow in follow stage. Adjusting CPU, memory, and concurrency has no effect. Which stage can be optimized or skipped? ![image](ht