[GitHub] [hudi] clownxc commented on pull request #8472: [HUDI-5298] Optimize WriteStatus storing HoodieRecord

2023-05-07 Thread via GitHub
clownxc commented on PR #8472: URL: https://github.com/apache/hudi/pull/8472#issuecomment-1537368074 > @clownxc : For failed records, we need to have them logged elsewhere and so no need to deflate. For exception cases, the write status should be marked as failure. So, I don't see any reaso

[GitHub] [hudi] clownxc commented on pull request #8472: [HUDI-5298] Optimize WriteStatus storing HoodieRecord

2023-05-06 Thread via GitHub
clownxc commented on PR #8472: URL: https://github.com/apache/hudi/pull/8472#issuecomment-1537272950 According to the suggestion provided by @prashantwason , I did a test as follows: ```java WriteStatus status = new WriteStatus(true, 1.0); String partitionPath = HoodieTestD

[GitHub] [hudi] clownxc commented on pull request #8472: [HUDI-5298] Optimize WriteStatus storing HoodieRecord

2023-05-05 Thread via GitHub
clownxc commented on PR #8472: URL: https://github.com/apache/hudi/pull/8472#issuecomment-1536940199 > @clownxc If I understand correctly, the memory savings are coming from dropping the "data" part of the HoodieRecord? I noticed that HoodieRecord has only 2 additional members - sealed (boo

[GitHub] [hudi] clownxc commented on pull request #8472: [HUDI-5298] Optimize WriteStatus storing HoodieRecord

2023-05-05 Thread via GitHub
clownxc commented on PR #8472: URL: https://github.com/apache/hudi/pull/8472#issuecomment-1536939863 > this interesting optimization this interesting optimization was reported by @nsivabalan and has not been implemented for a long time -- This is an automated message from the Apac

[GitHub] [hudi] clownxc commented on pull request #8472: [HUDI-5298] Optimize WriteStatus storing HoodieRecord

2023-05-05 Thread via GitHub
clownxc commented on PR #8472: URL: https://github.com/apache/hudi/pull/8472#issuecomment-1536932897 > @clownxc If I understand correctly, the memory savings are coming from dropping the "data" part of the HoodieRecord? I noticed that HoodieRecord has only 2 additional members -

[GitHub] [hudi] clownxc commented on pull request #8472: [HUDI-5298] Optimize WriteStatus storing HoodieRecord

2023-04-22 Thread via GitHub
clownxc commented on PR #8472: URL: https://github.com/apache/hudi/pull/8472#issuecomment-1518674030 > It is great if we can have numbers to illustrate the gains after the patch, like the cost reduction for memory or something. The memory occupied by WriteStatus after optimization is

[GitHub] [hudi] clownxc commented on pull request #8472: [HUDI-5298] Optimize WriteStatus storing HoodieRecord

2023-04-22 Thread via GitHub
clownxc commented on PR #8472: URL: https://github.com/apache/hudi/pull/8472#issuecomment-1518672224 > It is great if we can have numbers to illustrate the gains after the patch, like the cost reduction for memory or something. I did a test based on your suggestion: The number of H

[GitHub] [hudi] clownxc commented on pull request #8472: [HUDI-5298] Optimize WriteStatus storing HoodieRecord

2023-04-18 Thread via GitHub
clownxc commented on PR #8472: URL: https://github.com/apache/hudi/pull/8472#issuecomment-1513964845 > It is great if we can have numbers to illustrate the gains after the patch, like the cost reduction for memory or something. I would be happy to do it. -- This is an automated mes