[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?
DavidZ1 commented on issue #8267: URL: https://github.com/apache/hudi/issues/8267#issuecomment-1493220098 We use the offline method for compaction. It is normal to start the compaction, but after running for a period of time, we found that the compaction is delayed a lot, and the instant time is still yesterday. Let's take a look at the `FlinkCompactionConfig `parameter configuration. Currently, we have not found any parameters that can be tuned. ![1680408973530](https://user-images.githubusercontent.com/30795397/229331242-30ee4d8a-2be9-46ab-baa7-36126cb33c7d.png) ![1680409026244](https://user-images.githubusercontent.com/30795397/229331245-8fb078e5-d425-4073-a7cd-35cbbb56c023.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?
DavidZ1 commented on issue #8267: URL: https://github.com/apache/hudi/issues/8267#issuecomment-1493216537 Yes, I understand, we test the effect of different bucket numbers. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?
DavidZ1 commented on issue #8267: URL: https://github.com/apache/hudi/issues/8267#issuecomment-1491339024 No, we have stopped the job. The logs file cleanup logic is enabled by default, as seen from the DAG diagram. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?
DavidZ1 commented on issue #8267: URL: https://github.com/apache/hudi/issues/8267#issuecomment-1490228324 We are currently stopped the offline Compaction operation. The bucket index we use and the bucket num are set to 100. Currently, under this file group, there are 13 parquet files, and the logs are still there, but they have not been cleaned up. ![513108a072bac3e368d7791deff778e](https://user-images.githubusercontent.com/30795397/228837762-b4f34036-6b42-4854-952a-9bce101bbe23.png) ![e30ee1a4cc28e919ae2f12b4d96a002](https://user-images.githubusercontent.com/30795397/228837778-57158599-7fce-4526-aad0-532dc574ed98.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?
DavidZ1 commented on issue #8267: URL: https://github.com/apache/hudi/issues/8267#issuecomment-1488144130 > > there is actually no filedId parquet file > > Confused by your words, can you re-organize it a little? Sorry,We fund instant `20230328130031810 `has been compacted, but there is no compaction record for FieldId `-0b69-4b13-a1b2-677b800e0729`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?
DavidZ1 commented on issue #8267: URL: https://github.com/apache/hudi/issues/8267#issuecomment-1487993254 Yes, we checked the Compaction archive file and found that the corresponding commit has completed the Compaction, but there is actually no filedId parquet file. ![556c395225d6cb7ec60bfc97c4b32fe](https://user-images.githubusercontent.com/30795397/228440036-33be49b9-8581-42e9-ac9b-99adfb8a9541.png) ![289f141a5bfa99362f94c6d7194](https://user-images.githubusercontent.com/30795397/228440081-baeef86b-97b1-4370-90be-3a050d842b33.png) ![1680069696475](https://user-images.githubusercontent.com/30795397/228440518-059af26a-d3a8-4a3c-ba4f-e453c3d152a5.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?
DavidZ1 commented on issue #8267: URL: https://github.com/apache/hudi/issues/8267#issuecomment-1486562349 At present, our data is written in real time, and the Compaction is offline through the Flink job, but we found that there are still some logs files without Compaction. Through the HUDI CLI, it is found that the Compaction is completed normally, and there is no abnormal situation. 1.DAG ![1679997323126](https://user-images.githubusercontent.com/30795397/228200049-4a1e934b-5b33-43cb-bf22-d34cd75a314f.png) 2.no parquet file ![1679997363383](https://user-images.githubusercontent.com/30795397/228200213-99b4ab44-0249-40ce-b1fc-9979dbfabdec.png) no parquet files for 13 hour partitions ![1679997491851](https://user-images.githubusercontent.com/30795397/228200789-5c5b8a19-f373-4f9a-9439-590fe6f08969.png) 3.Compactions status ![1679997449250](https://user-images.githubusercontent.com/30795397/228200591-ec0986ac-a02c-4e82-8ccf-62bf6fd4d846.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?
DavidZ1 commented on issue #8267: URL: https://github.com/apache/hudi/issues/8267#issuecomment-1484881742 Thx for reply. > * the `--service` param has no value, it is a non-valued param we tried it, no problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?
DavidZ1 commented on issue #8267: URL: https://github.com/apache/hudi/issues/8267#issuecomment-1482523743 We found that part of the Instant Compaction was completed through `Hudi Cli`. We have a few questions: 1.Why the` Data File Path` is null, through hudi cli? ![1679650806693](https://user-images.githubusercontent.com/30795397/227484026-005e5a52-3e12-41d6-9fc3-18949793c9ac.png) ![1679651061728](https://user-images.githubusercontent.com/30795397/227484590-fb028ec5-eee5-4194-b292-de60d90fdf19.png) 2.Why `xxx.compaction.inflight` meta file size is zero? ![1679651190521](https://user-images.githubusercontent.com/30795397/227485559-1fca9d52-4104-45ea-b984-6d78d05497be.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?
DavidZ1 commented on issue #8267: URL: https://github.com/apache/hudi/issues/8267#issuecomment-1482471170 When I used the` --service `parameter, I found that there was an error in parsing: ![1679647857839](https://user-images.githubusercontent.com/30795397/227473299-f4a9b91f-c5b9-4404-90b0-ce433e7135ad.png) We fund set ` JCommander arity` ,can running. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?
DavidZ1 commented on issue #8267: URL: https://github.com/apache/hudi/issues/8267#issuecomment-1482134241 Thank you for your answer. I use the `HUDI CLI` tool to view the compact execution process of the table, and found that many compacts are `inflight`. Does it mean that the asynchronous compaction of the flink job is too slow and requires more resources, or is it an offline compaction plan? ![f69ab074618168c8e40026c67a7a728](https://user-images.githubusercontent.com/30795397/227402647-ffa831c3-176b-4d3b-98c3-e08559aa5621.png) ![df6b4140d34c0fd4ae763196aac7f4a](https://user-images.githubusercontent.com/30795397/227402674-0e0bd4fa-670e-4a13-bece-63a0125627ad.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org