[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

2023-04-01 Thread via GitHub


DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1493220098

   We use the offline method for compaction. It is normal to start the 
compaction, but after running for a period of time, we found that the 
compaction is delayed a lot, and the instant time is still yesterday.
   
   Let's take a look at the `FlinkCompactionConfig `parameter configuration. 
Currently, we have not found any parameters that can be tuned.
   
   
![1680408973530](https://user-images.githubusercontent.com/30795397/229331242-30ee4d8a-2be9-46ab-baa7-36126cb33c7d.png)
   
   
![1680409026244](https://user-images.githubusercontent.com/30795397/229331245-8fb078e5-d425-4073-a7cd-35cbbb56c023.png)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

2023-04-01 Thread via GitHub


DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1493216537

   Yes, I understand, we test the effect of different bucket numbers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

2023-03-30 Thread via GitHub


DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1491339024

   No, we have stopped the job. The logs file cleanup logic is enabled by 
default, as seen from the DAG diagram.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

2023-03-30 Thread via GitHub


DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1490228324

   We are currently stopped the offline Compaction operation. The bucket index 
we use and the bucket num are set to 100. Currently, under this file group, 
there are 13 parquet files, and the logs are still there, but they have not 
been cleaned up.
   
   
![513108a072bac3e368d7791deff778e](https://user-images.githubusercontent.com/30795397/228837762-b4f34036-6b42-4854-952a-9bce101bbe23.png)
   
   
![e30ee1a4cc28e919ae2f12b4d96a002](https://user-images.githubusercontent.com/30795397/228837778-57158599-7fce-4526-aad0-532dc574ed98.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

2023-03-29 Thread via GitHub


DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1488144130

   > > there is actually no filedId parquet file
   > 
   > Confused by your words, can you re-organize it a little?
   
   Sorry,We fund instant `20230328130031810 `has been compacted, but there is 
no compaction record for FieldId `-0b69-4b13-a1b2-677b800e0729`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

2023-03-29 Thread via GitHub


DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1487993254

   Yes, we checked the Compaction archive file and found that the corresponding 
commit has completed the Compaction, but there is actually no filedId parquet 
file.
   
   
   
![556c395225d6cb7ec60bfc97c4b32fe](https://user-images.githubusercontent.com/30795397/228440036-33be49b9-8581-42e9-ac9b-99adfb8a9541.png)
   
   
![289f141a5bfa99362f94c6d7194](https://user-images.githubusercontent.com/30795397/228440081-baeef86b-97b1-4370-90be-3a050d842b33.png)
   
   
   
![1680069696475](https://user-images.githubusercontent.com/30795397/228440518-059af26a-d3a8-4a3c-ba4f-e453c3d152a5.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

2023-03-28 Thread via GitHub


DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1486562349

   At present, our data is written in real time, and the Compaction is offline 
through the Flink job, but we found that there are still some logs files 
without Compaction.
   
   Through the HUDI CLI, it is found that the Compaction is completed normally, 
and there is no abnormal situation.
   
   1.DAG
   
![1679997323126](https://user-images.githubusercontent.com/30795397/228200049-4a1e934b-5b33-43cb-bf22-d34cd75a314f.png)
   
   2.no parquet file
   
![1679997363383](https://user-images.githubusercontent.com/30795397/228200213-99b4ab44-0249-40ce-b1fc-9979dbfabdec.png)
   
   no parquet files for 13 hour partitions
   
![1679997491851](https://user-images.githubusercontent.com/30795397/228200789-5c5b8a19-f373-4f9a-9439-590fe6f08969.png)
   
   
   3.Compactions status
   
![1679997449250](https://user-images.githubusercontent.com/30795397/228200591-ec0986ac-a02c-4e82-8ccf-62bf6fd4d846.png)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

2023-03-27 Thread via GitHub


DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1484881742

   Thx for reply.
   
   > * the `--service` param has no value, it is a non-valued param
   we tried it, no problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

2023-03-24 Thread via GitHub


DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1482523743

   We found that part of the Instant Compaction was completed through `Hudi 
Cli`. We have a few questions:
   
   1.Why the` Data File Path` is null, through hudi cli?
   
   
![1679650806693](https://user-images.githubusercontent.com/30795397/227484026-005e5a52-3e12-41d6-9fc3-18949793c9ac.png)
   
   
![1679651061728](https://user-images.githubusercontent.com/30795397/227484590-fb028ec5-eee5-4194-b292-de60d90fdf19.png)
   
   2.Why  `xxx.compaction.inflight`  meta file size is zero?
   
![1679651190521](https://user-images.githubusercontent.com/30795397/227485559-1fca9d52-4104-45ea-b984-6d78d05497be.png)
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

2023-03-24 Thread via GitHub


DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1482471170

   When I used the` --service `parameter, I found that there was an error in 
parsing:
   
   
![1679647857839](https://user-images.githubusercontent.com/30795397/227473299-f4a9b91f-c5b9-4404-90b0-ce433e7135ad.png)
   
   We fund set ` JCommander  arity` ,can running.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] DavidZ1 commented on issue #8267: [SUPPORT] Why some delta commit logs files are not converted to parquet ?

2023-03-23 Thread via GitHub


DavidZ1 commented on issue #8267:
URL: https://github.com/apache/hudi/issues/8267#issuecomment-1482134241

   Thank you for your answer.
   
   I use the `HUDI CLI`  tool to view the compact execution process of the 
table, and found that many compacts are `inflight`. Does it mean that the 
asynchronous compaction of the flink job is too slow and requires more 
resources, or is it an offline compaction plan?
   
   
   
![f69ab074618168c8e40026c67a7a728](https://user-images.githubusercontent.com/30795397/227402647-ffa831c3-176b-4d3b-98c3-e08559aa5621.png)
   
   
![df6b4140d34c0fd4ae763196aac7f4a](https://user-images.githubusercontent.com/30795397/227402674-0e0bd4fa-670e-4a13-bece-63a0125627ad.png)
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org