[ 
https://issues.apache.org/jira/browse/HUDI-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17391877#comment-17391877
 ] 

ASF GitHub Bot commented on HUDI-2247:
--------------------------------------

danny0405 commented on a change in pull request #3363:
URL: https://github.com/apache/hudi/pull/3363#discussion_r681386305



##########
File path: 
hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/profile/WriteProfiles.java
##########
@@ -131,7 +133,7 @@ public static void clean(String path) {
         })
         // filter out crushed files
         .filter(Objects::nonNull)
-        .filter(fileStatus -> fileStatus.getLen() > 0)
+        .filter(fileStatus -> fileStatus.getLen() > MAGIC.length)
         .collect(Collectors.toList());

Review comment:
       The committed file can be a log file, what do you mean by `log file 
still filter by fileSize > 0` ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Filter file where length less than parquet MAGIC length
> -------------------------------------------------------
>
>                 Key: HUDI-2247
>                 URL: https://issues.apache.org/jira/browse/HUDI-2247
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Flink Integration
>            Reporter: yuzhaojing
>            Assignee: yuzhaojing
>            Priority: Major
>              Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to