Yun Gao created FLINK-20295:
-------------------------------

             Summary: File Source lost data when reading from directories 
created by FileSystemTableSink with JSON format
                 Key: FLINK-20295
                 URL: https://issues.apache.org/jira/browse/FLINK-20295
             Project: Flink
          Issue Type: Bug
            Reporter: Yun Gao
             Fix For: 1.12.0
         Attachments: compaction.tgz

When testing the compaction functionality of the FileSystemTableSink, I found 
that when using json format, the produced directories could not be read 
correctly by the file source, namely only a part of records are read.


By checking the produced directories, the number of the records in it is the 
same as expected, thus it seems to be the issue of the source side.

 

The issue only exists for JSON format.

The data is produced by 
[FileCompactionTest|https://github.com/gaoyunhaii/flink1.12test/blob/main/src/main/java/FileCompactionTest.java]
 and read by  
[FileCompactionCheckTest|https://github.com/gaoyunhaii/flink1.12test/blob/main/src/main/java/FileCompactionCheckTest.java]
 . An example directories tar file of 8000 records are also attached.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to