[ https://issues.apache.org/jira/browse/FLINK-20538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17245886#comment-17245886 ]
Jark Wu commented on FLINK-20538: --------------------------------- cc [~lzljs3620320] could you have a look? > sink.rolling-policy.file-size does not work in filesystem connector > ------------------------------------------------------------------- > > Key: FLINK-20538 > URL: https://issues.apache.org/jira/browse/FLINK-20538 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem > Affects Versions: 1.11.1 > Reporter: zhuxiaoshang > Priority: Major > > When I use sql filesystem connector to write data to hdfs,and set > sink.rolling-policy.file-size to 50MB.But seems not working, there are still > 100MB+ size files. > My table ddl is : > > {code:java} > CREATE TABLE cpc_bd_recall_log_hdfs ( > log_timestamp BIGINT, > ip STRING, > `raw` STRING, > `day` STRING, `hour` STRING,`minute` STRING > ) PARTITIONED BY (`day` , `hour` ,`minute`) WITH ( > 'connector'='filesystem', > 'path'='hdfs://xxx/test.db/hdfs_test', > 'format'='parquet', > 'parquet.compression'='SNAPPY', > 'sink.rolling-policy.file-size' = '50MB', > 'sink.partition-commit.policy.kind' = 'success-file', > 'sink.partition-commit.delay'='60s' > ); > {code} > the hdfs files are: > > > {code:java} > 0 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/_SUCCESS > -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-0-2500 > -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-0-2501 > -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-1-2499 > -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-1-2500 > -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-10-2501 > -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-10-2502 > -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-11-2500 > -rw-r--r-- 3 hadoop hadoop 122.2 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-11-2501 > -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-12-2500 > -rw-r--r-- 3 hadoop hadoop 122.2 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-12-2501 > -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-13-2499 > -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-13-2500 > -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-14-2500 > -rw-r--r-- 3 hadoop hadoop 122.1 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-14-2501 > -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-15-2498 > -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-15-2499 > -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-16-2501 > -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-16-2502 > -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-17-2500 > -rw-r--r-- 3 hadoop hadoop 122.5 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-17-2501 > -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-18-2500 > -rw-r--r-- 3 hadoop hadoop 121.7 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-18-2501 > -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-19-2501 > -rw-r--r-- 3 hadoop hadoop 121.7 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-19-2502 > -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-2-2499 > -rw-r--r-- 3 hadoop hadoop 121.6 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-2-2500 > -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-3-2500 > -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-3-2501 > -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-4-2499 > -rw-r--r-- 3 hadoop hadoop 122.1 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-4-2500 > -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-5-2499 > -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-5-2500 > -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-6-2499 > -rw-r--r-- 3 hadoop hadoop 121.5 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-6-2500 > -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-7-2500 > -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-7-2501 > -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-8-2501 > -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-8-2502 > -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-9-2501 > -rw-r--r-- 3 hadoop hadoop 121.9 M 2020-12-04 14:56 > hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-9-2502 > {code} > > > However,when I dig into source code,when writing element to bucket it'll > invoke `shouldRollOnEvent` in TableRollingPolicy. > I don't understand how can this happen?Is a BUG or somewhere I get it wrong. > -- This message was sent by Atlassian Jira (v8.3.4#803005)