[ https://issues.apache.org/jira/browse/FLINK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362280#comment-16362280 ]
ASF GitHub Bot commented on FLINK-3655: --------------------------------------- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/5415#discussion_r167848724 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/io/FileInputFormat.java --- @@ -404,9 +486,39 @@ public FileBaseStatistics getStatistics(BaseStatistics cachedStats) throws IOExc return null; } - protected FileBaseStatistics getFileStats(FileBaseStatistics cachedStats, Path filePath, FileSystem fs, - ArrayList<FileStatus> files) throws IOException { - + protected FileBaseStatistics getFileStats(FileBaseStatistics cachedStats, Path[] filePaths, ArrayList<FileStatus> files) throws IOException { + + // shortcut for a single path that preserves cached stats + if (filePaths.length == 1) { --- End diff -- not sure if this optimization is really worth it > Allow comma-separated or multiple directories to be specified for > FileInputFormat > --------------------------------------------------------------------------------- > > Key: FLINK-3655 > URL: https://issues.apache.org/jira/browse/FLINK-3655 > Project: Flink > Issue Type: Improvement > Components: Core > Affects Versions: 1.0.0 > Reporter: Gna Phetsarath > Assignee: Fabian Hueske > Priority: Major > Labels: starter > Fix For: 1.5.0 > > > Allow comma-separated or multiple directories to be specified for > FileInputFormat so that a DataSource will process the directories > sequentially. > > env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,/data/2016/01/03/*/*") > in Scala > env.readFile(paths: Seq[String]) > or > env.readFile(path: String, otherPaths: String*) > Wildcard support would be a bonus. -- This message was sent by Atlassian JIRA (v7.6.3#76005)