[jira] [Comment Edited] (FLINK-3655) Allow comma-separated or multiple directories to be specified for FileInputFormat
[ https://issues.apache.org/jira/browse/FLINK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15226474#comment-15226474 ] Maximilian Michels edited comment on FLINK-3655 at 4/5/16 3:37 PM: --- Sounds good. It is important to maintain backwards compatibility. I'm not sure about the "comma-separated Path string". File names may contain commas. So we might skip that for now and do the path list first. I think we could also use {{readFile(FileInputFormat inputFormat, String.. filePaths)}} which will return the filePath as a {{String[] filepaths}} array. was (Author: mxm): Sounds good. It is important to maintain backwards compatibility. I'm not sure about the "comma-separated Path string". File names may contain commas. So we might skip that for now and do the path list first. I think we could also use {{readFile(FileInputFormat inputFormat, String.. filePaths}} which will return the filePath as a {{String[] filepaths}} array. > Allow comma-separated or multiple directories to be specified for > FileInputFormat > - > > Key: FLINK-3655 > URL: https://issues.apache.org/jira/browse/FLINK-3655 > Project: Flink > Issue Type: Improvement > Components: Core >Affects Versions: 1.0.0 >Reporter: Gna Phetsarath >Priority: Minor > Labels: starter > > Allow comma-separated or multiple directories to be specified for > FileInputFormat so that a DataSource will process the directories > sequentially. > > env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,/data/2016/01/03/*/*") > in Scala >env.readFile(paths: Seq[String]) > or > env.readFile(path: String, otherPaths: String*) > Wildcard support would be a bonus. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLINK-3655) Allow comma-separated or multiple directories to be specified for FileInputFormat
[ https://issues.apache.org/jira/browse/FLINK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224354#comment-15224354 ] Tian, Li edited comment on FLINK-3655 at 4/4/16 3:44 PM: - I think we may need to use "List filePaths" instead of "Path filePath" in FileInputFormat. In this way, we should also 1. modify current implementations to support multiple input paths 2. add functions like setFilePaths, getFilePaths to FileInputFormat, and support comma-seperated Path string in ExecutionEnvironment 3. for backward compatibility, let FileInputFormat.setFilePath set the inputPaths to a one-element list was (Author: tianli): I think we may need to use List instead of a single Path in FileInputFormat. In this way, we should also 1. modify current implementations to support multiple input paths 2. add functions like setFilePaths, getFilePaths to FileInputFormat, and support comma-seperated Path string in ExecutionEnvironment 3. for backward compatibility, let FileInputFormat.setFilePath set the inputPaths to a one-element list > Allow comma-separated or multiple directories to be specified for > FileInputFormat > - > > Key: FLINK-3655 > URL: https://issues.apache.org/jira/browse/FLINK-3655 > Project: Flink > Issue Type: Improvement > Components: Core >Affects Versions: 1.0.0 >Reporter: Gna Phetsarath >Priority: Minor > Labels: starter > > Allow comma-separated or multiple directories to be specified for > FileInputFormat so that a DataSource will process the directories > sequentially. > > env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,/data/2016/01/03/*/*") > in Scala >env.readFile(paths: Seq[String]) > or > env.readFile(path: String, otherPaths: String*) > Wildcard support would be a bonus. -- This message was sent by Atlassian JIRA (v6.3.4#6332)