Re: Continuous File monitoring not reading nested files

2017-01-10 Thread Kostas Kloudas
Aljoscha is right! Any contribution is more than welcomed. Kostas > On Jan 10, 2017, at 3:48 PM, Aljoscha Krettek wrote: > > Yes, please go ahead with the fix! :-) > > (If I'm not mistaken Kostas is working on other stuff right now.) > > On Mon, 9 Jan 2017 at 23:19

Re: Continuous File monitoring not reading nested files

2017-01-10 Thread Aljoscha Krettek
Yes, please go ahead with the fix! :-) (If I'm not mistaken Kostas is working on other stuff right now.) On Mon, 9 Jan 2017 at 23:19 Yassine MARZOUGUI wrote: > Hi, > > I found the root cause of the problem : the listEligibleFiles method in >

Re: Continuous File monitoring not reading nested files

2017-01-09 Thread Yassine MARZOUGUI
Hi, I found the root cause of the problem : the listEligibleFiles method in ContinuousFileMonitoringFunction scans only the topmost files and ignores the nested files. By fixing that I was able to get the expected output. I created Jira issue: https://issues.apache.org/jira/browse/FLINK-5432.

Re: Continuous File monitoring not reading nested files

2017-01-09 Thread Yassine MARZOUGUI
Hi Kostas, I debugged the code and the nestedFileEnumeration parameter was always true during the execution. I noticed however that in the following loop in ContinuousFileMonitoringFunction, for some reason, the fileStatus was null for files in nested folders, and non null for files directly

Re: Continuous File monitoring not reading nested files

2017-01-09 Thread Kostas Kloudas
Yes, thanks for the effort. I will look into it. Kostas > On Jan 9, 2017, at 4:24 PM, Lukas Kircher > wrote: > > Thanks for your suggestions: > > @Timo > 1) Regarding the recursive.file.enumeration parameter: I think what counts > here is the

Re: Continuous File monitoring not reading nested files

2017-01-09 Thread Lukas Kircher
Thanks for your suggestions: @Timo 1) Regarding the recursive.file.enumeration parameter: I think what counts here is the enumerateNestedFiles parameter in FileInputFormat.java. Calling the setter for enumerateNestedFiles is expected to overwrite recursive.file.enumeration. Not literally - I

Re: Continuous File monitoring not reading nested files

2017-01-09 Thread Kostas Kloudas
Hi Yassine, I suspect that the problem is in the way the input format (and not the reader) scans nested files, but could you see if in the code that is executed by the tasks, the nestedFileEnumeration parameter is still true? I am asking in order to pin down if the problem is in the way we

Re: Continuous File monitoring not reading nested files

2017-01-09 Thread Kostas Kloudas
Hi Lukas, Are you sure that the tempFile.deleteOnExit() does not remove the files before the test completes. I am just asking to be sure. Also from the code, I suppose that you run it locally. I suspect that the problem is in the way the input format scans nested files, but could you see if in

Fwd: Continuous File monitoring not reading nested files

2017-01-09 Thread Lukas Kircher
Hi all, this is probably related to the problem that I reported in December. In case it helps you can find a self contained example below. I haven't looked deeply into the problem but it seems like the correct file splits are determined but somehow not processed. If I read from HDFS nested

Re: Continuous File monitoring not reading nested files

2017-01-09 Thread Yassine MARZOUGUI
Hi, Any updates on this issue? Thank you. Best, Yassine On Dec 20, 2016 6:15 PM, "Aljoscha Krettek" wrote: +kostas, who probably has the most experience with this by now. Do you have an idea what might be going on? On Fri, 16 Dec 2016 at 15:45 Yassine MARZOUGUI

Re: Continuous File monitoring not reading nested files

2016-12-20 Thread Aljoscha Krettek
+kostas, who probably has the most experience with this by now. Do you have an idea what might be going on? On Fri, 16 Dec 2016 at 15:45 Yassine MARZOUGUI wrote: > Looks like this is not specific to the continuous file monitoring, I'm > having the same issue (files in

Re: Continuous File monitoring not reading nested files

2016-12-16 Thread Yassine MARZOUGUI
Looks like this is not specific to the continuous file monitoring, I'm having the same issue (files in nested directories are not read) when using: env.readFile(fileInputFormat, "hdfs:///shared/mydir", FileProcessingMode.PROCESS_ONCE, -1L) 2016-12-16 11:12 GMT+01:00 Yassine MARZOUGUI

Continuous File monitoring not reading nested files

2016-12-16 Thread Yassine MARZOUGUI
Hi all, I'm using the following code to continuously process files from a directory "mydir". final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); FileInputFormat fileInputFormat = new TextInputFormat(new Path("hdfs:///shared/mydir"));