Hi,

Hadoop FileInputFormats (by default) also include hidden files (files starting 
with “.” or “_”). You can override this behaviour in Flink by subclassing 
TextInputFormat and overriding the accept() method. You can use a custom input 
format with ExecutionEnvironment.readFile().

Regarding BucketingSink, you can change both the prefixes and suffixes of the 
various files using configuration methods.

Best,
Aljoscha

> On 27. Jun 2017, at 11:53, Adarsh Jain <eradarshj...@gmail.com> wrote:
> 
> Thanks Stefan, my colleague Shashank has filed a bug for the same in jira
> 
> https://issues.apache.org/jira/browse/FLINK-6993 
> <https://issues.apache.org/jira/browse/FLINK-6993>
> 
> Regards,
> Adarsh
> 
> On Fri, Jun 23, 2017 at 8:19 PM, Stefan Richter <s.rich...@data-artisans.com 
> <mailto:s.rich...@data-artisans.com>> wrote:
> Hi,
> 
> I suggest that you simply open an issue for this in our jira, describing the 
> improvement idea. That should be the fastest way to get this changed.
> 
> Best,
> Stefan
> 
>> Am 23.06.2017 um 15:08 schrieb Adarsh Jain <eradarshj...@gmail.com 
>> <mailto:eradarshj...@gmail.com>>:
>> 
>> Hi Stefan,
>> 
>> I think I found the problem, try it with a file which starts with underscore 
>> in the name like "_part-1-0.csv".
>> 
>> While saving Flink appends a "_" to the file name however while reading at 
>> folder level it does not pick those files.
>> 
>> Can you suggest if we can do a setting so that it does not pre appends 
>> underscore while saving a file.
>> 
>> Regards,
>> Adarsh
>> 
>> On Fri, Jun 23, 2017 at 3:24 PM, Stefan Richter <s.rich...@data-artisans.com 
>> <mailto:s.rich...@data-artisans.com>> wrote:
>> No, that doesn’t make a difference and also works.
>> 
>>> Am 23.06.2017 um 11:40 schrieb Adarsh Jain <eradarshj...@gmail.com 
>>> <mailto:eradarshj...@gmail.com>>:
>>> 
>>> I am using "val env = ExecutionEnvironment.getExecutionEnvironment", can 
>>> this be the problem?
>>> 
>>> With "import org.apache.flink.api.scala.ExecutionEnvironment"
>>> 
>>> Using scala in my program.
>>> 
>>> Regards,
>>> Adarsh 
>>> 
>>> On Fri, Jun 23, 2017 at 3:01 PM, Stefan Richter 
>>> <s.rich...@data-artisans.com <mailto:s.rich...@data-artisans.com>> wrote:
>>> I just copy pasted your code, adding the missing "val env = 
>>> LocalEnvironment.createLocalEnvironment()" and exchanged the string with a 
>>> local directory for some test files that I created. No other changes.
>>> 
>>>> Am 23.06.2017 um 11:25 schrieb Adarsh Jain <eradarshj...@gmail.com 
>>>> <mailto:eradarshj...@gmail.com>>:
>>>> 
>>>> Hi Stefan,
>>>> 
>>>> Thanks for your efforts in checking the same, still doesn't work for me. 
>>>> 
>>>> Can you copy paste the code you used maybe I am doing some silly mistake 
>>>> and am not able to figure out the same.
>>>> 
>>>> Thanks again.
>>>> 
>>>> Regards,
>>>> Adarsh
>>>> 
>>>> 
>>>> On Fri, Jun 23, 2017 at 2:32 PM, Stefan Richter 
>>>> <s.rich...@data-artisans.com <mailto:s.rich...@data-artisans.com>> wrote:
>>>> Hi,
>>>> 
>>>> I tried this out on the current master and the 1.3 release and both work 
>>>> for me everything works exactly as expected, for file names, a directory, 
>>>> and even nested directories.
>>>> 
>>>> Best,
>>>> Stefan
>>>> 
>>>>> Am 22.06.2017 um 21:13 schrieb Adarsh Jain <eradarshj...@gmail.com 
>>>>> <mailto:eradarshj...@gmail.com>>:
>>>>> 
>>>>> Hi Stefan,
>>>>> 
>>>>> Yes your understood right, when I give full path till the filename it 
>>>>> works fine however when I give path till 
>>>>> directory it does not read the data, doesn't print any exceptions too ... 
>>>>> I am also not sure why it is behaving like this.
>>>>> 
>>>>> Should be easily replicable, in case you can try. Will be really helpful.
>>>>> 
>>>>> Regards,
>>>>> Adarsh
>>>>> 
>>>>> On Thu, Jun 22, 2017 at 9:00 PM, Stefan Richter 
>>>>> <s.rich...@data-artisans.com <mailto:s.rich...@data-artisans.com>> wrote:
>>>>> Hi,
>>>>> 
>>>>> I am not sure I am getting the problem right: the code works if you use a 
>>>>> file name, but it does not work for directories? What exactly is not 
>>>>> working? Do you get any exceptions?
>>>>> 
>>>>> Best,
>>>>> Stefan
>>>>> 
>>>>>> Am 22.06.2017 um 17:01 schrieb Adarsh Jain <eradarshj...@gmail.com 
>>>>>> <mailto:eradarshj...@gmail.com>>:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I am trying to use "Recursive Traversal of the Input Path Directory" in 
>>>>>> Flink 1.3 using scala. Snippet of my code below. If I give exact file 
>>>>>> name it is working fine. Ref 
>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/batch/index.html
>>>>>>  
>>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/batch/index.html>
>>>>>> 
>>>>>> import org.apache.flink.api.java.utils.ParameterTool
>>>>>> import org.apache.flink.api.java.{DataSet, ExecutionEnvironment}
>>>>>> import org.apache.flink.configuration.Configuration
>>>>>> 
>>>>>> val config = new Configuration
>>>>>>     config.setBoolean("recursive.file.enumeration",true)
>>>>>> 
>>>>>> val featuresSource: String = 
>>>>>> "file:///Users/adarsh/Documents/testData/featurecsv/31c710ac40/2017/06/22
>>>>>>  <>"
>>>>>> 
>>>>>> val testInput = env.readTextFile(featuresSource).withParameters(config)
>>>>>> testInput.print()
>>>>>> 
>>>>>> Please guide how to fix this.
>>>>>> 
>>>>>> Regards,
>>>>>> Adarsh
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 

Reply via email to