[ 
https://issues.apache.org/jira/browse/NIFI-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570998#comment-16570998
 ] 

ASF GitHub Bot commented on NIFI-4434:
--------------------------------------

Github user jtstorck commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2937#discussion_r208072541
  
    --- Diff: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java
 ---
    @@ -462,11 +523,15 @@ private String getPerms(final FsAction action) {
     
         private PathFilter createPathFilter(final ProcessContext context) {
             final Pattern filePattern = 
Pattern.compile(context.getProperty(FILE_FILTER).getValue());
    --- End diff --
    
    @ottobackwards The FILE_FILTER property does not currently support 
expression language.  The processor could be updated to enable EL for the 
property, but that is outside the scope of this PR.


> ListHDFS applies File Filter also to subdirectory names in recursive search
> ---------------------------------------------------------------------------
>
>                 Key: NIFI-4434
>                 URL: https://issues.apache.org/jira/browse/NIFI-4434
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.3.0
>            Reporter: Holger Frydrych
>            Assignee: Jeff Storck
>            Priority: Major
>
> The File Filter regex configured in the ListHDFS processor is applied not 
> just to files found, but also to subdirectories. 
> If you try to set up a recursive search to list e.g. all csv files in a 
> directory hierarchy via a regex like ".*\.csv", it will only pick up csv 
> files in the base directory, not in any subdirectory. This is because 
> subdirectories don't typically match that regex pattern.
> To fix this, either subdirectories should not be matched against the file 
> filter, or the file filter should be applied to the full path of all files 
> (relative to the base directory). The GetHDFS processor offers both options 
> via a switch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to