[ 
https://issues.apache.org/jira/browse/NIFI-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419019#comment-15419019
 ] 

ASF GitHub Bot commented on NIFI-2553:
--------------------------------------

Github user YolandaMDavis commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/843#discussion_r74612515
  
    --- Diff: 
nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/AbstractHadoopProcessor.java
 ---
    @@ -286,8 +310,10 @@ HdfsResources resetHDFSResources(String 
configResources, String dir, ProcessCont
                     }
                 }
     
    +            final Path workingDir = fs.getWorkingDirectory();
                 getLogger().info("Initialized a new HDFS File System with 
working dir: {} default block size: {} default replication: {} config: {}",
    -                    new Object[] { fs.getWorkingDirectory(), 
fs.getDefaultBlockSize(new Path(dir)), fs.getDefaultReplication(new Path(dir)), 
config.toString() });
    +                    new Object[]{workingDir, 
fs.getDefaultBlockSize(workingDir), fs.getDefaultReplication(workingDir), 
config.toString()});
    --- End diff --
    
    Noted this on the Jira 
([NIFI-2553](https://issues.apache.org/jira/browse/NIFI-2553)) but wanted to 
mention here as well.  Understood on the use of working directory for block 
size, curiously intrigued why path is even required,given the 
getDefulatBlockSize's implementation (noted in Jira comments). I think very 
small risk of implementation change on hadoop size causing a real relevance for 
the path (such as block size settings on directory level). So something to keep 
back of mind perhaps is a need to check if dir exists first and if not then use 
default?


> HDFS processors throwing exception from OnSchedule when directory is an 
> invalid URI
> -----------------------------------------------------------------------------------
>
>                 Key: NIFI-2553
>                 URL: https://issues.apache.org/jira/browse/NIFI-2553
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.0.0, 0.7.0
>            Reporter: Bryan Bende
>            Assignee: Bryan Bende
>            Priority: Minor
>             Fix For: 1.0.0
>
>
> If you enter a directory string that results in an invalid URI, the HDFS 
> processors will throw an unexpected exception from OnScheduled because of a 
> logging statement on in AbstractHadoopProcessor:
> {code}
> getLogger().info("Initialized a new HDFS File System with working dir: {} 
> default block size: {} default replication: {} config: {}",
>                     new Object[] { fs.getWorkingDirectory(), 
> fs.getDefaultBlockSize(new Path(dir)), fs.getDefaultReplication(new 
> Path(dir)), config.toString() });
> {code}
> An example input for the directory that can produce this problem:
> data_${literal('testing'):substring(0,4)%7D
> In addition to this, FetchHDFS, ListHDFS, GetHDFS, and PutHDFS all create new 
> Path instances in their onTrigger methods from the same directory, outside of 
> a try/catch which would result in throwing a ProcessException (if it got past 
> the logging issue above).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to