[ 
https://issues.apache.org/jira/browse/NIFI-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417583#comment-15417583
 ] 

Bryan Bende commented on NIFI-2553:
-----------------------------------

For the logging error, I propose we use the working directory as the Path, such 
as:
{code}
final Path workingDir = fs.getWorkingDirectory();
            getLogger().info("Initialized a new HDFS File System with working 
dir: {} default block size: {} default replication: {} config: {}",
                    new Object[]{workingDir, 
fs.getDefaultBlockSize(workingDir), fs.getDefaultReplication(workingDir), 
config.toString()});
{code}

I don't see why there would be any difference between logging the block size 
and replication for the working directory vs using the directory from the 
property, they are both the same filesystem.

For the processors onTrigger, we should be catching the exception and routing 
to failure.

I'll put together a PR to address this.

> HDFS processors throwing exception from OnSchedule when directory is an 
> invalid URI
> -----------------------------------------------------------------------------------
>
>                 Key: NIFI-2553
>                 URL: https://issues.apache.org/jira/browse/NIFI-2553
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.0.0, 0.7.0
>            Reporter: Bryan Bende
>            Assignee: Bryan Bende
>            Priority: Minor
>             Fix For: 1.0.0
>
>
> If you enter a directory string that results in an invalid URI, the HDFS 
> processors will throw an unexpected exception from OnScheduled because of a 
> logging statement on in AbstractHadoopProcessor:
> {code}
> getLogger().info("Initialized a new HDFS File System with working dir: {} 
> default block size: {} default replication: {} config: {}",
>                     new Object[] { fs.getWorkingDirectory(), 
> fs.getDefaultBlockSize(new Path(dir)), fs.getDefaultReplication(new 
> Path(dir)), config.toString() });
> {code}
> An example input for the directory that can produce this problem:
> data_${literal('testing'):substring(0,4)%7D
> In addition to this, FetchHDFS, ListHDFS, GetHDFS, and PutHDFS all create new 
> Path instances in their onTrigger methods from the same directory, outside of 
> a try/catch which would result in throwing a ProcessException (if it got past 
> the logging issue above).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to