[ 
https://issues.apache.org/jira/browse/MINIFI-95?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15445055#comment-15445055
 ] 

Randy Gelhausen commented on MINIFI-95:
---------------------------------------

I added logging statements to GetFile::performListing. It seems that 
subdirectories are not recognized as directories, but instead possibly as type 
"DT_UNKNOWN", and assumed to be regular files.

This is on CentOS 7: cat /etc/redhat-release -> CentOS Linux release 7.2.1511 
(Core) 

I created /test/a/test.txt and ran a GetFile processor with recursion true on 
/test:
[2016-08-29 07:01:45.451] [minifi log] [info] MiNiFi started
[2016-08-29 07:01:45.451] [minifi log] [info] Checking: /test/.
[2016-08-29 07:01:45.451] [minifi log] [info] /test/.: entry->d_type: 4
[2016-08-29 07:01:45.451] [minifi log] [info] entry->d_type & DT_DIR == true
[2016-08-29 07:01:45.451] [minifi log] [info] Checking: /test/..
[2016-08-29 07:01:45.451] [minifi log] [info] /test/..: entry->d_type: 4
[2016-08-29 07:01:45.451] [minifi log] [info] entry->d_type & DT_DIR == true
[2016-08-29 07:01:45.451] [minifi log] [info] Checking: /test/a
[2016-08-29 07:01:45.451] [minifi log] [info] /test/a: entry->d_type: 0

Since d_type is not DT_DIR, minifi is never attempting to list files in /test/a

> MiNiFi-cpp GetFile processor sends directories as files
> -------------------------------------------------------
>
>                 Key: MINIFI-95
>                 URL: https://issues.apache.org/jira/browse/MINIFI-95
>             Project: Apache NiFi MiNiFi
>          Issue Type: Bug
>          Components: C++, Core Framework
>            Reporter: Randy Gelhausen
>              Labels: c++, cpp, minifi-cpp
>
> When recurse subdirectories is true (in below example, it is blank, which 
> should default to true), MiNiFi-cpp sends a flowfile for each file in each 
> subdirectory, but _also_ sends a flowfile for each subdirectory folder.
> Feel free to close if this is expected behavior, but I assumed GetFile would 
> send only files, not directories.
> Processors:
> - name: GetNMon
>   class: org.apache.nifi.processors.standard.GetFile
>   max concurrent tasks: 1
>   scheduling strategy: TIMER_DRIVEN
>   scheduling period: 0 sec
>   penalization period: 30 sec
>   yield period: 1 sec
>   run duration nanos: 0
>   auto-terminated relationships list: []
>   Properties:
>     Batch Size:
>     File Filter:
>     Ignore Hidden Files:
>     Input Directory: /host-data/metrics/nmon/HOSTNAME
>     Keep Source File:
>     Maximum File Age:
>     Maximum File Size:
>     Minimum File Age: 10 sec
>     Minimum File Size:
>     Path Filter:
>     Polling Interval:
>     Recurse Subdirectories:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to