[
https://issues.apache.org/jira/browse/MINIFI-95?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15445055#comment-15445055
]
Randy Gelhausen commented on MINIFI-95:
---
I added logging statements to GetFile::performListing. It seems that
subdirectories are not recognized as directories, but instead possibly as type
"DT_UNKNOWN", and assumed to be regular files.
This is on CentOS 7: cat /etc/redhat-release -> CentOS Linux release 7.2.1511
(Core)
I created /test/a/test.txt and ran a GetFile processor with recursion true on
/test:
[2016-08-29 07:01:45.451] [minifi log] [info] MiNiFi started
[2016-08-29 07:01:45.451] [minifi log] [info] Checking: /test/.
[2016-08-29 07:01:45.451] [minifi log] [info] /test/.: entry->d_type: 4
[2016-08-29 07:01:45.451] [minifi log] [info] entry->d_type & DT_DIR == true
[2016-08-29 07:01:45.451] [minifi log] [info] Checking: /test/..
[2016-08-29 07:01:45.451] [minifi log] [info] /test/..: entry->d_type: 4
[2016-08-29 07:01:45.451] [minifi log] [info] entry->d_type & DT_DIR == true
[2016-08-29 07:01:45.451] [minifi log] [info] Checking: /test/a
[2016-08-29 07:01:45.451] [minifi log] [info] /test/a: entry->d_type: 0
Since d_type is not DT_DIR, minifi is never attempting to list files in /test/a
> MiNiFi-cpp GetFile processor sends directories as files
> ---
>
> Key: MINIFI-95
> URL: https://issues.apache.org/jira/browse/MINIFI-95
> Project: Apache NiFi MiNiFi
> Issue Type: Bug
> Components: C++, Core Framework
>Reporter: Randy Gelhausen
> Labels: c++, cpp, minifi-cpp
>
> When recurse subdirectories is true (in below example, it is blank, which
> should default to true), MiNiFi-cpp sends a flowfile for each file in each
> subdirectory, but _also_ sends a flowfile for each subdirectory folder.
> Feel free to close if this is expected behavior, but I assumed GetFile would
> send only files, not directories.
> Processors:
> - name: GetNMon
> class: org.apache.nifi.processors.standard.GetFile
> max concurrent tasks: 1
> scheduling strategy: TIMER_DRIVEN
> scheduling period: 0 sec
> penalization period: 30 sec
> yield period: 1 sec
> run duration nanos: 0
> auto-terminated relationships list: []
> Properties:
> Batch Size:
> File Filter:
> Ignore Hidden Files:
> Input Directory: /host-data/metrics/nmon/HOSTNAME
> Keep Source File:
> Maximum File Age:
> Maximum File Size:
> Minimum File Age: 10 sec
> Minimum File Size:
> Path Filter:
> Polling Interval:
> Recurse Subdirectories:
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)