[jira] [Commented] (MINIFI-95) MiNiFi-cpp GetFile processor sends directories as files

2016-08-29 Thread Randy Gelhausen (JIRA)

[ 
https://issues.apache.org/jira/browse/MINIFI-95?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15445073#comment-15445073
 ] 

Randy Gelhausen commented on MINIFI-95:
---

Just to confirm, the fs does recognize directories vs files:
[root@2ef5d2331f09 a]# stat test.txt 
  File: 'test.txt'
  Size: 4   Blocks: 8  IO Block: 4096   regular file
Device: fd05h/64773dInode: 167794696   Links: 1
Access: (0644/-rw-r--r--)  Uid: (0/root)   Gid: (0/root)
Access: 2016-08-29 06:59:51.914447813 +
Modify: 2016-08-29 06:57:25.699454601 +
Change: 2016-08-29 07:01:41.452442727 +
 Birth: -
[root@2ef5d2331f09 a]# cd ..
[root@2ef5d2331f09 test]# stat a/
  File: 'a/'
  Size: 21  Blocks: 0  IO Block: 4096   directory
Device: fd05h/64773dInode: 301997110   Links: 2
Access: (0755/drwxr-xr-x)  Uid: (0/root)   Gid: (0/root)
Access: 2016-08-29 07:12:53.337411533 +
Modify: 2016-08-29 07:01:41.452442727 +
Change: 2016-08-29 07:01:41.452442727 +
 Birth: -

> MiNiFi-cpp GetFile processor sends directories as files
> ---
>
> Key: MINIFI-95
> URL: https://issues.apache.org/jira/browse/MINIFI-95
> Project: Apache NiFi MiNiFi
>  Issue Type: Bug
>  Components: C++, Core Framework
>Reporter: Randy Gelhausen
>  Labels: c++, cpp, minifi-cpp
>
> When recurse subdirectories is true, MiNiFi-cpp sends a flowfile for each 
> subdirectory, but does not send flowfiles for files in subdirectories.
> Processors:
> - name: GetNMon
>   class: org.apache.nifi.processors.standard.GetFile
>   max concurrent tasks: 1
>   scheduling strategy: TIMER_DRIVEN
>   scheduling period: 0 sec
>   penalization period: 30 sec
>   yield period: 1 sec
>   run duration nanos: 0
>   auto-terminated relationships list: []
>   Properties:
> Batch Size:
> File Filter:
> Ignore Hidden Files:
> Input Directory: /host-data/metrics/nmon/HOSTNAME
> Keep Source File: false
> Maximum File Age:
> Maximum File Size:
> Minimum File Age: 10 sec
> Minimum File Size:
> Path Filter:
> Polling Interval:
> Recurse Subdirectories: true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MINIFI-95) MiNiFi-cpp GetFile processor sends directories as files

2016-08-29 Thread Randy Gelhausen (JIRA)

[ 
https://issues.apache.org/jira/browse/MINIFI-95?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15445055#comment-15445055
 ] 

Randy Gelhausen commented on MINIFI-95:
---

I added logging statements to GetFile::performListing. It seems that 
subdirectories are not recognized as directories, but instead possibly as type 
"DT_UNKNOWN", and assumed to be regular files.

This is on CentOS 7: cat /etc/redhat-release -> CentOS Linux release 7.2.1511 
(Core) 

I created /test/a/test.txt and ran a GetFile processor with recursion true on 
/test:
[2016-08-29 07:01:45.451] [minifi log] [info] MiNiFi started
[2016-08-29 07:01:45.451] [minifi log] [info] Checking: /test/.
[2016-08-29 07:01:45.451] [minifi log] [info] /test/.: entry->d_type: 4
[2016-08-29 07:01:45.451] [minifi log] [info] entry->d_type & DT_DIR == true
[2016-08-29 07:01:45.451] [minifi log] [info] Checking: /test/..
[2016-08-29 07:01:45.451] [minifi log] [info] /test/..: entry->d_type: 4
[2016-08-29 07:01:45.451] [minifi log] [info] entry->d_type & DT_DIR == true
[2016-08-29 07:01:45.451] [minifi log] [info] Checking: /test/a
[2016-08-29 07:01:45.451] [minifi log] [info] /test/a: entry->d_type: 0

Since d_type is not DT_DIR, minifi is never attempting to list files in /test/a

> MiNiFi-cpp GetFile processor sends directories as files
> ---
>
> Key: MINIFI-95
> URL: https://issues.apache.org/jira/browse/MINIFI-95
> Project: Apache NiFi MiNiFi
>  Issue Type: Bug
>  Components: C++, Core Framework
>Reporter: Randy Gelhausen
>  Labels: c++, cpp, minifi-cpp
>
> When recurse subdirectories is true (in below example, it is blank, which 
> should default to true), MiNiFi-cpp sends a flowfile for each file in each 
> subdirectory, but _also_ sends a flowfile for each subdirectory folder.
> Feel free to close if this is expected behavior, but I assumed GetFile would 
> send only files, not directories.
> Processors:
> - name: GetNMon
>   class: org.apache.nifi.processors.standard.GetFile
>   max concurrent tasks: 1
>   scheduling strategy: TIMER_DRIVEN
>   scheduling period: 0 sec
>   penalization period: 30 sec
>   yield period: 1 sec
>   run duration nanos: 0
>   auto-terminated relationships list: []
>   Properties:
> Batch Size:
> File Filter:
> Ignore Hidden Files:
> Input Directory: /host-data/metrics/nmon/HOSTNAME
> Keep Source File:
> Maximum File Age:
> Maximum File Size:
> Minimum File Age: 10 sec
> Minimum File Size:
> Path Filter:
> Polling Interval:
> Recurse Subdirectories:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)