[ https://issues.apache.org/jira/browse/NIFI-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16660413#comment-16660413 ]
ASF GitHub Bot commented on NIFI-5629: -------------------------------------- Github user adyoun2 commented on a diff in the pull request: https://github.com/apache/nifi/pull/3033#discussion_r227328073 --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/GetFile.java --- @@ -308,32 +321,45 @@ public boolean accept(final File file) { }; } - private Set<File> performListing(final File directory, final FileFilter filter, final boolean recurseSubdirectories) { - Path p = directory.toPath(); - if (!Files.isWritable(p) || !Files.isReadable(p)) { - throw new IllegalStateException("Directory '" + directory + "' does not have sufficient permissions (i.e., not writable and readable)"); - } - final Set<File> queue = new HashSet<>(); - if (!directory.exists()) { - return queue; - } - - final File[] children = directory.listFiles(); - if (children == null) { - return queue; - } - - for (final File child : children) { - if (child.isDirectory()) { + private Set<File> performListing(final File directory, final FileFilter filter, final boolean recurseSubdirectories, final int batchSize) { + try { + if (directoryStream == null || !this.fileIterator.hasNext()) { + final Path p = directory.toPath(); + if (!Files.isReadable(p) || !Files.isWritable(p)) { + throw new IllegalStateException("Directory '" + directory + "' does not have sufficient permissions (i.e., not writable and readable)"); + } + + if (!directory.exists()) { + return Collections.emptySet(); + } + + Stream<Path> listStream; if (recurseSubdirectories) { - queue.addAll(performListing(child, filter, recurseSubdirectories)); + listStream = Files.walk(p); --- End diff -- I assume you'll want the same flags set here? > GetFile becomes slow listing vast directories > --------------------------------------------- > > Key: NIFI-5629 > URL: https://issues.apache.org/jira/browse/NIFI-5629 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions > Affects Versions: 1.6.0 > Reporter: Adam > Priority: Minor > > GetFile repeatedly lists entire directories before applying batching, meaning > for vast directories it spends a long time listing directories. > > Pull request to follow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)