[jira] [Commented] (NIFI-11780) http.san.rfc822name SSL Context Service
[ https://issues.apache.org/jira/browse/NIFI-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740394#comment-17740394 ] David Handermann commented on NIFI-11780: - You're welcome [~namaste_nerd], and thanks for providing the additional details. This makes sense as an addition to the HandleHttpRequest Processor. In light of the fact that X.509 certificates can contain any number of Subject Alternative Names of various types, it sounds like the best approach would be to introduce numbered attributes that include the type of value of each SAN. For example: - http.client.cert.san.0.value = username@domain - http.client.cert.san.0.type = rfc822Name - http.client.cert.san.1.value = 192.168.1.1 - http.client.cert.san.1.type = iPAddress The numbered approach with type and value should be capable of covering other potential use cases with a standard way to represent Subject Alternative Names. > http.san.rfc822name SSL Context Service > --- > > Key: NIFI-11780 > URL: https://issues.apache.org/jira/browse/NIFI-11780 > Project: Apache NiFi > Issue Type: New Feature >Reporter: angela estes >Priority: Major > > I would love to put in a feature request to extract the subject alternate > name rfc822Name in the SSL Context service and put it in an attribute similar > to http.subject.dn? > > perhaps http.san.rfc822name ? > > if we could get this added to nifi itself, we think we could get rid of our > apache httpd front end proxy > > ironically y'all are doing something similar to this in your nifi toolkit to > generate certs just not in NiFi itself > > here is the java refernce: > > [https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/cert/X509Certificate.html#getSubjectAlternativeNames()|https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/cert/X509Certificate.html#getSubjectAlternativeNames%28%29] > > i don't know how y'all derive the actual nifi attribute names exactly, but we > think that part is up to them. we are just suggested http.san.X in case > someone wanted to implement the other SAN attributes from that API call > > this is the reference to the nifi security utils: > > [https://www.javadoc.io/doc/org.apache.nifi/nifi-security-utils/1.9.2/org/apache/nifi/security/util/CertificateUtils.html#getSubjectAlternativeNames-java.security.cert.X509Certificate-] > > > THANK YOU! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11780) http.san.rfc822name SSL Context Service
[ https://issues.apache.org/jira/browse/NIFI-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740357#comment-17740357 ] angela estes commented on NIFI-11780: - [~exceptionfactory] , thank you for the super quick reply! This would be for deriving the RFC822 SAN from a client certificate performing a GET, POST, PUT, etc request to the HandleHttpRequest processor. While this could also be used with the ListenHTTP processor, the HandleHttpRequest process is more robust and is a better use case for us. TYVM! > http.san.rfc822name SSL Context Service > --- > > Key: NIFI-11780 > URL: https://issues.apache.org/jira/browse/NIFI-11780 > Project: Apache NiFi > Issue Type: New Feature >Reporter: angela estes >Priority: Major > > I would love to put in a feature request to extract the subject alternate > name rfc822Name in the SSL Context service and put it in an attribute similar > to http.subject.dn? > > perhaps http.san.rfc822name ? > > if we could get this added to nifi itself, we think we could get rid of our > apache httpd front end proxy > > ironically y'all are doing something similar to this in your nifi toolkit to > generate certs just not in NiFi itself > > here is the java refernce: > > [https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/cert/X509Certificate.html#getSubjectAlternativeNames()|https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/cert/X509Certificate.html#getSubjectAlternativeNames%28%29] > > i don't know how y'all derive the actual nifi attribute names exactly, but we > think that part is up to them. we are just suggested http.san.X in case > someone wanted to implement the other SAN attributes from that API call > > this is the reference to the nifi security utils: > > [https://www.javadoc.io/doc/org.apache.nifi/nifi-security-utils/1.9.2/org/apache/nifi/security/util/CertificateUtils.html#getSubjectAlternativeNames-java.security.cert.X509Certificate-] > > > THANK YOU! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] Lehel44 commented on a diff in pull request #7240: NIFI-11178: Improve ListHDFS performance, incremental loading refactor.
Lehel44 commented on code in PR #7240: URL: https://github.com/apache/nifi/pull/7240#discussion_r1253703892 ## nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/util/FileStatusIterable.java: ## @@ -0,0 +1,124 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop.util; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.processor.exception.ProcessException; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayDeque; +import java.util.Deque; +import java.util.Iterator; +import java.util.NoSuchElementException; +import java.util.concurrent.atomic.AtomicLong; + +public class FileStatusIterable implements Iterable { + +private final Path path; +private final boolean recursive; +private final FileSystem hdfs; +private final UserGroupInformation userGroupInformation; +private final AtomicLong totalFileCount = new AtomicLong(); + +public FileStatusIterable(final Path path, final boolean recursive, final FileSystem hdfs, final UserGroupInformation userGroupInformation) { +this.path = path; +this.recursive = recursive; +this.hdfs = hdfs; +this.userGroupInformation = userGroupInformation; +} + +@Override +public Iterator iterator() { +return new FileStatusIterator(); +} + +public long getTotalFileCount() { +return totalFileCount.get(); +} + +class FileStatusIterator implements Iterator { + +private static final String IO_ERROR_MESSAGE = "IO error occurred while iterating HFDS"; + +private final Deque dirStatuses; + +private FileStatus nextFileStatus; +private RemoteIterator hdfsIterator; Review Comment: Thanks for the suggestion. The outer custom iterator class is also called FileStatusIterator, I'd avoid the "hiding field" code smell, remoteIterator works well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (NIFI-11774) Upgrade gRPC to 1.56.1
[ https://issues.apache.org/jira/browse/NIFI-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Handermann updated NIFI-11774: Issue Type: Improvement (was: Task) > Upgrade gRPC to 1.56.1 > -- > > Key: NIFI-11774 > URL: https://issues.apache.org/jira/browse/NIFI-11774 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Major > Fix For: 2.0.0, 1.23.0 > > Time Spent: 20m > Remaining Estimate: 0h > > https://github.com/grpc/grpc-java/releases -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11774) Upgrade gRPC to 1.56.1
[ https://issues.apache.org/jira/browse/NIFI-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740332#comment-17740332 ] ASF subversion and git services commented on NIFI-11774: Commit 2901993142cb246f22283523c3098fe978995e3a in nifi's branch refs/heads/support/nifi-1.x from Pierre Villard [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=2901993142 ] NIFI-11774 Upgraded gRPC from 1.55.1 to 1.56.1 This closes #7456 Signed-off-by: David Handermann (cherry picked from commit b3372900b3533d8e0ca5b99089bc51159281d3c7) > Upgrade gRPC to 1.56.1 > -- > > Key: NIFI-11774 > URL: https://issues.apache.org/jira/browse/NIFI-11774 > Project: Apache NiFi > Issue Type: Task > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Major > Fix For: 1.latest, 2.latest > > Time Spent: 20m > Remaining Estimate: 0h > > https://github.com/grpc/grpc-java/releases -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-11774) Upgrade gRPC to 1.56.1
[ https://issues.apache.org/jira/browse/NIFI-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Handermann updated NIFI-11774: Fix Version/s: 2.0.0 1.23.0 (was: 1.latest) (was: 2.latest) Resolution: Fixed Status: Resolved (was: Patch Available) > Upgrade gRPC to 1.56.1 > -- > > Key: NIFI-11774 > URL: https://issues.apache.org/jira/browse/NIFI-11774 > Project: Apache NiFi > Issue Type: Task > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Major > Fix For: 2.0.0, 1.23.0 > > Time Spent: 20m > Remaining Estimate: 0h > > https://github.com/grpc/grpc-java/releases -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11774) Upgrade gRPC to 1.56.1
[ https://issues.apache.org/jira/browse/NIFI-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740331#comment-17740331 ] ASF subversion and git services commented on NIFI-11774: Commit b3372900b3533d8e0ca5b99089bc51159281d3c7 in nifi's branch refs/heads/main from Pierre Villard [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=b3372900b3 ] NIFI-11774 Upgraded gRPC from 1.55.1 to 1.56.1 This closes #7456 Signed-off-by: David Handermann > Upgrade gRPC to 1.56.1 > -- > > Key: NIFI-11774 > URL: https://issues.apache.org/jira/browse/NIFI-11774 > Project: Apache NiFi > Issue Type: Task > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Major > Fix For: 1.latest, 2.latest > > Time Spent: 20m > Remaining Estimate: 0h > > https://github.com/grpc/grpc-java/releases -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] mattyb149 commented on a diff in pull request #7460: NIFI-11779 - Override endpoint in PutBigQuery
mattyb149 commented on code in PR #7460: URL: https://github.com/apache/nifi/pull/7460#discussion_r1253674327 ## nifi-nar-bundles/nifi-gcp-bundle/nifi-gcp-processors/src/main/java/org/apache/nifi/processors/gcp/bigquery/PutBigQuery.java: ## @@ -124,6 +127,15 @@ public class PutBigQuery extends AbstractBigQueryProcessor { .required(true) .build(); +public static final PropertyDescriptor BIGQUERY_API_URL = new PropertyDescriptor.Builder() +.name("bigquery-api-url") +.displayName("BigQuery API URL") +.description("Overrides the default BigQuery URL.") Review Comment: Good point! Then the changed logic stays the same and doesn't need the `if` clause -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (NIFI-11449) add autocommit property to PutDatabaseRecord processor
[ https://issues.apache.org/jira/browse/NIFI-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740328#comment-17740328 ] Abdelrahim Ahmad commented on NIFI-11449: - Hi [~sabonyi] , Thanks for your reply. I saw the processor you mentioned but it works only with Hive database or HDFS. this one doesn't support Object storage like Minio, AWS or GCP. So this one cannot be used with the modern data Lakehouse systems. Best regards AA > add autocommit property to PutDatabaseRecord processor > -- > > Key: NIFI-11449 > URL: https://issues.apache.org/jira/browse/NIFI-11449 > Project: Apache NiFi > Issue Type: New Feature > Components: Extensions >Affects Versions: 1.21.0 > Environment: Any Nifi Deployment >Reporter: Abdelrahim Ahmad >Priority: Blocker > Labels: Trino, autocommit, database, iceberg, putdatabaserecord > > The issue is with the {{PutDatabaseRecord}} processor in Apache NiFi. When > using the processor with the Trino-JDBC-Driver or Dremio-JDBC-Driver to write > to an Iceberg catalog, it disables the autocommit feature. This leads to > errors such as "{*}Catalog only supports writes using autocommit: iceberg{*}". > the autocommit feature needs to be added in the processor to be > enabled/disabled. > enabling auto-commit in the Nifi PutDatabaseRecord processor is important for > Deltalake, Iceberg, and Hudi as it ensures data consistency and integrity by > allowing atomic writes to be performed in the underlying database. This will > allow the process to be widely used with bigger range of databases. > _Improving this processor will allow Nifi to be the main tool to ingest data > into these new Technologies. So we don't have to deal with another tool to do > so._ > +*_{color:#de350b}BUT:{color}_*+ > I have reviewed The {{PutDatabaseRecord}} processor in NiFi. It inserts > records one by one into the database using a prepared statement, and commits > the transaction at the end of the loop that processes each record. This > approach can be inefficient and slow when inserting large volumes of data > into tables that are optimized for bulk ingestion, such as Delta Lake, > Iceberg, and Hudi tables. > These tables use various techniques to optimize the performance of bulk > ingestion, such as partitioning, clustering, and indexing. Inserting records > one by one using a prepared statement can bypass these optimizations, leading > to poor performance and potentially causing issues such as excessive disk > usage, increased memory consumption, and decreased query performance. > To avoid these issues, it is recommended to have a new processor, or add > feature to the current one, to bulk insert method with AutoCommit feature > when inserting large volumes of data into Delta Lake, Iceberg, and Hudi > tables. > > P.S.: using PutSQL is not a have autoCommit but have the same performance > problem described above.. > Thanks and best regards :) > Abdelrahim Ahmad -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] exceptionfactory commented on a diff in pull request #7460: NIFI-11779 - Override endpoint in PutBigQuery
exceptionfactory commented on code in PR #7460: URL: https://github.com/apache/nifi/pull/7460#discussion_r1253663148 ## nifi-nar-bundles/nifi-gcp-bundle/nifi-gcp-processors/src/main/java/org/apache/nifi/processors/gcp/bigquery/PutBigQuery.java: ## @@ -124,6 +127,15 @@ public class PutBigQuery extends AbstractBigQueryProcessor { .required(true) .build(); +public static final PropertyDescriptor BIGQUERY_API_URL = new PropertyDescriptor.Builder() +.name("bigquery-api-url") +.displayName("BigQuery API URL") +.description("Overrides the default BigQuery URL.") Review Comment: Along these lines, perhaps it would be clearer to make this a required property, with the default value supplied. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (NIFI-11780) http.san.rfc822name SSL Context Service
[ https://issues.apache.org/jira/browse/NIFI-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740322#comment-17740322 ] David Handermann commented on NIFI-11780: - Thanks for submitting this feature request [~namaste_nerd]. The SSL Context Service is not specific to HTTP communication, so some additional details would be useful to clarify what you are requesting. Is this request for the ListenHTTP Processor, the HandleHttpRequest Processor, or some other extension component? It sounds like you are requesting the Subject Alternative Name for the client certificate, is that correct? If you could clarify those details, that would be helpful. > http.san.rfc822name SSL Context Service > --- > > Key: NIFI-11780 > URL: https://issues.apache.org/jira/browse/NIFI-11780 > Project: Apache NiFi > Issue Type: New Feature >Reporter: angela estes >Priority: Major > > I would love to put in a feature request to extract the subject alternate > name rfc822Name in the SSL Context service and put it in an attribute similar > to http.subject.dn? > > perhaps http.san.rfc822name ? > > if we could get this added to nifi itself, we think we could get rid of our > apache httpd front end proxy > > ironically y'all are doing something similar to this in your nifi toolkit to > generate certs just not in NiFi itself > > here is the java refernce: > > [https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/cert/X509Certificate.html#getSubjectAlternativeNames()|https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/cert/X509Certificate.html#getSubjectAlternativeNames%28%29] > > i don't know how y'all derive the actual nifi attribute names exactly, but we > think that part is up to them. we are just suggested http.san.X in case > someone wanted to implement the other SAN attributes from that API call > > this is the reference to the nifi security utils: > > [https://www.javadoc.io/doc/org.apache.nifi/nifi-security-utils/1.9.2/org/apache/nifi/security/util/CertificateUtils.html#getSubjectAlternativeNames-java.security.cert.X509Certificate-] > > > THANK YOU! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] mattyb149 commented on a diff in pull request #7460: NIFI-11779 - Override endpoint in PutBigQuery
mattyb149 commented on code in PR #7460: URL: https://github.com/apache/nifi/pull/7460#discussion_r1253585746 ## nifi-nar-bundles/nifi-gcp-bundle/nifi-gcp-processors/src/main/java/org/apache/nifi/processors/gcp/bigquery/PutBigQuery.java: ## @@ -124,6 +127,15 @@ public class PutBigQuery extends AbstractBigQueryProcessor { .required(true) .build(); +public static final PropertyDescriptor BIGQUERY_API_URL = new PropertyDescriptor.Builder() +.name("bigquery-api-url") +.displayName("BigQuery API URL") +.description("Overrides the default BigQuery URL.") Review Comment: Is it possible to add the actual default BigQuery URL to the description and/or when you would want to override it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (NIFI-11780) http.san.rfc822name SSL Context Service
angela estes created NIFI-11780: --- Summary: http.san.rfc822name SSL Context Service Key: NIFI-11780 URL: https://issues.apache.org/jira/browse/NIFI-11780 Project: Apache NiFi Issue Type: New Feature Reporter: angela estes I would love to put in a feature request to extract the subject alternate name rfc822Name in the SSL Context service and put it in an attribute similar to http.subject.dn? perhaps http.san.rfc822name ? if we could get this added to nifi itself, we think we could get rid of our apache httpd front end proxy ironically y'all are doing something similar to this in your nifi toolkit to generate certs just not in NiFi itself here is the java refernce: [https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/cert/X509Certificate.html#getSubjectAlternativeNames()|https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/cert/X509Certificate.html#getSubjectAlternativeNames%28%29] i don't know how y'all derive the actual nifi attribute names exactly, but we think that part is up to them. we are just suggested http.san.X in case someone wanted to implement the other SAN attributes from that API call this is the reference to the nifi security utils: [https://www.javadoc.io/doc/org.apache.nifi/nifi-security-utils/1.9.2/org/apache/nifi/security/util/CertificateUtils.html#getSubjectAlternativeNames-java.security.cert.X509Certificate-] THANK YOU! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-11779) Override endpoint in PutBigQuery
[ https://issues.apache.org/jira/browse/NIFI-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Villard updated NIFI-11779: -- Status: Patch Available (was: Open) > Override endpoint in PutBigQuery > > > Key: NIFI-11779 > URL: https://issues.apache.org/jira/browse/NIFI-11779 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Pierre Villard >Assignee: Pierre Villard >Priority: Major > Fix For: 1.latest, 2.latest > > Time Spent: 10m > Remaining Estimate: 0h > > Similarly to NIFI-11439 we should provide the option to override the endpoint > in the PutBigQuery processor so that a user can specify a private endpoint. > For example, something like: https://mybigquery.p.googleapis.com -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] pvillard31 opened a new pull request, #7460: NIFI-11779 - Override endpoint in PutBigQuery
pvillard31 opened a new pull request, #7460: URL: https://github.com/apache/nifi/pull/7460 # Summary [NIFI-11779](https://issues.apache.org/jira/browse/NIFI-11779) - Override endpoint in PutBigQuery Similarly to [NIFI-11439](https://issues.apache.org/jira/browse/NIFI-11439) we should provide the option to override the endpoint in the PutBigQuery processor so that a user can specify a private endpoint. For example, something like: https://mybigquery.p.googleapis.com/ See #7172 # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [ ] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [ ] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [ ] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [ ] Pull Request based on current revision of the `main` branch - [ ] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [ ] JDK 17 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (NIFI-11779) Override endpoint in PutBigQuery
Pierre Villard created NIFI-11779: - Summary: Override endpoint in PutBigQuery Key: NIFI-11779 URL: https://issues.apache.org/jira/browse/NIFI-11779 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Pierre Villard Assignee: Pierre Villard Fix For: 1.latest, 2.latest Similarly to NIFI-11439 we should provide the option to override the endpoint in the PutBigQuery processor so that a user can specify a private endpoint. For example, something like: https://mybigquery.p.googleapis.com -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi-minifi-cpp] lordgamez opened a new pull request, #1602: MINIFICPP-2157 Move response node implementations to source files
lordgamez opened a new pull request, #1602: URL: https://github.com/apache/nifi-minifi-cpp/pull/1602 https://issues.apache.org/jira/browse/MINIFICPP-2157 --- Thank you for submitting a contribution to Apache NiFi - MiNiFi C++. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with MINIFICPP- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically main)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file? - [ ] If applicable, have you updated the NOTICE file? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] tpalfy commented on a diff in pull request #7240: NIFI-11178: Improve ListHDFS performance, incremental loading refactor.
tpalfy commented on code in PR #7240: URL: https://github.com/apache/nifi/pull/7240#discussion_r1253211213 ## nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java: ## @@ -286,392 +200,141 @@ protected Collection customValidate(ValidationContext context) problems.add(new ValidationResult.Builder().valid(false).subject("GetHDFS Configuration") .explanation(MIN_AGE.getDisplayName() + " cannot be greater than " + MAX_AGE.getDisplayName()).build()); } - return problems; } -protected String getKey(final String directory) { -return getIdentifier() + ".lastListingTime." + directory; -} - @Override public void onPropertyModified(final PropertyDescriptor descriptor, final String oldValue, final String newValue) { super.onPropertyModified(descriptor, oldValue, newValue); if (isConfigurationRestored() && (descriptor.equals(DIRECTORY) || descriptor.equals(FILE_FILTER))) { -this.resetState = true; +resetState = true; } } -/** - * Determines which of the given FileStatus's describes a File that should be listed. - * - * @param statuses the eligible FileStatus objects that we could potentially list - * @param context processor context with properties values - * @return a Set containing only those FileStatus objects that we want to list - */ -Set determineListable(final Set statuses, ProcessContext context) { -final long minTimestamp = this.latestTimestampListed; -final TreeMap> orderedEntries = new TreeMap<>(); - -final Long minAgeProp = context.getProperty(MIN_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -// NIFI-4144 - setting to MIN_VALUE so that in case the file modification time is in -// the future relative to the nifi instance, files are not skipped. -final long minimumAge = (minAgeProp == null) ? Long.MIN_VALUE : minAgeProp; -final Long maxAgeProp = context.getProperty(MAX_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -final long maximumAge = (maxAgeProp == null) ? Long.MAX_VALUE : maxAgeProp; - -// Build a sorted map to determine the latest possible entries -for (final FileStatus status : statuses) { -if (status.getPath().getName().endsWith("_COPYING_")) { -continue; -} - -final long fileAge = System.currentTimeMillis() - status.getModificationTime(); -if (minimumAge > fileAge || fileAge > maximumAge) { -continue; -} - -final long entityTimestamp = status.getModificationTime(); - -if (entityTimestamp > latestTimestampListed) { -latestTimestampListed = entityTimestamp; -} - -// New entries are all those that occur at or after the associated timestamp -final boolean newEntry = entityTimestamp >= minTimestamp && entityTimestamp > latestTimestampEmitted; - -if (newEntry) { -List entitiesForTimestamp = orderedEntries.get(status.getModificationTime()); -if (entitiesForTimestamp == null) { -entitiesForTimestamp = new ArrayList(); -orderedEntries.put(status.getModificationTime(), entitiesForTimestamp); -} -entitiesForTimestamp.add(status); -} -} - -final Set toList = new HashSet<>(); - -if (orderedEntries.size() > 0) { -long latestListingTimestamp = orderedEntries.lastKey(); - -// If the last listing time is equal to the newest entries previously seen, -// another iteration has occurred without new files and special handling is needed to avoid starvation -if (latestListingTimestamp == minTimestamp) { -// We are done if the latest listing timestamp is equal to the last processed time, -// meaning we handled those items originally passed over -if (latestListingTimestamp == latestTimestampEmitted) { -return Collections.emptySet(); -} -} else { -// Otherwise, newest entries are held back one cycle to avoid issues in writes occurring exactly when the listing is being performed to avoid missing data -orderedEntries.remove(latestListingTimestamp); -} - -for (List timestampEntities : orderedEntries.values()) { -for (FileStatus status : timestampEntities) { -toList.add(status); -} -} -} - -return toList; -} - @OnScheduled public void resetStateIfNecessary(final ProcessContext context) throws IOException { if (resetState) { -getLogger().debug("Property
[GitHub] [nifi] tpalfy commented on a diff in pull request #7240: NIFI-11178: Improve ListHDFS performance, incremental loading refactor.
tpalfy commented on code in PR #7240: URL: https://github.com/apache/nifi/pull/7240#discussion_r1253194313 ## nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java: ## @@ -286,392 +200,141 @@ protected Collection customValidate(ValidationContext context) problems.add(new ValidationResult.Builder().valid(false).subject("GetHDFS Configuration") .explanation(MIN_AGE.getDisplayName() + " cannot be greater than " + MAX_AGE.getDisplayName()).build()); } - return problems; } -protected String getKey(final String directory) { -return getIdentifier() + ".lastListingTime." + directory; -} - @Override public void onPropertyModified(final PropertyDescriptor descriptor, final String oldValue, final String newValue) { super.onPropertyModified(descriptor, oldValue, newValue); if (isConfigurationRestored() && (descriptor.equals(DIRECTORY) || descriptor.equals(FILE_FILTER))) { -this.resetState = true; +resetState = true; } } -/** - * Determines which of the given FileStatus's describes a File that should be listed. - * - * @param statuses the eligible FileStatus objects that we could potentially list - * @param context processor context with properties values - * @return a Set containing only those FileStatus objects that we want to list - */ -Set determineListable(final Set statuses, ProcessContext context) { -final long minTimestamp = this.latestTimestampListed; -final TreeMap> orderedEntries = new TreeMap<>(); - -final Long minAgeProp = context.getProperty(MIN_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -// NIFI-4144 - setting to MIN_VALUE so that in case the file modification time is in -// the future relative to the nifi instance, files are not skipped. -final long minimumAge = (minAgeProp == null) ? Long.MIN_VALUE : minAgeProp; -final Long maxAgeProp = context.getProperty(MAX_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -final long maximumAge = (maxAgeProp == null) ? Long.MAX_VALUE : maxAgeProp; - -// Build a sorted map to determine the latest possible entries -for (final FileStatus status : statuses) { -if (status.getPath().getName().endsWith("_COPYING_")) { -continue; -} - -final long fileAge = System.currentTimeMillis() - status.getModificationTime(); -if (minimumAge > fileAge || fileAge > maximumAge) { -continue; -} - -final long entityTimestamp = status.getModificationTime(); - -if (entityTimestamp > latestTimestampListed) { -latestTimestampListed = entityTimestamp; -} - -// New entries are all those that occur at or after the associated timestamp -final boolean newEntry = entityTimestamp >= minTimestamp && entityTimestamp > latestTimestampEmitted; - -if (newEntry) { -List entitiesForTimestamp = orderedEntries.get(status.getModificationTime()); -if (entitiesForTimestamp == null) { -entitiesForTimestamp = new ArrayList(); -orderedEntries.put(status.getModificationTime(), entitiesForTimestamp); -} -entitiesForTimestamp.add(status); -} -} - -final Set toList = new HashSet<>(); - -if (orderedEntries.size() > 0) { -long latestListingTimestamp = orderedEntries.lastKey(); - -// If the last listing time is equal to the newest entries previously seen, -// another iteration has occurred without new files and special handling is needed to avoid starvation -if (latestListingTimestamp == minTimestamp) { -// We are done if the latest listing timestamp is equal to the last processed time, -// meaning we handled those items originally passed over -if (latestListingTimestamp == latestTimestampEmitted) { -return Collections.emptySet(); -} -} else { -// Otherwise, newest entries are held back one cycle to avoid issues in writes occurring exactly when the listing is being performed to avoid missing data -orderedEntries.remove(latestListingTimestamp); -} - -for (List timestampEntities : orderedEntries.values()) { -for (FileStatus status : timestampEntities) { -toList.add(status); -} -} -} - -return toList; -} - @OnScheduled public void resetStateIfNecessary(final ProcessContext context) throws IOException { if (resetState) { -getLogger().debug("Property
[GitHub] [nifi] tpalfy commented on a diff in pull request #7240: NIFI-11178: Improve ListHDFS performance, incremental loading refactor.
tpalfy commented on code in PR #7240: URL: https://github.com/apache/nifi/pull/7240#discussion_r1253194313 ## nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java: ## @@ -286,392 +200,141 @@ protected Collection customValidate(ValidationContext context) problems.add(new ValidationResult.Builder().valid(false).subject("GetHDFS Configuration") .explanation(MIN_AGE.getDisplayName() + " cannot be greater than " + MAX_AGE.getDisplayName()).build()); } - return problems; } -protected String getKey(final String directory) { -return getIdentifier() + ".lastListingTime." + directory; -} - @Override public void onPropertyModified(final PropertyDescriptor descriptor, final String oldValue, final String newValue) { super.onPropertyModified(descriptor, oldValue, newValue); if (isConfigurationRestored() && (descriptor.equals(DIRECTORY) || descriptor.equals(FILE_FILTER))) { -this.resetState = true; +resetState = true; } } -/** - * Determines which of the given FileStatus's describes a File that should be listed. - * - * @param statuses the eligible FileStatus objects that we could potentially list - * @param context processor context with properties values - * @return a Set containing only those FileStatus objects that we want to list - */ -Set determineListable(final Set statuses, ProcessContext context) { -final long minTimestamp = this.latestTimestampListed; -final TreeMap> orderedEntries = new TreeMap<>(); - -final Long minAgeProp = context.getProperty(MIN_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -// NIFI-4144 - setting to MIN_VALUE so that in case the file modification time is in -// the future relative to the nifi instance, files are not skipped. -final long minimumAge = (minAgeProp == null) ? Long.MIN_VALUE : minAgeProp; -final Long maxAgeProp = context.getProperty(MAX_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -final long maximumAge = (maxAgeProp == null) ? Long.MAX_VALUE : maxAgeProp; - -// Build a sorted map to determine the latest possible entries -for (final FileStatus status : statuses) { -if (status.getPath().getName().endsWith("_COPYING_")) { -continue; -} - -final long fileAge = System.currentTimeMillis() - status.getModificationTime(); -if (minimumAge > fileAge || fileAge > maximumAge) { -continue; -} - -final long entityTimestamp = status.getModificationTime(); - -if (entityTimestamp > latestTimestampListed) { -latestTimestampListed = entityTimestamp; -} - -// New entries are all those that occur at or after the associated timestamp -final boolean newEntry = entityTimestamp >= minTimestamp && entityTimestamp > latestTimestampEmitted; - -if (newEntry) { -List entitiesForTimestamp = orderedEntries.get(status.getModificationTime()); -if (entitiesForTimestamp == null) { -entitiesForTimestamp = new ArrayList(); -orderedEntries.put(status.getModificationTime(), entitiesForTimestamp); -} -entitiesForTimestamp.add(status); -} -} - -final Set toList = new HashSet<>(); - -if (orderedEntries.size() > 0) { -long latestListingTimestamp = orderedEntries.lastKey(); - -// If the last listing time is equal to the newest entries previously seen, -// another iteration has occurred without new files and special handling is needed to avoid starvation -if (latestListingTimestamp == minTimestamp) { -// We are done if the latest listing timestamp is equal to the last processed time, -// meaning we handled those items originally passed over -if (latestListingTimestamp == latestTimestampEmitted) { -return Collections.emptySet(); -} -} else { -// Otherwise, newest entries are held back one cycle to avoid issues in writes occurring exactly when the listing is being performed to avoid missing data -orderedEntries.remove(latestListingTimestamp); -} - -for (List timestampEntities : orderedEntries.values()) { -for (FileStatus status : timestampEntities) { -toList.add(status); -} -} -} - -return toList; -} - @OnScheduled public void resetStateIfNecessary(final ProcessContext context) throws IOException { if (resetState) { -getLogger().debug("Property
[GitHub] [nifi] tpalfy commented on a diff in pull request #7240: NIFI-11178: Improve ListHDFS performance, incremental loading refactor.
tpalfy commented on code in PR #7240: URL: https://github.com/apache/nifi/pull/7240#discussion_r1253194313 ## nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java: ## @@ -286,392 +200,141 @@ protected Collection customValidate(ValidationContext context) problems.add(new ValidationResult.Builder().valid(false).subject("GetHDFS Configuration") .explanation(MIN_AGE.getDisplayName() + " cannot be greater than " + MAX_AGE.getDisplayName()).build()); } - return problems; } -protected String getKey(final String directory) { -return getIdentifier() + ".lastListingTime." + directory; -} - @Override public void onPropertyModified(final PropertyDescriptor descriptor, final String oldValue, final String newValue) { super.onPropertyModified(descriptor, oldValue, newValue); if (isConfigurationRestored() && (descriptor.equals(DIRECTORY) || descriptor.equals(FILE_FILTER))) { -this.resetState = true; +resetState = true; } } -/** - * Determines which of the given FileStatus's describes a File that should be listed. - * - * @param statuses the eligible FileStatus objects that we could potentially list - * @param context processor context with properties values - * @return a Set containing only those FileStatus objects that we want to list - */ -Set determineListable(final Set statuses, ProcessContext context) { -final long minTimestamp = this.latestTimestampListed; -final TreeMap> orderedEntries = new TreeMap<>(); - -final Long minAgeProp = context.getProperty(MIN_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -// NIFI-4144 - setting to MIN_VALUE so that in case the file modification time is in -// the future relative to the nifi instance, files are not skipped. -final long minimumAge = (minAgeProp == null) ? Long.MIN_VALUE : minAgeProp; -final Long maxAgeProp = context.getProperty(MAX_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -final long maximumAge = (maxAgeProp == null) ? Long.MAX_VALUE : maxAgeProp; - -// Build a sorted map to determine the latest possible entries -for (final FileStatus status : statuses) { -if (status.getPath().getName().endsWith("_COPYING_")) { -continue; -} - -final long fileAge = System.currentTimeMillis() - status.getModificationTime(); -if (minimumAge > fileAge || fileAge > maximumAge) { -continue; -} - -final long entityTimestamp = status.getModificationTime(); - -if (entityTimestamp > latestTimestampListed) { -latestTimestampListed = entityTimestamp; -} - -// New entries are all those that occur at or after the associated timestamp -final boolean newEntry = entityTimestamp >= minTimestamp && entityTimestamp > latestTimestampEmitted; - -if (newEntry) { -List entitiesForTimestamp = orderedEntries.get(status.getModificationTime()); -if (entitiesForTimestamp == null) { -entitiesForTimestamp = new ArrayList(); -orderedEntries.put(status.getModificationTime(), entitiesForTimestamp); -} -entitiesForTimestamp.add(status); -} -} - -final Set toList = new HashSet<>(); - -if (orderedEntries.size() > 0) { -long latestListingTimestamp = orderedEntries.lastKey(); - -// If the last listing time is equal to the newest entries previously seen, -// another iteration has occurred without new files and special handling is needed to avoid starvation -if (latestListingTimestamp == minTimestamp) { -// We are done if the latest listing timestamp is equal to the last processed time, -// meaning we handled those items originally passed over -if (latestListingTimestamp == latestTimestampEmitted) { -return Collections.emptySet(); -} -} else { -// Otherwise, newest entries are held back one cycle to avoid issues in writes occurring exactly when the listing is being performed to avoid missing data -orderedEntries.remove(latestListingTimestamp); -} - -for (List timestampEntities : orderedEntries.values()) { -for (FileStatus status : timestampEntities) { -toList.add(status); -} -} -} - -return toList; -} - @OnScheduled public void resetStateIfNecessary(final ProcessContext context) throws IOException { if (resetState) { -getLogger().debug("Property
[GitHub] [nifi] tpalfy commented on a diff in pull request #7240: NIFI-11178: Improve ListHDFS performance, incremental loading refactor.
tpalfy commented on code in PR #7240: URL: https://github.com/apache/nifi/pull/7240#discussion_r1253186098 ## nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java: ## @@ -286,392 +200,141 @@ protected Collection customValidate(ValidationContext context) problems.add(new ValidationResult.Builder().valid(false).subject("GetHDFS Configuration") .explanation(MIN_AGE.getDisplayName() + " cannot be greater than " + MAX_AGE.getDisplayName()).build()); } - return problems; } -protected String getKey(final String directory) { -return getIdentifier() + ".lastListingTime." + directory; -} - @Override public void onPropertyModified(final PropertyDescriptor descriptor, final String oldValue, final String newValue) { super.onPropertyModified(descriptor, oldValue, newValue); if (isConfigurationRestored() && (descriptor.equals(DIRECTORY) || descriptor.equals(FILE_FILTER))) { -this.resetState = true; +resetState = true; } } -/** - * Determines which of the given FileStatus's describes a File that should be listed. - * - * @param statuses the eligible FileStatus objects that we could potentially list - * @param context processor context with properties values - * @return a Set containing only those FileStatus objects that we want to list - */ -Set determineListable(final Set statuses, ProcessContext context) { -final long minTimestamp = this.latestTimestampListed; -final TreeMap> orderedEntries = new TreeMap<>(); - -final Long minAgeProp = context.getProperty(MIN_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -// NIFI-4144 - setting to MIN_VALUE so that in case the file modification time is in -// the future relative to the nifi instance, files are not skipped. -final long minimumAge = (minAgeProp == null) ? Long.MIN_VALUE : minAgeProp; -final Long maxAgeProp = context.getProperty(MAX_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -final long maximumAge = (maxAgeProp == null) ? Long.MAX_VALUE : maxAgeProp; - -// Build a sorted map to determine the latest possible entries -for (final FileStatus status : statuses) { -if (status.getPath().getName().endsWith("_COPYING_")) { -continue; -} - -final long fileAge = System.currentTimeMillis() - status.getModificationTime(); -if (minimumAge > fileAge || fileAge > maximumAge) { -continue; -} - -final long entityTimestamp = status.getModificationTime(); - -if (entityTimestamp > latestTimestampListed) { -latestTimestampListed = entityTimestamp; -} - -// New entries are all those that occur at or after the associated timestamp -final boolean newEntry = entityTimestamp >= minTimestamp && entityTimestamp > latestTimestampEmitted; - -if (newEntry) { -List entitiesForTimestamp = orderedEntries.get(status.getModificationTime()); -if (entitiesForTimestamp == null) { -entitiesForTimestamp = new ArrayList(); -orderedEntries.put(status.getModificationTime(), entitiesForTimestamp); -} -entitiesForTimestamp.add(status); -} -} - -final Set toList = new HashSet<>(); - -if (orderedEntries.size() > 0) { -long latestListingTimestamp = orderedEntries.lastKey(); - -// If the last listing time is equal to the newest entries previously seen, -// another iteration has occurred without new files and special handling is needed to avoid starvation -if (latestListingTimestamp == minTimestamp) { -// We are done if the latest listing timestamp is equal to the last processed time, -// meaning we handled those items originally passed over -if (latestListingTimestamp == latestTimestampEmitted) { -return Collections.emptySet(); -} -} else { -// Otherwise, newest entries are held back one cycle to avoid issues in writes occurring exactly when the listing is being performed to avoid missing data -orderedEntries.remove(latestListingTimestamp); -} - -for (List timestampEntities : orderedEntries.values()) { -for (FileStatus status : timestampEntities) { -toList.add(status); -} -} -} - -return toList; -} - @OnScheduled public void resetStateIfNecessary(final ProcessContext context) throws IOException { if (resetState) { -getLogger().debug("Property
[GitHub] [nifi] tpalfy commented on a diff in pull request #7240: NIFI-11178: Improve ListHDFS performance, incremental loading refactor.
tpalfy commented on code in PR #7240: URL: https://github.com/apache/nifi/pull/7240#discussion_r1253186098 ## nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/ListHDFS.java: ## @@ -286,392 +200,141 @@ protected Collection customValidate(ValidationContext context) problems.add(new ValidationResult.Builder().valid(false).subject("GetHDFS Configuration") .explanation(MIN_AGE.getDisplayName() + " cannot be greater than " + MAX_AGE.getDisplayName()).build()); } - return problems; } -protected String getKey(final String directory) { -return getIdentifier() + ".lastListingTime." + directory; -} - @Override public void onPropertyModified(final PropertyDescriptor descriptor, final String oldValue, final String newValue) { super.onPropertyModified(descriptor, oldValue, newValue); if (isConfigurationRestored() && (descriptor.equals(DIRECTORY) || descriptor.equals(FILE_FILTER))) { -this.resetState = true; +resetState = true; } } -/** - * Determines which of the given FileStatus's describes a File that should be listed. - * - * @param statuses the eligible FileStatus objects that we could potentially list - * @param context processor context with properties values - * @return a Set containing only those FileStatus objects that we want to list - */ -Set determineListable(final Set statuses, ProcessContext context) { -final long minTimestamp = this.latestTimestampListed; -final TreeMap> orderedEntries = new TreeMap<>(); - -final Long minAgeProp = context.getProperty(MIN_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -// NIFI-4144 - setting to MIN_VALUE so that in case the file modification time is in -// the future relative to the nifi instance, files are not skipped. -final long minimumAge = (minAgeProp == null) ? Long.MIN_VALUE : minAgeProp; -final Long maxAgeProp = context.getProperty(MAX_AGE).asTimePeriod(TimeUnit.MILLISECONDS); -final long maximumAge = (maxAgeProp == null) ? Long.MAX_VALUE : maxAgeProp; - -// Build a sorted map to determine the latest possible entries -for (final FileStatus status : statuses) { -if (status.getPath().getName().endsWith("_COPYING_")) { -continue; -} - -final long fileAge = System.currentTimeMillis() - status.getModificationTime(); -if (minimumAge > fileAge || fileAge > maximumAge) { -continue; -} - -final long entityTimestamp = status.getModificationTime(); - -if (entityTimestamp > latestTimestampListed) { -latestTimestampListed = entityTimestamp; -} - -// New entries are all those that occur at or after the associated timestamp -final boolean newEntry = entityTimestamp >= minTimestamp && entityTimestamp > latestTimestampEmitted; - -if (newEntry) { -List entitiesForTimestamp = orderedEntries.get(status.getModificationTime()); -if (entitiesForTimestamp == null) { -entitiesForTimestamp = new ArrayList(); -orderedEntries.put(status.getModificationTime(), entitiesForTimestamp); -} -entitiesForTimestamp.add(status); -} -} - -final Set toList = new HashSet<>(); - -if (orderedEntries.size() > 0) { -long latestListingTimestamp = orderedEntries.lastKey(); - -// If the last listing time is equal to the newest entries previously seen, -// another iteration has occurred without new files and special handling is needed to avoid starvation -if (latestListingTimestamp == minTimestamp) { -// We are done if the latest listing timestamp is equal to the last processed time, -// meaning we handled those items originally passed over -if (latestListingTimestamp == latestTimestampEmitted) { -return Collections.emptySet(); -} -} else { -// Otherwise, newest entries are held back one cycle to avoid issues in writes occurring exactly when the listing is being performed to avoid missing data -orderedEntries.remove(latestListingTimestamp); -} - -for (List timestampEntities : orderedEntries.values()) { -for (FileStatus status : timestampEntities) { -toList.add(status); -} -} -} - -return toList; -} - @OnScheduled public void resetStateIfNecessary(final ProcessContext context) throws IOException { if (resetState) { -getLogger().debug("Property
[GitHub] [nifi] tpalfy commented on a diff in pull request #7240: NIFI-11178: Improve ListHDFS performance, incremental loading refactor.
tpalfy commented on code in PR #7240: URL: https://github.com/apache/nifi/pull/7240#discussion_r1253156101 ## nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/util/FileStatusIterable.java: ## @@ -0,0 +1,124 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop.util; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.processor.exception.ProcessException; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayDeque; +import java.util.Deque; +import java.util.Iterator; +import java.util.NoSuchElementException; +import java.util.concurrent.atomic.AtomicLong; + +public class FileStatusIterable implements Iterable { + +private final Path path; +private final boolean recursive; +private final FileSystem hdfs; +private final UserGroupInformation userGroupInformation; +private final AtomicLong totalFileCount = new AtomicLong(); + +public FileStatusIterable(final Path path, final boolean recursive, final FileSystem hdfs, final UserGroupInformation userGroupInformation) { +this.path = path; +this.recursive = recursive; +this.hdfs = hdfs; +this.userGroupInformation = userGroupInformation; +} + +@Override +public Iterator iterator() { +return new FileStatusIterator(); +} + +public long getTotalFileCount() { +return totalFileCount.get(); +} + +class FileStatusIterator implements Iterator { + +private static final String IO_ERROR_MESSAGE = "IO error occurred while iterating HFDS"; + +private final Deque dirStatuses; + +private FileStatus nextFileStatus; +private RemoteIterator hdfsIterator; Review Comment: Or we could call it `fileStatusIterator` (that aspect is more significant imo). Whichever the case, rename the `getHdfsIterator` method as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (MINIFICPP-2157) Move response node implementations to source files
Gábor Gyimesi created MINIFICPP-2157: Summary: Move response node implementations to source files Key: MINIFICPP-2157 URL: https://issues.apache.org/jira/browse/MINIFICPP-2157 Project: Apache NiFi MiNiFi C++ Issue Type: Improvement Reporter: Gábor Gyimesi Assignee: Gábor Gyimesi Most response nodes have their implementations in the header files, which should be moved to the cpp files. Additional refactoring can also be done with the layout definitions to make them more compact. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi-minifi-cpp] szaszm commented on a diff in pull request #1601: MINIFICPP-2143 - Resolve Security/UserID attribute
szaszm commented on code in PR #1601: URL: https://github.com/apache/nifi-minifi-cpp/pull/1601#discussion_r1253085830 ## extensions/windows-event-log/wel/MetadataWalker.cpp: ## @@ -171,4 +176,17 @@ void MetadataWalker::updateText(pugi::xml_node , const std::string _n } } +void MetadataWalker::updateText(pugi::xml_attribute , const std::string _name, std::function &) { Review Comment: I think we should pass `fn` without type erasure, since it's all local to this class. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi-minifi-cpp] adamdebreceni opened a new pull request, #1601: MINIFICPP-2143 - Resolve Security/UserID attribute
adamdebreceni opened a new pull request, #1601: URL: https://github.com/apache/nifi-minifi-cpp/pull/1601 Thank you for submitting a contribution to Apache NiFi - MiNiFi C++. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with MINIFICPP- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically main)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file? - [ ] If applicable, have you updated the NOTICE file? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a diff in pull request #1600: MINIFICPP-2133 Add TLS 1.3 support
adamdebreceni commented on code in PR #1600: URL: https://github.com/apache/nifi-minifi-cpp/pull/1600#discussion_r1253074956 ## libminifi/src/controllers/SSLContextService.cpp: ## @@ -196,16 +196,16 @@ bool SSLContextService::configure_ssl_context(SSL_CTX *ctx) { } // Security level set to 0 for backwards compatibility to support TLS versions below v1.2 - SSL_CTX_set_security_level(ctx, 0); + if (minimum_tls_version_ < TLS1_2_VERSION || maximum_tls_version_ < TLS1_2_VERSION) { Review Comment: I see, so `-1` is not "don't care" but the default -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] turcsanyip closed pull request #7459: NIFI-11334: fix maven version
turcsanyip closed pull request #7459: NIFI-11334: fix maven version URL: https://github.com/apache/nifi/pull/7459 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (NIFI-11334) PutIceberg processor instance interference due same class loader usage
[ https://issues.apache.org/jira/browse/NIFI-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17740157#comment-17740157 ] ASF subversion and git services commented on NIFI-11334: Commit 53a0aef4228f7b14cb8c9941a7eca158f0a42093 in nifi's branch refs/heads/support/nifi-1.x from Zoltan Kornel Torok [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=53a0aef422 ] NIFI-11334: Fixed dependency version in nifi-iceberg-services This closes #7459. Signed-off-by: Peter Turcsanyi > PutIceberg processor instance interference due same class loader usage > -- > > Key: NIFI-11334 > URL: https://issues.apache.org/jira/browse/NIFI-11334 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mark Bathori >Assignee: Mark Bathori >Priority: Major > Fix For: 2.0.0, 1.23.0 > > Time Spent: 2h > Remaining Estimate: 0h > > If more than one PutIceberg processor instance is being used simultaneous > e.g. with different Kerberos services then the underlying contexts can > interfere with each other. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] turcsanyip commented on a diff in pull request #7240: NIFI-11178: Improve ListHDFS performance, incremental loading refactor.
turcsanyip commented on code in PR #7240: URL: https://github.com/apache/nifi/pull/7240#discussion_r1252713068 ## nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/util/FileStatusIterable.java: ## @@ -0,0 +1,124 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop.util; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.security.UserGroupInformation; +import org.apache.nifi.processor.exception.ProcessException; + +import java.io.IOException; +import java.security.PrivilegedExceptionAction; +import java.util.ArrayDeque; +import java.util.Deque; +import java.util.Iterator; +import java.util.NoSuchElementException; +import java.util.concurrent.atomic.AtomicLong; + +public class FileStatusIterable implements Iterable { + +private final Path path; +private final boolean recursive; +private final FileSystem hdfs; Review Comment: The underlying `FileSystem` is not necessarily HDFS but can be a cloud storage. I would rename it to `fs` or `fileSystem`. ## nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/util/writer/HdfsObjectWriter.java: ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.processors.hadoop.util.writer; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.PathFilter; +import org.apache.hadoop.fs.permission.FsAction; +import org.apache.hadoop.fs.permission.FsPermission; +import org.apache.nifi.processor.ProcessSession; +import org.apache.nifi.processors.hadoop.util.FileStatusIterable; +import org.apache.nifi.processors.hadoop.util.FileStatusManager; + +import java.util.List; + +/** + * Interface for common management of writing to records and to flowfiles. + */ +public abstract class HdfsObjectWriter { + +protected final ProcessSession session; +protected final FileStatusIterable fileStatusIterable; +final long minimumAge; +final long maximumAge; +final PathFilter pathFilter; +final FileStatusManager fileStatusManager; +final long latestModificationTime; +final List latestModifiedStatuses; +final long currentTimeMillis; +long fileCount; + + +HdfsObjectWriter(ProcessSession session, FileStatusIterable fileStatusIterable, long minimumAge, long maximumAge, PathFilter pathFilter, + FileStatusManager fileStatusManager, long latestModificationTime, List latestModifiedStatuses) { +this.session = session; +this.fileStatusIterable = fileStatusIterable; +this.minimumAge = minimumAge; +this.maximumAge = maximumAge; +this.pathFilter = pathFilter; +this.fileStatusManager = fileStatusManager; +this.latestModificationTime = latestModificationTime; +this.latestModifiedStatuses = latestModifiedStatuses; +currentTimeMillis = System.currentTimeMillis(); +fileCount = 0L; +} + +public abstract void write(); + +public long getListedFileCount() { +return fileCount; +} + +boolean determineListable(final FileStatus status, final long minimumAge, final long maximumAge, final PathFilter filter, Review
[GitHub] [nifi-minifi-cpp] lordgamez commented on a diff in pull request #1600: MINIFICPP-2133 Add TLS 1.3 support
lordgamez commented on code in PR #1600: URL: https://github.com/apache/nifi-minifi-cpp/pull/1600#discussion_r1252815701 ## libminifi/src/controllers/SSLContextService.cpp: ## @@ -196,16 +196,16 @@ bool SSLContextService::configure_ssl_context(SSL_CTX *ctx) { } // Security level set to 0 for backwards compatibility to support TLS versions below v1.2 - SSL_CTX_set_security_level(ctx, 0); + if (minimum_tls_version_ < TLS1_2_VERSION || maximum_tls_version_ < TLS1_2_VERSION) { Review Comment: If we do not set the minimum version (or maximum version) at all then the default is TLS 1.2 or 1.3 that is available. In the negotiation always the highest available version is chosen that is available for both peers. If we only set the maximum version to be for example TLS 1.1 in that case the security level has to be set to 0 to be able to choose the any versions below TLS 1.2. So even if the minimum version is not set we have to check if the maximum version is below TLS 1.2. This is how I imagine the current workflow: ``` minTLS maxTLS availableTLSVersions needsSecurityLevel0 -1 -1 1.2, 1.3 False 1.21.21.2 False 1.1-1 1.1, 1.2, 1.3True -1 1.11, 1.1 True 1 1.11, 1.1 True ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi-minifi-cpp] adamdebreceni commented on a diff in pull request #1600: MINIFICPP-2133 Add TLS 1.3 support
adamdebreceni commented on code in PR #1600: URL: https://github.com/apache/nifi-minifi-cpp/pull/1600#discussion_r1252788325 ## libminifi/src/controllers/SSLContextService.cpp: ## @@ -196,16 +196,16 @@ bool SSLContextService::configure_ssl_context(SSL_CTX *ctx) { } // Security level set to 0 for backwards compatibility to support TLS versions below v1.2 - SSL_CTX_set_security_level(ctx, 0); + if (minimum_tls_version_ < TLS1_2_VERSION || maximum_tls_version_ < TLS1_2_VERSION) { Review Comment: I'm still having wrapping my head around this, so for the negotiation to be able to chose < 1.2 we have to set the security level to 0, but we also have to either specify a minimum version < 1.2 or don't set the minimum version at all, so as I understand the < 1.2 can only NOT be negotiated if `minimum_tls_version_ >= 1.2` all other cases allow for < 1.2 but for that we need to also set the security level to 0, so the condition should only be `minimum_tls_version_ != -1 && minimum_tls_version_ < TLS1_2_VERSION`, so the user explicitly set the minimum version to allow pre-1.2, it seems to me that the maximum version does not really play a role -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] taz1988 opened a new pull request, #7459: NIFI-11334: fix maven version
taz1988 opened a new pull request, #7459: URL: https://github.com/apache/nifi/pull/7459 # Summary [Nifi 11334](https://issues.apache.org/jira/browse/Nifi 11334) # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [x] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIF/Nifi-11334) issue created ### Pull Request Tracking - [x] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [x] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [x] Pull Request based on current revision of the `main` branch - [ ] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [x] Build completed using `mvn clean install -P contrib-check` - [x] JDK 8 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org