[GitHub] [nifi] emiliosetiadarma opened a new pull request, #7269: NIFI-11549: implemented AzureQueueStorage_v12 processors
emiliosetiadarma opened a new pull request, #7269: URL: https://github.com/apache/nifi/pull/7269 # Summary [NIFI-11549](https://issues.apache.org/jira/browse/NIFI-11549) # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [x] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [x] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [x] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [x] Pull Request based on current revision of the `main` branch - [x] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [x] Build completed using `mvn clean install -P contrib-check` - [x] JDK 11 - [ ] JDK 17 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (NIFI-11552) Support FlowFile attributes in PutIceberg's Table Name property
[ https://issues.apache.org/jira/browse/NIFI-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Burgess updated NIFI-11552: Status: Patch Available (was: In Progress) > Support FlowFile attributes in PutIceberg's Table Name property > --- > > Key: NIFI-11552 > URL: https://issues.apache.org/jira/browse/NIFI-11552 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Major > Fix For: 1.latest, 2.latest > > > The documentation for PutIceberg's Table Name property says it doesn’t > support any Expression Language but the code calls the evaluate method on the > property without passing in a FlowFile, so at the very least it supports > Variable Registry. this Jira proposes to add EL support including the > FlowFile attributes for the Table Name property and update the documentation > to reflect the new behavior. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] mattyb149 opened a new pull request, #7268: NIFI-11552: Support FlowFile Attributes in some PutIceberg proeprties
mattyb149 opened a new pull request, #7268: URL: https://github.com/apache/nifi/pull/7268 # Summary [NIFI-11552](https://issues.apache.org/jira/browse/NIFI-11552) This PR adds support for FlowFile attributes in Expression Language for such PutIceberg properties as Catalog Name and Table Name. # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [x] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [x] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-11552` - [x] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-11552` ### Pull Request Formatting - [x] Pull Request based on current revision of the `main` branch - [x] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [x] JDK 11 - [ ] JDK 17 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (NIFI-11552) Support FlowFile attributes in PutIceberg's Table Name property
[ https://issues.apache.org/jira/browse/NIFI-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Burgess reassigned NIFI-11552: --- Assignee: Matt Burgess > Support FlowFile attributes in PutIceberg's Table Name property > --- > > Key: NIFI-11552 > URL: https://issues.apache.org/jira/browse/NIFI-11552 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Major > Fix For: 1.latest, 2.latest > > > The documentation for PutIceberg's Table Name property says it doesn’t > support any Expression Language but the code calls the evaluate method on the > property without passing in a FlowFile, so at the very least it supports > Variable Registry. this Jira proposes to add EL support including the > FlowFile attributes for the Table Name property and update the documentation > to reflect the new behavior. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] exceptionfactory commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
exceptionfactory commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198284268 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/RowIterator.java: ## @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */package org.apache.nifi.excel; + +import com.github.pjfanning.xlsx.StreamingReader; +import org.apache.nifi.logging.ComponentLog; +import org.apache.poi.ss.usermodel.Row; +import org.apache.poi.ss.usermodel.Sheet; +import org.apache.poi.ss.usermodel.Workbook; + +import java.io.Closeable; +import java.io.IOException; +import java.io.InputStream; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class RowIterator implements Iterator, Closeable { +private final Workbook workbook; +private final Iterator sheets; +private Sheet currentSheet; +private Iterator currentRows; +private final Map desiredSheets; +private final int firstRow; +private ComponentLog logger; +private boolean log; +private Row currentRow; + +public RowIterator(InputStream in, List desiredSheets, int firstRow) { +this(in, desiredSheets, firstRow, null); +} + +public RowIterator(InputStream in, List desiredSheets, int firstRow, ComponentLog logger) { +this.workbook = StreamingReader.builder() +.rowCacheSize(100) +.bufferSize(4096) +.open(in); +this.sheets = this.workbook.iterator(); +this.desiredSheets = desiredSheets != null ? desiredSheets.stream() +.collect(Collectors.toMap(key -> key, value -> Boolean.FALSE)) : new HashMap<>(); +this.firstRow = firstRow; +this.logger = logger; +this.log = logger != null; +} + +@Override +public boolean hasNext() { +setCurrent(); +boolean next = currentRow != null; +if(!next) { +String sheetsNotFound = getSheetsNotFound(desiredSheets); +if (!sheetsNotFound.isEmpty() && log) { +logger.warn("Excel sheet(s) not found: {}", sheetsNotFound); +} +} +return next; +} + +private void setCurrent() { +currentRow = getNextRow(); +if (currentRow != null) { +return; +} + +currentSheet = null; +currentRows = null; +while (sheets.hasNext()) { +currentSheet = sheets.next(); +if (isIterateOverAllSheets() || hasSheet(currentSheet.getSheetName())) { +currentRows = currentSheet.iterator(); +currentRow = getNextRow(); +if (currentRow != null) { +return; +} +} +} +} + +private Row getNextRow() { +while (currentRows != null && !hasExhaustedRows()) { +Row tempCurrentRow = currentRows.next(); +if (!isSkip(tempCurrentRow)) { +return tempCurrentRow; +} +} +return null; +} + +private boolean hasExhaustedRows() { +boolean exhausted = !currentRows.hasNext(); +if (log && exhausted) { +logger.info("Exhausted all rows from sheet {}", currentSheet.getSheetName()); +} +return exhausted; +} + +private boolean isSkip(Row row) { +return row.getRowNum() < firstRow; +} + +private boolean isIterateOverAllSheets() { +boolean iterateAllSheets = desiredSheets.isEmpty(); +if (iterateAllSheets && log) { +logger.info("Advanced to sheet {}", currentSheet.getSheetName()); +} +return iterateAllSheets; +} + +private boolean hasSheet(String name) { +boolean sheetByName = !desiredSheets.isEmpty() +&& desiredSheets.keySet().stream() +.anyMatch(desiredSheet -> desiredSheet.equalsIgnoreCase(name)); +if (sheetByName) { +desiredSheets.put(name, Boolean.TRUE); +} +return sheetByName; +} + +
[GitHub] [nifi] dan-s1 commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
dan-s1 commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198279649 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/RowIterator.java: ## @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */package org.apache.nifi.excel; + +import com.github.pjfanning.xlsx.StreamingReader; +import org.apache.nifi.logging.ComponentLog; +import org.apache.poi.ss.usermodel.Row; +import org.apache.poi.ss.usermodel.Sheet; +import org.apache.poi.ss.usermodel.Workbook; + +import java.io.Closeable; +import java.io.IOException; +import java.io.InputStream; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class RowIterator implements Iterator, Closeable { +private final Workbook workbook; +private final Iterator sheets; +private Sheet currentSheet; +private Iterator currentRows; +private final Map desiredSheets; +private final int firstRow; +private ComponentLog logger; +private boolean log; +private Row currentRow; + +public RowIterator(InputStream in, List desiredSheets, int firstRow) { +this(in, desiredSheets, firstRow, null); +} + +public RowIterator(InputStream in, List desiredSheets, int firstRow, ComponentLog logger) { +this.workbook = StreamingReader.builder() +.rowCacheSize(100) +.bufferSize(4096) +.open(in); +this.sheets = this.workbook.iterator(); +this.desiredSheets = desiredSheets != null ? desiredSheets.stream() +.collect(Collectors.toMap(key -> key, value -> Boolean.FALSE)) : new HashMap<>(); +this.firstRow = firstRow; +this.logger = logger; +this.log = logger != null; +} + +@Override +public boolean hasNext() { +setCurrent(); +boolean next = currentRow != null; +if(!next) { +String sheetsNotFound = getSheetsNotFound(desiredSheets); +if (!sheetsNotFound.isEmpty() && log) { +logger.warn("Excel sheet(s) not found: {}", sheetsNotFound); +} +} +return next; +} + +private void setCurrent() { +currentRow = getNextRow(); +if (currentRow != null) { +return; +} + +currentSheet = null; +currentRows = null; +while (sheets.hasNext()) { +currentSheet = sheets.next(); +if (isIterateOverAllSheets() || hasSheet(currentSheet.getSheetName())) { +currentRows = currentSheet.iterator(); +currentRow = getNextRow(); +if (currentRow != null) { +return; +} +} +} +} + +private Row getNextRow() { +while (currentRows != null && !hasExhaustedRows()) { +Row tempCurrentRow = currentRows.next(); +if (!isSkip(tempCurrentRow)) { +return tempCurrentRow; +} +} +return null; +} + +private boolean hasExhaustedRows() { +boolean exhausted = !currentRows.hasNext(); +if (log && exhausted) { +logger.info("Exhausted all rows from sheet {}", currentSheet.getSheetName()); +} +return exhausted; +} + +private boolean isSkip(Row row) { +return row.getRowNum() < firstRow; +} + +private boolean isIterateOverAllSheets() { +boolean iterateAllSheets = desiredSheets.isEmpty(); +if (iterateAllSheets && log) { +logger.info("Advanced to sheet {}", currentSheet.getSheetName()); +} +return iterateAllSheets; +} + +private boolean hasSheet(String name) { +boolean sheetByName = !desiredSheets.isEmpty() +&& desiredSheets.keySet().stream() +.anyMatch(desiredSheet -> desiredSheet.equalsIgnoreCase(name)); +if (sheetByName) { +desiredSheets.put(name, Boolean.TRUE); +} +return sheetByName; +} + +
[GitHub] [nifi] exceptionfactory commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
exceptionfactory commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198272978 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/RowIterator.java: ## @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */package org.apache.nifi.excel; + +import com.github.pjfanning.xlsx.StreamingReader; +import org.apache.nifi.logging.ComponentLog; +import org.apache.poi.ss.usermodel.Row; +import org.apache.poi.ss.usermodel.Sheet; +import org.apache.poi.ss.usermodel.Workbook; + +import java.io.Closeable; +import java.io.IOException; +import java.io.InputStream; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class RowIterator implements Iterator, Closeable { +private final Workbook workbook; +private final Iterator sheets; +private Sheet currentSheet; +private Iterator currentRows; +private final Map desiredSheets; +private final int firstRow; +private ComponentLog logger; +private boolean log; +private Row currentRow; + +public RowIterator(InputStream in, List desiredSheets, int firstRow) { +this(in, desiredSheets, firstRow, null); +} + +public RowIterator(InputStream in, List desiredSheets, int firstRow, ComponentLog logger) { +this.workbook = StreamingReader.builder() +.rowCacheSize(100) +.bufferSize(4096) +.open(in); +this.sheets = this.workbook.iterator(); +this.desiredSheets = desiredSheets != null ? desiredSheets.stream() +.collect(Collectors.toMap(key -> key, value -> Boolean.FALSE)) : new HashMap<>(); +this.firstRow = firstRow; +this.logger = logger; +this.log = logger != null; +} + +@Override +public boolean hasNext() { +setCurrent(); +boolean next = currentRow != null; +if(!next) { +String sheetsNotFound = getSheetsNotFound(desiredSheets); +if (!sheetsNotFound.isEmpty() && log) { +logger.warn("Excel sheet(s) not found: {}", sheetsNotFound); +} +} +return next; +} + +private void setCurrent() { +currentRow = getNextRow(); +if (currentRow != null) { +return; +} + +currentSheet = null; +currentRows = null; +while (sheets.hasNext()) { +currentSheet = sheets.next(); +if (isIterateOverAllSheets() || hasSheet(currentSheet.getSheetName())) { +currentRows = currentSheet.iterator(); +currentRow = getNextRow(); +if (currentRow != null) { +return; +} +} +} +} + +private Row getNextRow() { +while (currentRows != null && !hasExhaustedRows()) { +Row tempCurrentRow = currentRows.next(); +if (!isSkip(tempCurrentRow)) { +return tempCurrentRow; +} +} +return null; +} + +private boolean hasExhaustedRows() { +boolean exhausted = !currentRows.hasNext(); +if (log && exhausted) { +logger.info("Exhausted all rows from sheet {}", currentSheet.getSheetName()); +} +return exhausted; +} + +private boolean isSkip(Row row) { +return row.getRowNum() < firstRow; +} + +private boolean isIterateOverAllSheets() { +boolean iterateAllSheets = desiredSheets.isEmpty(); +if (iterateAllSheets && log) { +logger.info("Advanced to sheet {}", currentSheet.getSheetName()); +} +return iterateAllSheets; +} + +private boolean hasSheet(String name) { +boolean sheetByName = !desiredSheets.isEmpty() +&& desiredSheets.keySet().stream() +.anyMatch(desiredSheet -> desiredSheet.equalsIgnoreCase(name)); +if (sheetByName) { +desiredSheets.put(name, Boolean.TRUE); +} +return sheetByName; +} + +
[GitHub] [nifi] exceptionfactory commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
exceptionfactory commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198269634 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/ExcelReader.java: ## @@ -0,0 +1,193 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.excel; + +import org.apache.commons.lang3.StringUtils; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.context.PropertyContext; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaAccessStrategy; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.schema.inference.InferSchemaAccessStrategy; +import org.apache.nifi.schema.inference.RecordSourceFactory; +import org.apache.nifi.schema.inference.SchemaInferenceEngine; +import org.apache.nifi.schema.inference.SchemaInferenceUtil; +import org.apache.nifi.schema.inference.TimeValueInference; +import org.apache.nifi.schemaregistry.services.SchemaRegistry; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.stream.io.NonCloseableInputStream; +import org.apache.poi.ss.usermodel.Row; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.concurrent.atomic.AtomicReferenceArray; +import java.util.stream.IntStream; + +@Tags({"excel", "spreadsheet", "xlsx", "parse", "record", "row", "reader", "values", "cell"}) +@CapabilityDescription("Parses a Microsoft Excel document returning each row in each sheet as a separate record. " ++ "This reader allows for inferring a schema either based on the first line of an Excel sheet if a 'header line' is " ++ "present or from all the desired sheets, or providing an explicit schema " ++ "for interpreting the values. See Controller Service's Usage for further documentation. " ++ "This reader is currently only capable of processing .xlsx " ++ "(XSSF 2007 OOXML file format) Excel documents and not older .xls (HSSF '97(-2007) file format) documents.)") +public class ExcelReader extends SchemaRegistryService implements RecordReaderFactory { + +private static final AllowableValue HEADER_DERIVED = new AllowableValue("excel-header-derived", "Use fields From Header", +"The first chosen row of the Excel sheet is a header row that contains the columns representative of all the rows " + +"in the desired sheets. The schema will be derived by using those columns in the header."); +public static final PropertyDescriptor DESIRED_SHEETS = new PropertyDescriptor +.Builder().name("extract-sheets") +.displayName("Sheets to Extract") +.description("Comma separated list of Excel document sheet names whose rows should be extracted from the excel document. If this property" + +" is left blank then all the rows from all the sheets will be extracted from the Excel document. The list of names is case in-sensitive. Any sheets not" + +" specified in this value will be ignored. A bulletin will be generated if a specified sheet(s) are not found.") +.required(false) + .expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES) +.addValidator(StandardValidators.NON_EMPTY_VALIDATOR) +
[jira] [Updated] (NIFI-5151) Patch Nifi with Upsert functions for PutDatabaseRecord processor
[ https://issues.apache.org/jira/browse/NIFI-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Burgess updated NIFI-5151: --- Fix Version/s: 2.0.0 1.22.0 (was: 1.latest) (was: 2.latest) Resolution: Fixed Status: Resolved (was: Patch Available) > Patch Nifi with Upsert functions for PutDatabaseRecord processor > > > Key: NIFI-5151 > URL: https://issues.apache.org/jira/browse/NIFI-5151 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.7.0 >Reporter: Karl Amundsson >Assignee: Lehel Boér >Priority: Major > Labels: Processor > Fix For: 2.0.0, 1.22.0 > > Attachments: > 0001-NIFI-5151-Adding-support-for-UPSERT-in-PutDatabaseRe.patch, > 0001-NIFI-5151-Using-DatabaseAdapter-to-generate-INSERT-S.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > Since Phoenix doesn't support the SQL statement INSERT you have to use a > process like: ConvertAttributesToJSON->ConvertJSONToSQL in Insert > mode->ReplaceText to replace "INSERT" with "UPSERT" -> PutSQL (See: > [https://community.hortonworks.com/questions/40561/nifi-phoenix-processor.html)] > With this patch you can choose to use UPSERT directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-5151) Patch Nifi with Upsert functions for PutDatabaseRecord processor
[ https://issues.apache.org/jira/browse/NIFI-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724010#comment-17724010 ] ASF subversion and git services commented on NIFI-5151: --- Commit 6c70471cc6fefe00b9a69a6eeba8dbd9f0a5c7aa in nifi's branch refs/heads/main from Lehel Boér [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=6c70471cc6 ] NIFI-5151: Add UPSERT support for Apache Phoenix Signed-off-by: Matthew Burgess This closes #7263 > Patch Nifi with Upsert functions for PutDatabaseRecord processor > > > Key: NIFI-5151 > URL: https://issues.apache.org/jira/browse/NIFI-5151 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.7.0 >Reporter: Karl Amundsson >Assignee: Lehel Boér >Priority: Major > Labels: Processor > Fix For: 1.latest, 2.latest > > Attachments: > 0001-NIFI-5151-Adding-support-for-UPSERT-in-PutDatabaseRe.patch, > 0001-NIFI-5151-Using-DatabaseAdapter-to-generate-INSERT-S.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > Since Phoenix doesn't support the SQL statement INSERT you have to use a > process like: ConvertAttributesToJSON->ConvertJSONToSQL in Insert > mode->ReplaceText to replace "INSERT" with "UPSERT" -> PutSQL (See: > [https://community.hortonworks.com/questions/40561/nifi-phoenix-processor.html)] > With this patch you can choose to use UPSERT directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-5151) Patch Nifi with Upsert functions for PutDatabaseRecord processor
[ https://issues.apache.org/jira/browse/NIFI-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724009#comment-17724009 ] ASF subversion and git services commented on NIFI-5151: --- Commit 4fa47ecc2ad02fe266bcfee38b954d00e7aeddf8 in nifi's branch refs/heads/support/nifi-1.x from Lehel Boér [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=4fa47ecc2a ] NIFI-5151: Add UPSERT support for Apache Phoenix Signed-off-by: Matthew Burgess > Patch Nifi with Upsert functions for PutDatabaseRecord processor > > > Key: NIFI-5151 > URL: https://issues.apache.org/jira/browse/NIFI-5151 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.7.0 >Reporter: Karl Amundsson >Assignee: Lehel Boér >Priority: Major > Labels: Processor > Fix For: 1.latest, 2.latest > > Attachments: > 0001-NIFI-5151-Adding-support-for-UPSERT-in-PutDatabaseRe.patch, > 0001-NIFI-5151-Using-DatabaseAdapter-to-generate-INSERT-S.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > Since Phoenix doesn't support the SQL statement INSERT you have to use a > process like: ConvertAttributesToJSON->ConvertJSONToSQL in Insert > mode->ReplaceText to replace "INSERT" with "UPSERT" -> PutSQL (See: > [https://community.hortonworks.com/questions/40561/nifi-phoenix-processor.html)] > With this patch you can choose to use UPSERT directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] dan-s1 commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
dan-s1 commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198252637 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/ExcelReader.java: ## @@ -0,0 +1,193 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.excel; + +import org.apache.commons.lang3.StringUtils; +import org.apache.nifi.annotation.documentation.CapabilityDescription; +import org.apache.nifi.annotation.documentation.Tags; +import org.apache.nifi.annotation.lifecycle.OnEnabled; +import org.apache.nifi.components.AllowableValue; +import org.apache.nifi.components.PropertyDescriptor; +import org.apache.nifi.context.PropertyContext; +import org.apache.nifi.controller.ConfigurationContext; +import org.apache.nifi.expression.ExpressionLanguageScope; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.processor.util.StandardValidators; +import org.apache.nifi.schema.access.SchemaAccessStrategy; +import org.apache.nifi.schema.access.SchemaNotFoundException; +import org.apache.nifi.schema.inference.InferSchemaAccessStrategy; +import org.apache.nifi.schema.inference.RecordSourceFactory; +import org.apache.nifi.schema.inference.SchemaInferenceEngine; +import org.apache.nifi.schema.inference.SchemaInferenceUtil; +import org.apache.nifi.schema.inference.TimeValueInference; +import org.apache.nifi.schemaregistry.services.SchemaRegistry; +import org.apache.nifi.serialization.DateTimeUtils; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.RecordReader; +import org.apache.nifi.serialization.RecordReaderFactory; +import org.apache.nifi.serialization.SchemaRegistryService; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.stream.io.NonCloseableInputStream; +import org.apache.poi.ss.usermodel.Row; + +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.concurrent.atomic.AtomicReferenceArray; +import java.util.stream.IntStream; + +@Tags({"excel", "spreadsheet", "xlsx", "parse", "record", "row", "reader", "values", "cell"}) +@CapabilityDescription("Parses a Microsoft Excel document returning each row in each sheet as a separate record. " ++ "This reader allows for inferring a schema either based on the first line of an Excel sheet if a 'header line' is " ++ "present or from all the desired sheets, or providing an explicit schema " ++ "for interpreting the values. See Controller Service's Usage for further documentation. " ++ "This reader is currently only capable of processing .xlsx " ++ "(XSSF 2007 OOXML file format) Excel documents and not older .xls (HSSF '97(-2007) file format) documents.)") +public class ExcelReader extends SchemaRegistryService implements RecordReaderFactory { + +private static final AllowableValue HEADER_DERIVED = new AllowableValue("excel-header-derived", "Use fields From Header", +"The first chosen row of the Excel sheet is a header row that contains the columns representative of all the rows " + +"in the desired sheets. The schema will be derived by using those columns in the header."); +public static final PropertyDescriptor DESIRED_SHEETS = new PropertyDescriptor +.Builder().name("extract-sheets") +.displayName("Sheets to Extract") +.description("Comma separated list of Excel document sheet names whose rows should be extracted from the excel document. If this property" + +" is left blank then all the rows from all the sheets will be extracted from the Excel document. The list of names is case in-sensitive. Any sheets not" + +" specified in this value will be ignored. A bulletin will be generated if a specified sheet(s) are not found.") +.required(false) + .expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES) +.addValidator(StandardValidators.NON_EMPTY_VALIDATOR) +
[GitHub] [nifi] mattyb149 closed pull request #7263: NIFI-5151: Add UPSERT support for Apache Phoenix
mattyb149 closed pull request #7263: NIFI-5151: Add UPSERT support for Apache Phoenix URL: https://github.com/apache/nifi/pull/7263 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (NIFI-5151) Patch Nifi with Upsert functions for PutDatabaseRecord processor
[ https://issues.apache.org/jira/browse/NIFI-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Burgess updated NIFI-5151: --- Fix Version/s: 1.latest 2.latest > Patch Nifi with Upsert functions for PutDatabaseRecord processor > > > Key: NIFI-5151 > URL: https://issues.apache.org/jira/browse/NIFI-5151 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.7.0 >Reporter: Karl Amundsson >Assignee: Lehel Boér >Priority: Major > Labels: Processor > Fix For: 1.latest, 2.latest > > Attachments: > 0001-NIFI-5151-Adding-support-for-UPSERT-in-PutDatabaseRe.patch, > 0001-NIFI-5151-Using-DatabaseAdapter-to-generate-INSERT-S.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > Since Phoenix doesn't support the SQL statement INSERT you have to use a > process like: ConvertAttributesToJSON->ConvertJSONToSQL in Insert > mode->ReplaceText to replace "INSERT" with "UPSERT" -> PutSQL (See: > [https://community.hortonworks.com/questions/40561/nifi-phoenix-processor.html)] > With this patch you can choose to use UPSERT directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] mattyb149 commented on pull request #7263: NIFI-5151: Add UPSERT support for Apache Phoenix
mattyb149 commented on PR #7263: URL: https://github.com/apache/nifi/pull/7263#issuecomment-1553571399 +1 LGTM, tried on a live Phoenix/HBase instance, was able to add new rows as well as updating existing rows using UPSERT Thanks for the improvement! Merging to main -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] dan-s1 commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
dan-s1 commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198212598 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/RowIterator.java: ## @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */package org.apache.nifi.excel; + +import com.github.pjfanning.xlsx.StreamingReader; +import org.apache.nifi.logging.ComponentLog; +import org.apache.poi.ss.usermodel.Row; +import org.apache.poi.ss.usermodel.Sheet; +import org.apache.poi.ss.usermodel.Workbook; + +import java.io.Closeable; +import java.io.IOException; +import java.io.InputStream; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class RowIterator implements Iterator, Closeable { +private final Workbook workbook; +private final Iterator sheets; +private Sheet currentSheet; +private Iterator currentRows; +private final Map desiredSheets; +private final int firstRow; +private ComponentLog logger; +private boolean log; +private Row currentRow; + +public RowIterator(InputStream in, List desiredSheets, int firstRow) { +this(in, desiredSheets, firstRow, null); +} + +public RowIterator(InputStream in, List desiredSheets, int firstRow, ComponentLog logger) { +this.workbook = StreamingReader.builder() +.rowCacheSize(100) +.bufferSize(4096) +.open(in); +this.sheets = this.workbook.iterator(); +this.desiredSheets = desiredSheets != null ? desiredSheets.stream() +.collect(Collectors.toMap(key -> key, value -> Boolean.FALSE)) : new HashMap<>(); +this.firstRow = firstRow; +this.logger = logger; +this.log = logger != null; +} + +@Override +public boolean hasNext() { +setCurrent(); +boolean next = currentRow != null; +if(!next) { +String sheetsNotFound = getSheetsNotFound(desiredSheets); +if (!sheetsNotFound.isEmpty() && log) { +logger.warn("Excel sheet(s) not found: {}", sheetsNotFound); +} +} +return next; +} + +private void setCurrent() { +currentRow = getNextRow(); +if (currentRow != null) { +return; +} + +currentSheet = null; +currentRows = null; +while (sheets.hasNext()) { +currentSheet = sheets.next(); +if (isIterateOverAllSheets() || hasSheet(currentSheet.getSheetName())) { +currentRows = currentSheet.iterator(); +currentRow = getNextRow(); +if (currentRow != null) { +return; +} +} +} +} + +private Row getNextRow() { +while (currentRows != null && !hasExhaustedRows()) { +Row tempCurrentRow = currentRows.next(); +if (!isSkip(tempCurrentRow)) { +return tempCurrentRow; +} +} +return null; +} + +private boolean hasExhaustedRows() { +boolean exhausted = !currentRows.hasNext(); +if (log && exhausted) { +logger.info("Exhausted all rows from sheet {}", currentSheet.getSheetName()); +} +return exhausted; +} + +private boolean isSkip(Row row) { +return row.getRowNum() < firstRow; +} + +private boolean isIterateOverAllSheets() { +boolean iterateAllSheets = desiredSheets.isEmpty(); +if (iterateAllSheets && log) { +logger.info("Advanced to sheet {}", currentSheet.getSheetName()); +} +return iterateAllSheets; +} + +private boolean hasSheet(String name) { +boolean sheetByName = !desiredSheets.isEmpty() +&& desiredSheets.keySet().stream() +.anyMatch(desiredSheet -> desiredSheet.equalsIgnoreCase(name)); +if (sheetByName) { +desiredSheets.put(name, Boolean.TRUE); +} +return sheetByName; +} + +
[GitHub] [nifi] exceptionfactory commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
exceptionfactory commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198210056 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/ExcelUtils.java: ## @@ -0,0 +1,28 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.excel; + +import org.apache.poi.ss.usermodel.Row; + +public class ExcelUtils { Review Comment: Thanks, it is fine for now, leaving it as is sounds good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] dan-s1 commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
dan-s1 commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198208433 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/ExcelUtils.java: ## @@ -0,0 +1,28 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.nifi.excel; + +import org.apache.poi.ss.usermodel.Row; + +public class ExcelUtils { Review Comment: @exceptionfactory What would you like me to do for this? Leave as is, declare the method static in one class and have the others use it or duplicate it for all 3 classes that use it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] exceptionfactory commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
exceptionfactory commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198205592 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/RowIterator.java: ## @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */package org.apache.nifi.excel; + +import com.github.pjfanning.xlsx.StreamingReader; +import org.apache.nifi.logging.ComponentLog; +import org.apache.poi.ss.usermodel.Row; +import org.apache.poi.ss.usermodel.Sheet; +import org.apache.poi.ss.usermodel.Workbook; + +import java.io.Closeable; +import java.io.IOException; +import java.io.InputStream; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class RowIterator implements Iterator, Closeable { +private final Workbook workbook; +private final Iterator sheets; +private Sheet currentSheet; +private Iterator currentRows; +private final Map desiredSheets; +private final int firstRow; +private ComponentLog logger; +private boolean log; +private Row currentRow; + +public RowIterator(InputStream in, List desiredSheets, int firstRow) { +this(in, desiredSheets, firstRow, null); +} + +public RowIterator(InputStream in, List desiredSheets, int firstRow, ComponentLog logger) { +this.workbook = StreamingReader.builder() +.rowCacheSize(100) +.bufferSize(4096) +.open(in); +this.sheets = this.workbook.iterator(); +this.desiredSheets = desiredSheets != null ? desiredSheets.stream() +.collect(Collectors.toMap(key -> key, value -> Boolean.FALSE)) : new HashMap<>(); +this.firstRow = firstRow; +this.logger = logger; +this.log = logger != null; +} + +@Override +public boolean hasNext() { +setCurrent(); +boolean next = currentRow != null; +if(!next) { +String sheetsNotFound = getSheetsNotFound(desiredSheets); +if (!sheetsNotFound.isEmpty() && log) { +logger.warn("Excel sheet(s) not found: {}", sheetsNotFound); +} +} +return next; +} + +private void setCurrent() { +currentRow = getNextRow(); +if (currentRow != null) { +return; +} + +currentSheet = null; +currentRows = null; +while (sheets.hasNext()) { +currentSheet = sheets.next(); +if (isIterateOverAllSheets() || hasSheet(currentSheet.getSheetName())) { +currentRows = currentSheet.iterator(); +currentRow = getNextRow(); +if (currentRow != null) { +return; +} +} +} +} + +private Row getNextRow() { +while (currentRows != null && !hasExhaustedRows()) { +Row tempCurrentRow = currentRows.next(); +if (!isSkip(tempCurrentRow)) { +return tempCurrentRow; +} +} +return null; +} + +private boolean hasExhaustedRows() { +boolean exhausted = !currentRows.hasNext(); +if (log && exhausted) { +logger.info("Exhausted all rows from sheet {}", currentSheet.getSheetName()); +} +return exhausted; +} + +private boolean isSkip(Row row) { +return row.getRowNum() < firstRow; +} + +private boolean isIterateOverAllSheets() { +boolean iterateAllSheets = desiredSheets.isEmpty(); +if (iterateAllSheets && log) { +logger.info("Advanced to sheet {}", currentSheet.getSheetName()); +} +return iterateAllSheets; +} + +private boolean hasSheet(String name) { +boolean sheetByName = !desiredSheets.isEmpty() +&& desiredSheets.keySet().stream() +.anyMatch(desiredSheet -> desiredSheet.equalsIgnoreCase(name)); +if (sheetByName) { +desiredSheets.put(name, Boolean.TRUE); +} +return sheetByName; +} + +
[GitHub] [nifi] exceptionfactory commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
exceptionfactory commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198205061 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/RowIterator.java: ## @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */package org.apache.nifi.excel; + +import com.github.pjfanning.xlsx.StreamingReader; +import org.apache.nifi.logging.ComponentLog; +import org.apache.poi.ss.usermodel.Row; +import org.apache.poi.ss.usermodel.Sheet; +import org.apache.poi.ss.usermodel.Workbook; + +import java.io.Closeable; +import java.io.IOException; +import java.io.InputStream; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class RowIterator implements Iterator, Closeable { +private final Workbook workbook; +private final Iterator sheets; +private Sheet currentSheet; +private Iterator currentRows; +private final Map desiredSheets; +private final int firstRow; +private ComponentLog logger; +private boolean log; +private Row currentRow; + +public RowIterator(InputStream in, List desiredSheets, int firstRow) { +this(in, desiredSheets, firstRow, null); +} + +public RowIterator(InputStream in, List desiredSheets, int firstRow, ComponentLog logger) { +this.workbook = StreamingReader.builder() +.rowCacheSize(100) +.bufferSize(4096) +.open(in); +this.sheets = this.workbook.iterator(); +this.desiredSheets = desiredSheets != null ? desiredSheets.stream() +.collect(Collectors.toMap(key -> key, value -> Boolean.FALSE)) : new HashMap<>(); +this.firstRow = firstRow; +this.logger = logger; +this.log = logger != null; +} + +@Override +public boolean hasNext() { +setCurrent(); +boolean next = currentRow != null; +if(!next) { +String sheetsNotFound = getSheetsNotFound(desiredSheets); +if (!sheetsNotFound.isEmpty() && log) { +logger.warn("Excel sheet(s) not found: {}", sheetsNotFound); +} Review Comment: The problem with the warning is that it is not actionable in terms of flow handling. Using the `record.count` attribute provides the opportunity to warn if a flow configuration expects to see records in all cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (NIFI-11567) GeoEnrichIP processors should auto-reload the database file
[ https://issues.apache.org/jira/browse/NIFI-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Burgess updated NIFI-11567: Status: Patch Available (was: In Progress) > GeoEnrichIP processors should auto-reload the database file > --- > > Key: NIFI-11567 > URL: https://issues.apache.org/jira/browse/NIFI-11567 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Major > Fix For: 1.latest, 2.latest > > > Currently the GeoEnrichIP processors only load the database when the > processor is scheduled. This requires a processor restart if the database > file changes. Instead, the processors should auto-reload the database file > when it detects a change. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] dan-s1 commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
dan-s1 commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198175910 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/RowIterator.java: ## @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */package org.apache.nifi.excel; + +import com.github.pjfanning.xlsx.StreamingReader; +import org.apache.nifi.logging.ComponentLog; +import org.apache.poi.ss.usermodel.Row; +import org.apache.poi.ss.usermodel.Sheet; +import org.apache.poi.ss.usermodel.Workbook; + +import java.io.Closeable; +import java.io.IOException; +import java.io.InputStream; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class RowIterator implements Iterator, Closeable { +private final Workbook workbook; +private final Iterator sheets; +private Sheet currentSheet; +private Iterator currentRows; +private final Map desiredSheets; +private final int firstRow; +private ComponentLog logger; +private boolean log; +private Row currentRow; + +public RowIterator(InputStream in, List desiredSheets, int firstRow) { +this(in, desiredSheets, firstRow, null); +} + +public RowIterator(InputStream in, List desiredSheets, int firstRow, ComponentLog logger) { +this.workbook = StreamingReader.builder() +.rowCacheSize(100) +.bufferSize(4096) +.open(in); +this.sheets = this.workbook.iterator(); +this.desiredSheets = desiredSheets != null ? desiredSheets.stream() +.collect(Collectors.toMap(key -> key, value -> Boolean.FALSE)) : new HashMap<>(); +this.firstRow = firstRow; +this.logger = logger; +this.log = logger != null; +} + +@Override +public boolean hasNext() { +setCurrent(); +boolean next = currentRow != null; +if(!next) { +String sheetsNotFound = getSheetsNotFound(desiredSheets); +if (!sheetsNotFound.isEmpty() && log) { +logger.warn("Excel sheet(s) not found: {}", sheetsNotFound); +} +} +return next; +} + +private void setCurrent() { +currentRow = getNextRow(); +if (currentRow != null) { +return; +} + +currentSheet = null; +currentRows = null; +while (sheets.hasNext()) { +currentSheet = sheets.next(); +if (isIterateOverAllSheets() || hasSheet(currentSheet.getSheetName())) { +currentRows = currentSheet.iterator(); +currentRow = getNextRow(); +if (currentRow != null) { +return; +} +} +} +} + +private Row getNextRow() { +while (currentRows != null && !hasExhaustedRows()) { +Row tempCurrentRow = currentRows.next(); +if (!isSkip(tempCurrentRow)) { +return tempCurrentRow; +} +} +return null; +} + +private boolean hasExhaustedRows() { +boolean exhausted = !currentRows.hasNext(); +if (log && exhausted) { +logger.info("Exhausted all rows from sheet {}", currentSheet.getSheetName()); +} +return exhausted; +} + +private boolean isSkip(Row row) { +return row.getRowNum() < firstRow; +} + +private boolean isIterateOverAllSheets() { +boolean iterateAllSheets = desiredSheets.isEmpty(); +if (iterateAllSheets && log) { +logger.info("Advanced to sheet {}", currentSheet.getSheetName()); +} +return iterateAllSheets; +} + +private boolean hasSheet(String name) { +boolean sheetByName = !desiredSheets.isEmpty() +&& desiredSheets.keySet().stream() +.anyMatch(desiredSheet -> desiredSheet.equalsIgnoreCase(name)); +if (sheetByName) { +desiredSheets.put(name, Boolean.TRUE); +} +return sheetByName; +} + +
[GitHub] [nifi] dan-s1 commented on a diff in pull request #7194: NIFI-11167 - Add Excel Record Reader
dan-s1 commented on code in PR #7194: URL: https://github.com/apache/nifi/pull/7194#discussion_r1198172969 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/excel/RowIterator.java: ## @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */package org.apache.nifi.excel; + +import com.github.pjfanning.xlsx.StreamingReader; +import org.apache.nifi.logging.ComponentLog; +import org.apache.poi.ss.usermodel.Row; +import org.apache.poi.ss.usermodel.Sheet; +import org.apache.poi.ss.usermodel.Workbook; + +import java.io.Closeable; +import java.io.IOException; +import java.io.InputStream; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class RowIterator implements Iterator, Closeable { +private final Workbook workbook; +private final Iterator sheets; +private Sheet currentSheet; +private Iterator currentRows; +private final Map desiredSheets; +private final int firstRow; +private ComponentLog logger; +private boolean log; +private Row currentRow; + +public RowIterator(InputStream in, List desiredSheets, int firstRow) { +this(in, desiredSheets, firstRow, null); +} + +public RowIterator(InputStream in, List desiredSheets, int firstRow, ComponentLog logger) { +this.workbook = StreamingReader.builder() +.rowCacheSize(100) +.bufferSize(4096) +.open(in); +this.sheets = this.workbook.iterator(); +this.desiredSheets = desiredSheets != null ? desiredSheets.stream() +.collect(Collectors.toMap(key -> key, value -> Boolean.FALSE)) : new HashMap<>(); +this.firstRow = firstRow; +this.logger = logger; +this.log = logger != null; +} + +@Override +public boolean hasNext() { +setCurrent(); +boolean next = currentRow != null; +if(!next) { +String sheetsNotFound = getSheetsNotFound(desiredSheets); +if (!sheetsNotFound.isEmpty() && log) { +logger.warn("Excel sheet(s) not found: {}", sheetsNotFound); +} Review Comment: I thought it may be useful for operators to know explicitly when a sheet did not exist. I personally think that is clearer than a record count which could be misleading. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] gresockj opened a new pull request, #7267: NIFI-11566: Adding updateTimeout argument to parameter commands in CLI
gresockj opened a new pull request, #7267: URL: https://github.com/apache/nifi/pull/7267 # Summary [NIFI-11566](https://issues.apache.org/jira/browse/NIFI-11566) # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [ ] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [ ] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [ ] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [ ] Pull Request based on current revision of the `main` branch - [ ] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [ ] JDK 11 - [ ] JDK 17 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] mattyb149 commented on pull request #7263: NIFI-5151: Add UPSERT support for Apache Phoenix
mattyb149 commented on PR #7263: URL: https://github.com/apache/nifi/pull/7263#issuecomment-1553405287 Reviewing... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (NIFI-11567) GeoEnrichIP processors should auto-reload the database file
[ https://issues.apache.org/jira/browse/NIFI-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Burgess updated NIFI-11567: Fix Version/s: 1.latest 2.latest > GeoEnrichIP processors should auto-reload the database file > --- > > Key: NIFI-11567 > URL: https://issues.apache.org/jira/browse/NIFI-11567 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Major > Fix For: 1.latest, 2.latest > > > Currently the GeoEnrichIP processors only load the database when the > processor is scheduled. This requires a processor restart if the database > file changes. Instead, the processors should auto-reload the database file > when it detects a change. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] mattyb149 opened a new pull request, #7266: NIFI-11567: Auto-reload database file in GeoEnrichIP processors
mattyb149 opened a new pull request, #7266: URL: https://github.com/apache/nifi/pull/7266 # Summary [NIFI-11567](https://issues.apache.org/jira/browse/NIFI-11567) This PR adds a SynchronousFileWatcher and retry logic to reload the specified database file if it has changed. # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [x] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [x] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [x] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [x] Pull Request based on current revision of the `main` branch - [x] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [x] JDK 11 - [ ] JDK 17 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (NIFI-11557) Eliminate use of Files.walkFileTree for any performance-critical parts of application
[ https://issues.apache.org/jira/browse/NIFI-11557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-11557: -- Status: Patch Available (was: Open) > Eliminate use of Files.walkFileTree for any performance-critical parts of > application > - > > Key: NIFI-11557 > URL: https://issues.apache.org/jira/browse/NIFI-11557 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework, Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Labels: content-repo, content-repository, performance, slowness, > startup > Fix For: 1.latest, 2.latest > > > The FileSystemRepository (content repo implementation) as well as ListFile > both make use of the {{Files.walkFileTree}} method. Recently, I worked with a > user who had horribly long startup times. Thread dumps show that the time was > almost entirely in the FileSystemRepository's {{initializeRepository}} method > as it is walking the file tree in order to determine which archive files can > be cleaned up next. This is done during startup and again periodically in > background threads. > I made a small modification locally to instead use the standard synchronous > IO methods ( {{File.listFiles}} method. I used GenerateFlowFile to generate > 1-byte FlowFiles and set {{nifi.content.claim.max.appendable.size=1 B}} in > nifi.properties in order to generate a huge number of files - about 1.2 > million files in the content repository and restarted a few times. > Additionally, added some log lines to show how long this part of the startup > process took. > With the existing code, startup took 210 seconds (3.5 mins). With the new > implementation, it took 6.7 seconds. The appears to be due to the fact that > when using NIO.2 for every file, it does an individual disk access to obtain > File attributes, while when using the {{File.listFiles}} method the File > objects that are returned already have the necessary attributes. As a result, > the NIO.2 approach makes millions of disk accesses that are unnecessary. As > the number of files in the repository grows, the discrepancy also grows. > We need to eliminate any use of {{File.walkFileTree}} for any > performance-critical parts of the codebase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-11557) Eliminate use of Files.walkFileTree for any performance-critical parts of application
[ https://issues.apache.org/jira/browse/NIFI-11557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Payne updated NIFI-11557: -- Labels: content-repo content-repository performance slowness startup (was: ) > Eliminate use of Files.walkFileTree for any performance-critical parts of > application > - > > Key: NIFI-11557 > URL: https://issues.apache.org/jira/browse/NIFI-11557 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework, Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Labels: content-repo, content-repository, performance, slowness, > startup > Fix For: 1.latest, 2.latest > > > The FileSystemRepository (content repo implementation) as well as ListFile > both make use of the {{Files.walkFileTree}} method. Recently, I worked with a > user who had horribly long startup times. Thread dumps show that the time was > almost entirely in the FileSystemRepository's {{initializeRepository}} method > as it is walking the file tree in order to determine which archive files can > be cleaned up next. This is done during startup and again periodically in > background threads. > I made a small modification locally to instead use the standard synchronous > IO methods ( {{File.listFiles}} method. I used GenerateFlowFile to generate > 1-byte FlowFiles and set {{nifi.content.claim.max.appendable.size=1 B}} in > nifi.properties in order to generate a huge number of files - about 1.2 > million files in the content repository and restarted a few times. > Additionally, added some log lines to show how long this part of the startup > process took. > With the existing code, startup took 210 seconds (3.5 mins). With the new > implementation, it took 6.7 seconds. The appears to be due to the fact that > when using NIO.2 for every file, it does an individual disk access to obtain > File attributes, while when using the {{File.listFiles}} method the File > objects that are returned already have the necessary attributes. As a result, > the NIO.2 approach makes millions of disk accesses that are unnecessary. As > the number of files in the repository grows, the discrepancy also grows. > We need to eliminate any use of {{File.walkFileTree}} for any > performance-critical parts of the codebase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] markap14 opened a new pull request, #7265: NIFI-11557: Avoid using the expensive and unnecessary Files.walkFileT…
markap14 opened a new pull request, #7265: URL: https://github.com/apache/nifi/pull/7265 …ree on startup and initialization of Content Repository. Also performed some code cleanup: IntelliJ flagged many warnings in the class, mostly around methods that are no longer used and potential NullPointerExceptions, so those were cleaned up. Additionally, removed the nifi property for max flowfiles per claim - this property was never implemented. It was referenced, but the way in which is was used curiously had nothing to do with what the property was intended to be used for or for how it was documented. Instead, it was used to limit the max number of claims that could remain writable. As a result, it was removed. # Summary [NIFI-0](https://issues.apache.org/jira/browse/NIFI-0) # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [ ] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [ ] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [ ] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [ ] Pull Request based on current revision of the `main` branch - [ ] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [ ] JDK 11 - [ ] JDK 17 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (NIFI-4298) NiFi allows users to remove critical Attributes that are needed by processors.
[ https://issues.apache.org/jira/browse/NIFI-4298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723953#comment-17723953 ] Michael W Moser commented on NIFI-4298: --- NIFI-8971 partially resolves this, by fixing the specific MergeContent problem in the description. > NiFi allows users to remove critical Attributes that are needed by processors. > -- > > Key: NIFI-4298 > URL: https://issues.apache.org/jira/browse/NIFI-4298 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework >Affects Versions: 1.2.0 >Reporter: Matthew Clarke >Priority: Major > > The UpdateAttribute processor provides users with the ability to provide a > "Delete Attributes Expression". > While FlowFile properties entryDate, lineageDate, fileSize, and uuid are > protected and can not be removed, FlowFile attributes path and filename can > be removed. > Removal of these attributes has adverse affects on many other processors. > Any processor that will write a FlowFile out requires the filename attribute. > In addition, I have found that the MergeContent processor (configured to use > FlowFileStreams as the merge strategy). also, for whatever reason, requires > that the path attribute exists on the FlowFile. > If this attribute is missing, a NPE is thrown and the session is rolled back. > 2017-08-14 19:27:00,156 ERROR [Timer-Driven Process Thread-7] > o.a.n.processors.standard.MergeContent > MergeContent[id=d7213e1f-0c03-1715-93cc-b1be9228ec36] Failed to process > bundle of 1 files due to java.lang.NullPointerException; rolling back > sessions: {} > A stack trace is not produced even if DEBUG is enabled for this processor. > NiFi needs to prevent users from being able to remove attributes which may be > "required" by other processors. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11557) Eliminate use of Files.walkFileTree for any performance-critical parts of application
[ https://issues.apache.org/jira/browse/NIFI-11557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723950#comment-17723950 ] Mark Payne commented on NIFI-11557: --- Looking further into this, I found that the logic that we have currently that scans through the content repo serves two purposes: 1. To count how many files are archived 2. To determine the timestamp of the oldest archived file. The timestamp of the oldest archived file was to be used for performance gains, in order to determine that there are no files that need to be cleaned up due to time constraints and as a result don't bother scanning in the background. Interestingly, this code was buggy - while it checked the last modified time of each file, it then compared it to the 'oldestTimestamp' but 'oldestTimestamp' was initialized to 0, which means that it would always remain 0. As a result, this code was very expensive and unneeded. We only really need to count the number of files archived. This can be achieved MUCH more efficiently by simply performing a {{File.listFiles}} call on each archive directory. This will drastically improve startup performance in cases where there are millions of files archived. > Eliminate use of Files.walkFileTree for any performance-critical parts of > application > - > > Key: NIFI-11557 > URL: https://issues.apache.org/jira/browse/NIFI-11557 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework, Extensions >Reporter: Mark Payne >Assignee: Mark Payne >Priority: Major > Fix For: 1.latest, 2.latest > > > The FileSystemRepository (content repo implementation) as well as ListFile > both make use of the {{Files.walkFileTree}} method. Recently, I worked with a > user who had horribly long startup times. Thread dumps show that the time was > almost entirely in the FileSystemRepository's {{initializeRepository}} method > as it is walking the file tree in order to determine which archive files can > be cleaned up next. This is done during startup and again periodically in > background threads. > I made a small modification locally to instead use the standard synchronous > IO methods ( {{File.listFiles}} method. I used GenerateFlowFile to generate > 1-byte FlowFiles and set {{nifi.content.claim.max.appendable.size=1 B}} in > nifi.properties in order to generate a huge number of files - about 1.2 > million files in the content repository and restarted a few times. > Additionally, added some log lines to show how long this part of the startup > process took. > With the existing code, startup took 210 seconds (3.5 mins). With the new > implementation, it took 6.7 seconds. The appears to be due to the fact that > when using NIO.2 for every file, it does an individual disk access to obtain > File attributes, while when using the {{File.listFiles}} method the File > objects that are returned already have the necessary attributes. As a result, > the NIO.2 approach makes millions of disk accesses that are unnecessary. As > the number of files in the repository grows, the discrepancy also grows. > We need to eliminate any use of {{File.walkFileTree}} for any > performance-critical parts of the codebase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-11568) Remove Apache DS Test Dependency
[ https://issues.apache.org/jira/browse/NIFI-11568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Handermann updated NIFI-11568: Status: Patch Available (was: Open) > Remove Apache DS Test Dependency > > > Key: NIFI-11568 > URL: https://issues.apache.org/jira/browse/NIFI-11568 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions, NiFi Registry >Reporter: David Handermann >Assignee: David Handermann >Priority: Minor > Fix For: 1.latest, 2.latest > > > With recent refactoring of LDAP test cases, the Apache DS dependency is no > longer used and should be removed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] exceptionfactory commented on pull request #7257: NIFI-11555-Upgrade-apacheds-all-to-2.0.0.AM26
exceptionfactory commented on PR #7257: URL: https://github.com/apache/nifi/pull/7257#issuecomment-1553242827 Thanks for your work on this @jbalchan. It appears that more recent versions of the Apache DS All JAR introduced issues with manifest signatures. On further investigation, the library is no longer used, so I created a new Jira issue and a separate pull request to remove the references. This was helpful for highlighting the opportunity to remove the dependency. Closing in favor of #7264. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (NIFI-11555) Upgrade apacheds-all to 2.0.0.AM26
[ https://issues.apache.org/jira/browse/NIFI-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Handermann resolved NIFI-11555. - Assignee: David Handermann Resolution: Workaround Recent LDAP test refactoring removed the need for the Apache DS dependency, so this issue is superceded by NIFI-11568. > Upgrade apacheds-all to 2.0.0.AM26 > -- > > Key: NIFI-11555 > URL: https://issues.apache.org/jira/browse/NIFI-11555 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mike R >Assignee: David Handermann >Priority: Major > > Upgrade apacheds-all to 2.0.0.AM26 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-11555) Upgrade apacheds-all to 2.0.0.AM26
[ https://issues.apache.org/jira/browse/NIFI-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Handermann updated NIFI-11555: Affects Version/s: (was: 1.21.0) > Upgrade apacheds-all to 2.0.0.AM26 > -- > > Key: NIFI-11555 > URL: https://issues.apache.org/jira/browse/NIFI-11555 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mike R >Priority: Major > > Upgrade apacheds-all to 2.0.0.AM26 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] exceptionfactory opened a new pull request, #7264: NIFI-11568 Remove Apache DS Test Dependency
exceptionfactory opened a new pull request, #7264: URL: https://github.com/apache/nifi/pull/7264 # Summary [NIFI-11568](https://issues.apache.org/jira/browse/NIFI-11568) Removes the Apache Directory Server test dependency from Registry and NiFi LDAP Provider modules. Recent test refactoring using the Unbound library eliminated the need for Apache DS. # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [X] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [X] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [X] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [X] Pull Request based on current revision of the `main` branch - [X] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [ ] JDK 11 - [ ] JDK 17 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (NIFI-11568) Remove Apache DS Test Dependency
David Handermann created NIFI-11568: --- Summary: Remove Apache DS Test Dependency Key: NIFI-11568 URL: https://issues.apache.org/jira/browse/NIFI-11568 Project: Apache NiFi Issue Type: Improvement Components: Extensions, NiFi Registry Reporter: David Handermann Assignee: David Handermann Fix For: 1.latest, 2.latest With recent refactoring of LDAP test cases, the Apache DS dependency is no longer used and should be removed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] exceptionfactory commented on pull request #7231: [NIFI-2964] Added ability for AttributeToJSON to handle nested JSON when either outputting to a flow file or an attribute.
exceptionfactory commented on PR #7231: URL: https://github.com/apache/nifi/pull/7231#issuecomment-1553212554 > @exceptionfactory After much deliberation we realized you were right that NIFI has other processors to handle the JSON transformations needed. We would still like though the ability to handle attributes as nested objects as you suggested: > > > If we go forward with this change, it seems better to have a simple property like JSON Handling Strategy with values of Escaped String or Nested Object. That way, anything detected as JSON would be treated the same way, without the potential complexity of pattern-matching on FlowFile attribute names. > > This would clearly highlight the fact that the processor can handle nested objects. Is that still okay? Thanks for evaluating the options. Yes, I think the Handling Strategy approach with the two options provides clarity on both the implementation side and the usability side. If you can make those adjustments to the PR, that would be great. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] mosermw commented on pull request #6560: NIFI-10676 USE_SPECIFIED_OR_COMPATIBLE_OR_GHOST on flow load from bytes
mosermw commented on PR #6560: URL: https://github.com/apache/nifi/pull/6560#issuecomment-1553196696 @genehynson @mh013370 If there is still interest in this PR, then the Admin Guide should be updated to describe this new property and its options, while stressing that changing the default is an experimental and risky feature. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (NIFI-5151) Patch Nifi with Upsert functions for PutDatabaseRecord processor
[ https://issues.apache.org/jira/browse/NIFI-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723931#comment-17723931 ] Lehel Boér commented on NIFI-5151: -- new PR created: https://github.com/apache/nifi/pull/7263 > Patch Nifi with Upsert functions for PutDatabaseRecord processor > > > Key: NIFI-5151 > URL: https://issues.apache.org/jira/browse/NIFI-5151 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.7.0 >Reporter: Karl Amundsson >Assignee: Lehel Boér >Priority: Major > Labels: Processor > Attachments: > 0001-NIFI-5151-Adding-support-for-UPSERT-in-PutDatabaseRe.patch, > 0001-NIFI-5151-Using-DatabaseAdapter-to-generate-INSERT-S.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > Since Phoenix doesn't support the SQL statement INSERT you have to use a > process like: ConvertAttributesToJSON->ConvertJSONToSQL in Insert > mode->ReplaceText to replace "INSERT" with "UPSERT" -> PutSQL (See: > [https://community.hortonworks.com/questions/40561/nifi-phoenix-processor.html)] > With this patch you can choose to use UPSERT directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] Lehel44 opened a new pull request, #7263: NIFI-5151: Add UPSERT support for Apache Phoenix
Lehel44 opened a new pull request, #7263: URL: https://github.com/apache/nifi/pull/7263 # Summary [NIFI-5151](https://issues.apache.org/jira/browse/NIFI-5151) # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [ ] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [ ] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [ ] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [ ] Pull Request based on current revision of the `main` branch - [ ] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [ ] JDK 11 - [ ] JDK 17 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (NIFI-5151) Patch Nifi with Upsert functions for PutDatabaseRecord processor
[ https://issues.apache.org/jira/browse/NIFI-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lehel Boér reassigned NIFI-5151: Assignee: Lehel Boér > Patch Nifi with Upsert functions for PutDatabaseRecord processor > > > Key: NIFI-5151 > URL: https://issues.apache.org/jira/browse/NIFI-5151 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.7.0 >Reporter: Karl Amundsson >Assignee: Lehel Boér >Priority: Major > Labels: Processor > Attachments: > 0001-NIFI-5151-Adding-support-for-UPSERT-in-PutDatabaseRe.patch, > 0001-NIFI-5151-Using-DatabaseAdapter-to-generate-INSERT-S.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > Since Phoenix doesn't support the SQL statement INSERT you have to use a > process like: ConvertAttributesToJSON->ConvertJSONToSQL in Insert > mode->ReplaceText to replace "INSERT" with "UPSERT" -> PutSQL (See: > [https://community.hortonworks.com/questions/40561/nifi-phoenix-processor.html)] > With this patch you can choose to use UPSERT directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] dan-s1 commented on pull request #7231: [NIFI-2964] Added ability for AttributeToJSON to handle nested JSON when either outputting to a flow file or an attribute.
dan-s1 commented on PR #7231: URL: https://github.com/apache/nifi/pull/7231#issuecomment-1553177346 @exceptionfactory After much deliberation we realized you were right that NIFI has other processors to handle the JSON transformations needed. We would still like though the ability to handle attributes as nested objects as you suggested: > If we go forward with this change, it seems better to have a simple property like JSON Handling Strategy with values of Escaped String or Nested Object. That way, anything detected as JSON would be treated the same way, without the potential complexity of pattern-matching on FlowFile attribute names. This would clearly highlight the fact that the processor can handle nested objects. Is that still okay? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (NIFI-11567) GeoEnrichIP processors should auto-reload the database file
[ https://issues.apache.org/jira/browse/NIFI-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Burgess reassigned NIFI-11567: --- Assignee: Matt Burgess > GeoEnrichIP processors should auto-reload the database file > --- > > Key: NIFI-11567 > URL: https://issues.apache.org/jira/browse/NIFI-11567 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Reporter: Matt Burgess >Assignee: Matt Burgess >Priority: Major > > Currently the GeoEnrichIP processors only load the database when the > processor is scheduled. This requires a processor restart if the database > file changes. Instead, the processors should auto-reload the database file > when it detects a change. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-11567) GeoEnrichIP processors should auto-reload the database file
Matt Burgess created NIFI-11567: --- Summary: GeoEnrichIP processors should auto-reload the database file Key: NIFI-11567 URL: https://issues.apache.org/jira/browse/NIFI-11567 Project: Apache NiFi Issue Type: Improvement Components: Extensions Reporter: Matt Burgess Currently the GeoEnrichIP processors only load the database when the processor is scheduled. This requires a processor restart if the database file changes. Instead, the processors should auto-reload the database file when it detects a change. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-11566) CLI set-param command can timeout in some cases
Joe Gresock created NIFI-11566: -- Summary: CLI set-param command can timeout in some cases Key: NIFI-11566 URL: https://issues.apache.org/jira/browse/NIFI-11566 Project: Apache NiFi Issue Type: Bug Components: Tools and Build Affects Versions: 1.21.0 Reporter: Joe Gresock Assignee: Joe Gresock In cases where controller services take a long time to disable/enable, the NIFI CLI command 'set-param' can timeout, causing problems for client code. This command currently has a hard-coded 60-second timeout, which needs to be configurable. An update timeout should be added as an argument to the command. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (NIFI-11562) ValidateRecord doesn't work correctly; routes all to 'valid'
[ https://issues.apache.org/jira/browse/NIFI-11562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] crissaegrim resolved NIFI-11562. Resolution: Not A Bug Oops, was using 'reader's schema' and not writer's schema. Closing. > ValidateRecord doesn't work correctly; routes all to 'valid' > > > Key: NIFI-11562 > URL: https://issues.apache.org/jira/browse/NIFI-11562 > Project: Apache NiFi > Issue Type: Bug >Affects Versions: 1.20.0 > Environment: linux, docker >Reporter: crissaegrim >Priority: Major > Attachments: image-2023-05-17-17-32-04-404.png > > > Replicate this with. Expecting `age` field (int in schema) to cause failure. > But it's not failing. > # Generate Flow File > id,name,age,employer > 1,joe,30,mlp > 2,bob,thirty,google > 3,linda,32.123,yahoo > 4,anne,31, > # ValidateRecord > # Connect `invalid` and `valid` to downstream to visualize > Validate against this avro schema > > protocol Test { > record TestEmployer { > int id; > string name; > int age; > string? employer = null; > } > } > { > "type" : "record", > "name" : "TestEmployer", > "fields" : [ { > "name" : "id", > "type" : "int" > }, { > "name" : "name", > "type" : "string" > }, { > "name" : "age", > "type" : "int" > }, { > "name" : "employer", > "type" : [ "null", "string" ], > "default" : null > } ] > } > See that all records are routed to `valid`. Why? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-11565) Upgrade Jackson-databind to 2.15.1
[ https://issues.apache.org/jira/browse/NIFI-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike R updated NIFI-11565: -- Affects Version/s: 1.21.0 > Upgrade Jackson-databind to 2.15.1 > -- > > Key: NIFI-11565 > URL: https://issues.apache.org/jira/browse/NIFI-11565 > Project: Apache NiFi > Issue Type: Improvement >Affects Versions: 1.21.0 >Reporter: Mike R >Assignee: Mike R >Priority: Minor > > Upgrade Jackson-databind to 2.15.1 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (NIFI-11565) Upgrade Jackson-databind to 2.15.1
[ https://issues.apache.org/jira/browse/NIFI-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike R updated NIFI-11565: -- Priority: Minor (was: Major) > Upgrade Jackson-databind to 2.15.1 > -- > > Key: NIFI-11565 > URL: https://issues.apache.org/jira/browse/NIFI-11565 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mike R >Assignee: Mike R >Priority: Minor > > Upgrade Jackson-databind to 2.15.1 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (NIFI-11565) Upgrade Jackson-databind to 2.15.1
[ https://issues.apache.org/jira/browse/NIFI-11565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike R reassigned NIFI-11565: - Assignee: Mike R > Upgrade Jackson-databind to 2.15.1 > -- > > Key: NIFI-11565 > URL: https://issues.apache.org/jira/browse/NIFI-11565 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mike R >Assignee: Mike R >Priority: Major > > Upgrade Jackson-databind to 2.15.1 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-11565) Upgrade Jackson-databind to 2.15.1
Mike R created NIFI-11565: - Summary: Upgrade Jackson-databind to 2.15.1 Key: NIFI-11565 URL: https://issues.apache.org/jira/browse/NIFI-11565 Project: Apache NiFi Issue Type: Improvement Reporter: Mike R Upgrade Jackson-databind to 2.15.1 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] mr1716 opened a new pull request, #7262: NIFI-11564 Update msal4j to 1.13.8
mr1716 opened a new pull request, #7262: URL: https://github.com/apache/nifi/pull/7262 # Summary [NIFI-11564](https://issues.apache.org/jira/browse/NIFI-11564) # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [X] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI-11564) issue created ### Pull Request Tracking - [X] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [X] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [X] Pull Request based on current revision of the `main` branch - [X] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [ ] JDK 11 - [ ] JDK 17 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (NIFI-11564) Update msal4j to 1.13.8
[ https://issues.apache.org/jira/browse/NIFI-11564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike R reassigned NIFI-11564: - Assignee: Mike R > Update msal4j to 1.13.8 > --- > > Key: NIFI-11564 > URL: https://issues.apache.org/jira/browse/NIFI-11564 > Project: Apache NiFi > Issue Type: Improvement >Reporter: Mike R >Assignee: Mike R >Priority: Minor > > Update msal4j to 1.13.8 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-11564) Update msal4j to 1.13.8
Mike R created NIFI-11564: - Summary: Update msal4j to 1.13.8 Key: NIFI-11564 URL: https://issues.apache.org/jira/browse/NIFI-11564 Project: Apache NiFi Issue Type: Improvement Reporter: Mike R Update msal4j to 1.13.8 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi] gresockj opened a new pull request, #7261: NIFI-11563: Allowing source connectables to be restarted on new conne…
gresockj opened a new pull request, #7261: URL: https://github.com/apache/nifi/pull/7261 …ctions in the StandardVersionedComponentSynchronizer # Summary [NIFI-11563](https://issues.apache.org/jira/browse/NIFI-11563) # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [ ] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [ ] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [ ] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [ ] Pull Request based on current revision of the `main` branch - [ ] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [ ] JDK 11 - [ ] JDK 17 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (NIFI-11562) ValidateRecord doesn't work correctly; routes all to 'valid'
[ https://issues.apache.org/jira/browse/NIFI-11562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] crissaegrim updated NIFI-11562: --- Summary: ValidateRecord doesn't work correctly; routes all to 'valid' (was: ValidateRecord doesn't work correctly; routes all to 'valid' for invalid records) > ValidateRecord doesn't work correctly; routes all to 'valid' > > > Key: NIFI-11562 > URL: https://issues.apache.org/jira/browse/NIFI-11562 > Project: Apache NiFi > Issue Type: Bug >Affects Versions: 1.20.0 > Environment: linux, docker >Reporter: crissaegrim >Priority: Major > Attachments: image-2023-05-17-17-32-04-404.png > > > Replicate this with. Expecting `age` field (int in schema) to cause failure. > But it's not failing. > # Generate Flow File > id,name,age,employer > 1,joe,30,mlp > 2,bob,thirty,google > 3,linda,32.123,yahoo > 4,anne,31, > # ValidateRecord > # Connect `invalid` and `valid` to downstream to visualize > Validate against this avro schema > > protocol Test { > record TestEmployer { > int id; > string name; > int age; > string? employer = null; > } > } > { > "type" : "record", > "name" : "TestEmployer", > "fields" : [ { > "name" : "id", > "type" : "int" > }, { > "name" : "name", > "type" : "string" > }, { > "name" : "age", > "type" : "int" > }, { > "name" : "employer", > "type" : [ "null", "string" ], > "default" : null > } ] > } > See that all records are routed to `valid`. Why? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (NIFI-11563) Versioned component synchronizer does not stop source connectable for new connection
Joe Gresock created NIFI-11563: -- Summary: Versioned component synchronizer does not stop source connectable for new connection Key: NIFI-11563 URL: https://issues.apache.org/jira/browse/NIFI-11563 Project: Apache NiFi Issue Type: Bug Components: Core Framework Affects Versions: 1.21.0 Reporter: Joe Gresock Assignee: Joe Gresock Although this cannot be reproduced directly in NiFi, if any external tools use the StandardVersionedComponentSynchronizer to synchronize a flow, then adding a connection with an already-running source throws an exception. This is because although upstream connections are restarted for existing connections, they are not restarted for new connections. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [nifi-minifi-cpp] adamdebreceni opened a new pull request, #1576: MINIFICPP-2121 - Use std::atomic_flag instead of semaphore
adamdebreceni opened a new pull request, #1576: URL: https://github.com/apache/nifi-minifi-cpp/pull/1576 Thank you for submitting a contribution to Apache NiFi - MiNiFi C++. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with MINIFICPP- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically main)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file? - [ ] If applicable, have you updated the NOTICE file? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] simonbence opened a new pull request, #7260: NIFI-11559 Increase poll time in test to avoid breaking test on slowe…
simonbence opened a new pull request, #7260: URL: https://github.com/apache/nifi/pull/7260 …r environments # Summary [NIFI-0](https://issues.apache.org/jira/browse/NIFI-0) # Tracking Please complete the following tracking steps prior to pull request creation. ### Issue Tracking - [ ] [Apache NiFi Jira](https://issues.apache.org/jira/browse/NIFI) issue created ### Pull Request Tracking - [ ] Pull Request title starts with Apache NiFi Jira issue number, such as `NIFI-0` - [ ] Pull Request commit message starts with Apache NiFi Jira issue number, as such `NIFI-0` ### Pull Request Formatting - [ ] Pull Request based on current revision of the `main` branch - [ ] Pull Request refers to a feature branch with one commit containing changes # Verification Please indicate the verification steps performed prior to pull request creation. ### Build - [ ] Build completed using `mvn clean install -P contrib-check` - [ ] JDK 11 - [ ] JDK 17 ### Licensing - [ ] New dependencies are compatible with the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0) according to the [License Policy](https://www.apache.org/legal/resolved.html) - [ ] New dependencies are documented in applicable `LICENSE` and `NOTICE` files ### Documentation - [ ] Documentation formatting appears as expected in rendered files -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org