Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1775953704 You're welcome @dan-s1. I am verifying the changes for the support branch, I updated the Jira to resolved for 2.0.0, and should be able to include 1.24.0 shortly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1775941691 @exceptionfactory Thank you! I am not sure if there is a delay or not but in Jira I still see the status as Patch Available. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory closed pull request #7665: NIFI-11197 Initial check in for Yaml record reader URL: https://github.com/apache/nifi/pull/7665 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1775094328 n@dan-s1 In the course of runtime testing, I found an issue with the Max String Length property value being `null` in `storePropertyValues` in `JsonTreeReader`. This is understandable since the YamlTreeReader removes that property. I rebased the branch and added one more commit with a `buildStreamReadContraints()` method that `YamlTreeReader` overrides. This approach will be clearer than null-checking the property value. I plan to run a few more tests soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory commented on code in PR #7665: URL: https://github.com/apache/nifi/pull/7665#discussion_r1367234754 ## nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonParserFactory.java: ## @@ -23,27 +23,20 @@ import java.io.IOException; import java.io.InputStream; +import java.util.Objects; -public class JsonParserFactory implements TokenParserFactory{ -private static final JsonFactory JSON_FACTORY = new JsonFactory(); -private static final ObjectMapper JSON_MAPPER = new ObjectMapper(); - +public class JsonParserFactory implements TokenParserFactory { @Override -public JsonParser getJsonParser(InputStream in) throws IOException { -JsonParser jsonParser = JSON_FACTORY.createParser(in); -jsonParser.setCodec(JSON_MAPPER); - -return jsonParser; -} +public JsonParser getJsonParser(final InputStream in, final StreamReadConstraints streamReadConstraints, final boolean allowComments) throws IOException { +Objects.requireNonNull(in, "Input Stream required"); +Objects.requireNonNull(streamReadConstraints, "Stream Read Constraints required"); -@Override -public ObjectMapper createCodec(boolean allowComments, StreamReadConstraints streamReadConstraints) { -ObjectMapper codec = new ObjectMapper(); -if(allowComments) { -codec.enable(JsonParser.Feature.ALLOW_COMMENTS); +final ObjectMapper objectMapper = new ObjectMapper(); + objectMapper.getFactory().setStreamReadConstraints(streamReadConstraints); +if (allowComments) { +objectMapper.enable(JsonParser.Feature.ALLOW_COMMENTS); } -codec.getFactory().setStreamReadConstraints(streamReadConstraints); - -return codec; +final JsonFactory jsonFactory = objectMapper.getFactory(); +return jsonFactory.createParser(in); Review Comment: Thanks for noting the previous approach. The line calling `setCodec()` was unchanged in the most recent updates for NIFI-12153, but it actually should have been removed as part of the changes to incorporate the constraint settings. So in this case, it is an opportunity to correct the implementation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on code in PR #7665: URL: https://github.com/apache/nifi/pull/7665#discussion_r1367185418 ## nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonParserFactory.java: ## @@ -23,27 +23,20 @@ import java.io.IOException; import java.io.InputStream; +import java.util.Objects; -public class JsonParserFactory implements TokenParserFactory{ -private static final JsonFactory JSON_FACTORY = new JsonFactory(); -private static final ObjectMapper JSON_MAPPER = new ObjectMapper(); - +public class JsonParserFactory implements TokenParserFactory { @Override -public JsonParser getJsonParser(InputStream in) throws IOException { -JsonParser jsonParser = JSON_FACTORY.createParser(in); -jsonParser.setCodec(JSON_MAPPER); - -return jsonParser; -} +public JsonParser getJsonParser(final InputStream in, final StreamReadConstraints streamReadConstraints, final boolean allowComments) throws IOException { +Objects.requireNonNull(in, "Input Stream required"); +Objects.requireNonNull(streamReadConstraints, "Stream Read Constraints required"); -@Override -public ObjectMapper createCodec(boolean allowComments, StreamReadConstraints streamReadConstraints) { -ObjectMapper codec = new ObjectMapper(); -if(allowComments) { -codec.enable(JsonParser.Feature.ALLOW_COMMENTS); +final ObjectMapper objectMapper = new ObjectMapper(); + objectMapper.getFactory().setStreamReadConstraints(streamReadConstraints); +if (allowComments) { +objectMapper.enable(JsonParser.Feature.ALLOW_COMMENTS); } -codec.getFactory().setStreamReadConstraints(streamReadConstraints); - -return codec; +final JsonFactory jsonFactory = objectMapper.getFactory(); +return jsonFactory.createParser(in); Review Comment: In the previous code also the `JsonFactory` was coming from the `ObjectMapper`and yet `setCodec` was still called. See the previous line 145: `jsonParser = codec.getFactory().createParser(in);` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory commented on code in PR #7665: URL: https://github.com/apache/nifi/pull/7665#discussion_r1367180160 ## nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonRecordSource.java: ## @@ -29,6 +30,11 @@ public class JsonRecordSource implements RecordSource { private static final Logger logger = LoggerFactory.getLogger(JsonRecordSource.class); Review Comment: There have been discussions in the past around casing for the `Logger` in particular. The general trend is having the `logger` lowercase, even when static, but of course actual constants should be uppercase. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory commented on code in PR #7665: URL: https://github.com/apache/nifi/pull/7665#discussion_r1367179198 ## nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonParserFactory.java: ## @@ -23,27 +23,20 @@ import java.io.IOException; import java.io.InputStream; +import java.util.Objects; -public class JsonParserFactory implements TokenParserFactory{ -private static final JsonFactory JSON_FACTORY = new JsonFactory(); -private static final ObjectMapper JSON_MAPPER = new ObjectMapper(); - +public class JsonParserFactory implements TokenParserFactory { @Override -public JsonParser getJsonParser(InputStream in) throws IOException { -JsonParser jsonParser = JSON_FACTORY.createParser(in); -jsonParser.setCodec(JSON_MAPPER); - -return jsonParser; -} +public JsonParser getJsonParser(final InputStream in, final StreamReadConstraints streamReadConstraints, final boolean allowComments) throws IOException { +Objects.requireNonNull(in, "Input Stream required"); +Objects.requireNonNull(streamReadConstraints, "Stream Read Constraints required"); -@Override -public ObjectMapper createCodec(boolean allowComments, StreamReadConstraints streamReadConstraints) { -ObjectMapper codec = new ObjectMapper(); -if(allowComments) { -codec.enable(JsonParser.Feature.ALLOW_COMMENTS); +final ObjectMapper objectMapper = new ObjectMapper(); + objectMapper.getFactory().setStreamReadConstraints(streamReadConstraints); +if (allowComments) { +objectMapper.enable(JsonParser.Feature.ALLOW_COMMENTS); } -codec.getFactory().setStreamReadConstraints(streamReadConstraints); - -return codec; +final JsonFactory jsonFactory = objectMapper.getFactory(); +return jsonFactory.createParser(in); Review Comment: Setting the codec is not necessary because the `JsonFactory` is coming from the `ObjectMapper`, as opposed to the other way around. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on code in PR #7665: URL: https://github.com/apache/nifi/pull/7665#discussion_r1367170067 ## nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonRecordSource.java: ## @@ -29,6 +30,11 @@ public class JsonRecordSource implements RecordSource { private static final Logger logger = LoggerFactory.getLogger(JsonRecordSource.class); Review Comment: If the added static final private variables below are uppercase, this should also be for consistency. ## nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonParserFactory.java: ## @@ -23,27 +23,20 @@ import java.io.IOException; import java.io.InputStream; +import java.util.Objects; -public class JsonParserFactory implements TokenParserFactory{ -private static final JsonFactory JSON_FACTORY = new JsonFactory(); -private static final ObjectMapper JSON_MAPPER = new ObjectMapper(); - +public class JsonParserFactory implements TokenParserFactory { @Override -public JsonParser getJsonParser(InputStream in) throws IOException { -JsonParser jsonParser = JSON_FACTORY.createParser(in); -jsonParser.setCodec(JSON_MAPPER); - -return jsonParser; -} +public JsonParser getJsonParser(final InputStream in, final StreamReadConstraints streamReadConstraints, final boolean allowComments) throws IOException { +Objects.requireNonNull(in, "Input Stream required"); +Objects.requireNonNull(streamReadConstraints, "Stream Read Constraints required"); -@Override -public ObjectMapper createCodec(boolean allowComments, StreamReadConstraints streamReadConstraints) { -ObjectMapper codec = new ObjectMapper(); -if(allowComments) { -codec.enable(JsonParser.Feature.ALLOW_COMMENTS); +final ObjectMapper objectMapper = new ObjectMapper(); + objectMapper.getFactory().setStreamReadConstraints(streamReadConstraints); +if (allowComments) { +objectMapper.enable(JsonParser.Feature.ALLOW_COMMENTS); } -codec.getFactory().setStreamReadConstraints(streamReadConstraints); - -return codec; +final JsonFactory jsonFactory = objectMapper.getFactory(); +return jsonFactory.createParser(in); Review Comment: The previous code in `AbstractJsonRowRecordReader.java` formally on line 146 when creating the jsonParser also called `setCodec`. Shouldn't you have something similar here? Instead of ` return jsonFactory.createParser(in);` change to ``` final JsonParser jsonParser = jsonFactory.createParser(in); jsonParser.setCodec(objectMapper); return jsonParser; ``` ## nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/TokenParserFactory.java: ## @@ -18,13 +18,19 @@ import com.fasterxml.jackson.core.JsonParser; import com.fasterxml.jackson.core.StreamReadConstraints; -import com.fasterxml.jackson.databind.ObjectMapper; import java.io.IOException; import java.io.InputStream; public interface TokenParserFactory { -JsonParser getJsonParser(InputStream in) throws IOException; - -ObjectMapper createCodec(boolean allowComments, StreamReadConstraints streamReadConstraints); +/** + * Get JSON Parser implementation for provided Input Stream with configured settings + * + * @param in Input Stream to be parsed + * @param streamReadConstraints Stream Read Constraints applied + * @param allowComments Whether to allow comments when parsing + * @return JSON Parser + * @throws IOException Thrown on failures ot read the Input Stream Review Comment: Instead of `@throws IOException Thrown on failures ot read the Input Stream` this should be `@throws IOException Thrown on failures to read the Input Stream` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1772781514 @exceptionfactory Any chance this can make 1.24? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1765429075 @exceptionfactory The failed build seems to not have to do with any of my changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1765209620 Thanks for the updates @dan-s1, I will take a closer look soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1765207530 @exceptionfactory I resolved the conflicts but mistakenly rebased and squashed the commits. I believe all the changes you had requested are all in there. Is that okay? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1764582630 > @exceptionfactory The problem with static is I need many of the properties which are defined in instance methods in `SchemaRegistryService`. I basically want to mimic `getSupportedPropertyDescriptors` from `JsonTreeReader` which is not using any static methods minus the 2 properties which are not applicable for Yaml. Thanks for noting that detail @dan-s1, in that case, the approach you outlined seems like the best option. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1764574517 @exceptionfactory The problem with static is I need many of the properties which are defined in instance methods in `SchemaRegistryService`. I basically want to mimic `getSupportedPropertyDescriptors` from `JsonTreeReader` which is not using any static methods minus the 2 properties which are not applicable for Yaml. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1764530205 > @exceptionfactory Then would it be okay to have in `YamlTreeReader` code like: > > ``` > @Override > protected List getSupportedPropertyDescriptors() { > final List properties = new ArrayList<>(super.getSupportedPropertyDescriptors()); > //NOTE: Remove those properties which are not applicable for Yaml. > properties.remove(AbstractJsonRowRecordReader.MAX_STRING_LENGTH); > properties.remove(AbstractJsonRowRecordReader.ALLOW_COMMENTS); > > return properties; > } > ``` Thanks for the example @dan-s1, yes, if those properties do not appear to be honored with the YAML implementation, then that is one approach. However, given the number of times that this method will be called, it would be better to define a local static list of supported properties and return that reference in `getSupportedPropertyDescriptors()`. That does require manual maintenance of the list for the YAML Reader, but that is probably better in this case as it should force a review of whether any new property would apply to both the JSON and YAML implementations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1764514871 > @dan-s1 Just out of (genuine) curiosity, what prompted this feature? I have never seen anyone use YAML as anything other than a configuration file format. What horror story/interesting use case prompted it? :-D @MikeThomsen Please refer to the ticket for more details but the submitter's use case was to have ability to convert from Yaml to Json. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1764511710 > > @exceptionfactory I am trying to rework my code to handle the changes made in #7823 for NIFI-12153. What I am discovering is that YAML supports comments and there is no way to turn off parsing of comments. In addition I am trying to mimic `TestJsonTreeRowRecordReader#testReadJSONStringTooLong` and I am trying to use `StreamReadConstraints.builder().maxStringLength(1).build()` yet no exception is thrown even though I have strings longer than one character long. It would seem that aspect only works for Json parsing and not Yaml parsing. I did find the following article [How to parse large YAML file in Java or Kotlin](https://stackoverflow.com/questions/74805240/how-to-parse-large-yaml-file-in-java-or-kotlin) which would seem to handle larger Yaml files but I am not sure if that is the same setting as `StreamReadConstraints.builder().maxStringLength(1).build()`. Please advise. > > Thanks for the details @dan-s1. If the YAML implementation does not support the same constraints, then I recommend not including those property descriptors on the YAML Record Reader implementation. @exceptionfactory Then would it be okay to have in `YamlTreeReader` code like: ``` @Override protected List getSupportedPropertyDescriptors() { final List properties = new ArrayList<>(super.getSupportedPropertyDescriptors()); //NOTE: Remove those properties which are not applicable for Yaml. properties.remove(AbstractJsonRowRecordReader.MAX_STRING_LENGTH); properties.remove(AbstractJsonRowRecordReader.ALLOW_COMMENTS); return properties; } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
MikeThomsen commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1762850896 @dan-s1 Just out of (genuine) curiosity, what prompted this feature? I have never seen anyone use YAML as anything other than a configuration file format. What horror story/interesting use case prompted it? :-D -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1762354391 > @exceptionfactory I am trying to rework my code to handle the changes made in #7823 for NIFI-12153. What I am discovering is that YAML supports comments and there is no way to turn off parsing of comments. In addition I am trying to mimic `TestJsonTreeRowRecordReader#testReadJSONStringTooLong` and I am trying to use `StreamReadConstraints.builder().maxStringLength(1).build()` yet no exception is thrown even though I have strings longer than one character long. It would seem that aspect only works for Json parsing and not Yaml parsing. I did find the following article [How to parse large YAML file in Java or Kotlin](https://stackoverflow.com/questions/74805240/how-to-parse-large-yaml-file-in-java-or-kotlin) which would seem to handle larger Yaml files but I am not sure if that is the same setting as `StreamReadConstraints.builder().maxStringLength(1).build()`. Please advise. Thanks for the details @dan-s1. If the YAML implementation does not support the same constraints, then I recommend not including those property descriptors on the YAML Record Reader implementation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1762022382 @exceptionfactory I am trying to rework my code to handle the changes made in #7823 for NIFI-12153. What I am discovering is that YAML supports comments and there is no way to turn off parsing of comments. In addition I am trying to mimic `TestJsonTreeRowRecordReader#testReadJSONStringTooLong` and I am trying to use `StreamReadConstraints.builder().maxStringLength(1).build()` yet no exception is thrown even though I have strings longer than one character long. It would seem that aspect only works for Json parsing and not Yaml parsing. I did find the following article [How to parse large YAML file in Java or Kotlin](https://stackoverflow.com/questions/74805240/how-to-parse-large-yaml-file-in-java-or-kotlin) which would seem to handle larger Yaml files but I am not sure if that is the same setting as `StreamReadConstraints.builder().maxStringLength(1).build()`. Please advise. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1753340884 @exceptionfactory There are conflicts (I guess obviously) I will try to resolve and then push. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-175951 @dan-s1 The failure appears to be an intermittent issue based on heap space for the build. We may need to address that separately, but it is not a concern for this PR. However, I just merged a separate PR adding `StreamConstraints` handling to the JSON Readers, resulting in merge conflicts. Can you rebase on the latest main branch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on PR #7665: URL: https://github.com/apache/nifi/pull/7665#issuecomment-1753319713 @exceptionfactory I see three out of the four builds succeeded but I do not see a reason for the one that failed. Also I am not sure why are there conflicts. Please advise. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
dan-s1 commented on code in PR #7665: URL: https://github.com/apache/nifi/pull/7665#discussion_r1350266069 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/yaml/TestYamlTreeRowRecordReader.java: ## @@ -0,0 +1,1347 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.yaml; + +import org.apache.avro.Schema; +import org.apache.commons.io.FileUtils; +import org.apache.nifi.avro.AvroTypeUtil; +import org.apache.nifi.json.JsonSchemaInference; +import org.apache.nifi.json.JsonTreeRowRecordReader; +import org.apache.nifi.json.SchemaApplicationStrategy; +import org.apache.nifi.json.StartingFieldStrategy; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.schema.inference.InferSchemaAccessStrategy; +import org.apache.nifi.schema.inference.TimeValueInference; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordFieldType; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ChoiceDataType; +import org.apache.nifi.util.EqualsWrapper; +import org.apache.nifi.util.MockComponentLog; +import org.junit.jupiter.api.Disabled; +import org.junit.jupiter.api.Test; + +import java.io.ByteArrayInputStream; +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; +import java.io.InputStream; +import java.math.BigDecimal; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.function.BiPredicate; +import java.util.function.Function; +import java.util.stream.Collectors; + +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertNotNull; +import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertSame; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; +import static org.mockito.Mockito.mock; + +class TestYamlTreeRowRecordReader { +private final String dateFormat = RecordFieldType.DATE.getDefaultFormat(); +private final String timeFormat = RecordFieldType.TIME.getDefaultFormat(); +private final String timestampFormat = RecordFieldType.TIMESTAMP.getDefaultFormat(); + +private List getDefaultFields() { +return getFields(RecordFieldType.DOUBLE.getDataType()); +} + +private List getFields(final DataType balanceDataType) { +final List fields = new ArrayList<>(); +fields.add(new RecordField("id", RecordFieldType.INT.getDataType())); +fields.add(new RecordField("name", RecordFieldType.STRING.getDataType())); +fields.add(new RecordField("balance", balanceDataType)); +fields.add(new RecordField("address", RecordFieldType.STRING.getDataType())); +fields.add(new RecordField("city", RecordFieldType.STRING.getDataType())); +fields.add(new RecordField("state", RecordFieldType.STRING.getDataType())); +fields.add(new RecordField("zipCode", RecordFieldType.STRING.getDataType())); +fields.add(new RecordField("country", RecordFieldType.STRING.getDataType())); +return fields; +} + +private RecordSchema getAccountSchema() { +final List accountFields = new ArrayList<>(); +accountFields.add(new RecordField("id", RecordFieldType.INT.getDataType())); +accountFields.add(new RecordField("balance", RecordFieldType.DOUBLE.getDataType())); + +return new SimpleRecordSchema(accountFields); +} + +@Test +void testReadChoiceOfStringO
Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]
exceptionfactory commented on code in PR #7665: URL: https://github.com/apache/nifi/pull/7665#discussion_r1348154405 ## nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/yaml/TestYamlTreeRowRecordReader.java: ## @@ -0,0 +1,1347 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.yaml; + +import org.apache.avro.Schema; +import org.apache.commons.io.FileUtils; +import org.apache.nifi.avro.AvroTypeUtil; +import org.apache.nifi.json.JsonSchemaInference; +import org.apache.nifi.json.JsonTreeRowRecordReader; +import org.apache.nifi.json.SchemaApplicationStrategy; +import org.apache.nifi.json.StartingFieldStrategy; +import org.apache.nifi.logging.ComponentLog; +import org.apache.nifi.schema.inference.InferSchemaAccessStrategy; +import org.apache.nifi.schema.inference.TimeValueInference; +import org.apache.nifi.serialization.MalformedRecordException; +import org.apache.nifi.serialization.SimpleRecordSchema; +import org.apache.nifi.serialization.record.DataType; +import org.apache.nifi.serialization.record.MapRecord; +import org.apache.nifi.serialization.record.Record; +import org.apache.nifi.serialization.record.RecordField; +import org.apache.nifi.serialization.record.RecordFieldType; +import org.apache.nifi.serialization.record.RecordSchema; +import org.apache.nifi.serialization.record.type.ChoiceDataType; +import org.apache.nifi.util.EqualsWrapper; +import org.apache.nifi.util.MockComponentLog; +import org.junit.jupiter.api.Disabled; +import org.junit.jupiter.api.Test; + +import java.io.ByteArrayInputStream; +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; +import java.io.InputStream; +import java.math.BigDecimal; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.function.BiPredicate; +import java.util.function.Function; +import java.util.stream.Collectors; + +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertNotNull; +import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertSame; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; +import static org.mockito.Mockito.mock; + +class TestYamlTreeRowRecordReader { +private final String dateFormat = RecordFieldType.DATE.getDefaultFormat(); +private final String timeFormat = RecordFieldType.TIME.getDefaultFormat(); +private final String timestampFormat = RecordFieldType.TIMESTAMP.getDefaultFormat(); + +private List getDefaultFields() { +return getFields(RecordFieldType.DOUBLE.getDataType()); +} + +private List getFields(final DataType balanceDataType) { +final List fields = new ArrayList<>(); +fields.add(new RecordField("id", RecordFieldType.INT.getDataType())); +fields.add(new RecordField("name", RecordFieldType.STRING.getDataType())); +fields.add(new RecordField("balance", balanceDataType)); +fields.add(new RecordField("address", RecordFieldType.STRING.getDataType())); +fields.add(new RecordField("city", RecordFieldType.STRING.getDataType())); +fields.add(new RecordField("state", RecordFieldType.STRING.getDataType())); +fields.add(new RecordField("zipCode", RecordFieldType.STRING.getDataType())); +fields.add(new RecordField("country", RecordFieldType.STRING.getDataType())); +return fields; +} + +private RecordSchema getAccountSchema() { +final List accountFields = new ArrayList<>(); +accountFields.add(new RecordField("id", RecordFieldType.INT.getDataType())); +accountFields.add(new RecordField("balance", RecordFieldType.DOUBLE.getDataType())); + +return new SimpleRecordSchema(accountFields); +} + +@Test +void testReadChoic