Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-23 Thread via GitHub


exceptionfactory commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1775953704

   You're welcome @dan-s1. I am verifying the changes for the support branch, I 
updated the Jira to resolved for 2.0.0, and should be able to include 1.24.0 
shortly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-23 Thread via GitHub


dan-s1 commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1775941691

   @exceptionfactory Thank you! I am not sure if there is a delay or not but in 
Jira I still see the status as Patch Available.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-23 Thread via GitHub


exceptionfactory closed pull request #7665: NIFI-11197 Initial check in for 
Yaml record reader
URL: https://github.com/apache/nifi/pull/7665


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-23 Thread via GitHub


exceptionfactory commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1775094328

   n@dan-s1 In the course of runtime testing, I found an issue with the Max 
String Length property value being `null` in `storePropertyValues` in 
`JsonTreeReader`. This is understandable since the YamlTreeReader removes that 
property. I rebased the branch and added one more commit with a 
`buildStreamReadContraints()` method that `YamlTreeReader` overrides. This 
approach will be clearer than null-checking the property value. I plan to run a 
few more tests soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-20 Thread via GitHub


exceptionfactory commented on code in PR #7665:
URL: https://github.com/apache/nifi/pull/7665#discussion_r1367234754


##
nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonParserFactory.java:
##
@@ -23,27 +23,20 @@
 
 import java.io.IOException;
 import java.io.InputStream;
+import java.util.Objects;
 
-public class JsonParserFactory implements TokenParserFactory{
-private static final JsonFactory JSON_FACTORY = new JsonFactory();
-private static final ObjectMapper JSON_MAPPER = new ObjectMapper();
-
+public class JsonParserFactory implements TokenParserFactory {
 @Override
-public JsonParser getJsonParser(InputStream in) throws IOException {
-JsonParser jsonParser = JSON_FACTORY.createParser(in);
-jsonParser.setCodec(JSON_MAPPER);
-
-return jsonParser;
-}
+public JsonParser getJsonParser(final InputStream in, final 
StreamReadConstraints streamReadConstraints, final boolean allowComments) 
throws IOException {
+Objects.requireNonNull(in, "Input Stream required");
+Objects.requireNonNull(streamReadConstraints, "Stream Read Constraints 
required");
 
-@Override
-public ObjectMapper createCodec(boolean allowComments, 
StreamReadConstraints streamReadConstraints) {
-ObjectMapper codec = new ObjectMapper();
-if(allowComments) {
-codec.enable(JsonParser.Feature.ALLOW_COMMENTS);
+final ObjectMapper objectMapper = new ObjectMapper();
+
objectMapper.getFactory().setStreamReadConstraints(streamReadConstraints);
+if (allowComments) {
+objectMapper.enable(JsonParser.Feature.ALLOW_COMMENTS);
 }
-codec.getFactory().setStreamReadConstraints(streamReadConstraints);
-
-return codec;
+final JsonFactory jsonFactory = objectMapper.getFactory();
+return jsonFactory.createParser(in);

Review Comment:
   Thanks for noting the previous approach. The line calling `setCodec()` was 
unchanged in the most recent updates for NIFI-12153, but it actually should 
have been removed as part of the changes to incorporate the constraint 
settings. So in this case, it is an opportunity to correct the implementation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-20 Thread via GitHub


dan-s1 commented on code in PR #7665:
URL: https://github.com/apache/nifi/pull/7665#discussion_r1367185418


##
nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonParserFactory.java:
##
@@ -23,27 +23,20 @@
 
 import java.io.IOException;
 import java.io.InputStream;
+import java.util.Objects;
 
-public class JsonParserFactory implements TokenParserFactory{
-private static final JsonFactory JSON_FACTORY = new JsonFactory();
-private static final ObjectMapper JSON_MAPPER = new ObjectMapper();
-
+public class JsonParserFactory implements TokenParserFactory {
 @Override
-public JsonParser getJsonParser(InputStream in) throws IOException {
-JsonParser jsonParser = JSON_FACTORY.createParser(in);
-jsonParser.setCodec(JSON_MAPPER);
-
-return jsonParser;
-}
+public JsonParser getJsonParser(final InputStream in, final 
StreamReadConstraints streamReadConstraints, final boolean allowComments) 
throws IOException {
+Objects.requireNonNull(in, "Input Stream required");
+Objects.requireNonNull(streamReadConstraints, "Stream Read Constraints 
required");
 
-@Override
-public ObjectMapper createCodec(boolean allowComments, 
StreamReadConstraints streamReadConstraints) {
-ObjectMapper codec = new ObjectMapper();
-if(allowComments) {
-codec.enable(JsonParser.Feature.ALLOW_COMMENTS);
+final ObjectMapper objectMapper = new ObjectMapper();
+
objectMapper.getFactory().setStreamReadConstraints(streamReadConstraints);
+if (allowComments) {
+objectMapper.enable(JsonParser.Feature.ALLOW_COMMENTS);
 }
-codec.getFactory().setStreamReadConstraints(streamReadConstraints);
-
-return codec;
+final JsonFactory jsonFactory = objectMapper.getFactory();
+return jsonFactory.createParser(in);

Review Comment:
   In the previous code also the `JsonFactory` was coming from the 
`ObjectMapper`and yet `setCodec` was still called. See the previous line 145:
`jsonParser = codec.getFactory().createParser(in);`
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-20 Thread via GitHub


exceptionfactory commented on code in PR #7665:
URL: https://github.com/apache/nifi/pull/7665#discussion_r1367180160


##
nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonRecordSource.java:
##
@@ -29,6 +30,11 @@
 
 public class JsonRecordSource implements RecordSource {
 private static final Logger logger = 
LoggerFactory.getLogger(JsonRecordSource.class);

Review Comment:
   There have been discussions in the past around casing for the `Logger` in 
particular. The general trend is having the `logger` lowercase, even when 
static, but of course actual constants should be uppercase.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-20 Thread via GitHub


exceptionfactory commented on code in PR #7665:
URL: https://github.com/apache/nifi/pull/7665#discussion_r1367179198


##
nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonParserFactory.java:
##
@@ -23,27 +23,20 @@
 
 import java.io.IOException;
 import java.io.InputStream;
+import java.util.Objects;
 
-public class JsonParserFactory implements TokenParserFactory{
-private static final JsonFactory JSON_FACTORY = new JsonFactory();
-private static final ObjectMapper JSON_MAPPER = new ObjectMapper();
-
+public class JsonParserFactory implements TokenParserFactory {
 @Override
-public JsonParser getJsonParser(InputStream in) throws IOException {
-JsonParser jsonParser = JSON_FACTORY.createParser(in);
-jsonParser.setCodec(JSON_MAPPER);
-
-return jsonParser;
-}
+public JsonParser getJsonParser(final InputStream in, final 
StreamReadConstraints streamReadConstraints, final boolean allowComments) 
throws IOException {
+Objects.requireNonNull(in, "Input Stream required");
+Objects.requireNonNull(streamReadConstraints, "Stream Read Constraints 
required");
 
-@Override
-public ObjectMapper createCodec(boolean allowComments, 
StreamReadConstraints streamReadConstraints) {
-ObjectMapper codec = new ObjectMapper();
-if(allowComments) {
-codec.enable(JsonParser.Feature.ALLOW_COMMENTS);
+final ObjectMapper objectMapper = new ObjectMapper();
+
objectMapper.getFactory().setStreamReadConstraints(streamReadConstraints);
+if (allowComments) {
+objectMapper.enable(JsonParser.Feature.ALLOW_COMMENTS);
 }
-codec.getFactory().setStreamReadConstraints(streamReadConstraints);
-
-return codec;
+final JsonFactory jsonFactory = objectMapper.getFactory();
+return jsonFactory.createParser(in);

Review Comment:
   Setting the codec is not necessary because the `JsonFactory` is coming from 
the `ObjectMapper`, as opposed to the other way around.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-20 Thread via GitHub


dan-s1 commented on code in PR #7665:
URL: https://github.com/apache/nifi/pull/7665#discussion_r1367170067


##
nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonRecordSource.java:
##
@@ -29,6 +30,11 @@
 
 public class JsonRecordSource implements RecordSource {
 private static final Logger logger = 
LoggerFactory.getLogger(JsonRecordSource.class);

Review Comment:
   If the added static final private variables below are uppercase, this should 
also be for consistency.



##
nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/JsonParserFactory.java:
##
@@ -23,27 +23,20 @@
 
 import java.io.IOException;
 import java.io.InputStream;
+import java.util.Objects;
 
-public class JsonParserFactory implements TokenParserFactory{
-private static final JsonFactory JSON_FACTORY = new JsonFactory();
-private static final ObjectMapper JSON_MAPPER = new ObjectMapper();
-
+public class JsonParserFactory implements TokenParserFactory {
 @Override
-public JsonParser getJsonParser(InputStream in) throws IOException {
-JsonParser jsonParser = JSON_FACTORY.createParser(in);
-jsonParser.setCodec(JSON_MAPPER);
-
-return jsonParser;
-}
+public JsonParser getJsonParser(final InputStream in, final 
StreamReadConstraints streamReadConstraints, final boolean allowComments) 
throws IOException {
+Objects.requireNonNull(in, "Input Stream required");
+Objects.requireNonNull(streamReadConstraints, "Stream Read Constraints 
required");
 
-@Override
-public ObjectMapper createCodec(boolean allowComments, 
StreamReadConstraints streamReadConstraints) {
-ObjectMapper codec = new ObjectMapper();
-if(allowComments) {
-codec.enable(JsonParser.Feature.ALLOW_COMMENTS);
+final ObjectMapper objectMapper = new ObjectMapper();
+
objectMapper.getFactory().setStreamReadConstraints(streamReadConstraints);
+if (allowComments) {
+objectMapper.enable(JsonParser.Feature.ALLOW_COMMENTS);
 }
-codec.getFactory().setStreamReadConstraints(streamReadConstraints);
-
-return codec;
+final JsonFactory jsonFactory = objectMapper.getFactory();
+return jsonFactory.createParser(in);

Review Comment:
   The previous code in `AbstractJsonRowRecordReader.java` formally on line 146 
when creating the jsonParser also called `setCodec`.  Shouldn't you have 
something similar here?
   Instead of
   ` return jsonFactory.createParser(in);`
   
   change to
   ```
   final JsonParser jsonParser = jsonFactory.createParser(in);
   jsonParser.setCodec(objectMapper);
   return jsonParser;
   ```



##
nifi-nar-bundles/nifi-extension-utils/nifi-record-utils/nifi-json-record-utils/src/main/java/org/apache/nifi/json/TokenParserFactory.java:
##
@@ -18,13 +18,19 @@
 
 import com.fasterxml.jackson.core.JsonParser;
 import com.fasterxml.jackson.core.StreamReadConstraints;
-import com.fasterxml.jackson.databind.ObjectMapper;
 
 import java.io.IOException;
 import java.io.InputStream;
 
 public interface TokenParserFactory {
-JsonParser getJsonParser(InputStream in) throws IOException;
-
-ObjectMapper createCodec(boolean allowComments, StreamReadConstraints 
streamReadConstraints);
+/**
+ * Get JSON Parser implementation for provided Input Stream with 
configured settings
+ *
+ * @param in Input Stream to be parsed
+ * @param streamReadConstraints Stream Read Constraints applied
+ * @param allowComments Whether to allow comments when parsing
+ * @return JSON Parser
+ * @throws IOException Thrown on failures ot read the Input Stream

Review Comment:
   Instead of
   `@throws IOException Thrown on failures ot read the Input Stream`
   this should be
   `@throws IOException Thrown on failures to read the Input Stream`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-20 Thread via GitHub


dan-s1 commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1772781514

   @exceptionfactory Any chance this can make 1.24?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-16 Thread via GitHub


dan-s1 commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1765429075

   @exceptionfactory The failed build seems to not have to do with any of my 
changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-16 Thread via GitHub


exceptionfactory commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1765209620

   Thanks for the updates @dan-s1, I will take a closer look soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-16 Thread via GitHub


dan-s1 commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1765207530

   @exceptionfactory I resolved the conflicts but mistakenly rebased and 
squashed the commits. I believe all the changes you had requested are all in 
there. Is that okay?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-16 Thread via GitHub


exceptionfactory commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1764582630

   > @exceptionfactory The problem with static is I need many of the properties 
which are defined in instance methods in `SchemaRegistryService`. I basically 
want to mimic `getSupportedPropertyDescriptors` from `JsonTreeReader` which is 
not using any static methods minus the 2 properties which are not applicable 
for Yaml.
   
   Thanks for noting that detail @dan-s1, in that case, the approach you 
outlined seems like the best option.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-16 Thread via GitHub


dan-s1 commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1764574517

   @exceptionfactory The problem with static is I need many of the properties 
which are defined in instance methods in `SchemaRegistryService`. I basically 
want to mimic `getSupportedPropertyDescriptors` from `JsonTreeReader` which is 
not using any static methods minus the 2 properties which are not applicable 
for Yaml.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-16 Thread via GitHub


exceptionfactory commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1764530205

   > @exceptionfactory Then would it be okay to have in `YamlTreeReader` code 
like:
   > 
   > ```
   > @Override
   > protected List getSupportedPropertyDescriptors() {
   > final List properties = new 
ArrayList<>(super.getSupportedPropertyDescriptors());
   > //NOTE: Remove those properties which are not applicable for Yaml.
   > properties.remove(AbstractJsonRowRecordReader.MAX_STRING_LENGTH);
   > properties.remove(AbstractJsonRowRecordReader.ALLOW_COMMENTS);
   > 
   > return properties;
   > }
   > ```
   
   Thanks for the example @dan-s1, yes, if those properties do not appear to be 
honored with the YAML implementation, then that is one approach. However, given 
the number of times that this method will be called, it would be better to 
define a local static list of supported properties and return that reference in 
`getSupportedPropertyDescriptors()`. That does require manual maintenance of 
the list for the YAML Reader, but that is probably better in this case as it 
should force a review of whether any new property would apply to both the JSON 
and YAML implementations.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-16 Thread via GitHub


dan-s1 commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1764514871

   > @dan-s1 Just out of (genuine) curiosity, what prompted this feature? I 
have never seen anyone use YAML as anything other than a configuration file 
format. What horror story/interesting use case prompted it? :-D
   
   @MikeThomsen Please refer to the ticket for more details but the submitter's 
use case was to have ability to convert from Yaml to Json.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-16 Thread via GitHub


dan-s1 commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1764511710

   > > @exceptionfactory I am trying to rework my code to handle the changes 
made in #7823 for NIFI-12153. What I am discovering is that YAML supports 
comments and there is no way to turn off parsing of comments. In addition I am 
trying to mimic `TestJsonTreeRowRecordReader#testReadJSONStringTooLong` and I 
am trying to use `StreamReadConstraints.builder().maxStringLength(1).build()` 
yet no exception is thrown even though I have strings longer than one character 
long. It would seem that aspect only works for Json parsing and not Yaml 
parsing. I did find the following article [How to parse large YAML file in Java 
or 
Kotlin](https://stackoverflow.com/questions/74805240/how-to-parse-large-yaml-file-in-java-or-kotlin)
 which would seem to handle larger Yaml files but I am not sure if that is the 
same setting as `StreamReadConstraints.builder().maxStringLength(1).build()`. 
Please advise.
   > 
   > Thanks for the details @dan-s1. If the YAML implementation does not 
support the same constraints, then I recommend not including those property 
descriptors on the YAML Record Reader implementation.
   
   @exceptionfactory Then would it be okay to have in `YamlTreeReader` code 
like:
   
   ```
   @Override
   protected List getSupportedPropertyDescriptors() {
   final List properties = new 
ArrayList<>(super.getSupportedPropertyDescriptors());
   //NOTE: Remove those properties which are not applicable for Yaml.
   properties.remove(AbstractJsonRowRecordReader.MAX_STRING_LENGTH);
   properties.remove(AbstractJsonRowRecordReader.ALLOW_COMMENTS);
   
   return properties;
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-14 Thread via GitHub


MikeThomsen commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1762850896

   @dan-s1 Just out of (genuine) curiosity, what prompted this feature? I have 
never seen anyone use YAML as anything other than a configuration file format. 
What horror story/interesting use case prompted it? :-D


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-13 Thread via GitHub


exceptionfactory commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1762354391

   > @exceptionfactory I am trying to rework my code to handle the changes made 
in #7823 for NIFI-12153. What I am discovering is that YAML supports comments 
and there is no way to turn off parsing of comments. In addition I am trying to 
mimic `TestJsonTreeRowRecordReader#testReadJSONStringTooLong` and I am trying 
to use `StreamReadConstraints.builder().maxStringLength(1).build()` yet no 
exception is thrown even though I have strings longer than one character long. 
It would seem that aspect only works for Json parsing and not Yaml parsing. I 
did find the following article [How to parse large YAML file in Java or 
Kotlin](https://stackoverflow.com/questions/74805240/how-to-parse-large-yaml-file-in-java-or-kotlin)
 which would seem to handle larger Yaml files but I am not sure if that is the 
same setting as `StreamReadConstraints.builder().maxStringLength(1).build()`. 
Please advise.
   
   Thanks for the details @dan-s1. If the YAML implementation does not support 
the same constraints, then I recommend not including those property descriptors 
on the YAML Record Reader implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-13 Thread via GitHub


dan-s1 commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1762022382

   @exceptionfactory I am trying to rework my code to handle the changes made 
in #7823 for NIFI-12153. What I am discovering is that YAML supports comments 
and there is no way to turn off parsing of comments.  In addition I am trying 
to mimic `TestJsonTreeRowRecordReader#testReadJSONStringTooLong` and I am 
trying to use `StreamReadConstraints.builder().maxStringLength(1).build()` yet 
no exception is thrown even though I have strings longer than one character 
long. It would seem that aspect only works for Json parsing and not Yaml 
parsing. I did find the following article [How to parse large YAML file in Java 
or 
Kotlin](https://stackoverflow.com/questions/74805240/how-to-parse-large-yaml-file-in-java-or-kotlin)
 which would seem to handle larger Yaml files but I am not sure if that is the 
same setting as `StreamReadConstraints.builder().maxStringLength(1).build()`. 
Please advise.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-09 Thread via GitHub


dan-s1 commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1753340884

   @exceptionfactory There are conflicts (I guess obviously) I will try to 
resolve and then push.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-09 Thread via GitHub


exceptionfactory commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-175951

   @dan-s1 The failure appears to be an intermittent issue based on heap space 
for the build. We may need to address that separately, but it is not a concern 
for this PR.
   
   However, I just merged a separate PR adding `StreamConstraints` handling to 
the JSON Readers, resulting in merge conflicts. Can you rebase on the latest 
main branch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-09 Thread via GitHub


dan-s1 commented on PR #7665:
URL: https://github.com/apache/nifi/pull/7665#issuecomment-1753319713

   @exceptionfactory I see three out of the four builds succeeded but I do not 
see a reason for the one that failed. Also I am not sure why are there 
conflicts. Please advise.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-09 Thread via GitHub


dan-s1 commented on code in PR #7665:
URL: https://github.com/apache/nifi/pull/7665#discussion_r1350266069


##
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/yaml/TestYamlTreeRowRecordReader.java:
##
@@ -0,0 +1,1347 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.yaml;
+
+import org.apache.avro.Schema;
+import org.apache.commons.io.FileUtils;
+import org.apache.nifi.avro.AvroTypeUtil;
+import org.apache.nifi.json.JsonSchemaInference;
+import org.apache.nifi.json.JsonTreeRowRecordReader;
+import org.apache.nifi.json.SchemaApplicationStrategy;
+import org.apache.nifi.json.StartingFieldStrategy;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.schema.inference.InferSchemaAccessStrategy;
+import org.apache.nifi.schema.inference.TimeValueInference;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.SimpleRecordSchema;
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.MapRecord;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordFieldType;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.type.ChoiceDataType;
+import org.apache.nifi.util.EqualsWrapper;
+import org.apache.nifi.util.MockComponentLog;
+import org.junit.jupiter.api.Disabled;
+import org.junit.jupiter.api.Test;
+
+import java.io.ByteArrayInputStream;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.math.BigDecimal;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.function.BiPredicate;
+import java.util.function.Function;
+import java.util.stream.Collectors;
+
+import static org.junit.jupiter.api.Assertions.assertArrayEquals;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.junit.jupiter.api.Assertions.assertNull;
+import static org.junit.jupiter.api.Assertions.assertSame;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+import static org.mockito.Mockito.mock;
+
+class TestYamlTreeRowRecordReader {
+private final String dateFormat = RecordFieldType.DATE.getDefaultFormat();
+private final String timeFormat = RecordFieldType.TIME.getDefaultFormat();
+private final String timestampFormat = 
RecordFieldType.TIMESTAMP.getDefaultFormat();
+
+private List getDefaultFields() {
+return getFields(RecordFieldType.DOUBLE.getDataType());
+}
+
+private List getFields(final DataType balanceDataType) {
+final List fields = new ArrayList<>();
+fields.add(new RecordField("id", RecordFieldType.INT.getDataType()));
+fields.add(new RecordField("name", 
RecordFieldType.STRING.getDataType()));
+fields.add(new RecordField("balance", balanceDataType));
+fields.add(new RecordField("address", 
RecordFieldType.STRING.getDataType()));
+fields.add(new RecordField("city", 
RecordFieldType.STRING.getDataType()));
+fields.add(new RecordField("state", 
RecordFieldType.STRING.getDataType()));
+fields.add(new RecordField("zipCode", 
RecordFieldType.STRING.getDataType()));
+fields.add(new RecordField("country", 
RecordFieldType.STRING.getDataType()));
+return fields;
+}
+
+private RecordSchema getAccountSchema() {
+final List accountFields = new ArrayList<>();
+accountFields.add(new RecordField("id", 
RecordFieldType.INT.getDataType()));
+accountFields.add(new RecordField("balance", 
RecordFieldType.DOUBLE.getDataType()));
+
+return new SimpleRecordSchema(accountFields);
+}
+
+@Test
+void 

Re: [PR] NIFI-11197 Initial check in for Yaml record reader [nifi]

2023-10-05 Thread via GitHub


exceptionfactory commented on code in PR #7665:
URL: https://github.com/apache/nifi/pull/7665#discussion_r1348154405


##
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/yaml/TestYamlTreeRowRecordReader.java:
##
@@ -0,0 +1,1347 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.nifi.yaml;
+
+import org.apache.avro.Schema;
+import org.apache.commons.io.FileUtils;
+import org.apache.nifi.avro.AvroTypeUtil;
+import org.apache.nifi.json.JsonSchemaInference;
+import org.apache.nifi.json.JsonTreeRowRecordReader;
+import org.apache.nifi.json.SchemaApplicationStrategy;
+import org.apache.nifi.json.StartingFieldStrategy;
+import org.apache.nifi.logging.ComponentLog;
+import org.apache.nifi.schema.inference.InferSchemaAccessStrategy;
+import org.apache.nifi.schema.inference.TimeValueInference;
+import org.apache.nifi.serialization.MalformedRecordException;
+import org.apache.nifi.serialization.SimpleRecordSchema;
+import org.apache.nifi.serialization.record.DataType;
+import org.apache.nifi.serialization.record.MapRecord;
+import org.apache.nifi.serialization.record.Record;
+import org.apache.nifi.serialization.record.RecordField;
+import org.apache.nifi.serialization.record.RecordFieldType;
+import org.apache.nifi.serialization.record.RecordSchema;
+import org.apache.nifi.serialization.record.type.ChoiceDataType;
+import org.apache.nifi.util.EqualsWrapper;
+import org.apache.nifi.util.MockComponentLog;
+import org.junit.jupiter.api.Disabled;
+import org.junit.jupiter.api.Test;
+
+import java.io.ByteArrayInputStream;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.math.BigDecimal;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.function.BiPredicate;
+import java.util.function.Function;
+import java.util.stream.Collectors;
+
+import static org.junit.jupiter.api.Assertions.assertArrayEquals;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.junit.jupiter.api.Assertions.assertNull;
+import static org.junit.jupiter.api.Assertions.assertSame;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+import static org.mockito.Mockito.mock;
+
+class TestYamlTreeRowRecordReader {
+private final String dateFormat = RecordFieldType.DATE.getDefaultFormat();
+private final String timeFormat = RecordFieldType.TIME.getDefaultFormat();
+private final String timestampFormat = 
RecordFieldType.TIMESTAMP.getDefaultFormat();
+
+private List getDefaultFields() {
+return getFields(RecordFieldType.DOUBLE.getDataType());
+}
+
+private List getFields(final DataType balanceDataType) {
+final List fields = new ArrayList<>();
+fields.add(new RecordField("id", RecordFieldType.INT.getDataType()));
+fields.add(new RecordField("name", 
RecordFieldType.STRING.getDataType()));
+fields.add(new RecordField("balance", balanceDataType));
+fields.add(new RecordField("address", 
RecordFieldType.STRING.getDataType()));
+fields.add(new RecordField("city", 
RecordFieldType.STRING.getDataType()));
+fields.add(new RecordField("state", 
RecordFieldType.STRING.getDataType()));
+fields.add(new RecordField("zipCode", 
RecordFieldType.STRING.getDataType()));
+fields.add(new RecordField("country", 
RecordFieldType.STRING.getDataType()));
+return fields;
+}
+
+private RecordSchema getAccountSchema() {
+final List accountFields = new ArrayList<>();
+accountFields.add(new RecordField("id", 
RecordFieldType.INT.getDataType()));
+accountFields.add(new RecordField("balance", 
RecordFieldType.DOUBLE.getDataType()));
+
+return new SimpleRecordSchema(accountFields);
+}
+
+@Test
+void