[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800641#comment-17800641 ] ASF subversion and git services commented on NIFI-11627: Commit 6bca79cb3720395352bfa587449422e09bec0233 in nifi's branch refs/heads/main from dan-s1 [ https://gitbox.apache.org/repos/asf?p=nifi.git;h=6bca79cb37 ] NIFI-11627 Added JsonSchemaRegistry for ValidateJson - Added nifi-json-schema-api to nifi-commons - Added StandardJsonSchemaRegistry implementation of JsonSchemaRegistry - Added strategy configuration properties to ValidateJson This closes #8005 Signed-off-by: David Handermann > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > Labels: backport-needed > Time Spent: 9h > Remaining Estimate: 0h > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17782356#comment-17782356 ] David Handermann commented on NIFI-11627: - Thanks for the update [~dstiegli1]. One important thing to note for a new Controller Service interface is that the interface design should be decoupled from the networknt JsonSchema class itself. Although that library seems to be one of the best available implementations, the ideal interface definition would avoid coupling the design to that particular class. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17781509#comment-17781509 ] Daniel Stieglitz commented on NIFI-11627: - [~exceptionfactory] I tried seeing whether I could map com.networknt.schema.JsonSchema to the RecordSchema interface but I did not see anything that stood out that could make that happen. Also I am not sure what would the benefit of that be as the Json schema would have to be reconstituted in order to instantiate a com.networknt.schema.JsonSchema for validation. In the interim, I have chosen the second route and I have created a new Controller Service interface. I have tried to start small and I have developed an in memory registry much like the AvroSchemaRegistry except it stores Json schemas. I have some more testing to do but I am hoping to push a PR soon. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17781175#comment-17781175 ] David Handermann commented on NIFI-11627: - [~dstiegli1] I think the next step is to take a step back and evaluate the {{RecordSchema}} interface definition in the nifi-record module. Some aspects should map to JSON Schema constructs, but there are some JSON Schema capabilities that do not in the current interface. That's where a new interface will likely be necessary. That may also require a new Controller Service interface, but I would start with evaluating the Record Schema interface first. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780049#comment-17780049 ] Daniel Stieglitz commented on NIFI-11627: - [~exceptionfactory] Digging a little deeper I realize I cannot even use the ConfluentSchemaRegistry and the AmazonGlueSchemaRegistry since they parse the retrieved text from the registry as an Avro schema. I found * ConfluentSchemaRegistry uses RestSchemaRegistryClient which in method createRecordSchema on line 216 parses the retrieved text as an Avro schema * AmazonGlueSchemaRegistry uses the GlueSchemaRegistryClient in method createRecordSchemaon on line 98 parses the retrieved text as an Avro schema That is a real shame as it would seem most of the code for retrieving a schema of any kind is there. It just how it is parsed is what is the problem. Where should I go from here? > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780036#comment-17780036 ] David Handermann commented on NIFI-11627: - [~dstiegli1] Yes, that is a limitation of the current design. This seems like a case where a new interface may be necessary. Spending some additional time to evaluate what elements of JSON Schema map to the existing RecordSchema, and which do not, would be a good starting point. One approach might involve a way to capture to additional aspects of JSON Schema in a different interface. I am familiar with JSON Schema capabilities, but I have not gone through a careful comparison against NiFi Record Schema to recommend a particular approach thus far. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780017#comment-17780017 ] Daniel Stieglitz commented on NIFI-11627: - Currently though SchemaAccessStrategy is the only way I can get schemas from providers such as AmazonGlueSchemaRegistry, ConfluentSchemaRegistry which also support storing JSON schemas. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780006#comment-17780006 ] David Handermann commented on NIFI-11627: - In the case of only populating the {{schemaText}} and not populating the fields, I don't think reusing the SchemaAccessStrategy is the right approach. The purpose of the SchemaAccessStrategy and RecordSchema is to provide a basic definition with field information, where the schema text provides supporting information. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779980#comment-17779980 ] Daniel Stieglitz commented on NIFI-11627: - For the anonymous SchemaAccesStrategy, I am only populating the {{schemaText}} property of the {{SimpleRecordSchema}} to hold the JSON Schema definition. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779971#comment-17779971 ] David Handermann commented on NIFI-11627: - Thanks for the additional background [~dstiegli1]. I may not be following correctly, so perhaps breaking it down a bit more would be helpful. For clarification, you are evaluating using the {{schemaText}} property of the {{SimpleRecordSchema}} to hold the JSON Schema definition? Are you also populating the {{fields}} property of {{{}SimpleRecordSchema{}}}, or just populating the {{{}schemaText{}}}? > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779929#comment-17779929 ] Daniel Stieglitz commented on NIFI-11627: - [~exceptionfactory] I would like to clarify I am not actually creating a SchemaRegistry. I am trying to integrate the use of one. As a result I need to use SchemaAccesStrategy which involves returning the schema in a RecordSchema object. Since the schema is now accessed via a SchemaAccesStrategy I have had to make an anonymous SchemaAccesStrategy to encapsulate the current functionality of obtaining the JSON schema from a resource. That is where I stored the contents of the JSON schema in the text section of SimpleRecordSchema. I assume (but not tested just yet) that the various Schema Registry implementations return the actual text of the schema also. Hence I would like to take advantage of that and build the necessary JSON Schema object for validation. I have been able to reuse much of the code found in SchemaAccessUtils to accomplish the integration of a SchemaRegistry. Please let me know if there is still an issue with this design. Thanks! > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779657#comment-17779657 ] David Handermann commented on NIFI-11627: - Thanks for the reply [~dstiegli1], I do not recommend that approach unless the implementation is fully compatible with the SchemaRegistry interface. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779656#comment-17779656 ] Daniel Stieglitz commented on NIFI-11627: - [~exceptionfactory] I wasn't planning to actually use the RecordSchema for validating. I was going to use it more as a means to obtain the Json schema. From what I see the SimpleRecordSchema can hold the text of the original schema which is what I could use to build the necessary JSON schema object. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17779640#comment-17779640 ] David Handermann commented on NIFI-11627: - [~dstiegli1] implementing the {{SchemaRegistry}} interface would be useful, but the NiFi RecordSchema lacks certain features that JSON Schema provides. So for this particular Jira issue, that would not address the use case. Supporting more JSON Schema features would definitely be useful, and that is a related question. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1825#comment-1825 ] Daniel Stieglitz commented on NIFI-11627: - @[~markap14] [~exceptionfactory] As for the initial implementation for this Controller Interface would an in memory map similar to the AvroSchemaRegistry be okay? > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774947#comment-17774947 ] Mark Payne commented on NIFI-11627: --- [~nwchuckster] yes, the approach that you mentioned there, which uses {{evaluateELString}} was an oversight that has since been patched. Anywhere that you're using that, you'll want to fix because in more recent versions that will fail. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17771313#comment-17771313 ] Chuck Tilly commented on NIFI-11627: Hi [~markap14] {quote}Making use of a Controller Service would work well. We could have a Controller Service that allows user-added properties where the values are JSON Schemas, and then allow ValidateJson to be configured with a Controller Service and take in the name of the schema, which would allow for Expression Language to be used. So I believe that would give you exactly what you're looking for, [~nwchuckster], no? {quote} Absolutely!!! That would be fantastic. Of course that is my preference so I look forward to that. {quote}...you cannot use Expression Language within a Parameter Context because parameters' values are resolved before the processor ever even has access to the property value. So, if you were to enter #\{${schema.name}} what would happen is that NiFi would resolve that to a parameter named ${schema.name} and the processor would be invalid, before it ever had any chance to even evaluate Expression Language. {quote} My experience is that you can use them together with the UpdateAttribute processor where I use this technique in several places. For example I use the following expression in an UpdateAttribute rule to dynamically generate the correct URL string as an attribute for each flowfile. {color:#0747a6}{{${#\{'Toadol End Point Service'}:evaluateELString()}}}{color} {quote}There are security policies that guard who is allow to reference parameters, etc. and allowing dynamic creation of parameter names would violate the security constraints. {quote} I was not aware of this. Is this true even if this process is contained in its own process group with its own parameter context specific to only this context? > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770634#comment-17770634 ] David Handermann commented on NIFI-11627: - Thanks for the reply [~nwchuckster], and thanks for [~markap14] for summarizing the optimal approach. As Mark highlighted, the general concept for selecting a schema makes sense, but using the Parameter Context will not work for the reasons mentioned, and Parameter Contexts are not intended to provide a general purpose registry of schemas. I agree with Mark's suggestion that providing a new Controller Service interface for accessing JSON Schemas would meet the use case and provide a configurable way forward. As JSON Schema has grown as a common way to express format validation requirements, this would be an excellent improvement. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770614#comment-17770614 ] Mark Payne commented on NIFI-11627: --- It does make sense to allow for a reference to be stored in a FlowFile attribute and then reference it in much the same way that you would when looking up a schema in a SchemaRegistry. But Parameter Contexts are not the right approach. Making use of a Controller Service would work well. We could have a Controller Service that allows user-added properties where the values are JSON Schemas, and then allow ValidateJson to be configured with a Controller Service and take in the name of the schema, which would allow for Expression Language to be used. So I believe that would give you exactly what you're looking for, [~nwchuckster], no? The reason that Parameter Contexts won't work here are two-fold. Firstly, you cannot use Expression Language within a Parameter Context because parameters' values are resolved before the processor ever even has access to the property value. So, if you were to enter #\{${schema.name}} what would happen is that NiFi would resolve that to a parameter named ${schema.name} and the processor would be invalid, before it ever had any chance to even evaluate Expression Language. Secondly, we never allow creating a String value and then evaluating it to get a parameter due to security concerns. There are security policies that guard who is allow to reference parameters, etc. and allowing dynamic creation of parameter names would violate the security constraints. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770593#comment-17770593 ] Chuck Tilly commented on NIFI-11627: [~exceptionfactory] A specific use-case would be every single time you want to validate your data, which is always. Hence the need for this processor. But storing the schema in the attributes is not the solution, you are right about that. Rather, it would be better if the flow file contained a +reference+ to its schema. The schema itself is stored as a value in the Parameter Contexts, and the ValidateJSON Processor uses this schema to perform the validation. This is exactly the same paradigm that the AvroSchemaRegistry uses, except the schemas are stored in a Controller Service instead of a Parameter Context (the principle is the same though). So assuming you have a flowfile with an attribute named "schema.name" (e.g. schema.name = blue-cars), then the syntax for referencing a schema stored as a Parameter Context value would be: JSON Schema = #\{${schema.name}} Within the Parameter Contexts there would be a value "blue-cars", and it would contain the JSON schema for blue-cars. The benefits are: 1) Only a single ValidateJSON processor is needed to perform all the validations. This allows for clean and simple flows that are easy to manage, and consume fewer resources. 2) This approach is consistent with how schema validation is done in Nifi using schema registries. It is a well established pattern in NiFi. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770584#comment-17770584 ] David Handermann commented on NIFI-11627: - Thanks for the reply [~dstiegli1]. Supporting FlowFile attributes works well for JoltTransformJSON as it allows for minor adjustments in transformation. However, JSON schema validation is a different use case where the schema should not change for each FlowFile. For this reasons, supporting FlowFile attributes in ValidateJson does not appear to be the best way forward. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770581#comment-17770581 ] Daniel Stieglitz commented on NIFI-11627: - [~exceptionfactory] I do not have a specific use case for a schema in an attribute but I thought to align ValidateJson with JoltTransformJSON. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770567#comment-17770567 ] David Handermann commented on NIFI-11627: - [~dstiegli1] Can you describe the use case for providing the Schema in an attribute? That would require a the FlowFile to include its own Schema, which has some utility, but could be problematic and a JSON Schema can be large, resulting in potential memory issues related to FlowFile attribute sizing. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17766371#comment-17766371 ] Daniel Stieglitz commented on NIFI-11627: - [~exceptionfactory] I understand a FlowFile attribute cannot be used as a dynamic reference to a Parameter Context value but could we still allow for specifying the schema in an attribute much like the JoltTransformJSON allows for specifying the Jolt spec either in an attribute or from a resource? This would allow for the use of one instance of ValidateJson. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765864#comment-17765864 ] David Handermann commented on NIFI-11627: - The Jolt JSON Processors support Expression Language in the Jolt Specification, but using a FlowFile attribute as a dynamic reference to a Parameter Context value is not supported. Schema validation should also be fairly stable, so another solution would be to have several ValidateJson Processors and route to the appropriate one based on a FlowFile attribute value. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765250#comment-17765250 ] Daniel Stieglitz commented on NIFI-11627: - [~exceptionfactory] Can you please weigh in on what is NIFI best practice? It seems JoltTransformJSON allows for specifying a Jolt spec inside an attribute while [~nwchuckster] argues this seems to be a NIFI antipattern. Thanks! > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763836#comment-17763836 ] Chuck Tilly commented on NIFI-11627: This seems like an anti-pattern since the prescribed way of performing validation in NiFi is through the use of a schema registry. Using a hack like the Jolt spec is undocumented, and inconsistent with best practices for data validation in NiFi. > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (NIFI-11627) Add Dynamic Schema References to ValidateJSON Processor
[ https://issues.apache.org/jira/browse/NIFI-11627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763804#comment-17763804 ] Daniel Stieglitz commented on NIFI-11627: - JoltTransformJSON with the changes in NIFI-4957 is a good example of a processor which supports both the use of flowfile attributes and resources for Json (in that case the Jolt spec). > Add Dynamic Schema References to ValidateJSON Processor > --- > > Key: NIFI-11627 > URL: https://issues.apache.org/jira/browse/NIFI-11627 > Project: Apache NiFi > Issue Type: Improvement > Components: Extensions >Affects Versions: 1.19.1 >Reporter: Chuck Tilly >Assignee: Daniel Stieglitz >Priority: Major > > For the ValidateJSON processor, add support for flowfile attribute references > that will allow for a JSON schema located in the Parameter Contexts, to be > referenced dynamically based on a flowfile attribute. e.g. > {code:java} > #{${schema.name}} {code} > > The benefits of adding support for attribute references are significant. > Adding this capability will allow a single processor to be used for all JSON > schema validation. Unfortunately, the current version of this processor > requires a dedicated processor for every schema, i.e. 12 schemas requires 12 > ValidateJSON processors. This is very laborious to construct and maintain, > and resource expensive. > ValidateJSON processor (https://issues.apache.org/jira/browse/NIFI-7392) -- This message was sent by Atlassian Jira (v8.20.10#820010)