Hi Mark, forgot to share the NiFi version we using:
1.8.0
10/22/2018 23:48:30 EDT
Tagged nifi-1.8.0-RC3


Thanks//Regards,
Emanuel Oliveira
Senior Oracle/Data Engineer | CTG | Galway
TEL ext: 353 – (0)91-74  4971 | int: 8-737 4971 |  who's 
who<http://fidelitycentral.fmr.com/ww/a639704> 

From: Emanuel Oliveira <emanu...@gmail.com>
Sent: Thursday 5 December 2019 22:42
To: users@nifi.apache.org
Subject: Re: NiFi ValidateRecord - unable to handle missing mandatory ARRAY ?

This email is from an external source - exercise caution regarding links and 
attachments.

Hi Mark, be sure you copy paste "NOK - payload BAD 1 - " into GenerateFlowfile 
as this is the problem.

Cheers,
Emanuel

On Thu 5 Dec 2019, 22:03 Mark Payne, 
<marka...@hotmail.com<mailto:marka...@hotmail.com>> wrote:
Emanuel,

What version of NiFi are you using?

I just tested the attached template against the latest, and the FlowFile was 
routed to 'invalid' with the explanation:

Records in this FlowFile were invalid for the following reasons: The following 
1 fields were missing: [[0]/Records/eventVersion]




Thanks
-Mark



On Dec 5, 2019, at 7:06 AM, Oliveira, Emanuel 
<emanuel.olive...@fmr.com<mailto:emanuel.olive...@fmr.com>> wrote:

Hi all,

I been struggling to find a way for ValidateRecord using Avro Schema to force 
mandatory the presence of an array on json payload, problem is if array 
“records” is missing Validate is considering FF valid ☹.
--objective - Mandatory to have "Records array" with at least "eventVersion"
- using ValidateRecord > Allow Extra Fields
- problem im facing is nifi dont trigger payload BAD 1 as invalid!!

How can I make mandatory the Records array ? Is it possible ?

I know I can eventually use a SplitJson JsonPath Expression=$.Records to rid 
off the ARRAY, and also to fial if array "Records" not present.. But I would 
like to have a clean solution using just avro schema, is this possible ?



--OK - payload GOOD
{
   "Service": "sssssss",
   "Event": "eeeee",
   "Time": "2019-11-25T16:21:53.280Z",
   "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb",
   "RequestId": "RRRRRRRRRRRRRRRRRR",
   "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh",
   "Records": [{
         "eventVersion": "aaa"
      }
   ]
}

--NOK - payload BAD 1 - missing "Records" array --> BUT 
VALIDATERECORD/AVROSCHEMA SENDS FF TO “valid”!! I want it to be sent “invalid” 
since is not compliant to my avro schema which needs array “Records” with 
element “eventVersion” as 2 mandatory things.
{
   "Service": "sssssss",
   "Event": "eeeee",
   "Time": "2019-11-25T16:21:53.280Z",
   "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb",
   "RequestId": "RRRRRRRRRRRRRRRRRR",
   "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh",
   "RecordsXXX": [{
         "eventVersion": "aaa"
      }
   ]
}

--OK - payload BAD 2 - "Records" array present but missing "eventVersion"
{
   "Service": "sssssss",
   "Event": "eeeee",
   "Time": "2019-11-25T16:21:53.280Z",
   "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb",
   "RequestId": "RRRRRRRRRRRRRRRRRR",
   "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh",
   "Records": [{
         "eventVersionXX": "aaa"
      }
   ]
}

Its very simple test flow (attachmed the xml template 
ValidateRecord_missing_mandatory_ARRAY_is_VALID_problem.xml) just using 
ValidateRecord with JsonReader/Json Writer:
<image001.png>


Heres ValidateRecord processor + reader/writer controllers:

  *   Avro schema with just array “Records” and “eventVersion” as min tag on 
array element.
  *   Using Allow Extra Fields true:

     *   So im ok having other fields on the root side by side with the array 
“Records”, and also ok to have extra elements inside each array.
     *   FYI: the real use case im trying to validate AWS SQS message (s3 
trigger) where I will be interested on several fields, but crafted this simpler 
example just to ask if its possible to force array to be mandatory and with at 
least 1 element ?
==========================================================

--ValidateRecord 1.8.0
Record Reader                           JsonTreeReader
Record Writer                           JsonRecordSetWriter
Record Writer for Invalid Records
Schema Access Strategy                  Use Reader's Schema
Schema Registry                         No value set
Schema Name                             ${schema.name<http://schema.name>}
Schema Text                             ${avro.schema}
Allow Extra Fields                      true
Strict Type Checking                    true

--JsonTreeReader 1.8.0 - MANDATORY TO HAVE "Records" ARRAY + "eventVersion" on 
each ARRAY element
Schema Access Strategy                  Use 'Schema Text' Property
Schema Registry
Schema Name                             ${schema.name<http://schema.name>}
Schema Version
Schema Branch
Schema Text
                                        {
                                           "name": "MyName",
                                           "type": "record",
                                           "namespace": 
"aa.bb.cc<http://aa.bb.cc/>",
                                           "fields": [{
                                                 "name": "Records",
                                                 "type": {
                                                    "type": "array",
                                                    "items": {
                                                       "name": "Records_record",
                                                       "type": "record",
                                                       "fields": [{
                                                             "name": 
"eventVersion",
                                                             "type": "string"
                                                          }
                                                       ]
                                                    }
                                                 }
                                              }
                                           ]
                                        }
Date Format
Time Format
Timestamp Format

--JsonRecordSetWriter 1.8.0
Schema Write Strategy                   Do Not Write Schema
Schema Access Strategy                  Inherit Record Schema
Schema Registry
Schema Name                             ${schema.name<http://schema.name>}
Schema Version
Schema Branch
Schema Text                             { "name": "eventVersion", "type": 
"string" }
Date Format
Time Format
Timestamp Format
Pretty Print JSON                       true
Suppress Null Values                    Never Suppress
Output Grouping                         Array

Thanks in advance,
Emanuel Oliveira

<ValidateRecord_missing_mandatory_ARRAY_is_VALID_problem.xml>

Reply via email to