Hi Mark, forgot to share the NiFi version we using: 1.8.0 10/22/2018 23:48:30 EDT Tagged nifi-1.8.0-RC3
Thanks//Regards, Emanuel Oliveira Senior Oracle/Data Engineer | CTG | Galway TEL ext: 353 – (0)91-74 4971 | int: 8-737 4971 | who's who<http://fidelitycentral.fmr.com/ww/a639704> From: Emanuel Oliveira <emanu...@gmail.com> Sent: Thursday 5 December 2019 22:42 To: users@nifi.apache.org Subject: Re: NiFi ValidateRecord - unable to handle missing mandatory ARRAY ? This email is from an external source - exercise caution regarding links and attachments. Hi Mark, be sure you copy paste "NOK - payload BAD 1 - " into GenerateFlowfile as this is the problem. Cheers, Emanuel On Thu 5 Dec 2019, 22:03 Mark Payne, <marka...@hotmail.com<mailto:marka...@hotmail.com>> wrote: Emanuel, What version of NiFi are you using? I just tested the attached template against the latest, and the FlowFile was routed to 'invalid' with the explanation: Records in this FlowFile were invalid for the following reasons: The following 1 fields were missing: [[0]/Records/eventVersion] Thanks -Mark On Dec 5, 2019, at 7:06 AM, Oliveira, Emanuel <emanuel.olive...@fmr.com<mailto:emanuel.olive...@fmr.com>> wrote: Hi all, I been struggling to find a way for ValidateRecord using Avro Schema to force mandatory the presence of an array on json payload, problem is if array “records” is missing Validate is considering FF valid ☹. --objective - Mandatory to have "Records array" with at least "eventVersion" - using ValidateRecord > Allow Extra Fields - problem im facing is nifi dont trigger payload BAD 1 as invalid!! How can I make mandatory the Records array ? Is it possible ? I know I can eventually use a SplitJson JsonPath Expression=$.Records to rid off the ARRAY, and also to fial if array "Records" not present.. But I would like to have a clean solution using just avro schema, is this possible ? --OK - payload GOOD { "Service": "sssssss", "Event": "eeeee", "Time": "2019-11-25T16:21:53.280Z", "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb", "RequestId": "RRRRRRRRRRRRRRRRRR", "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh", "Records": [{ "eventVersion": "aaa" } ] } --NOK - payload BAD 1 - missing "Records" array --> BUT VALIDATERECORD/AVROSCHEMA SENDS FF TO “valid”!! I want it to be sent “invalid” since is not compliant to my avro schema which needs array “Records” with element “eventVersion” as 2 mandatory things. { "Service": "sssssss", "Event": "eeeee", "Time": "2019-11-25T16:21:53.280Z", "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb", "RequestId": "RRRRRRRRRRRRRRRRRR", "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh", "RecordsXXX": [{ "eventVersion": "aaa" } ] } --OK - payload BAD 2 - "Records" array present but missing "eventVersion" { "Service": "sssssss", "Event": "eeeee", "Time": "2019-11-25T16:21:53.280Z", "Bucket": "bbb-bbbbb-bbb-bbbbb-bbbbbb", "RequestId": "RRRRRRRRRRRRRRRRRR", "HostId": "hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh", "Records": [{ "eventVersionXX": "aaa" } ] } Its very simple test flow (attachmed the xml template ValidateRecord_missing_mandatory_ARRAY_is_VALID_problem.xml) just using ValidateRecord with JsonReader/Json Writer: <image001.png> Heres ValidateRecord processor + reader/writer controllers: * Avro schema with just array “Records” and “eventVersion” as min tag on array element. * Using Allow Extra Fields true: * So im ok having other fields on the root side by side with the array “Records”, and also ok to have extra elements inside each array. * FYI: the real use case im trying to validate AWS SQS message (s3 trigger) where I will be interested on several fields, but crafted this simpler example just to ask if its possible to force array to be mandatory and with at least 1 element ? ========================================================== --ValidateRecord 1.8.0 Record Reader JsonTreeReader Record Writer JsonRecordSetWriter Record Writer for Invalid Records Schema Access Strategy Use Reader's Schema Schema Registry No value set Schema Name ${schema.name<http://schema.name>} Schema Text ${avro.schema} Allow Extra Fields true Strict Type Checking true --JsonTreeReader 1.8.0 - MANDATORY TO HAVE "Records" ARRAY + "eventVersion" on each ARRAY element Schema Access Strategy Use 'Schema Text' Property Schema Registry Schema Name ${schema.name<http://schema.name>} Schema Version Schema Branch Schema Text { "name": "MyName", "type": "record", "namespace": "aa.bb.cc<http://aa.bb.cc/>", "fields": [{ "name": "Records", "type": { "type": "array", "items": { "name": "Records_record", "type": "record", "fields": [{ "name": "eventVersion", "type": "string" } ] } } } ] } Date Format Time Format Timestamp Format --JsonRecordSetWriter 1.8.0 Schema Write Strategy Do Not Write Schema Schema Access Strategy Inherit Record Schema Schema Registry Schema Name ${schema.name<http://schema.name>} Schema Version Schema Branch Schema Text { "name": "eventVersion", "type": "string" } Date Format Time Format Timestamp Format Pretty Print JSON true Suppress Null Values Never Suppress Output Grouping Array Thanks in advance, Emanuel Oliveira <ValidateRecord_missing_mandatory_ARRAY_is_VALID_problem.xml>