[ https://issues.apache.org/jira/browse/NIFI-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julien G. updated NIFI-11402: ----------------------------- Affects Version/s: 1.18.0 > PutBigQuery processor case sensitive and Append Record Count issues > ------------------------------------------------------------------- > > Key: NIFI-11402 > URL: https://issues.apache.org/jira/browse/NIFI-11402 > Project: Apache NiFi > Issue Type: Bug > Affects Versions: 1.18.0 > Reporter: Julien G. > Priority: Major > > The {{PutBigQuery}} processor seems to have to some issues. I detected 2 > issues that can be quite blocking. > For the first one, if you set a hight value in the {{Append Record Count}} > property in my case 500 000 and that you have a big flowfile (number of > records and size, in my case 54 000 records for a size of 74MB) you will get > an error because the message to send is too big. That is quite normal. > {code:java} > PutBigQuery[id=16da3694-c886-3b31-929e-0dc81be51bf7] Stream processing > failed: java.lang.RuntimeException: io.grpc.StatusRuntimeException: > INVALID_ARGUMENT: MessageSize is too large. Max allow: 10000000 Actual: > 13593340 > - Caused by: io.grpc.StatusRuntimeException: INVALID_ARGUMENT: MessageSize is > too large. Max allow: 10000000 Actual: 13593340 > {code} > So you replace the value with a smaller one, but the error message remains > the same. Even if you reduce your flowfile to a single record, you will still > get the error. The only way to fix this is to delete the processor and readd > it, then reduce the value of the property before running it. Seems to be an > issue here. > It would also be interesting to give information about the limit of the > message sent in the processor documentation because the limit in the previous > implementation of the {{PutBigQueryStreaming}} and {{PutBigQueryBatch}} > processors was quite straightforward and linked to the size of the file sent. > But now the limit is on the {{Message}} but it doesn't really correspond to > the size of the FlowFile or the number of records in it. > The second issue occure if you are using upper case in your field name. For > example, you have a table with the following schema: > {code:java} > timestamp | TIMESTAMP | REQUIRED > original_payload | STRING | NULLABLE > error_message | STRING | REQUIRED > error_type | STRING REQUIRED > error_subType | STRING | REQUIRED > {code} > and try to put the following event in it: > {code:java} > { > "original_payload" : "XXXXXXXX", > "error_message" : "XXXXXX", > "error_type" : "XXXXXXXXXX", > "error_subType" : "XXXXXXXXXXX", > "timestamp" : "2023-04-07T10:31:45Z" > } > {code} > (in my case this event was in Avro) > You will get the following telling you that the required field > {{error_subtype}} is missing: > {code:java} > Cannot convert record to message: > com.google.protobuf.UninitializedMessageException: Message missing required > fields: error_subtype > {code} > So to fix it, you need to change your Avro Schema and put {{error_subtype}} > instead of {{error_subType}} in it. > BigQuery columns aren't case sensitive so it should be ok to put a field with > upper case but it's not. In the previous implementation of the > {{PutBigQueryStreaming}} and {{PutBigQueryBatch}}, we were able to use upper > case in the schema fields. So it should still be the case. -- This message was sent by Atlassian Jira (v8.20.10#820010)