Marcio Sugar created NIFI-11791:
-----------------------------------

             Summary: PutBigQuery processor lacks functionality found in 
PutBigQueryBatch
                 Key: NIFI-11791
                 URL: https://issues.apache.org/jira/browse/NIFI-11791
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Extensions
    Affects Versions: 1.22.0, 2.0.0
            Reporter: Marcio Sugar


Before PutBigQuery, we had PutBigQueryBatch and PutBigQueryStream, both now 
deprecated. Not sure if PutBigQuery was designed to completely replace its 
older brothers, but it cannot do that yet because of some missing features. For 
example, we can't use PubBigQuery alone to create snapshot tables, something 
that was easy to do with PutBigQueryBatch. 

A snapshot table is a recent copy of a table from a database or a subset of 
rows/columns of a table. It is used to dynamically replicate data between 
distributed databases. Using PutBigQueryBatch, we can achieve that by setting 
the following properties:
 * Create Disposition = CREATE_IF_NEEDED
 * Write Disposition = WRITE_TRUNCATE

I understand that PutBigQuery uses the newer [BigQuery Storage Write 
API|https://cloud.google.com/bigquery/docs/write-api], so adding the missing 
functionality might not be possible. 

But please note the older BigQuery (core) API (the one I believe 
PutBigQueryBatch uses) allows the user to submit jobs to load data into 
BigQuery in a very convenient way. That is something I'd like to see preserved 
in future versions of NiFi



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to