Hi Community!

Some users may want to protect their sensitive data using tokenization.
We propose to create a Beam example template that will demonstrate Beam 
transform to protect sensitive data using tokenization. In our example, we will 
use an external service for the data tokenization.

At a high level, a pipeline that will:

  *   support batch (GCS) and streaming (Pub/Sub) input sources
  *   tokenize sensitive data via external REST service - we are about to use 
Protegrity
  *   output tokenized data into BigQuery or BigTable


I created JIRA ticket 
BEAM-11322<https://issues.apache.org/jira/browse/BEAM-11322> to describe this 
proposal and capture feedback. More details and the proposed design are 
available in the design 
doc<https://docs.google.com/document/d/1fnsUfGpCx8A_MBchPRvlm4gU0Ai5EQNSiZS1mg_A_zg/edit?usp=sharing>.

I welcome community feedback and comments regarding this Beam data tokenization 
template proposal

Thanks,
Artur Khanin
Akvelon, Inc

Reply via email to