Hi Community! Some users may want to protect their sensitive data using tokenization. We propose to create a Beam example template that will demonstrate Beam transform to protect sensitive data using tokenization. In our example, we will use an external service for the data tokenization.
At a high level, a pipeline that will: * support batch (GCS) and streaming (Pub/Sub) input sources * tokenize sensitive data via external REST service - we are about to use Protegrity * output tokenized data into BigQuery or BigTable I created JIRA ticket BEAM-11322<https://issues.apache.org/jira/browse/BEAM-11322> to describe this proposal and capture feedback. More details and the proposed design are available in the design doc<https://docs.google.com/document/d/1fnsUfGpCx8A_MBchPRvlm4gU0Ai5EQNSiZS1mg_A_zg/edit?usp=sharing>. I welcome community feedback and comments regarding this Beam data tokenization template proposal Thanks, Artur Khanin Akvelon, Inc
