zzlbuaa commented on issue #5539: [AIRFLOW-4811] Implement GCP DLP' Hook and 
Operators
URL: https://github.com/apache/airflow/pull/5539#issuecomment-510246198
 
 
   @ryanyuan @mik-laj Since create_dlp_job is an asynchronous call(completes 
while job is still pending or running), and the job can take seconds, minutes, 
or hours to run depending on inspected data size, one feature we want to have 
is a RunDlpJobOperator that could create a dlpJob and keep polling its status 
via get_dlp_job until the job is done or canceled/failed. Do you have any 
suggestions on where the actual looping and polling status thing should be, in 
a hook function or in the operator?
   
   I have that operator in my implementation, and it does the actual looping in 
a hook function:
   The hook function: 
https://github.com/apache/airflow/pull/5531/commits/e0a43280c6208eee526839a06a7d3b0dc44f8489#diff-86a3d285c23e0963485636db3bc28fe1R298
   The RunDlpJobOperator: 
https://github.com/apache/airflow/pull/5531/commits/e0a43280c6208eee526839a06a7d3b0dc44f8489#diff-bedc38985a98c3b915ead387ecc351e2
   Does it look good to you?
   
   For more info about dlpJob: 
https://cloud.google.com/dlp/docs/inspecting-storage
   
   cc @criccomini @whynick1 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to