Collecting operators real output cardinalities as json files

Francesco Ventura Sat, 23 May 2020 02:31:47 -0700

Hi everybody, 

I would like to collect the statistics and the real output cardinalities about 
the execution of many jobs as json files. I know that exist a REST interface 
that can be used but I was looking for something simpler. In practice, I would 
like to get the information showed in the WebUI at runtime about a job and 
store it as a file. I am using the env.getExecutionPlan() to get the execution 
plan of a job with the estimated cardinalities for each operator. However, it 
includes only estimated cardinalities and it can be used only before calling 
env.execute().


There is a similar way to extract the real output cardinalities of each 
pipeline after the execution? 
Is there a place where the Flink cluster stores the history of the information 
about executed jobs?
Developing a REST client to extract such information is the only way possible? 

I also would like to avoid adding counters to the job source code since I am 
monitoring the run time execution and I should avoid everything that can 
interfere.

Maybe is a trivial problem but I have a quick look around and I can not find 
the solution.

Thank you very much,

Francesco

Collecting operators real output cardinalities as json files

Reply via email to