Hi everybody, 

I would like to collect the statistics and the real output cardinalities about 
the execution of many jobs as json files. I know that exist a REST interface 
that can be used but I was looking for something simpler. In practice, I would 
like to get the information showed in the WebUI at runtime about a job and 
store it as a file. I am using the env.getExecutionPlan() to get the execution 
plan of a job with the estimated cardinalities for each operator. However, it 
includes only estimated cardinalities and it can be used only before calling 
env.execute(). 

There is a similar way to extract the real output cardinalities of each 
pipeline after the execution? 
Is there a place where the Flink cluster stores the history of the information 
about executed jobs?
Developing a REST client to extract such information is the only way possible? 

I also would like to avoid adding counters to the job source code since I am 
monitoring the run time execution and I should avoid everything that can 
interfere.

Maybe is a trivial problem but I have a quick look around and I can not find 
the solution.

Thank you very much,

Francesco

Reply via email to