Hi Flink Community,
We are deploying Flink on Kubernetes (standalone) with Istio service mesh and 
encountered an issue where the Flink UI shows "Loading..." indefinitely instead 
of displaying metrics data.
[cid:fc77eeb9-5e3d-44f6-8a9a-5ff70ebc1e32]
Root Cause: After investigation, we found that Istio was blocking connections 
because Flink allocates the metrics.internal.query-service.port dynamically by 
default.
Our Solution: We resolved this by:

  1.
Setting a static port: metrics.internal.query-service.port: 50009 # to make it 
static
  2.
Configuring Istio to exclude/bypass this port from the service mesh # on port 
50009

This fixed the issue, and the Flink UI now displays metrics correctly.
Security Question: From our understanding, metrics.internal.query-service.port 
is used to expose metrics internally from TaskManagers to the JobManager (via 
the REST API that powers the Flink UI).
Before we deploy this to production, we need confirmation from a security 
perspective:
Does this port expose ONLY metrics (JVM stats, checkpoint info, counters, 
etc.), or could it potentially expose actual processing data (the 
records/events being processed by tasks) either directly or indirectly?
We want to ensure that excluding this port from Istio doesn't create a security 
risk by inadvertently exposing business data flowing through Flink tasks.
Other Information:

  *
Flink version: 1.19.1
  *
Deployment: standalone
  *   Our understanding is that this port serves the internal metrics query 
service for monitoring purposes only

Any clarification or documentation references would be greatly appreciated!
Thank you!
Harsh

Reply via email to