Re: How does Flink plugin system work?

2023-01-02 Thread Matthias Pohl via user
Yes, Ruibin confirmed in a private message that using the factory class
works. But thanks for digging into it once more Yanfei. I missed to
consider in my previous message that the plugin classes are loaded using
their own class loaders which, indeed, can result in a
ClassNotFoundException being thrown.

Best,
Matthias

On Tue, Jan 3, 2023 at 4:45 AM Yanfei Lei  wrote:

> Hi Ruibin,
>
> "metrics.reporter.prom.class" is deprecated in 1.16, maybe "
> metrics.reporter.prom.factory.class"[1] can solve your problem.
> After reading the related code[2], I think the root cause is that  "
> metrics.reporter.prom.class" would load the code via flink's classpath
> instead of MetricReporterFactory, due to "Plugins cannot access classes
> from other plugins or from Flink that have not been specifically
> whitelisted"[3], so ClassNotFoundException is thrown.
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/metric_reporters/#prometheus
> [2]
> https://github.com/apache/flink/blob/release-1.16/flink-runtime/src/main/java/org/apache/flink/runtime/metrics/ReporterSetup.java#L457
> [3]
> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/plugins/
>
> Matthias Pohl via user  于2023年1月2日周一 20:27写道:
>
>> Hi Ruibin,
>> could you switch to using the currently supported way for instantiating
>> reporters using the factory configuration parameter [1][2]?
>>
>> Based on the ClassNotFoundException, your suspicion might be right that
>> the plugin didn't make it onto the classpath. Could you share the
>> startup logs of the JM and TMs. That might help getting a bit more context
>> on what's going on. Your approach on integrating the reporter through the
>> plugin system [3] sounds about right as far as I can see.
>>
>> Matthias
>>
>> [1]
>> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/#factory-class
>> [2]
>> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/#prometheus
>> [3]
>> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/plugins/
>>
>> On Fri, Dec 30, 2022 at 11:42 AM Ruibin Xing  wrote:
>>
>>> Hi community,
>>>
>>> I am having difficulty understanding the Flink plugin system. I am
>>> attempting to enable the Prometheus exporter with the official Flink image
>>> 1.16.0, but I am experiencing issues with library dependencies. According
>>> to the plugin documentation (
>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/plugins/),
>>> as long as the library is located in the /opt/flink/plugins/
>>> directory, Flink should automatically load it, similar to how it loads
>>> libraries in the /opt/flink/lib directory. However, Flink does not seem to
>>> detect the plugin.
>>>
>>> Here is the directory structure for /opt/flink:
>>> > tree /opt/flink
>>> .
>>> 
>>> ├── plugins
>>> │   ├── metrics-prometheus
>>> │   │   └── flink-metrics-prometheus-1.16.0.jar
>>> ...
>>>
>>> And here is the related Flink configuration:
>>> > metrics.reporter.prom.class:
>>> org.apache.flink.metrics.prometheus.PrometheusReporter
>>>
>>> The error logs in the task manager show the following:
>>> 2022-12-30 10:03:55,840 WARN
>>>  org.apache.flink.runtime.metrics.ReporterSetup   [] - The
>>> reporter configuration of 'prom' configures the reporter class, which is a
>>> deprecated approach to configure reporters. Please configure a factory
>>> class instead: 'metrics.reporter.prom.factory.class: ' to
>>> ensure that the configuration continues to work with future versions.
>>> 2022-12-30 10:03:55,841 ERROR
>>> org.apache.flink.runtime.metrics.ReporterSetup   [] - Could not
>>> instantiate metrics reporter prom. Metrics might not be exposed/reported.
>>> java.lang.ClassNotFoundException:
>>> org.apache.flink.metrics.prometheus.PrometheusReporter
>>> at jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source)
>>> ~[?:?]
>>> at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown
>>> Source) ~[?:?]
>>> at java.lang.ClassLoader.loadClass(Unknown Source) ~[?:?]
>>> at java.lang.Class.forName0(Native Method) ~[?:?]
>>> at java.lang.Class.forName(Unknown Source) ~[?:?]
>>> at
>>> org.apache.flink.runtime.metrics.ReporterSetup.loadViaReflection(ReporterSetup.java:456)
>>> ~[flink-dist-1.16.0.jar:1.16.0]
>>> at
>>> org.apache.flink.runtime.metrics.ReporterSetup.loadReporter(ReporterSetup.java:409)
>>> ~[flink-dist-1.16.0.jar:1.16.0]
>>> at
>>> org.apache.flink.runtime.metrics.ReporterSetup.setupReporters(ReporterSetup.java:328)
>>> ~[flink-dist-1.16.0.jar:1.16.0]
>>> at
>>> org.apache.flink.runtime.metrics.ReporterSetup.fromConfiguration(ReporterSetup.java:209)
>>> ~[flink-dist-1.16.0.jar:1.16.0]
>>>
>>> The Java commands for Flink process:
>>> flink  1  3.0  4.6 2168308 765936 ?  Ssl  10:03   1:08
>>> /opt/java/openjdk/bin/java -XX:+UseG1GC -Xmx697932173 -Xms697932173
>>> -XX:MaxDirectMemorySize=3006

Re: How does Flink plugin system work?

2023-01-02 Thread Yanfei Lei
Hi Ruibin,

"metrics.reporter.prom.class" is deprecated in 1.16, maybe "
metrics.reporter.prom.factory.class"[1] can solve your problem.
After reading the related code[2], I think the root cause is that  "
metrics.reporter.prom.class" would load the code via flink's classpath
instead of MetricReporterFactory, due to "Plugins cannot access classes
from other plugins or from Flink that have not been specifically
whitelisted"[3], so ClassNotFoundException is thrown.

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/metric_reporters/#prometheus
[2]
https://github.com/apache/flink/blob/release-1.16/flink-runtime/src/main/java/org/apache/flink/runtime/metrics/ReporterSetup.java#L457
[3]
https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/plugins/

Matthias Pohl via user  于2023年1月2日周一 20:27写道:

> Hi Ruibin,
> could you switch to using the currently supported way for instantiating
> reporters using the factory configuration parameter [1][2]?
>
> Based on the ClassNotFoundException, your suspicion might be right that
> the plugin didn't make it onto the classpath. Could you share the
> startup logs of the JM and TMs. That might help getting a bit more context
> on what's going on. Your approach on integrating the reporter through the
> plugin system [3] sounds about right as far as I can see.
>
> Matthias
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/#factory-class
> [2]
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/#prometheus
> [3]
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/plugins/
>
> On Fri, Dec 30, 2022 at 11:42 AM Ruibin Xing  wrote:
>
>> Hi community,
>>
>> I am having difficulty understanding the Flink plugin system. I am
>> attempting to enable the Prometheus exporter with the official Flink image
>> 1.16.0, but I am experiencing issues with library dependencies. According
>> to the plugin documentation (
>> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/plugins/),
>> as long as the library is located in the /opt/flink/plugins/
>> directory, Flink should automatically load it, similar to how it loads
>> libraries in the /opt/flink/lib directory. However, Flink does not seem to
>> detect the plugin.
>>
>> Here is the directory structure for /opt/flink:
>> > tree /opt/flink
>> .
>> 
>> ├── plugins
>> │   ├── metrics-prometheus
>> │   │   └── flink-metrics-prometheus-1.16.0.jar
>> ...
>>
>> And here is the related Flink configuration:
>> > metrics.reporter.prom.class:
>> org.apache.flink.metrics.prometheus.PrometheusReporter
>>
>> The error logs in the task manager show the following:
>> 2022-12-30 10:03:55,840 WARN
>>  org.apache.flink.runtime.metrics.ReporterSetup   [] - The
>> reporter configuration of 'prom' configures the reporter class, which is a
>> deprecated approach to configure reporters. Please configure a factory
>> class instead: 'metrics.reporter.prom.factory.class: ' to
>> ensure that the configuration continues to work with future versions.
>> 2022-12-30 10:03:55,841 ERROR
>> org.apache.flink.runtime.metrics.ReporterSetup   [] - Could not
>> instantiate metrics reporter prom. Metrics might not be exposed/reported.
>> java.lang.ClassNotFoundException:
>> org.apache.flink.metrics.prometheus.PrometheusReporter
>> at jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source) ~[?:?]
>> at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown
>> Source) ~[?:?]
>> at java.lang.ClassLoader.loadClass(Unknown Source) ~[?:?]
>> at java.lang.Class.forName0(Native Method) ~[?:?]
>> at java.lang.Class.forName(Unknown Source) ~[?:?]
>> at
>> org.apache.flink.runtime.metrics.ReporterSetup.loadViaReflection(ReporterSetup.java:456)
>> ~[flink-dist-1.16.0.jar:1.16.0]
>> at
>> org.apache.flink.runtime.metrics.ReporterSetup.loadReporter(ReporterSetup.java:409)
>> ~[flink-dist-1.16.0.jar:1.16.0]
>> at
>> org.apache.flink.runtime.metrics.ReporterSetup.setupReporters(ReporterSetup.java:328)
>> ~[flink-dist-1.16.0.jar:1.16.0]
>> at
>> org.apache.flink.runtime.metrics.ReporterSetup.fromConfiguration(ReporterSetup.java:209)
>> ~[flink-dist-1.16.0.jar:1.16.0]
>>
>> The Java commands for Flink process:
>> flink  1  3.0  4.6 2168308 765936 ?  Ssl  10:03   1:08
>> /opt/java/openjdk/bin/java -XX:+UseG1GC -Xmx697932173 -Xms697932173
>> -XX:MaxDirectMemorySize=300647712 -XX:MaxMetaspaceSize=268435456
>> -Dlog.file=/opt/flink/log/flink--kubernetes-taskmanager-0-checkpoint-ha-example-taskmanager-1-1.log
>> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
>> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
>> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
>> -classpath
>> /opt/flink/lib/flink-cep-1.16.0.jar:/opt/flink/lib/flink-connector-files-1.16.0.jar:/opt/flink/lib/flink-csv-1.16.0.j

Re: How does Flink plugin system work?

2023-01-02 Thread Matthias Pohl via user
Hi Ruibin,
could you switch to using the currently supported way for instantiating
reporters using the factory configuration parameter [1][2]?

Based on the ClassNotFoundException, your suspicion might be right that the
plugin didn't make it onto the classpath. Could you share the startup logs
of the JM and TMs. That might help getting a bit more context on what's
going on. Your approach on integrating the reporter through the plugin
system [3] sounds about right as far as I can see.

Matthias

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/#factory-class
[2]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters/#prometheus
[3]
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/plugins/

On Fri, Dec 30, 2022 at 11:42 AM Ruibin Xing  wrote:

> Hi community,
>
> I am having difficulty understanding the Flink plugin system. I am
> attempting to enable the Prometheus exporter with the official Flink image
> 1.16.0, but I am experiencing issues with library dependencies. According
> to the plugin documentation (
> https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/plugins/),
> as long as the library is located in the /opt/flink/plugins/
> directory, Flink should automatically load it, similar to how it loads
> libraries in the /opt/flink/lib directory. However, Flink does not seem to
> detect the plugin.
>
> Here is the directory structure for /opt/flink:
> > tree /opt/flink
> .
> 
> ├── plugins
> │   ├── metrics-prometheus
> │   │   └── flink-metrics-prometheus-1.16.0.jar
> ...
>
> And here is the related Flink configuration:
> > metrics.reporter.prom.class:
> org.apache.flink.metrics.prometheus.PrometheusReporter
>
> The error logs in the task manager show the following:
> 2022-12-30 10:03:55,840 WARN
>  org.apache.flink.runtime.metrics.ReporterSetup   [] - The
> reporter configuration of 'prom' configures the reporter class, which is a
> deprecated approach to configure reporters. Please configure a factory
> class instead: 'metrics.reporter.prom.factory.class: ' to
> ensure that the configuration continues to work with future versions.
> 2022-12-30 10:03:55,841 ERROR
> org.apache.flink.runtime.metrics.ReporterSetup   [] - Could not
> instantiate metrics reporter prom. Metrics might not be exposed/reported.
> java.lang.ClassNotFoundException:
> org.apache.flink.metrics.prometheus.PrometheusReporter
> at jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source) ~[?:?]
> at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown
> Source) ~[?:?]
> at java.lang.ClassLoader.loadClass(Unknown Source) ~[?:?]
> at java.lang.Class.forName0(Native Method) ~[?:?]
> at java.lang.Class.forName(Unknown Source) ~[?:?]
> at
> org.apache.flink.runtime.metrics.ReporterSetup.loadViaReflection(ReporterSetup.java:456)
> ~[flink-dist-1.16.0.jar:1.16.0]
> at
> org.apache.flink.runtime.metrics.ReporterSetup.loadReporter(ReporterSetup.java:409)
> ~[flink-dist-1.16.0.jar:1.16.0]
> at
> org.apache.flink.runtime.metrics.ReporterSetup.setupReporters(ReporterSetup.java:328)
> ~[flink-dist-1.16.0.jar:1.16.0]
> at
> org.apache.flink.runtime.metrics.ReporterSetup.fromConfiguration(ReporterSetup.java:209)
> ~[flink-dist-1.16.0.jar:1.16.0]
>
> The Java commands for Flink process:
> flink  1  3.0  4.6 2168308 765936 ?  Ssl  10:03   1:08
> /opt/java/openjdk/bin/java -XX:+UseG1GC -Xmx697932173 -Xms697932173
> -XX:MaxDirectMemorySize=300647712 -XX:MaxMetaspaceSize=268435456
> -Dlog.file=/opt/flink/log/flink--kubernetes-taskmanager-0-checkpoint-ha-example-taskmanager-1-1.log
> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
> -classpath
> /opt/flink/lib/flink-cep-1.16.0.jar:/opt/flink/lib/flink-connector-files-1.16.0.jar:/opt/flink/lib/flink-csv-1.16.0.jar:/opt/flink/lib/flink-json-1.16.0.jar:/opt/flink/lib/flink-scala_2.12-1.16.0.jar:/opt/flink/lib/flink-shaded-hadoop-2-uber-2.4.1-10.0.jar:/opt/flink/lib/flink-shaded-zookeeper-3.5.9.jar:/opt/flink/lib/flink-table-api-java-uber-1.16.0.jar:/opt/flink/lib/flink-table-planner-loader-1.16.0.jar:/opt/flink/lib/flink-table-runtime-1.16.0.jar:/opt/flink/lib/log4j-1.2-api-2.17.1.jar:/opt/flink/lib/log4j-api-2.17.1.jar:/opt/flink/lib/log4j-core-2.17.1.jar:/opt/flink/lib/log4j-slf4j-impl-2.17.1.jar:/opt/flink/lib/flink-dist-1.16.0.jar
> org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
> --configDir /opt/flink/conf -Djobmanager.rpc.address=172.17.0.7
> -Dpipeline.classpaths= -Djobmanager.memory.off-heap.size=134217728b
> -Dweb.tmpdir=/tmp/flink-web-57b9e638-f313-4389-a75b-988509697edb
> -Djobmanager.rpc.port=6123
> -D.pipeline.job-id=a6f1c9fb
> -Drest.address=172.17.0.7 -Djobmanager.memory.jvm-overhead.max=214

How does Flink plugin system work?

2022-12-30 Thread Ruibin Xing
Hi community,

I am having difficulty understanding the Flink plugin system. I am
attempting to enable the Prometheus exporter with the official Flink image
1.16.0, but I am experiencing issues with library dependencies. According
to the plugin documentation (
https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/plugins/),
as long as the library is located in the /opt/flink/plugins/
directory, Flink should automatically load it, similar to how it loads
libraries in the /opt/flink/lib directory. However, Flink does not seem to
detect the plugin.

Here is the directory structure for /opt/flink:
> tree /opt/flink
.

├── plugins
│   ├── metrics-prometheus
│   │   └── flink-metrics-prometheus-1.16.0.jar
...

And here is the related Flink configuration:
> metrics.reporter.prom.class:
org.apache.flink.metrics.prometheus.PrometheusReporter

The error logs in the task manager show the following:
2022-12-30 10:03:55,840 WARN
 org.apache.flink.runtime.metrics.ReporterSetup   [] - The
reporter configuration of 'prom' configures the reporter class, which is a
deprecated approach to configure reporters. Please configure a factory
class instead: 'metrics.reporter.prom.factory.class: ' to
ensure that the configuration continues to work with future versions.
2022-12-30 10:03:55,841 ERROR
org.apache.flink.runtime.metrics.ReporterSetup   [] - Could not
instantiate metrics reporter prom. Metrics might not be exposed/reported.
java.lang.ClassNotFoundException:
org.apache.flink.metrics.prometheus.PrometheusReporter
at jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source) ~[?:?]
at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown
Source) ~[?:?]
at java.lang.ClassLoader.loadClass(Unknown Source) ~[?:?]
at java.lang.Class.forName0(Native Method) ~[?:?]
at java.lang.Class.forName(Unknown Source) ~[?:?]
at
org.apache.flink.runtime.metrics.ReporterSetup.loadViaReflection(ReporterSetup.java:456)
~[flink-dist-1.16.0.jar:1.16.0]
at
org.apache.flink.runtime.metrics.ReporterSetup.loadReporter(ReporterSetup.java:409)
~[flink-dist-1.16.0.jar:1.16.0]
at
org.apache.flink.runtime.metrics.ReporterSetup.setupReporters(ReporterSetup.java:328)
~[flink-dist-1.16.0.jar:1.16.0]
at
org.apache.flink.runtime.metrics.ReporterSetup.fromConfiguration(ReporterSetup.java:209)
~[flink-dist-1.16.0.jar:1.16.0]

The Java commands for Flink process:
flink  1  3.0  4.6 2168308 765936 ?  Ssl  10:03   1:08
/opt/java/openjdk/bin/java -XX:+UseG1GC -Xmx697932173 -Xms697932173
-XX:MaxDirectMemorySize=300647712 -XX:MaxMetaspaceSize=268435456
-Dlog.file=/opt/flink/log/flink--kubernetes-taskmanager-0-checkpoint-ha-example-taskmanager-1-1.log
-Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
-Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
-Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
-classpath
/opt/flink/lib/flink-cep-1.16.0.jar:/opt/flink/lib/flink-connector-files-1.16.0.jar:/opt/flink/lib/flink-csv-1.16.0.jar:/opt/flink/lib/flink-json-1.16.0.jar:/opt/flink/lib/flink-scala_2.12-1.16.0.jar:/opt/flink/lib/flink-shaded-hadoop-2-uber-2.4.1-10.0.jar:/opt/flink/lib/flink-shaded-zookeeper-3.5.9.jar:/opt/flink/lib/flink-table-api-java-uber-1.16.0.jar:/opt/flink/lib/flink-table-planner-loader-1.16.0.jar:/opt/flink/lib/flink-table-runtime-1.16.0.jar:/opt/flink/lib/log4j-1.2-api-2.17.1.jar:/opt/flink/lib/log4j-api-2.17.1.jar:/opt/flink/lib/log4j-core-2.17.1.jar:/opt/flink/lib/log4j-slf4j-impl-2.17.1.jar:/opt/flink/lib/flink-dist-1.16.0.jar
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner
--configDir /opt/flink/conf -Djobmanager.rpc.address=172.17.0.7
-Dpipeline.classpaths= -Djobmanager.memory.off-heap.size=134217728b
-Dweb.tmpdir=/tmp/flink-web-57b9e638-f313-4389-a75b-988509697edb
-Djobmanager.rpc.port=6123
-D.pipeline.job-id=a6f1c9fb
-Drest.address=172.17.0.7 -Djobmanager.memory.jvm-overhead.max=214748368b
-Djobmanager.memory.jvm-overhead.min=214748368b
-Dtaskmanager.resource-id=checkpoint-ha-example-taskmanager-1-1
-Dexecution.target=embedded
-Dpipeline.jars=file:/opt/flink/examples/streaming/StateMachineExample.jar
-Djobmanager.memory.jvm-metaspace.size=268435456b
-Djobmanager.memory.heap.size=1530082096b -D
taskmanager.memory.network.min=166429984b -D taskmanager.cpu.cores=1.0 -D
taskmanager.memory.task.off-heap.size=0b -D
taskmanager.memory.jvm-metaspace.size=268435456b -D external-resources=none
-D taskmanager.memory.jvm-overhead.min=214748368b -D
taskmanager.memory.framework.off-heap.size=134217728b -D
taskmanager.memory.network.max=166429984b -D
taskmanager.memory.framework.heap.size=134217728b -D
taskmanager.memory.managed.size=665719939b -D
taskmanager.memory.task.heap.size=563714445b -D
taskmanager.numberOfTaskSlots=1 -D
taskmanager.memory.jvm-overhead.max=214748368b

It seems that I must be missing something. Could someone please help me
clarify this issue?

Thank y