Thanks Robert for the pointers. It is some issue with mockito which I was using to mock getMetricStoreProgramHelper method in my unit test. For now I have modified my unit test to not use mockito.
I will try to provide a reproducible example. On Fri, Feb 12, 2021 at 8:56 PM Robert Metzger <rmetz...@apache.org> wrote: > Thanks for reaching out to the Flink ML. > > It reports getMetricStoreProgramHelper as a non-serializable field, even > though it looks a lot like a method. The only recommendation I have for you > is carefully reading the full error message + stack trace. > > Your approach of using tagging fields as "transient" is absolutely correct. > There's also this message: NotSerializableException: > java.lang.reflect.Method, but I can not find a field of type Method. > > Can you provide a minimal reproducible example of this issue? > > On Fri, Feb 12, 2021 at 7:06 AM Debraj Manna <subharaj.ma...@gmail.com> > wrote: > >> HI >> >> I am having a ProcessFunction like below which is throwing an error like >> below whenever I am trying to use it in a opeator . My understanding when >> flink initializes the operator dag, it serializes things and sends over to >> the taskmanagers. >> So I have marked the operator state transient, since the operator state >> will be populated within the open() call that gets invoked in each >> taskmanager. But I am still getting the serialization exception like below. >> Can suggest some ways where I can debug this type of serialization error in >> Flink 1.12? >> >> org.apache.flink.api.common.InvalidProgramException: public >> com.vnera.programs.metrics.MetricStoreProgramHelper >> com.vnera.analytics.engine.MetricStoreMapper.getMetricStoreProgramHelper(com.vnera.resourcemanager.ResourceManager,com.vnera.storage.metrics.TsdbMetricStore$Writer,java.lang.String) >> is not serializable. The object probably contains or references non >> serializable fields. >> >> at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:164) >> at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:132) >> ... >> at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:69) >> at >> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:2000) >> at >> org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:203) >> at >> org.apache.flink.streaming.api.datastream.DataStream.process(DataStream.java:681) >> at >> org.apache.flink.streaming.api.datastream.DataStream.process(DataStream.java:661) >> at >> com.vnera.analytics.engine.MetricStoreOperator.linkFrom(MetricStoreOperator.java:27) >> at >> com.vnera.analytics.engine.AnaPipelineStage.link(AnaPipelineStage.java:12) >> at >> com.vnera.analytics.engine.AnalyticsEngine.createPipeline(AnalyticsEngine.java:106) >> at >> com.vnera.analytics.engine.source.DerivedMetricCreatorTest.testPipeline(DerivedMetricCreatorTest.java:98) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:498) >> at >> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) >> ... >> at >> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) >> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) >> ... >> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) >> ... >> at >> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) >> at >> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) >> at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) >> at org.junit.rules.RunRules.evaluate(RunRules.java:20) >> at org.junit.runners.ParentRunner.run(ParentRunner.java:363) >> at org.junit.runner.JUnitCore.run(JUnitCore.java:137) >> at >> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) >> ... >> at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53) >> Caused by: java.io.NotSerializableException: java.lang.reflect.Method >> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) >> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) >> at >> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:624) >> at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:143) >> ... 45 more >> >> My ProcessFunction looks like below >> public class MetricStoreMapper extends >> ProcessFunction<SelfDescribingMessageDO, GenericMetricV2> { >> private static final String MSG_STALENESS_METRIC_NAME = >> VneraMetrics.createMetricName(MetricStoreMapper.class, >> "metric_sdm_staleness"); >> private transient Histogram stalenessHisto = >> VneraMetrics.histogram(MSG_STALENESS_METRIC_NAME, VneraMetrics.DD_REPORTER); >> private transient MetricStoreProgramHelper metricStoreHelper; >> >> @Override >> public void open(Configuration parameters) throws Exception { >> ExecutionConfig.GlobalJobParameters jobParams = >> getRuntimeContext().getExecutionConfig().getGlobalJobParameters(); >> Configuration conf = >> ParameterTool.fromMap(jobParams.toMap()).getConfiguration(); >> String taskInstanceId = >> getRuntimeContext().getTaskNameWithSubtasks(); >> MetricStoreFactory.StoreType metStoreType = >> conf.getEnum(MetricStoreFactory.StoreType.class, >> StoreOptions.METRIC_STORE_TYPE); >> TaskManagerState state = >> TaskManagerState.getTaskManagerState(ConfigStoreFactory.StoreType.MEMORY, >> MetricStoreFactory.StoreType.MEMORY); >> ResourceManager rm = state.getResourceManager(); >> metricStoreHelper = getMetricStoreProgramHelper(rm, >> state.getTsdbMetricStore().writer(taskInstanceId), >> taskInstanceId); >> } >> >> @Override >> public void processElement(SelfDescribingMessageDO value, Context >> ctx, Collector<GenericMetricV2> out) throws Exception { >> MetricStoreProgramHelper.MetricStoreOutput output = >> metricStoreHelper.execute(value); >> for (SelfDescribingMessageDO sdm : output.outputSdms) { >> ctx.output(SideOutputs.metricStoreEvents, sdm); >> } >> } >> >> @VisibleForTesting >> public MetricStoreProgramHelper getMetricStoreProgramHelper(final >> ResourceManager rm, >> final >> TsdbMetricStore.Writer tsdbWriter, >> final >> String taskInstanceId) { >> return new MetricStoreProgramHelper(rm.getConfigStore(), >> rm.getDataModel(), >> tsdbWriter, >> taskInstanceId, >> rm.getPolicyManger(), >> stalenessHisto); >> } >> >> private static ModelKey modelKeyFromConfigKey(ConfigKey ck) { >> return ModelKey.create(ck.customerId, ck.objectType, ck.objectId); >> } >> } >> >> My Operator is like below. >> >> public class MetricStoreOperator implements >> AnaPipelineStage<GenericMetricV2> { >> private transient SingleOutputStreamOperator<GenericMetricV2> >> metricStream; >> private transient final MetricStoreMapper metricStoreMapper; >> >> public MetricStoreOperator(final Configuration jobParams, final >> MetricStoreMapper metricStoreMapper) { >> this.metricStoreMapper = metricStoreMapper; >> } >> >> @Override >> public AnaPipelineStage<GenericMetricV2> linkFrom(AnaPipelineStage<?>... >> operators) { >> AnaPipelineStage<SelfDescribingMessageDO> source = >> (AnaPipelineStage<SelfDescribingMessageDO>) >> Stream.of(operators).findFirst().get(); >> DataStream<SelfDescribingMessageDO> sdmStream = >> source.getOutputStream(); >> metricStream = sdmStream.process(metricStoreMapper); >> return this; >> } >> >> @Override >> public DataStream<GenericMetricV2> getOutputStream() { >> return metricStream; >> } >> >> @Override >> public DataStream getSideOutput(OutputTag outTag) { >> return metricStream.getSideOutput(outTag); >> } >> } >> >>