Please always include the ML in the reply-list, so other can participate in the discussion / learn from the findings
we are aware of multiple issues when web-submission can result in classloader / thread local leaks, which could potentially result in the behavior you're describing. We're working on addressing them. FLINK-25022 [1]: The most critical one leaking thread locals. FLINK-25027 [2]: Is only a memory improvement for a particular situation (a lot of small batch jobs) and could be fixed by accounting for when setting Metaspace size. FLINK-25023 [3]: Can leak the classloader of the first job submitted via rest API. (constant overhead for Metaspace) In general, web-submission is different from a normal submission in way, that the "main method" of the uploaded jar is executed on JobManager and it's really hard to isolate it's execution from possible side effects. Could you by any chance try to submit jobs with the Flink CLI instead? That should be more robust when it comes to the class loading issues. Which endpoint are you using for submitting the job? "/jars/:jarid/run"? [1] https://issues.apache.org/jira/browse/FLINK-25022 [2] https://issues.apache.org/jira/browse/FLINK-25027 [3] https://issues.apache.org/jira/browse/FLINK-25023 On Tue, Dec 21, 2021 at 4:49 PM Lior Liviev <[email protected]> wrote: > Yes, I use the REST API. I'm running into OOM Metaspace, and I think it's > a class-loading problem, so that's why I'm thinking of putting the jar in > flink/lib > ------------------------------ > *From:* David Morávek <[email protected]> > *Sent:* Tuesday, December 21, 2021 5:43 PM > *To:* Lior Liviev <[email protected]> > *Cc:* [email protected] <[email protected]> > *Subject:* Re: Avoiding Dynamic Classloading for User Code > > > *CAUTION*: external source > Hi Lior, > > can you please provide details about the steps (I'm not sure what load jar > / execute with the API means)? are you submitting the job using the REST > API or Flink CLI? I assume you're using a session cluster. > > also what is the concern here? do you run into any class-loading related > issues? > > D. > > On Tue, Dec 21, 2021 at 3:48 PM Lior Liviev <[email protected]> > wrote: > > Hello, I have existing fixed cluster (*not* a new one with every job > execution) and a single Jar +multiple executions with different params. > > Currently my procedure is: 1. Download Jar 2. Load Jar with API 3. > Execute with API. > I plan to avoid dynamic class loading by applying method described in: > https://nightlies.apache.org/flink/flink-docs-master/docs/ops/debugging/debugging_classloading/#avoiding-dynamic-classloading-for-user-code > <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnightlies.apache.org%2Fflink%2Fflink-docs-master%2Fdocs%2Fops%2Fdebugging%2Fdebugging_classloading%2F%23avoiding-dynamic-classloading-for-user-code&data=04%7C01%7CLior.Liviev%40earnix.com%7C94f1b5c35aa8418c296508d9c498b537%7Cae9992508a9f4ae58a5dce9de7084b84%7C0%7C0%7C637756982418652203%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=H6deLwzAtI8XRsZZs2wz6P1H869NpEYwfA9AkJWJn7g%3D&reserved=0> > Debugging Classloading | Apache Flink > <https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnightlies.apache.org%2Fflink%2Fflink-docs-master%2Fdocs%2Fops%2Fdebugging%2Fdebugging_classloading%2F%23avoiding-dynamic-classloading-for-user-code&data=04%7C01%7CLior.Liviev%40earnix.com%7C94f1b5c35aa8418c296508d9c498b537%7Cae9992508a9f4ae58a5dce9de7084b84%7C0%7C0%7C637756982418652203%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=H6deLwzAtI8XRsZZs2wz6P1H869NpEYwfA9AkJWJn7g%3D&reserved=0> > Debugging Classloading # Overview of Classloading in Flink # When running > Flink applications, the JVM will load various classes over time. These > classes can be divided into three groups based on their origin: The Java > Classpath: This is Java’s common classpath, and it includes the JDK > libraries, and all code in Flink’s /lib folder (the classes of Apache Flink > and some dependencies). > nightlies.apache.org > <https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnightlies.apache.org%2F&data=04%7C01%7CLior.Liviev%40earnix.com%7C94f1b5c35aa8418c296508d9c498b537%7Cae9992508a9f4ae58a5dce9de7084b84%7C0%7C0%7C637756982418808450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=bSF2jkqJGtpVcKWeObaAPoLt6VQr36omqjxpXz0C2Ho%3D&reserved=0> > My question is: > > After putting the Jar in $FLINK/lib, do I need to load Jar and execute it > the old way, or what? > > Do not click on links or open attachments unless you recognize the sender. > Please use the report button if you believe this email is suspicious. >
