[
https://issues.apache.org/jira/browse/FLINK-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248280#comment-14248280
]
ASF GitHub Bot commented on FLINK-1325:
---------------------------------------
Github user StephanEwen commented on a diff in the pull request:
https://github.com/apache/incubator-flink/pull/269#discussion_r21900264
--- Diff:
flink-java/src/main/java/org/apache/flink/api/java/ClosureCleaner.java ---
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.java;
+
+
+import org.apache.flink.api.common.InvalidProgramException;
+import org.apache.flink.util.InstantiationUtil;
+import org.objectweb.asm.ClassReader;
+import org.objectweb.asm.ClassVisitor;
+import org.objectweb.asm.MethodVisitor;
+import org.objectweb.asm.Opcodes;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.lang.reflect.Field;
+
+public class ClosureCleaner {
+ private static Logger LOG =
LoggerFactory.getLogger(ClosureCleaner.class);
+
+ private static ClassReader getClassReader(Class<?> cls) {
+ String className = cls.getName().replaceFirst("^.*\\.", "") +
".class";
+ try {
+ return new
ClassReader(cls.getResourceAsStream(className));
+ } catch (IOException e) {
+ throw new RuntimeException("Could not create
ClassReader: " + e);
+ }
+ }
+
+ public static void clean(Object func, boolean checkSerializable) {
+ Class<?> cls = func.getClass();
+
+ String this0Name = null;
+
+ // First find the field name of the "this$0" field, this can
+ // be "field$x" depending on the nesting
+ for (Field f: cls.getDeclaredFields()) {
+ if (f.getName().startsWith("this$")) {
+ // found our field:
+ this0Name = f.getName();
+ }
+ }
+
+ if (this0Name == null) {
+ // no this$0 field, just return
+ return;
+ }
+
+ This0AccessFinder this0Finder = new
This0AccessFinder(this0Name);
+
+ getClassReader(cls).accept(this0Finder, 0);
+
+
+ if (LOG.isDebugEnabled()) {
+ LOG.debug(this0Name + " is accessed: " +
this0Finder.isThis0Accessed());
+ }
+
+ if (!this0Finder.isThis0Accessed()) {
+ Field this0;
+ try {
+ this0 =
func.getClass().getDeclaredField(this0Name);
+ } catch (NoSuchFieldException e) {
+ // has no this$0, just return
+ throw new RuntimeException("Could not set
this$0: " + e);
+ }
+ this0.setAccessible(true);
+ try {
+ this0.set(func, null);
+ } catch (IllegalAccessException e) {
+ // should not happen, since we use setAccessible
+ throw new RuntimeException("Could not set
this$0: " + e);
+ }
+ }
+
+ if (checkSerializable) {
+ ensureSerializable(func);
+ }
+ }
+
+ public static void ensureSerializable(Object func) {
+ try {
+ InstantiationUtil.serializeObject(func);
--- End diff --
That is probably okay for now, but for objects with large closures, we are
doing significant double work...
> Add a closure cleaner for Java
> ------------------------------
>
> Key: FLINK-1325
> URL: https://issues.apache.org/jira/browse/FLINK-1325
> Project: Flink
> Issue Type: Improvement
> Components: Java API
> Reporter: Stephan Ewen
> Assignee: Aljoscha Krettek
>
> The Java API could really need a simple closure cleaner.
> All functions that are implemented as anonymous subclasses hold a reference
> to the enclosing class, unless they are implemented as part of a static
> method.
> That reference (called {{this$0}}) causes serialization to fail, as it draws
> non serializable classes into the function, even in cases where the function
> makes no access to the enclosing data.
> It is possible to manually set this reference to {{null}}, using reflection,
> or using a debugger. Then the serialization succeeds.
> I suggest to add a closure cleaner that uses an ASM visitor over the
> function's code to see if there is any access to the {{this$0}} field. In
> case there is non, the field should be set to {{null}}.
> The problem can be reproduced with the simple program below:
> {code}
> public class Test {
> public void runProgram() throws Exception {
> ExecutionEnvironment env = ExecutionEnvironment
> .getExecutionEnvironment();
> env.generateSequence(1, 10)
> .map(new MapFunction<Long, Long>() {
> public Long map(Long value) {
> return value * 2;
> }
> })
> .print();
> env.execute();
> }
> public static void main(String[] args) throws Exception {
> new Test().runProgram();
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)