[jira] [Commented] (FLINK-12113) User code passing to fromCollection(Iterator, Class) not cleaned

2019-04-12 Thread yankai zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816008#comment-16816008
 ] 

yankai zhang commented on FLINK-12113:
--

I'm not quite familiar with flink project development, maybe you can help fix 
this, thx.

> User code passing to fromCollection(Iterator, Class) not cleaned
> 
>
> Key: FLINK-12113
> URL: https://issues.apache.org/jira/browse/FLINK-12113
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.7.2
>Reporter: yankai zhang
>Priority: Major
> Attachments: image-2019-04-07-21-52-37-264.png, 
> image-2019-04-08-23-19-27-359.png
>
>
>  
> {code:java}
> interface IS extends Iterator, Serializable { }
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }, Object.class);
> {code}
> Code piece above throws exception:
> {code:java}
> org.apache.flink.api.common.InvalidProgramException: The implementation of 
> the SourceFunction is not serializable. The object probably contains or 
> references non serializable fields.
>   at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
> {code}
> And my workaround is wrapping clean around iterator instance, like this:
>  
> {code:java}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(env.clean(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }), Object.class);
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (FLINK-12113) User code passing to fromCollection(Iterator, Class) not cleaned

2019-04-08 Thread yankai zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812967#comment-16812967
 ] 

yankai zhang edited comment on FLINK-12113 at 4/9/19 3:20 AM:
--

Yes, _fromCollection(Iterator, Class)_ works well as expected without anonymous 
class.

Problem here is anonymous class object in instance method implicitly references 
outer _this_(but not actually used), while outer _this_ is not serializable, 
and this is exactly what _StreamExecutionEnvironment#clean_ supposed to do.

In fact, the iterator passed by user is wrapped within a 
_FromIteratorFunction_, and then _StreamExecutionEnvironment#clean_ is called 
on that wrapper _ _instance, not the iterator itself. However current 
implementation of _StreamExecutionEnvironment#clean_ is not recursive, it can't 
find and clean _this_ deeply nested in closure.

Here is my fully reproducible code:
{code:java}
public class MainTest {


interface IS extends Iterator, Serializable {
}

@Test
public void cleanTest() {
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.fromCollection(new IS() {
@Override
public boolean hasNext() {
return false;
}

@Override
public Object next() {
return null;
}
}, Object.class);
}
}{code}


was (Author: vision57):
Yes, _fromCollection(Iterator, Class)_ works well as expected without anonymous 
class.

Problem here is anonymous class object in instance method implicitly references 
outer _this_(but not actually used), while outer _this_ is not serializable, 
and this is exactly what _StreamExecutionEnvironment#clean_ supposed to do.

In act, the iterator passed by user is wrapped within a _FromIteratorFunction_, 
and then _StreamExecutionEnvironment#clean_ is called on that wrapper __ 
instance, not the iterator itself. However current implementation of 
_StreamExecutionEnvironment#clean_ is not recursive, it can't find and clean 
_this_ deeply nested in closure.

Here is my fully reproducible code:
{code:java}
public class MainTest {


interface IS extends Iterator, Serializable {
}

@Test
public void cleanTest() {
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.fromCollection(new IS() {
@Override
public boolean hasNext() {
return false;
}

@Override
public Object next() {
return null;
}
}, Object.class);
}
}{code}

> User code passing to fromCollection(Iterator, Class) not cleaned
> 
>
> Key: FLINK-12113
> URL: https://issues.apache.org/jira/browse/FLINK-12113
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.7.2
>Reporter: yankai zhang
>Priority: Major
> Attachments: image-2019-04-07-21-52-37-264.png, 
> image-2019-04-08-23-19-27-359.png
>
>
>  
> {code:java}
> interface IS extends Iterator, Serializable { }
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }, Object.class);
> {code}
> Code piece above throws exception:
> {code:java}
> org.apache.flink.api.common.InvalidProgramException: The implementation of 
> the SourceFunction is not serializable. The object probably contains or 
> references non serializable fields.
>   at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
> {code}
> And my workaround is wrapping clean around iterator instance, like this:
>  
> {code:java}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(env.clean(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }), Object.class);
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-12113) User code passing to fromCollection(Iterator, Class) not cleaned

2019-04-08 Thread yankai zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812967#comment-16812967
 ] 

yankai zhang commented on FLINK-12113:
--

Yes, _fromCollection(Iterator, Class)_ works well as expected without anonymous 
class.

Problem here is anonymous class object in instance method implicitly references 
outer _this_(but not actually used), while outer _this_ is not serializable, 
and this is exactly what _StreamExecutionEnvironment#clean_ supposed to do.

In act, the iterator passed by user is wrapped within a _FromIteratorFunction_, 
and then _StreamExecutionEnvironment#clean_ is called on that wrapper __ 
instance, not the iterator itself. However current implementation of 
_StreamExecutionEnvironment#clean_ is not recursive, it can't find and clean 
_this_ deeply nested in closure.

Here is my fully reproducible code:
{code:java}
public class MainTest {


interface IS extends Iterator, Serializable {
}

@Test
public void cleanTest() {
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.fromCollection(new IS() {
@Override
public boolean hasNext() {
return false;
}

@Override
public Object next() {
return null;
}
}, Object.class);
}
}{code}

> User code passing to fromCollection(Iterator, Class) not cleaned
> 
>
> Key: FLINK-12113
> URL: https://issues.apache.org/jira/browse/FLINK-12113
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.7.2
>Reporter: yankai zhang
>Priority: Major
> Attachments: image-2019-04-07-21-52-37-264.png, 
> image-2019-04-08-23-19-27-359.png
>
>
>  
> {code:java}
> interface IS extends Iterator, Serializable { }
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }, Object.class);
> {code}
> Code piece above throws exception:
> {code:java}
> org.apache.flink.api.common.InvalidProgramException: The implementation of 
> the SourceFunction is not serializable. The object probably contains or 
> references non serializable fields.
>   at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
> {code}
> And my workaround is wrapping clean around iterator instance, like this:
>  
> {code:java}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(env.clean(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }), Object.class);
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (FLINK-12113) User code passing to fromCollection(Iterator, Class) not cleaned

2019-04-07 Thread yankai zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812049#comment-16812049
 ] 

yankai zhang edited comment on FLINK-12113 at 4/8/19 2:43 AM:
--

Interesting. I guess maybe java has some optimizing to make your anonymous 
class instance static, so you don't have reference to outer _this_. I find an 
explaination on stackoverflow: https://stackoverflow.com/a/758616/4281058. 
Actually there is no outer _this_ in your case, you can try putting your code 
into an instance method instead of static main.  


was (Author: vision57):
Interesting. I guess maybe java has some optimizing to make your anonymous 
class instance static, so you don't have reference to outer _this_. I find [an 
explaination on stackoverflow|[https://stackoverflow.com/a/758616/4281058]]. 
Actually there is no outer _this_ in your case, you can try putting your code 
into an instance method instead of static main.  

> User code passing to fromCollection(Iterator, Class) not cleaned
> 
>
> Key: FLINK-12113
> URL: https://issues.apache.org/jira/browse/FLINK-12113
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.7.2
>Reporter: yankai zhang
>Priority: Major
> Attachments: image-2019-04-07-21-52-37-264.png
>
>
>  
> {code:java}
> interface IS extends Iterator, Serializable { }
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }, Object.class);
> {code}
> Code piece above throws exception:
> {code:java}
> org.apache.flink.api.common.InvalidProgramException: The implementation of 
> the SourceFunction is not serializable. The object probably contains or 
> references non serializable fields.
>   at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
> {code}
> And my workaround is wrapping clean around iterator instance, like this:
>  
> {code:java}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(env.clean(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }), Object.class);
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (FLINK-12113) User code passing to fromCollection(Iterator, Class) not cleaned

2019-04-07 Thread yankai zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812049#comment-16812049
 ] 

yankai zhang edited comment on FLINK-12113 at 4/8/19 2:40 AM:
--

Interesting. I guess maybe java has some optimizing to make your anonymous 
class instance static, so you don't have reference to outer _this_. I find [an 
explaination on stackoverflow|[https://stackoverflow.com/a/758616/4281058]]. 
Actually there is no outer _this_ in your case, you can try putting your code 
into an instance method instead of static main.  


was (Author: vision57):
Interesting. I guess maybe java has some optimizing to make your anonymous 
class instance static, so you don't have reference to outer _this_. I find [an 
explaination on stackoverflow|[https://stackoverflow.com/a/758616/4281058].] 
Actually there is no outer _this_ in your case, you can try putting your code 
into an instance method instead of static main.  

> User code passing to fromCollection(Iterator, Class) not cleaned
> 
>
> Key: FLINK-12113
> URL: https://issues.apache.org/jira/browse/FLINK-12113
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.7.2
>Reporter: yankai zhang
>Priority: Major
> Attachments: image-2019-04-07-21-52-37-264.png
>
>
>  
> {code:java}
> interface IS extends Iterator, Serializable { }
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }, Object.class);
> {code}
> Code piece above throws exception:
> {code:java}
> org.apache.flink.api.common.InvalidProgramException: The implementation of 
> the SourceFunction is not serializable. The object probably contains or 
> references non serializable fields.
>   at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
> {code}
> And my workaround is wrapping clean around iterator instance, like this:
>  
> {code:java}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(env.clean(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }), Object.class);
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-12113) User code passing to fromCollection(Iterator, Class) not cleaned

2019-04-07 Thread yankai zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812049#comment-16812049
 ] 

yankai zhang commented on FLINK-12113:
--

Interesting. I guess maybe java has some optimizing to make your anonymous 
class instance static, so you don't have reference to outer _this_. I find [an 
explaination on stackoverflow|[https://stackoverflow.com/a/758616/4281058].] 
Actually there is no outer _this_ in your case, you can try putting your code 
into an instance method instead of static main.  

> User code passing to fromCollection(Iterator, Class) not cleaned
> 
>
> Key: FLINK-12113
> URL: https://issues.apache.org/jira/browse/FLINK-12113
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.7.2
>Reporter: yankai zhang
>Priority: Major
> Attachments: image-2019-04-07-21-52-37-264.png
>
>
>  
> {code:java}
> interface IS extends Iterator, Serializable { }
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }, Object.class);
> {code}
> Code piece above throws exception:
> {code:java}
> org.apache.flink.api.common.InvalidProgramException: The implementation of 
> the SourceFunction is not serializable. The object probably contains or 
> references non serializable fields.
>   at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
> {code}
> And my workaround is wrapping clean around iterator instance, like this:
>  
> {code:java}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(env.clean(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }), Object.class);
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-12113) User code passing to fromCollection(Iterator, Class) not cleaned

2019-04-04 Thread yankai zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yankai zhang updated FLINK-12113:
-
Description: 
 
{code:java}
interface IS extends Iterator, Serializable { }

StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.fromCollection(new IS() {
@Override
public boolean hasNext() {
return false;
}

@Override
public Object next() {
return null;
}
}, Object.class);
{code}
Code piece above throws exception:
{code:java}
org.apache.flink.api.common.InvalidProgramException: The implementation of the 
SourceFunction is not serializable. The object probably contains or references 
non serializable fields.

  at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
{code}
And my workaround is wrapping clean around iterator instance, like this:

 
{code:java}
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.fromCollection(env.clean(new IS() {
@Override
public boolean hasNext() {
return false;
}

@Override
public Object next() {
return null;
}
}), Object.class);
{code}
 

 

 

  was:
 
{code:java}
interface IS extends Iterator, Serializable { }

StreamExecutionEnvironment
  .getExecutionEnvironment()
  .fromCollection(new IS() {
@Override
public boolean hasNext() {
  return false;
}

@Override
public Object next() {
  return null;
}
  }, Object.class);
{code}
Code piece above throws exception:
{code:java}
org.apache.flink.api.common.InvalidProgramException: The implementation of the 
SourceFunction is not serializable. The object probably contains or references 
non serializable fields.

  at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
{code}


> User code passing to fromCollection(Iterator, Class) not cleaned
> 
>
> Key: FLINK-12113
> URL: https://issues.apache.org/jira/browse/FLINK-12113
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.7.2
>Reporter: yankai zhang
>Priority: Major
>
>  
> {code:java}
> interface IS extends Iterator, Serializable { }
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }, Object.class);
> {code}
> Code piece above throws exception:
> {code:java}
> org.apache.flink.api.common.InvalidProgramException: The implementation of 
> the SourceFunction is not serializable. The object probably contains or 
> references non serializable fields.
>   at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
> {code}
> And my workaround is wrapping clean around iterator instance, like this:
>  
> {code:java}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.fromCollection(env.clean(new IS() {
> @Override
> public boolean hasNext() {
> return false;
> }
> @Override
> public Object next() {
> return null;
> }
> }), Object.class);
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12113) User code passing to fromCollection(Iterator, Class) not cleaned

2019-04-04 Thread yankai zhang (JIRA)
yankai zhang created FLINK-12113:


 Summary: User code passing to fromCollection(Iterator, Class) not 
cleaned
 Key: FLINK-12113
 URL: https://issues.apache.org/jira/browse/FLINK-12113
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream
Affects Versions: 1.7.2
Reporter: yankai zhang


 
{code:java}
interface IS extends Iterator, Serializable { }

StreamExecutionEnvironment
  .getExecutionEnvironment()
  .fromCollection(new IS() {
@Override
public boolean hasNext() {
  return false;
}

@Override
public Object next() {
  return null;
}
  }, Object.class);
{code}
Code piece above throws exception:
{code:java}
org.apache.flink.api.common.InvalidProgramException: The implementation of the 
SourceFunction is not serializable. The object probably contains or references 
non serializable fields.

  at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:99)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)