Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/581#issuecomment-90987769
  
    It is an interesting idea to collect back a data stream. This solution here 
has, however, quite a few limitations and implications (I assume it was only 
locally tested?):
    
      - It supports only `java.io.Serializable` types. This is a bit 
inconsistent with the current type handling and serialization in Flink. Some 
types that work in all other parts do not work here.
    
      - It does not work in a cluster. It sends "localhost" as the name to the 
worker who should send the data back. In any non-local setup, this cannot work.
    
      - It requires the worker to be able to connect to the client. This may be 
tricky, when the client and workers do not run both in the cluster.
    
      - Selecting the proper interface that opens the port for data 
communication is actually quite tricky. The TaskManagers spend quite a bit of 
work to select that interface - otherwise many installations do not work, since 
in most cases certain interfaces or hostnames are only accessible from certain 
networks (cloud internal and external network interfaces).
    
    I think this is a very tricky thing to realize. It has implications on the 
distributed process and communication model. It starts extending streaming to 
mixed local/remote runtimes and everything. It affects all assumptions we make 
for fault tolerance. What happens to the stream in case of a failure? There is 
no notion of restarting the driver.
    
    That is something that needs a bit more consideration and design, for the 
sake of building something consistent where the concepts and implications play 
together well. I hope you do not take it the wrong way, but without clarifying 
these points, this addition is a bit premature. 
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to