Luka Jurukovski created FLINK-11895:
---------------------------------------

             Summary: Allow FileSystem Configs to be altered at Runtime
                 Key: FLINK-11895
                 URL: https://issues.apache.org/jira/browse/FLINK-11895
             Project: Flink
          Issue Type: Improvement
            Reporter: Luka Jurukovski


This stems from a need to be able to pass in S3 auth keys at runtime in order 
to allow users to specify the keys they want to use. Based on the documentation 
it seems that currently S3 keys need to be part of the Flink cluster 
configuration, in a hadoop file (which the cluster needs to pointed to) or JVM 
args.


This only seems to apply to the streaming API. Also Feel free to correct the 
following if I am wrong, as there may be pieces I have no run across, or parts 
of the code I have misunderstood.


Currently it seems that FileSystems are inferred based on the extension type 
and a set of cached Filesystems that are generated in the background. These 
seem to use the config as defined at the time they are stood up. Unfortunately 
there is no way to tap into this control mechanism or override this behavior as 
many places in the code pulls from this cache. This is particularly painful in 
the sink instance as there are places where this is used that are not 
accessible outside the package it is implemented.

Through a pretty hacky mechanism I have proved out that this is a self imposed 
limitation, as I was able to change the code to pass in a Filesystem from the 
top level and have it read and write to S3 given keys I set at runtime.

The current methodology is convenient, however there should be finer grain 
controls for instances where the cluster is in a multitenant environment.

As a final note it seems like both the FileSystem and FileSystemFactory classes 
are not Serializable. I can see why this would be the case in former, but I am 
not clear as to why a factory class would not be Serializable (like in the case 
of BucketFactory). If this can be made serializable this should make this a 
much cleaner process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to