Re: Passing arbitrary Hadoop s3a properties from FileSystem SQL Connector options

2021-12-15 Thread Arvid Heise
Hi Timothy,

The issue would require a refactor FileSystems abstraction to allow
multiple FileSystems objects of the same FileSystem type being configured
differently. While this would improve the code quality and enable such use
cases, I currently have no capacity to work on it or guide it.

If you are interested in working on it, I can try to find someone to
shepherd.

On Mon, Dec 13, 2021 at 7:13 PM Timothy James  wrote:

> Thank you Timo.  Hi Arvid!
>
> I note that that ticket proposes two alternatives for solution.  The first
>
> > Either we allow a properties map similar to Kafka or Kinesis properties
> to our connectors.
>
> seems to solve our problem.  The second, much more detailed, appears
> unrelated to our needs:
>
> > Or something like:
> > Management of two properties related S3 Object management:
> > ...
>
> The ticket is unassigned and has been open for more than a year.  It looks
> like you increased the ticket priority, thank you.
>
> Tim
>
> On Mon, Dec 13, 2021 at 6:52 AM Timo Walther  wrote:
>
>> Hi Timothy,
>>
>> unfortunetaly, this is not supported yet. However, the effort will be
>> tracked under the following ticket:
>>
>> https://issues.apache.org/jira/browse/FLINK-19589
>>
>> I will loop-in Arvid (in CC) which might help you in contributing the
>> missing functioniality.
>>
>> Regards,
>> Timo
>>
>>
>> On 10.12.21 23:48, Timothy James wrote:
>> > Hi,
>> >
>> > The Hadoop s3a library itself supports some properties we need, but the
>> > "FileSystem SQL Connector" (via FileSystemTableFactory) does not pass
>> > connector options for these to the "Hadoop/Presto S3 File Systems
>> > plugins" (via S3FileSystemFactory).
>> >
>> > Instead, only Job-global Flink config values are passed to Hadoop s3a.
>> > That won't work for us: we need to vary these values per Flink SQL
>> > table, and not override our config for other use of S3 (such as Flink
>> > checkpointing).
>> >
>> > Contrast this with the Kafka connector, which supports an analogous
>> > "properties.*" prefixed pass-through mechanism, and the Kinesis
>> > connector, which supports all the specific properties we would need out
>> > of the box.
>> >
>> > Our current intent is to alter FileSystemTableFactory to follow the
>> > "properties.*" approach used by the Kafka connector.
>> >
>> > *** ➡️ Our questions for you: ⬅️
>> > - Know of anything like this? Anybody solved this?
>> > - Know of anything that's going to break this approach?
>> > - What are we missing?
>> >
>> > For context, our particular use case requires options like:
>> > - fs.s3a.assumed.role.arn
>> > - fs.s3a.aws.credentials.provider, (or some other mechanism to pass
>> > externalId)
>> >
>> > We imagine there would be other use cases for this, and if we build it
>> > ourselves there's the possibility of contributing it to the Flink repo
>> > for everybody.
>> >
>> > Relevant documentation:
>> > -
>> >
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/filesystem/
>> > <
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/filesystem/
>> >
>> > -
>> >
>> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/s3/#hadooppresto-s3-file-systems-plugins
>> > <
>> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/s3/#hadooppresto-s3-file-systems-plugins
>> >
>> > -
>> >
>> https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html
>> > <
>> https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html
>> >
>> > -
>> >
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/kafka/#properties
>> > <
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/kafka/#properties
>> >
>> > -
>> >
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/kinesis/#aws-credentials-role-externalid
>> > <
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/kinesis/#aws-credentials-role-externalid
>> >
>> >
>> > Thank you!
>> >
>> > Tim James
>> > Decodable.co
>> >
>>
>>


Re: Passing arbitrary Hadoop s3a properties from FileSystem SQL Connector options

2021-12-13 Thread Timothy James
Thank you Timo.  Hi Arvid!

I note that that ticket proposes two alternatives for solution.  The first

> Either we allow a properties map similar to Kafka or Kinesis properties
to our connectors.

seems to solve our problem.  The second, much more detailed, appears
unrelated to our needs:

> Or something like:
> Management of two properties related S3 Object management:
> ...

The ticket is unassigned and has been open for more than a year.  It looks
like you increased the ticket priority, thank you.

Tim

On Mon, Dec 13, 2021 at 6:52 AM Timo Walther  wrote:

> Hi Timothy,
>
> unfortunetaly, this is not supported yet. However, the effort will be
> tracked under the following ticket:
>
> https://issues.apache.org/jira/browse/FLINK-19589
>
> I will loop-in Arvid (in CC) which might help you in contributing the
> missing functioniality.
>
> Regards,
> Timo
>
>
> On 10.12.21 23:48, Timothy James wrote:
> > Hi,
> >
> > The Hadoop s3a library itself supports some properties we need, but the
> > "FileSystem SQL Connector" (via FileSystemTableFactory) does not pass
> > connector options for these to the "Hadoop/Presto S3 File Systems
> > plugins" (via S3FileSystemFactory).
> >
> > Instead, only Job-global Flink config values are passed to Hadoop s3a.
> > That won't work for us: we need to vary these values per Flink SQL
> > table, and not override our config for other use of S3 (such as Flink
> > checkpointing).
> >
> > Contrast this with the Kafka connector, which supports an analogous
> > "properties.*" prefixed pass-through mechanism, and the Kinesis
> > connector, which supports all the specific properties we would need out
> > of the box.
> >
> > Our current intent is to alter FileSystemTableFactory to follow the
> > "properties.*" approach used by the Kafka connector.
> >
> > *** ➡️ Our questions for you: ⬅️
> > - Know of anything like this? Anybody solved this?
> > - Know of anything that's going to break this approach?
> > - What are we missing?
> >
> > For context, our particular use case requires options like:
> > - fs.s3a.assumed.role.arn
> > - fs.s3a.aws.credentials.provider, (or some other mechanism to pass
> > externalId)
> >
> > We imagine there would be other use cases for this, and if we build it
> > ourselves there's the possibility of contributing it to the Flink repo
> > for everybody.
> >
> > Relevant documentation:
> > -
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/filesystem/
> > <
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/filesystem/
> >
> > -
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/s3/#hadooppresto-s3-file-systems-plugins
> > <
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/s3/#hadooppresto-s3-file-systems-plugins
> >
> > -
> >
> https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html
> > <
> https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html
> >
> > -
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/kafka/#properties
> > <
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/kafka/#properties
> >
> > -
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/kinesis/#aws-credentials-role-externalid
> > <
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/kinesis/#aws-credentials-role-externalid
> >
> >
> > Thank you!
> >
> > Tim James
> > Decodable.co
> >
>
>


Re: Passing arbitrary Hadoop s3a properties from FileSystem SQL Connector options

2021-12-13 Thread Timo Walther

Hi Timothy,

unfortunetaly, this is not supported yet. However, the effort will be 
tracked under the following ticket:


https://issues.apache.org/jira/browse/FLINK-19589

I will loop-in Arvid (in CC) which might help you in contributing the 
missing functioniality.


Regards,
Timo


On 10.12.21 23:48, Timothy James wrote:

Hi,

The Hadoop s3a library itself supports some properties we need, but the 
"FileSystem SQL Connector" (via FileSystemTableFactory) does not pass 
connector options for these to the "Hadoop/Presto S3 File Systems 
plugins" (via S3FileSystemFactory).


Instead, only Job-global Flink config values are passed to Hadoop s3a.  
That won't work for us: we need to vary these values per Flink SQL 
table, and not override our config for other use of S3 (such as Flink 
checkpointing).


Contrast this with the Kafka connector, which supports an analogous 
"properties.*" prefixed pass-through mechanism, and the Kinesis 
connector, which supports all the specific properties we would need out 
of the box.


Our current intent is to alter FileSystemTableFactory to follow the 
"properties.*" approach used by the Kafka connector.


*** ➡️ Our questions for you: ⬅️
- Know of anything like this? Anybody solved this?
- Know of anything that's going to break this approach?
- What are we missing?

For context, our particular use case requires options like:
- fs.s3a.assumed.role.arn
- fs.s3a.aws.credentials.provider, (or some other mechanism to pass 
externalId)


We imagine there would be other use cases for this, and if we build it 
ourselves there's the possibility of contributing it to the Flink repo 
for everybody.


Relevant documentation:
- 
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/filesystem/ 

- 
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/filesystems/s3/#hadooppresto-s3-file-systems-plugins 

- 
https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html 

- 
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/kafka/#properties 

- 
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/table/kinesis/#aws-credentials-role-externalid 



Thank you!

Tim James
Decodable.co