gaborgsomogyi opened a new pull request, #27923:
URL: https://github.com/apache/flink/pull/27923

   ## What is the purpose of the change
   
   Problem
   
   Flink 2.0 switched from flink-conf.yaml to config.yaml (standard YAML 
format). Standard YAML cannot represent a key that is simultaneously a scalar 
value and a parent node. This causes a structural conflict for several S3A 
config key pair where one key is a dot-prefix of another:
   
   - `fs.s3a.endpoint` vs `fs.s3a.endpoint.region`
   - `fs.s3a.assumed.role.sts.endpoint` vs 
`fs.s3a.assumed.role.sts.endpoint.region`
   - `fs.s3a.fast.upload` vs `fs.s3a.fast.upload.buffer` / 
`fs.s3a.fast.upload.active.blocks`
   - `fs.s3a.multipart.purge` vs `fs.s3a.multipart.purge.age`
   - `fs.s3a.s3guard.ddb.table` vs `fs.s3a.s3guard.ddb.table.create` / 
`.capacity.read` / `.capacity.write` / etc.
   
   When both keys are present in flinkConfiguration, the standard YAML 
serializer silently drops the longer key. For example, fs.s3a.endpoint.region 
disappears from config.yaml without any error, causing the S3 client to use the 
wrong signing region.
   
   Solution
   
   Add .value suffix aliases for the affected scalar keys to 
MIRRORED_CONFIG_KEYS in S3FileSystemFactory. The existing 
mirrorCertainHadoopConfig() mechanism in HadoopConfigLoader already supports 
this pattern — it copies the value of the alias key to the actual Hadoop key 
before the S3A filesystem is initialized.
   
   Users on Flink 2.x can now write:
   
   ```
   # config.yaml - no collision, both are children of the same parent node
   fs:
     s3a:
       endpoint:
         value: https://s3.eu-west-1.amazonaws.com   # remapped to 
fs.s3a.endpoint
         region: eu-west-1                            # passed through as-is
       fast:
         upload:
           value: "true"                              # remapped to 
fs.s3a.fast.upload
           buffer: disk
   ```
   
   The .value convention is intentionally uniform across all aliases so users 
can immediately understand the pattern without consulting documentation.
   
   This follows the same approach already used for `fs.s3a.access-key` → 
`fs.s3a.access.key` and `fs.s3a.path-style-access` → `fs.s3a.path.style.access`.
   
   ## Brief change log
   
   Add `.value` suffix key aliases in `S3FileSystemFactory` to support Flink v2 
standard YAML.
   
   ## Verifying this change
   
   Additional unit test.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
     - The S3 file system connector: yes
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
     - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to