ahmedabu98 opened a new issue, #25669:
URL: https://github.com/apache/beam/issues/25669

   ### What happened?
   
   Was testing a SchemaTransform Python wrapper (#25521) and found that I had 
to have a right ordering of kwargs for the input arguments to reach the Java 
transform in the right fields. This is weird because the ordering of kwargs 
should have no impact.
   
   For example, where `self._table="my_project:my_dataset.xlang_table"`,
   
   the following works fine:
   ```
   external_storage_write = SchemaAwareExternalTransform(
       identifier=self.schematransform_config.identifier,
       expansion_service=self._expansion_service,
       createDisposition=self._create_disposition,
       writeDisposition=self._write_disposition,     #<---
       triggeringFrequencySeconds=self._triggering_frequency,
       useAtLeastOnceSemantics=self._use_at_least_once,
       table=self._table)                            #<---
   ```
   and I get a configuration object in Java transform that looks like this:
   ```
   BigQueryStorageWriteApiSchemaTransformConfiguration{
     table=my_project:my_dataset.xlang_table, 
     createDisposition=, 
     writeDisposition=, 
     triggeringFrequencySeconds=0, 
     useAtLeastOnceSemantics=false}
   ```
   
   However, if I change the kwargs to look like this (switch places of table 
and writeDisposition):
   ```
   external_storage_write = SchemaAwareExternalTransform(
       identifier=self.schematransform_config.identifier,
       expansion_service=self._expansion_service,
       createDisposition=self._create_disposition,
       table=self._table,                            #<---
       triggeringFrequencySeconds=self._triggering_frequency,
       useAtLeastOnceSemantics=self._use_at_least_once,
       writeDisposition=self._write_disposition)     #<---
   ```
   I get the following configuration object. Notice the value intended for 
`table` is now in the `writeDisposition` field.
   ```
   BigQueryStorageWriteApiSchemaTransformConfiguration{
     table=, 
     createDisposition=, 
     writeDisposition=my_project:my_dataset.xlang_table, 
     triggeringFrequencySeconds=0, 
     useAtLeastOnceSemantics=false
   ```
   
   
   ### Issue Priority
   
   Priority: 1 (data loss / total loss of function)
   
   ### Issue Components
   
   - [X] Component: Python SDK
   - [ ] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Spark Runner
   - [ ] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to