[ https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=296816&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296816 ]
ASF GitHub Bot logged work on BEAM-7738: ---------------------------------------- Author: ASF GitHub Bot Created on: 17/Aug/19 18:11 Start Date: 17/Aug/19 18:11 Worklog Time Spent: 10m Work Description: chadrik commented on issue #9268: [BEAM-7738] Add external transform support to PubsubIO URL: https://github.com/apache/beam/pull/9268#issuecomment-522258539 Ok, I now know how the `PubsubMessageWithAttributesCoder` is getting swapped with a `BytesCoder`, but it turns out it's intentional and I don't know why. In `FlinkStreamingPortablePipelineTranslator.translateRead` when the coder is initiated it is passed through `LengthPrefixUnknownCoders.addLengthPrefixedCoder` which silently replaces all non-model coders with `LengthPrefixCoder(ByteArrayCoder)`. This _seems_ like the sort of thing that should print a warning, since I assume that a broken pipeline is a likely outcome. I'm unclear why this coder swap is necessary. The java part of this pipeline is a `Read -> ParDo -> ParDo`, shouldn't this segment be able to utilize java-only coders (i.e. non-model coder)? What's the proper solution here? The last java ParDo is the one that's ensuring we have a byte array for sending to python, but evidently this needs to be happening in the Read itself? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 296816) Time Spent: 2h 20m (was: 2h 10m) > Support PubSubIO to be configured externally for use with other SDKs > -------------------------------------------------------------------- > > Key: BEAM-7738 > URL: https://issues.apache.org/jira/browse/BEAM-7738 > Project: Beam > Issue Type: New Feature > Components: io-java-gcp, runner-flink, sdk-py-core > Reporter: Chad Dombrova > Assignee: Chad Dombrova > Priority: Major > Labels: portability > Time Spent: 2h 20m > Remaining Estimate: 0h > > Now that KafkaIO is supported via the external transform API (BEAM-7029) we > should add support for PubSub. -- This message was sent by Atlassian JIRA (v7.6.14#76016)