[ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=296816&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296816
 ]

ASF GitHub Bot logged work on BEAM-7738:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Aug/19 18:11
            Start Date: 17/Aug/19 18:11
    Worklog Time Spent: 10m 
      Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-522258539
 
 
   Ok, I now know how the `PubsubMessageWithAttributesCoder` is getting swapped 
with a `BytesCoder`, but it turns out it's intentional and I don't know why.
   
   In `FlinkStreamingPortablePipelineTranslator.translateRead` when the coder 
is initiated it is passed through 
`LengthPrefixUnknownCoders.addLengthPrefixedCoder` which silently replaces all 
non-model coders with `LengthPrefixCoder(ByteArrayCoder)`.  This _seems_ like 
the sort of thing that should print a warning, since I assume that a broken 
pipeline is a likely outcome.
   
   I'm unclear why this coder swap is necessary.  The java part of this 
pipeline is a `Read -> ParDo -> ParDo`, shouldn't this segment be able to 
utilize java-only coders (i.e. non-model coder)?
   
   What's the proper solution here?  The last java ParDo is the one that's 
ensuring we have a byte array for sending to python, but evidently this needs 
to be happening in the Read itself?
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 296816)
    Time Spent: 2h 20m  (was: 2h 10m)

> Support PubSubIO to be configured externally for use with other SDKs
> --------------------------------------------------------------------
>
>                 Key: BEAM-7738
>                 URL: https://issues.apache.org/jira/browse/BEAM-7738
>             Project: Beam
>          Issue Type: New Feature
>          Components: io-java-gcp, runner-flink, sdk-py-core
>            Reporter: Chad Dombrova
>            Assignee: Chad Dombrova
>            Priority: Major
>              Labels: portability
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to