Cham is absolutely correct. The google-cloud-pubsub higher-level library
didn't exist when the Beam connector were written, and nobody has gotten
around to rewriting that connector.

On Wed, Jan 2, 2019 at 11:29 PM Chamikara Jayalath <chamik...@google.com>
wrote:

> Thanks Jeff for the interest in this.
>
> I think most of the existing GCP IO connectors use Google API client
> libraries [1] due to historical reasons (these were the libraries that were
> available when these connectors were originally built).
> We should upgrade to latest Google Cloud client libraries [2] at some
> point but I don't have an exact ETA for this.
>
> Thanks,
> Cham
>
> [1] https://developers.google.com/api-client-library/
> [2] https://cloud.google.com/apis/docs/cloud-client-libraries
>
> On Wed, Jan 2, 2019 at 10:38 AM Jeff Klukas <jklu...@mozilla.com> wrote:
>
>> My apologies. I got the terminology entirely wrong.
>>
>> As you say, PubsubIO and other Beam components _do_ use the official
>> Google API client library (google-api-client). They do not, however, use
>> the higher-level Google Cloud libraries such as google-cloud-pubsub which
>> provide abstractions on top of the API client library.
>>
>> I am wondering whether there are technical reasons not to use the
>> higher-level service-specific libraries, or whether this is simply
>> historical.
>>
>> On Wed, Jan 2, 2019 at 12:38 PM Anton Kedin <ke...@google.com> wrote:
>>
>>> I don't have enough context to answer all of the questions, but looking
>>> at PubsubIO it seems to use the official libraries, e.g. see Pubsub doc
>>> [1]  vs Pubsub IO GRPC client [2]. Correct me if I misunderstood
>>> your question.
>>>
>>> [1]
>>> https://cloud.google.com/pubsub/docs/publisher#pubsub-publish-message-java
>>> [2]
>>> https://github.com/apache/beam/blob/2e759fecf63d62d110f29265f9438128e3bdc8ab/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubGrpcClient.java#L189
>>>
>>> Pubsub IO JSON client seems to use a slightly different approach but
>>> still relies on somewhat official path, e.g. Pubsub doc [3] (javadoc[4]) vs
>>> Pubsub IO JSON client [5].
>>>
>>> [3] https://developers.google.com/api-client-library/java/apis/pubsub/v1
>>> [4]
>>> https://developers.google.com/resources/api-libraries/documentation/pubsub/v1/java/latest/com/google/api/services/pubsub/Pubsub.html
>>> [5]
>>> https://github.com/apache/beam/blob/2e759fecf63d62d110f29265f9438128e3bdc8ab/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubJsonClient.java#L130
>>>
>>> The latter seems to be the older library, so I would assume it's for
>>> legacy reasons.
>>>
>>> Regards,
>>> Anton
>>>
>>>
>>> On Wed, Jan 2, 2019 at 9:03 AM Jeff Klukas <jklu...@mozilla.com> wrote:
>>>
>>>> I'm building a high-volume Beam pipeline using PubsubIO and running
>>>> into some concerns over performance and delivery semantics, prompting me to
>>>> want to better understand the implementation. Reading through the library,
>>>> PubsubIO appears to be a completely separate implementation of Pubsub
>>>> client behavior from Google's own Java client. As a developer trying to
>>>> read and understand the implementation, this is a significant hurdle, since
>>>> any previous knowledge of the Google library is not applicable and is
>>>> potentially at odds with what's in PubsubIO.
>>>>
>>>> Why doesn't beam use the Google clients for PubsubIO, BigQueryIO, etc.?
>>>> Is it for historical reasons? Is there difficulty in packaging and
>>>> integration of the Google clients? Or are the needs for Beam just
>>>> substantially different from what the Google libraries provide?
>>>>
>>>

Reply via email to