Hi Matthew,

Welcome to Beam!

Looking at Python PubSub IO API, you should be able to access id and
timestamp by setting `with_attributes=True` when using `ReadFromPubSub`
PTransform, see [1,2].

[1]
https://github.com/apache/beam/blob/0fce2b88660f52dae638697e1472aa108c982ae6/sdks/python/apache_beam/io/gcp/pubsub.py#L61
[2]
https://github.com/apache/beam/blob/0fce2b88660f52dae638697e1472aa108c982ae6/sdks/python/apache_beam/io/gcp/pubsub.py#L138

On Fri, Jul 12, 2019 at 1:36 AM Matthew Darwin <
matthew.dar...@carfinance247.co.uk> wrote:

> Good morning,
>
> I'm very new to Beam, and pretty new to Python so please first accept my
> apologies for any obvious misconceptions/mistakes in the following.
>
> I am currently trying to develop a sample pipeline in Python to pull
> messages from Pub/Sub and then write them to either files in cloud storage
> or to BigQuery. The ultimate goal will be to utilise the pipeline for real
> time streaming of event data to BigQuery (with various transformations) but
> also to store the raw messages long term in files in cloud storage.
>
> At the moment, I'm simply trying to parse the message to get the PubSub
> messageId and publishTime in order to be able to write them into the
> output. The json of my PubSub message looks like this:-
>
> [
>   {
>     "ackId":
> "BCEhPjA-RVNEUAYWLF1GSFE3GQhoUQ5PXiM_NSAoRRIICBQFfH1xU1t1Xl8aB1ENGXJ8Zyc_XxcIB0BTeFVaEQx6bVxXOFcMEHF8YXZpWhUIA0FTfXeq5cveluzJNksxIbvE8KxfeqqmgfhiZho9XxJLLD5-PT5FQV5AEkw2C0RJUytDCypYEU4",
>     "message": {
>       "attributes": {
>         "source": "python"
>       },
>       "data": "eyJyb3dudW1iZXIiOiAyfQ==",
>       "messageId": "619310330691403",
>       "publishTime": "2019-07-12T08:27:58.522Z"
>     }
>   }
> ]
> According to the documentation
> <https://beam.apache.org/releases/pydoc/2.13.0/apache_beam.io.gcp.pubsub.html>
> the PubSub message payload returns the *data* and *attributes*
> properties; is there simply no way of retrieving the messageId and
> publishTime, or are these exposed somewhere else? If not, will the
> inclusion of these be in the roadmap, and are they available if using Java
> (I have zero Java experience hence why reaching for Python first).
>
> Kind regards,
>
> Matthew
>
>

Reply via email to