Dataflow service has a 10MB request size limit. Seems like you are hitting this. See following for more information regarding this. https://cloud.google.com/dataflow/pipelines/troubleshooting-your-pipeline
Looks like your are hitting this due to number of partitions. I don't think currently there's a good solution other than to execute multiple jobs. We hope to introduce dynamic destinations feature to Python BQ sink in the near future which will allow you to write this using a more compact pipeline. Thanks, Cham On Wed, Jan 10, 2018 at 10:22 PM Unais Thachuparambil < unais.thachuparam...@careem.com> wrote: > I wrote a python dataflow job to read data from biqquery and do some > transform and save the result as bq table.. > > I tested with 8 days data it works fine - when I scaled to 180 days I’m > getting the below error > > ```"message": "Request payload size exceeds the limit: 10485760 bytes.",``` > > > ```pitools.base.py.exceptions.HttpError: HttpError accessing < > https://dataflow.googleapis.com/v1b3/projects/careem-mktg-dwh/locations/us-central1/jobs?alt=json>: > response: <{'status': '400', 'content-length': '145', 'x-xss-protection': > '1; mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding': > 'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF', > '-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Wed, 10 > Jan 2018 22:49:32 GMT', 'x-frame-options': 'SAMEORIGIN', 'alt-svc': > 'hq=":443"; ma=2592000; quic=51303431; quic=51303339; quic=51303338; > quic=51303337; quic=51303335,quic=":443"; ma=2592000; v="41,39,38,37,35"', > 'content-type': 'application/json; charset=UTF-8'}>, content <{ > "error": { > "code": 400, > "message": "Request payload size exceeds the limit: 10485760 bytes.", > "status": "INVALID_ARGUMENT" > } > > ``` > > > In short, this is what I’m doing > 1 - Reading data from bigquery table using > ```beam.io.BigQuerySource ``` > 2 - Partitioning each days using > ``` beam.Partition ``` > 3- Applying transforms each partition and combining some output > P-Collections. > 4- After the transforms, the results are saved to a biqquery date > partitioned table. >