BigQuery FILE_LOADS recover from errors

2018-03-27 Thread Carlos Alonso
Hi all!!

On my pipeline I want to dump some data into BQ using FILE_LOADS write
method and I can't see how would I recover from errors (i.e. on the
pipeline detect which records couldn't be inserted and store it somewhere
else for further inspection) as the WriteTables transform throws an
Exception after BatchLoads.MAX_RETRY_JOBS retries...

How could I approach that?

Thanks!


Re: Dataflow throwing backend error

2018-03-27 Thread Ahmet Altay
Hi Rajesh,

This looks like a transient error from GCS. Beam SDK will retry tasks in
the face of such errors and those typically do not make your pipeline fail.
If you have additional questions please reach out to Dataflow support (
https://cloud.google.com/dataflow/support).

Thank you,
Ahmet

On Tue, Mar 27, 2018 at 3:58 AM, Rajesh Hegde 
wrote:

> Hi,
> We are getting backend error with Google Cloud Storage service, any idea
> why it happens and how to fix it? Error traceback is pasted below.
>
> *Traceback (most recent call last):*
> *  File
> "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py",
> line 609, in do_work*
> *work_executor.execute()*
> *  File
> "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line
> 170, in execute*
> *op.finish()*
> *  File "dataflow_worker/native_operations.py", line 93, in
> dataflow_worker.native_operations.NativeWriteOperation.finish*
> *def finish(self):*
> *  File "dataflow_worker/native_operations.py", line 94, in
> dataflow_worker.native_operations.NativeWriteOperation.finish*
> *with self.scoped_finish_state:*
> *  File "dataflow_worker/native_operations.py", line 95, in
> dataflow_worker.native_operations.NativeWriteOperation.finish*
> *self.writer.__exit__(None, None, None)*
> *  File
> "/usr/local/lib/python2.7/dist-packages/dataflow_worker/nativefileio.py",
> line 459, in __exit__*
> *self.file.close()*
> *  File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/filesystemio.py",
> line 201, in close*
> *self._uploader.finish()*
> *  File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/gcsio.py", line
> 575, in finish*
> *raise self._upload_thread.last_error  # pylint:
> disable=raising-bad-type*
> *HttpError: HttpError accessing
>  /o?uploadType=resumable&alt=json&upload_id=AEnB2UoK85wrqW1nWJ2cN7DQ5JKdQtTyDX-LwRwgIlIHgVL0KWR8JlUcLOJYFWmv9_YfpVhlKsooB4tHL2cxXch9hVls4nxFnw&name=temp%2Fbeamapp-bzftmxc-0327102557-645475.1522146357.645609%2F11032707444842841251%2Fdax-tmp-2018-03-27_03_27_30-7388048597187286914-S05-0-adbd1398680cf5c7%2F-shard--try-0c9aa3475bd67907-endshard.json>:
> response: <{'status': '410', 'content-length': '177', 'vary': 'Origin,
> X-Origin', 'server': 'UploadServer', 'x-guploader-uploadid':
> 'AEnB2UoK85wrqW1nWJ2cN7DQ5JKdQtTyDX-LwRwgIlIHgVL0KWR8JlUcLOJYFWmv9_YfpVhlKsooB4tHL2cxXch9hVls4nxFnw',
> 'date': 'Tue, 27 Mar 2018 10:41:17 GMT', 'content-type': 'application/json;
> charset=UTF-8'}>, content <{*
> * "error": {*
> *  "errors": [*
> *   {*
> *"domain": "global",*
> *"reason": "backendError",*
> *"message": "Backend Error"*
> *   }*
> *  ],*
> *  "code": 500,*
> *  "message": "Backend Error"*
> * }*
> *}*
> *>*
>
> --
>
> *Rajesh Hegde | Lead Product Developer | Datalicious*
> *e*: rhe...@datalicious.com | *m*: +919167571827 <+91%2091675%2071827>
> *a*: L-77, 15th Cross Rd, Sector 6, HSR Layout,
> 
> Bangalore Karnataka- 560102
> 
> *w*: www.datalicious.com
> 
>
> *Contact supp...@datalicious.com  anytime, we're
> keen to help!*
>
> 
>    
>
>
>
> 
>
>