[ 
https://issues.apache.org/jira/browse/BEAM-6154?focusedWorklogId=189600&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-189600
 ]

ASF GitHub Bot logged work on BEAM-6154:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/Jan/19 18:18
            Start Date: 24/Jan/19 18:18
    Worklog Time Spent: 10m 
      Work Description: markflyhigh commented on pull request #7617: 
[BEAM-6154] Update google-apitools to 0.5.26 and fix gcsio in python 3
URL: https://github.com/apache/beam/pull/7617
 
 
   google-apitools 0.5.26 contains a critical python 3 fix that help to unblock 
DataflowRunner in Python 3. The problem is described in 
https://issues.apache.org/jira/browse/BEAM-6154. This PR contains fix to the 
problem as well as upgrade google-apitools.
   
   **Note: this fix touches `base_image_requirements.txt` which is used to 
build Python sdk harness container image.**
   
   ------------------------
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
    - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   
------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
 </br> [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 189600)
            Time Spent: 10m
    Remaining Estimate: 0h

> Gcsio batch delete broken in Python 3
> -------------------------------------
>
>                 Key: BEAM-6154
>                 URL: https://issues.apache.org/jira/browse/BEAM-6154
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>            Reporter: Mark Liu
>            Assignee: Mark Liu
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> I'm running Python SDK agianst GCP in Python 3.5 and got following gcsio 
> error while deleting files:
> {code}
>   File "/usr/local/lib/python3.5/site-packages/apache_beam/io/iobase.py", 
> line 1077, in <genexpr>
>     window.TimestampedValue(v, timestamp.MAX_TIMESTAMP) for v in outputs)
>   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py", 
> line 315, in finalize_write
>     num_threads)
>   File "/usr/local/lib/python3.5/site-packages/apache_beam/internal/util.py", 
> line 145, in run_using_threadpool
>     return pool.map(fn_to_execute, inputs)
>   File "/usr/local/lib/python3.5/multiprocessing/pool.py", line 266, in map
>     return self._map_async(func, iterable, mapstar, chunksize).get()
>   File "/usr/local/lib/python3.5/multiprocessing/pool.py", line 644, in get
>     raise self._value
>   File "/usr/local/lib/python3.5/multiprocessing/pool.py", line 119, in worker
>     result = (True, func(*args, **kwds))
>   File "/usr/local/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
>     return list(map(*args))
>   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/filebasedsink.py", 
> line 299, in _rename_batch
>     FileSystems.rename(source_files, destination_files)
>   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/filesystems.py", line 
> 252, in rename
>     return filesystem.rename(source_file_names, destination_file_names)
>   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/gcsfilesystem.py", 
> line 229, in rename
>     copy_statuses = gcsio.GcsIO().copy_batch(batch)
>   File "/usr/local/lib/python3.5/site-packages/apache_beam/io/gcp/gcsio.py", 
> line 322, in copy_batch
>     api_calls = batch_request.Execute(self.client._http)  # pylint: 
> disable=protected-access
>   File "/usr/local/lib/python3.5/site-packages/apitools/base/py/batch.py", 
> line 222, in Execute
>     batch_http_request.Execute(http)
>   File "/usr/local/lib/python3.5/site-packages/apitools/base/py/batch.py", 
> line 480, in Execute
>     self._Execute(http)
>   File "/usr/local/lib/python3.5/site-packages/apitools/base/py/batch.py", 
> line 450, in _Execute
>     mime_response = parser.parsestr(header + response.content)
> TypeError: Can't convert 'bytes' object to str implicitly
> {code} 
> After looking into related code in apitools library, I found response.content 
> that's returned via http request to gcs is bytes and apitools didn't handle 
> this scenario. This can be a blocker to any pipeline depending on gcsio and 
> apparently blocks all Dataflow job in Python 3.
> This could be another case that moving off apitools dependency in 
> [BEAM-4850|https://issues.apache.org/jira/browse/BEAM-4850].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to