[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120990&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120990
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 20:01
Start Date: 09/Jul/18 20:01
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #5903: [BEAM-4742] 
mkdirs if they don't exist in localfilesystem
URL: https://github.com/apache/beam/pull/5903#issuecomment-403602098
 
 
   OK, I'm reorienting this PR around 
[BEAM-4747](https://issues.apache.org/jira/browse/BEAM-4747) and will respond 
to your comments above shortly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120990)
Time Spent: 2h 10m  (was: 2h)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
> Fix For: 2.5.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * -[the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub- I missed that [there's already a 
> pipeline option for 
> this|https://github.com/apache/beam/pull/5902#discussion_r201071859]! Thanks 
> [~lcwik]
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120958&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120958
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 18:46
Start Date: 09/Jul/18 18:46
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #5903: [BEAM-4742] 
mkdirs if they don't exist in localfilesystem
URL: https://github.com/apache/beam/pull/5903#issuecomment-403581322
 
 
   Good point, probably worth fixing in `rename`/`copy` as well.
   
   Confusingly, I'm now seeing the portable wordcount example not fail without 
this change… let me try to get an answer one way or another there, and likely 
file a different JIRA to link this against, as well as addressing your changes.
   
   If I confirm that this isn't actually an issue in the portable wordcount 
example, and that means you don't think it's worth making this change at all, 
let me know, thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120958)
Time Spent: 2h  (was: 1h 50m)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * -[the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub- I missed that [there's already a 
> pipeline option for 
> this|https://github.com/apache/beam/pull/5902#discussion_r201071859]! Thanks 
> [~lcwik]
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120919&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120919
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 17:50
Start Date: 09/Jul/18 17:50
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5903: 
[BEAM-4742] mkdirs if they don't exist in localfilesystem
URL: https://github.com/apache/beam/pull/5903#discussion_r201089697
 
 

 ##
 File path: sdks/python/apache_beam/io/localfilesystem.py
 ##
 @@ -127,6 +127,9 @@ def _path_open(self, path, mode, 
mime_type='application/octet-stream',
 """Helper functions to open a file in the provided mode.
 """
 compression_type = FileSystem._get_compression_type(path, compression_type)
+parent = os.path.dirname(path)
 
 Review comment:
   We should only create the path in the `create` call and not the `open` call 
as we'll get a weird error if the user mistypes the path for something being 
read and we will try to create the directory which may fail (e.g. permissions) 
which will raise a confusing error message.
   
   Do you want to add a test to localfilesystem_test.py so that this isn't 
regressed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120919)
Time Spent: 1h 50m  (was: 1h 40m)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * -[the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub- I missed that [there's already a 
> pipeline option for 
> this|https://github.com/apache/beam/pull/5902#discussion_r201071859]! Thanks 
> [~lcwik]
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120916&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120916
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 17:46
Start Date: 09/Jul/18 17:46
Worklog Time Spent: 10m 
  Work Description: ryan-williams closed pull request #5902: [BEAM-4742] 
allow custom docker image in portable runner
URL: https://github.com/apache/beam/pull/5902
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/examples/wordcount.py 
b/sdks/python/apache_beam/examples/wordcount.py
index 3ba3b334188..8d0ddc1afdd 100644
--- a/sdks/python/apache_beam/examples/wordcount.py
+++ b/sdks/python/apache_beam/examples/wordcount.py
@@ -21,11 +21,13 @@
 
 import argparse
 import logging
+import os
 import re
 
 import apache_beam as beam
 from apache_beam.io import ReadFromText
 from apache_beam.io import WriteToText
+from apache_beam.io.filesystems import FileSystems
 from apache_beam.metrics import Metrics
 from apache_beam.metrics.metric import MetricsFilter
 from apache_beam.options.pipeline_options import PipelineOptions
@@ -111,6 +113,10 @@ def format_result(word_count):
 
   output = counts | 'format' >> beam.Map(format_result)
 
+  out_dir = os.path.dirname(known_args.output)
+  if not FileSystems.exists(out_dir):
+FileSystems.mkdirs(out_dir)
+
   # Write the output using a "Write" transform that has side effects.
   # pylint: disable=expression-not-assigned
   output | 'write' >> WriteToText(known_args.output)
diff --git a/sdks/python/apache_beam/runners/portability/portable_runner.py 
b/sdks/python/apache_beam/runners/portability/portable_runner.py
index fff9aa49c17..be69fdf05ff 100644
--- a/sdks/python/apache_beam/runners/portability/portable_runner.py
+++ b/sdks/python/apache_beam/runners/portability/portable_runner.py
@@ -59,7 +59,15 @@ def __init__(self, is_embedded_fnapi_runner=False):
 
   @staticmethod
   def default_docker_image():
-if 'USER' in os.environ:
+if 'DOCKER_IMAGE' in os.environ:
+  # Perhaps also test if this was built?
+  image = os.environ['DOCKER_IMAGE'] + ':latest'
+  logging.info(
+  'Using latest locally built Python SDK docker image: %s',
+  image
+  )
+  return image
+elif 'USER' in os.environ:
   # Perhaps also test if this was built?
   logging.info('Using latest locally built Python SDK docker image.')
   return os.environ['USER'] + 
'-docker-apache.bintray.io/beam/python:latest'


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120916)
Time Spent: 1h 40m  (was: 1.5h)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * -[the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub- I missed that [there's already a 
> pipeline option for 
> this|https://github.com/apache/beam/pull/5902#discussion_r201071859]! Thanks 
> [~lcwik]
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120915&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120915
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 17:46
Start Date: 09/Jul/18 17:46
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #5902: [BEAM-4742] 
allow custom docker image in portable runner
URL: https://github.com/apache/beam/pull/5902#issuecomment-403562531
 
 
   closing in favor of #5903, thanks @lukecwik 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120915)
Time Spent: 1.5h  (was: 1h 20m)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * -[the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub- I missed that [there's already a 
> pipeline option for 
> this|https://github.com/apache/beam/pull/5902#discussion_r201071859]! Thanks 
> [~lcwik]
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120914&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120914
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 17:43
Start Date: 09/Jul/18 17:43
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5902: 
[BEAM-4742] allow custom docker image in portable runner
URL: https://github.com/apache/beam/pull/5902#discussion_r201089384
 
 

 ##
 File path: sdks/python/apache_beam/examples/wordcount.py
 ##
 @@ -111,6 +113,10 @@ def format_result(word_count):
 
   output = counts | 'format' >> beam.Map(format_result)
 
+  out_dir = os.path.dirname(known_args.output)
+  if not FileSystems.exists(out_dir):
 
 Review comment:
   Thanks, taking a look at #5903.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120914)
Time Spent: 1h 20m  (was: 1h 10m)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * -[the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub- I missed that [there's already a 
> pipeline option for 
> this|https://github.com/apache/beam/pull/5902#discussion_r201071859]! Thanks 
> [~lcwik]
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120910&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120910
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 17:42
Start Date: 09/Jul/18 17:42
Worklog Time Spent: 10m 
  Work Description: ryan-williams opened a new pull request #5903: 
[BEAM-4742] mkdirs if they don't exist in localfilesystem
URL: https://github.com/apache/beam/pull/5903
 
 
   Change `LocalFileSystem` to match semantics of e.g. `GCSFileSystem`: writing 
to a path in a non-existent directory should just create the intermediate 
directories, instead of throwing `IOError: [Errno 2] No such file or directory`
   
   R: @lukecwik 
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120910)
Time Spent: 1h  (was: 50m)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * -[the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub- I missed that [the

[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120911&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120911
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 17:42
Start Date: 09/Jul/18 17:42
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on a change in pull request 
#5902: [BEAM-4742] allow custom docker image in portable runner
URL: https://github.com/apache/beam/pull/5902#discussion_r201089069
 
 

 ##
 File path: sdks/python/apache_beam/examples/wordcount.py
 ##
 @@ -111,6 +113,10 @@ def format_result(word_count):
 
   output = counts | 'format' >> beam.Map(format_result)
 
+  out_dir = os.path.dirname(known_args.output)
+  if not FileSystems.exists(out_dir):
 
 Review comment:
   I filed https://github.com/apache/beam/pull/5903 with that change; can close 
this in favor of that if that's what you prefer, thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120911)
Time Spent: 1h 10m  (was: 1h)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * -[the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub- I missed that [there's already a 
> pipeline option for 
> this|https://github.com/apache/beam/pull/5902#discussion_r201071859]! Thanks 
> [~lcwik]
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120908&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120908
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 17:36
Start Date: 09/Jul/18 17:36
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on a change in pull request 
#5902: [BEAM-4742] allow custom docker image in portable runner
URL: https://github.com/apache/beam/pull/5902#discussion_r201086959
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -59,7 +59,15 @@ def __init__(self, is_embedded_fnapi_runner=False):
 
   @staticmethod
   def default_docker_image():
-if 'USER' in os.environ:
+if 'DOCKER_IMAGE' in os.environ:
 
 Review comment:
   (I'll revert this part of the change, I don't think it's necessary)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120908)
Time Spent: 50m  (was: 40m)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * [the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120906&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120906
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 17:35
Start Date: 09/Jul/18 17:35
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on a change in pull request 
#5902: [BEAM-4742] allow custom docker image in portable runner
URL: https://github.com/apache/beam/pull/5902#discussion_r201086586
 
 

 ##
 File path: sdks/python/apache_beam/examples/wordcount.py
 ##
 @@ -111,6 +113,10 @@ def format_result(word_count):
 
   output = counts | 'format' >> beam.Map(format_result)
 
+  out_dir = os.path.dirname(known_args.output)
+  if not FileSystems.exists(out_dir):
 
 Review comment:
   interesting, I originally [made a change to `LocalFileSystem` to create 
directories on 
`open`](https://github.com/ryan-williams/beam/commit/25868025c2ead0b695d0dde46b6d4e3d19a4923a),
 but I wasn't sure if that was the right semantics; it sounds like you're 
saying it is?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120906)
Time Spent: 0.5h  (was: 20m)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * [the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120907&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120907
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 17:35
Start Date: 09/Jul/18 17:35
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on a change in pull request 
#5902: [BEAM-4742] allow custom docker image in portable runner
URL: https://github.com/apache/beam/pull/5902#discussion_r201086647
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -59,7 +59,15 @@ def __init__(self, is_embedded_fnapi_runner=False):
 
   @staticmethod
   def default_docker_image():
-if 'USER' in os.environ:
+if 'DOCKER_IMAGE' in os.environ:
 
 Review comment:
   ah, yea, I just saw the pipeline option for this as well! thanks for 
pointing it out.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120907)
Time Spent: 40m  (was: 0.5h)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * [the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120882&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120882
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 16:48
Start Date: 09/Jul/18 16:48
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5902: 
[BEAM-4742] allow custom docker image in portable runner
URL: https://github.com/apache/beam/pull/5902#discussion_r201071859
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -59,7 +59,15 @@ def __init__(self, is_embedded_fnapi_runner=False):
 
   @staticmethod
   def default_docker_image():
-if 'USER' in os.environ:
+if 'DOCKER_IMAGE' in os.environ:
 
 Review comment:
   This is already controlled by the flag `--harness_docker_image`:
   
https://github.com/apache/beam/blob/385faa713951813371dffaf654b5dc8d96e27aa1/sdks/python/apache_beam/options/pipeline_options.py#L648
   
   Do you still want to make the default container selection be based off of 
`DOCKER_IMAGE`?
   If yes, should it specify the full path and not assume the user wants the 
`:latest` suffix?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120882)
Time Spent: 20m  (was: 10m)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * [the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120881&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120881
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 16:48
Start Date: 09/Jul/18 16:48
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5902: 
[BEAM-4742] allow custom docker image in portable runner
URL: https://github.com/apache/beam/pull/5902#discussion_r201070593
 
 

 ##
 File path: sdks/python/apache_beam/examples/wordcount.py
 ##
 @@ -111,6 +113,10 @@ def format_result(word_count):
 
   output = counts | 'format' >> beam.Map(format_result)
 
+  out_dir = os.path.dirname(known_args.output)
+  if not FileSystems.exists(out_dir):
 
 Review comment:
   I believe the expectation should be that any output path should be created 
during pipeline execution and not by the driver program creating the pipeline.
   
   Please revert this change to wordcount and fix the filesystem implementation 
to create any necessary directories instead.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120881)
Time Spent: 20m  (was: 10m)

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * [the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub
>  * the default output path is in a temporary directory that doesn't exist at 
> the time of the {{open}} call, so I got {{IOError: [Errno 2] No such file or 
> directory}} 
> I'll send a PR with fixes to each of these shortly.
> I've also not found where to observe output from successfully running the 
> example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4742) Allow custom docker-image in portable wordcount example

2018-07-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4742?focusedWorklogId=120818&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-120818
 ]

ASF GitHub Bot logged work on BEAM-4742:


Author: ASF GitHub Bot
Created on: 09/Jul/18 15:43
Start Date: 09/Jul/18 15:43
Worklog Time Spent: 10m 
  Work Description: ryan-williams opened a new pull request #5902: 
[BEAM-4742] allow custom docker image in portable runner
URL: https://github.com/apache/beam/pull/5902
 
 
   Allow specifying a docker image for the portable runner to use, via 
`DOCKER_IMAGE` env var
   
   Also: make output directory in wordcount example, if it doesn't exist.
   
   R: @angoenka 
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 120818)
Time Spent: 10m
Remaining Estimate: 0h

> Allow custom docker-image in portable wordcount example
> ---
>
> Key: BEAM-4742
> URL: https://issues.apache.org/jira/browse/BEAM-4742
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-python
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I hit a couple snags [running the portable wordcount 
> example|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/build.gradle#L200-L214]:
>  * [the default docker image is hard-coded to a bintray 
> URL|https://github.com/apache/beam/blob/997ee3afe74483ae44e2dcb32ca0e24876129cd9/sdks/python/apache_beam/runners/portability/portable_runner.py#L60-L68],
>  but I published my image to Docker Hub
>  * the default output path is in a temporary dire