[jira] [Work logged] (BEAM-5324) Finish Python 3 porting for unpackaged files

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5324?focusedWorklogId=145568=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145568
 ]

ASF GitHub Bot logged work on BEAM-5324:


Author: ASF GitHub Bot
Created on: 19/Sep/18 05:22
Start Date: 19/Sep/18 05:22
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #6424: 
[BEAM-5324] Partially port unpackaged modules to Python 3
URL: https://github.com/apache/beam/pull/6424#discussion_r218671781
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/bundle_processor.py
 ##
 @@ -108,7 +106,7 @@ def __init__(self, operation_name, step_name, consumers, 
counter_factory,
 self.receivers = [
 operations.ConsumerSet(
 self.counter_factory, self.name_context.step_name, 0,
-next(itervalues(consumers)), self.windowed_coder)]
+next(iter(consumers.values())), self.windowed_coder)]
 
 Review comment:
   What do you think about having `next(iter(itervalues(consumers)))` here to 
avoid expansion on Python 2?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145568)
Time Spent: 1h 10m  (was: 1h)

> Finish Python 3 porting for unpackaged files
> 
>
> Key: BEAM-5324
> URL: https://issues.apache.org/jira/browse/BEAM-5324
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5324) Finish Python 3 porting for unpackaged files

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5324?focusedWorklogId=145565=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145565
 ]

ASF GitHub Bot logged work on BEAM-5324:


Author: ASF GitHub Bot
Created on: 19/Sep/18 05:05
Start Date: 19/Sep/18 05:05
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #6424: 
[BEAM-5324] Partially port unpackaged modules to Python 3
URL: https://github.com/apache/beam/pull/6424#discussion_r218669632
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/operations.py
 ##
 @@ -496,7 +496,8 @@ def __init__(self, name_context, spec, counter_factory, 
state_sampler):
 fn, args, kwargs = pickler.loads(self.spec.combine_fn)[:3]
 self.combine_fn = curry_combine_fn(fn, args, kwargs)
 if (getattr(fn.add_input, 'im_func', None)
-is core.CombineFn.add_input.__func__):
+is getattr(core.CombineFn.add_input, '__func__',
 
 Review comment:
   In other words, we always follow `else` branch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145565)
Time Spent: 1h  (was: 50m)

> Finish Python 3 porting for unpackaged files
> 
>
> Key: BEAM-5324
> URL: https://issues.apache.org/jira/browse/BEAM-5324
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5324) Finish Python 3 porting for unpackaged files

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5324?focusedWorklogId=145552=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145552
 ]

ASF GitHub Bot logged work on BEAM-5324:


Author: ASF GitHub Bot
Created on: 19/Sep/18 02:39
Start Date: 19/Sep/18 02:39
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #6424: 
[BEAM-5324] Partially port unpackaged modules to Python 3
URL: https://github.com/apache/beam/pull/6424#discussion_r218652386
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/operations.py
 ##
 @@ -496,7 +496,8 @@ def __init__(self, name_context, spec, counter_factory, 
state_sampler):
 fn, args, kwargs = pickler.loads(self.spec.combine_fn)[:3]
 self.combine_fn = curry_combine_fn(fn, args, kwargs)
 if (getattr(fn.add_input, 'im_func', None)
-is core.CombineFn.add_input.__func__):
+is getattr(core.CombineFn.add_input, '__func__',
 
 Review comment:
   Thanks for the explanation. Upon a closer look at this code, I think the 
whole `if` section is legacy code and should be removed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145552)
Time Spent: 50m  (was: 40m)

> Finish Python 3 porting for unpackaged files
> 
>
> Key: BEAM-5324
> URL: https://issues.apache.org/jira/browse/BEAM-5324
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4542) BigQueryIO: Make BigQueryAvroUtils public

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4542?focusedWorklogId=145551=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145551
 ]

ASF GitHub Bot logged work on BEAM-4542:


Author: ASF GitHub Bot
Created on: 19/Sep/18 02:37
Start Date: 19/Sep/18 02:37
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #5606: [BEAM-4542] Make 
BigQueryAvroUtils public.
URL: https://github.com/apache/beam/pull/5606#issuecomment-422628228
 
 
   Hi, sorry about the delay. Unfortunately this class is not a part of the 
public API and is subjected to change. So I'd rather not make it public.
   
   Can you copy this code to your workflow ? Does not seem to be too large.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145551)
Time Spent: 50m  (was: 40m)

> BigQueryIO: Make BigQueryAvroUtils public
> -
>
> Key: BEAM-4542
> URL: https://issues.apache.org/jira/browse/BEAM-4542
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Alexander Huras
>Assignee: Chamikara Jayalath
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> BigQueryAvroUtils has a super useful static method for mapping GenericRecords 
> to TableRows (when their schemas conform etc.). This seems to be something 
> people are doing all over the place; and for my own project I've just 
> copy-pasted the entire class into a submodule (which feels wrong...).
> In my project I have a bunch of avro records that I'd like to write to BQ. 
> BigQueryIO.Write only accepts `TableRows` (as far as I can tell), thus my 
> dilemma.
> I'll have a one-liner PR to accompany this if people care enough.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5378) Ensure all Go SDK examples run successfully

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5378?focusedWorklogId=145546=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145546
 ]

ASF GitHub Bot logged work on BEAM-5378:


Author: ASF GitHub Bot
Created on: 19/Sep/18 01:55
Start Date: 19/Sep/18 01:55
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #6395: [BEAM-5378] Update go 
wordcap example to work on Dataflow runner
URL: https://github.com/apache/beam/pull/6395#issuecomment-422620257
 
 
   Removed wordcap example. PTAL.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145546)
Time Spent: 2h 10m  (was: 2h)

> Ensure all Go SDK examples run successfully
> ---
>
> Key: BEAM-5378
> URL: https://issues.apache.org/jira/browse/BEAM-5378
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: Not applicable
>Reporter: Tomas Roos
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> I've been spending a day or so running through the example available for the 
> Go SDK in order to see what works and on what runner (direct, dataflow), and 
> what doesn't and here's the results.
> All available examples for the go sdk. For me as a new developer on apache 
> beam and dataflow it would be a tremendous value to have all examples running 
> because many of them have legitimate use-cases behind them. 
> {code:java}
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> ├── contains
> │   └── contains.go
> ├── cookbook
> │   ├── combine
> │   │   └── combine.go
> │   ├── filter
> │   │   └── filter.go
> │   ├── join
> │   │   └── join.go
> │   ├── max
> │   │   └── max.go
> │   └── tornadoes
> │   └── tornadoes.go
> ├── debugging_wordcount
> │   └── debugging_wordcount.go
> ├── forest
> │   └── forest.go
> ├── grades
> │   └── grades.go
> ├── minimal_wordcount
> │   └── minimal_wordcount.go
> ├── multiout
> │   └── multiout.go
> ├── pingpong
> │   └── pingpong.go
> ├── streaming_wordcap
> │   └── wordcap.go
> ├── windowed_wordcount
> │   └── windowed_wordcount.go
> ├── wordcap
> │   └── wordcap.go
> ├── wordcount
> │   └── wordcount.go
> └── yatzy
> └── yatzy.go
> {code}
> All examples that are supposed to be runnable by the direct driver (not 
> depending on gcp platform services) are runnable.
> On the otherhand these are the tests that needs to be updated because its not 
> runnable on the dataflow platform for various reasons.
> I tried to figure them out and all I can do is to pin point at least where it 
> fails since my knowledge so far in the beam / dataflow internals is limited.
> .
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> Runs successfully if swapping the input to one of the shakespear data files 
> from gs://
> But when running this it yields a error from the top.Largest func (discussed 
> in another issue that top.Largest needs to have a serializeable combinator / 
> accumulator)
> ➜  autocomplete git:(master) ✗ ./autocomplete --project fair-app-213019 
> --runner dataflow --staging_location=gs://fair-app-213019/staging-test2 
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
>  
> 2018/09/11 15:35:26 Running autocomplete
> Unable to encode combiner for lifting: failed to encode custom coder: bad 
> underlying type: bad field type: bad element: unencodable type: interface 
> {}2018/09/11 15:35:26 Using running binary as worker binary: './autocomplete'
> 2018/09/11 15:35:26 Staging worker binary: ./autocomplete
> ├── contains
> │   └── contains.go
> Fails when running debug.Head for some mysterious reason, might have to do 
> with the param passing into the x,y iterator. Frankly I dont know and could 
> not figure.
> But removing the debug.Head call everything works as expected and succeeds.
> ├── cookbook
> │   ├── combine
> │   │   └── combine.go
> Fails because of extractFn which is a struct is not registered through the 
> beam.RegisterType (is this a must or not?)
> It works as a work around at least
> ➜  combine git:(master) ✗ ./combine 
> --output=fair-app-213019:combineoutput.test --project=fair-app-213019 
> --runner=dataflow --staging_location=gs://203019-staging/ 
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
>  
> 2018/09/11 15:40:50 Running combine
> panic: Failed to serialize 3: ParDo [In(Main): main.WordRow <- {2: 
> 

[jira] [Commented] (BEAM-5427) Fix sample code (AverageFn) in Combine.java

2018-09-18 Thread Ruoyun Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619936#comment-16619936
 ] 

Ruoyun Huang commented on BEAM-5427:


PR created: https://github.com/apache/beam/pull/6432

> Fix sample code (AverageFn) in Combine.java
> ---
>
> Key: BEAM-5427
> URL: https://issues.apache.org/jira/browse/BEAM-5427
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Reporter: Ruoyun Huang
>Assignee: Reuven Lax
>Priority: Minor
>
> Sample code missing coder. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Python_Verify #6019

2018-09-18 Thread Apache Jenkins Server
See 




[jira] [Created] (BEAM-5427) Fix sample code (AverageFn) in Combine.java

2018-09-18 Thread Ruoyun Huang (JIRA)
Ruoyun Huang created BEAM-5427:
--

 Summary: Fix sample code (AverageFn) in Combine.java
 Key: BEAM-5427
 URL: https://issues.apache.org/jira/browse/BEAM-5427
 Project: Beam
  Issue Type: Improvement
  Components: examples-java
Reporter: Ruoyun Huang
Assignee: Reuven Lax


Sample code missing coder. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Python #1454

2018-09-18 Thread Apache Jenkins Server
See 


Changes:

[ryan.blake.williams] specify localhost for portable flink VR tests

[ryan.blake.williams] enumerate primitive transforms in portable construction

[github] Update deprecation comment.

[github] Swap from READ_TRANSFORM to PAR_DO_TRANSFORM

[github] Update comment.

[lcwik] Fix tests expectations and minor code fix up.

[ryan.blake.williams] checkstyle nit

[ryan.blake.williams] use artificial subtransforms of primitives

[ryan.blake.williams] import URNs more directly in QueryablePipeline

[ryan.blake.williams] add splittable URNs as primitives

[lcwik] [BEAM-4176] Add the ability to allow for runners to register native

[chamikara] Upgrades Google API Client libraries to 1.24.1

[ryan.blake.williams] add NativeTransform test assertions

[kevinsi] Check if validation is disabled when validating BigtableSource

[boyuanz] Add big_query_query_to_table_it to python SDK

[lcwik] [BEAM-5406] Return null when a datetime data element is null. (#6410)

[lcwik] Fix errors in comments (#6394)

[lcwik] [BEAM-3194] Fail if @RequiresStableInput is used on runners that don't

[lcwik] [BEAM-4711] fix globbing in LocalFileSystem.delete (#5863)

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam15 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision c49a97ecbf815b320926285dcddba993590e3073 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f c49a97ecbf815b320926285dcddba993590e3073
Commit message: "[BEAM-4176] enumerate primitive transforms in portable 
construction"
 > git rev-list --no-walk 1f77b697027390225649df5dfde65012699d4f57 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins124567129690025685.sh
+ rm -rf 

[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6009064832576512244.sh
+ rm -rf 

[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins7945381587717398328.sh
+ virtualenv 

New python executable in 

Also creating executable in 

Installing setuptools, pkg_resources, pip, wheel...done.
Running virtualenv with interpreter /usr/bin/python2
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins2479550234249555677.sh
+ 

 install --upgrade setuptools pip
Requirement already up-to-date: setuptools in 
./env/.perfkit_env/lib/python2.7/site-packages (40.4.1)
Requirement already up-to-date: pip in 
./env/.perfkit_env/lib/python2.7/site-packages (18.0)
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3188559410345756719.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git 

Cloning into 
'
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins5725290640121144810.sh
+ 

 install -r 

Collecting absl-py (from -r 

 (line 14))
Collecting jinja2>=2.7 (from -r 

 (line 15))
  Using cached 

Build failed in Jenkins: beam_PostCommit_Python_PVR_Flink_Gradle #61

2018-09-18 Thread Apache Jenkins Server
See 


--
[...truncated 554.49 KB...]
_Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Socket closed"
debug_error_string = 
"{"created":"@1537315581.798108882","description":"Error received from 
peer","file":"src/core/lib/surface/call.cc","file_line":1099,"grpc_message":"Socket
 closed","grpc_status":14}"
>



==
ERROR: test_combine_per_key (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 251, in 
test_combine_per_key
assert_that(res, equal_to([('a', 1.5), ('b', 3.0)]))
  File "apache_beam/pipeline.py", line 414, in __exit__
self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 209, in 
wait_until_finish
'Pipeline %s failed in state %s.' % (self._job_id, self._state))
RuntimeError: Pipeline 
test_combine_per_key_1537315573.55_e75a9076-e88e-458c-b696-16de3ad6b0df failed 
in state FAILED.

==
ERROR: test_create (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 62, in 
test_create
assert_that(p | beam.Create(['a', 'b']), equal_to(['a', 'b']))
  File "apache_beam/pipeline.py", line 414, in __exit__
self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 209, in 
wait_until_finish
'Pipeline %s failed in state %s.' % (self._job_id, self._state))
RuntimeError: Pipeline 
test_create_1537315573.92_0f741043-98ca-4aa0-9251-21c0af5526c2 failed in state 
FAILED.

==
ERROR: test_flatten (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 244, in 
test_flatten
assert_that(res, equal_to(['a', 'b', 'c', 'd']))
  File "apache_beam/pipeline.py", line 414, in __exit__
self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 209, in 
wait_until_finish
'Pipeline %s failed in state %s.' % (self._job_id, self._state))
RuntimeError: Pipeline 
test_flatten_1537315574.42_8a52f78b-e772-433c-b3a6-d40a1026fca8 failed in state 
FAILED.

==
ERROR: test_flattened_side_input (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 190, in 
test_flattened_side_input
equal_to([(None, {'a': 1, 'b': 2})]))
  File "apache_beam/pipeline.py", line 414, in __exit__
self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 209, in 
wait_until_finish
'Pipeline %s failed in state %s.' % (self._job_id, self._state))
RuntimeError: Pipeline 
test_flattened_side_input_1537315574.93_7046f247-7bf1-4d1b-ac26-a61de797b999 
failed in state FAILED.

==
ERROR: test_gbk_side_input (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 198, in 
test_gbk_side_input
equal_to([(None, {'a': [1]})]))
  File "apache_beam/pipeline.py", line 414, in __exit__
self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 209, in 
wait_until_finish
'Pipeline %s failed in state %s.' % (self._job_id, self._state))
RuntimeError: Pipeline 
test_gbk_side_input_1537315575.45_93c2281d-098d-40d7-b9dd-bc406156e7ac failed 
in state FAILED.

==
ERROR: test_group_by_key (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 237, in 
test_group_by_key
assert_that(res, equal_to([('a', [1, 2]), ('b', [3])]))
  File "apache_beam/pipeline.py", line 414, in __exit__
self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 209, in 
wait_until_finish
'Pipeline %s failed in state %s.' % (self._job_id, self._state))
RuntimeError: Pipeline 
test_group_by_key_1537315575.97_f1b5ae5c-48a7-4bb6-bd9e-ef6e1ac91e49 

Build failed in Jenkins: beam_PreCommit_Website_Cron #76

2018-09-18 Thread Apache Jenkins Server
See 


Changes:

[ryan.blake.williams] specify localhost for portable flink VR tests

[ryan.blake.williams] enumerate primitive transforms in portable construction

[github] Update deprecation comment.

[github] Swap from READ_TRANSFORM to PAR_DO_TRANSFORM

[github] Update comment.

[lcwik] Fix tests expectations and minor code fix up.

[ryan.blake.williams] checkstyle nit

[ryan.blake.williams] use artificial subtransforms of primitives

[ryan.blake.williams] import URNs more directly in QueryablePipeline

[ryan.blake.williams] add splittable URNs as primitives

[lcwik] [BEAM-4176] Add the ability to allow for runners to register native

[chamikara] Upgrades Google API Client libraries to 1.24.1

[ryan.blake.williams] add NativeTransform test assertions

[kevinsi] Check if validation is disabled when validating BigtableSource

[boyuanz] Add big_query_query_to_table_it to python SDK

[lcwik] [BEAM-5406] Return null when a datetime data element is null. (#6410)

[lcwik] Fix errors in comments (#6394)

[lcwik] [BEAM-3194] Fail if @RequiresStableInput is used on runners that don't

[lcwik] [BEAM-4711] fix globbing in LocalFileSystem.delete (#5863)

--
[...truncated 24.71 KB...]
Sending build context to Docker daemon  24.51MB
Step 1/7 : FROM ruby:2.5
2.5: Pulling from library/ruby
05d1a5232b46: Already exists
5cee356eda6b: Already exists
89d3385f0fd3: Already exists
80ae6b477848: Already exists
28bdf9e584cc: Already exists
bdeb28e714e4: Pulling fs layer
5922247af93e: Pulling fs layer
a777432baaad: Pulling fs layer
bdeb28e714e4: Verifying Checksum
bdeb28e714e4: Download complete
a777432baaad: Verifying Checksum
a777432baaad: Download complete
bdeb28e714e4: Pull complete
5922247af93e: Verifying Checksum
5922247af93e: Download complete
5922247af93e: Pull complete
a777432baaad: Pull complete
Digest: sha256:5a2382c6f9910f8ab061734688b7ddfd70f0fd588f4f3f987625c14e0a648bd0
Status: Downloaded newer image for ruby:2.5
 ---> 88666731c3e1
Step 2/7 : WORKDIR /ruby
 ---> 26fdbe39e80c
Removing intermediate container f8ea91cd7a3b
Step 3/7 : RUN gem install bundler
 ---> Running in 8f5eb3d5e097
Successfully installed bundler-1.16.5
1 gem installed
 ---> d4e6b4ed08a8
Removing intermediate container 8f5eb3d5e097
Step 4/7 : ADD Gemfile Gemfile.lock /ruby/
 ---> 251ce216f70e
Removing intermediate container 4e90ab2a9e98
Step 5/7 : RUN bundle install --deployment --path $GEM_HOME
 ---> Running in 0e2c9c42b037
Fetching gem metadata from https://rubygems.org/.
Fetching rake 12.3.0
Installing rake 12.3.0
Fetching concurrent-ruby 1.0.5
Installing concurrent-ruby 1.0.5
Fetching i18n 0.9.5
Installing i18n 0.9.5
Fetching minitest 5.11.3
Installing minitest 5.11.3
Fetching thread_safe 0.3.6
Installing thread_safe 0.3.6
Fetching tzinfo 1.2.5
Installing tzinfo 1.2.5
Fetching activesupport 4.2.10
Installing activesupport 4.2.10
Fetching public_suffix 3.0.2
Installing public_suffix 3.0.2
Fetching addressable 2.5.2
Installing addressable 2.5.2
Using bundler 1.16.5
Fetching colorator 1.1.0
Installing colorator 1.1.0
Fetching colorize 0.8.1
Installing colorize 0.8.1
Fetching ffi 1.9.21
Installing ffi 1.9.21 with native extensions
Fetching ethon 0.11.0
Installing ethon 0.11.0
Fetching forwardable-extended 2.6.0
Installing forwardable-extended 2.6.0
Fetching mercenary 0.3.6
Installing mercenary 0.3.6
Fetching mini_portile2 2.3.0
Installing mini_portile2 2.3.0
Fetching nokogiri 1.8.2
Installing nokogiri 1.8.2 with native extensions
Fetching parallel 1.12.1
Installing parallel 1.12.1
Fetching typhoeus 1.3.0
Installing typhoeus 1.3.0
Fetching yell 2.0.7
Installing yell 2.0.7
Fetching html-proofer 3.8.0
Installing html-proofer 3.8.0
Fetching rb-fsevent 0.10.2
Installing rb-fsevent 0.10.2
Fetching rb-inotify 0.9.10
Installing rb-inotify 0.9.10
Fetching sass-listen 4.0.0
Installing sass-listen 4.0.0
Fetching sass 3.5.5
Installing sass 3.5.5
Fetching jekyll-sass-converter 1.5.2
Installing jekyll-sass-converter 1.5.2
Fetching ruby_dep 1.5.0
Installing ruby_dep 1.5.0
Fetching listen 3.1.5
Installing listen 3.1.5
Fetching jekyll-watch 1.5.1
Installing jekyll-watch 1.5.1
Fetching kramdown 1.16.2
Installing kramdown 1.16.2
Fetching liquid 3.0.6
Installing liquid 3.0.6
Fetching pathutil 0.16.1
Installing pathutil 0.16.1
Fetching rouge 1.11.1
Installing rouge 1.11.1
Fetching safe_yaml 1.0.4
Installing safe_yaml 1.0.4
Fetching jekyll 3.2.0
Installing jekyll 3.2.0
Fetching jekyll-redirect-from 0.11.0
Installing jekyll-redirect-from 0.11.0
Fetching jekyll_github_sample 0.3.1
Installing jekyll_github_sample 0.3.1
Bundle complete! 7 Gemfile dependencies, 38 gems now installed.
Bundled gems are installed into `/usr/local/bundle`
 ---> 8a08cc04ab00
Removing intermediate container 0e2c9c42b037
Step 6/7 : ENV LC_ALL C.UTF-8
 ---> Running in 94f76a89a16d
 ---> b1d3f4b75783
Removing intermediate container 94f76a89a16d
Step 7/7 : CMD 

Build failed in Jenkins: beam_PostCommit_Website_Merge #5

2018-09-18 Thread Apache Jenkins Server
See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on websites1 (git-websites svn-websites) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision c49a97ecbf815b320926285dcddba993590e3073 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f c49a97ecbf815b320926285dcddba993590e3073
Commit message: "[BEAM-4176] enumerate primitive transforms in portable 
construction"
 > git rev-list --no-walk c49a97ecbf815b320926285dcddba993590e3073 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[Gradle] - Launching build.
[src] $ 
 
--info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
-Dorg.gradle.jvmargs=-Xmx4g :beam-website:mergeWebsite
Initialized native services in: /home/jenkins/.gradle/native
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/4.8/userguide/gradle_daemon.html.
Starting process 'Gradle build daemon'. Working directory: 
/home/jenkins/.gradle/daemon/4.8 Command: 
/usr/local/asfpackages/java/jdk1.8.0_172/bin/java -Xmx4g -Dfile.encoding=UTF-8 
-Duser.country=US -Duser.language=en -Duser.variant -cp 
/home/jenkins/.gradle/wrapper/dists/gradle-4.8-bin/divx0s2uj4thofgytb7gf9fsi/gradle-4.8/lib/gradle-launcher-4.8.jar
 org.gradle.launcher.daemon.bootstrap.GradleDaemon 4.8
Successfully started process 'Gradle build daemon'
An attempt to start the daemon took 0.905 secs.
The client will now receive all logging from the daemon (pid: 327). The daemon 
log file: /home/jenkins/.gradle/daemon/4.8/daemon-327.out.log
Closing daemon's stdin at end of input.
The daemon will no longer process any standard input.
Daemon will be stopped at the end of the build stopping after processing
Using 12 worker leases.
Starting Build
Parallel execution is an incubating feature.

> Configure project :buildSrc
Evaluating project ':buildSrc' using build file 
'
file or directory 
'
 not found
Selected primary task 'build' from project :
file or directory 
'
 not found
Using local directory build cache for build ':buildSrc' (location = 
/home/jenkins/.gradle/caches/build-cache-1, removeUnusedEntriesAfter = 7 days).
:compileJava (Thread[Task worker for ':buildSrc' Thread 5,5,main]) started.

> Task :buildSrc:compileJava NO-SOURCE
file or directory 
'
 not found
Skipping task ':buildSrc:compileJava' as it has no source files and no previous 
output files.
:compileJava (Thread[Task worker for ':buildSrc' Thread 5,5,main]) completed. 
Took 0.022 secs.
:compileGroovy (Thread[Task worker for ':buildSrc' Thread 5,5,main]) started.

> Task :buildSrc:compileGroovy
Build cache key for task ':buildSrc:compileGroovy' is 
4aa1b6c9c202c4fc990c9cd958f0dce7
Task ':buildSrc:compileGroovy' is not up-to-date because:
  No history is available.
Origin for task ':buildSrc:compileGroovy': {executionTime=5846, 
hostName=jenkins-websites1.apache.org, operatingSystem=Linux, 
buildInvocationId=2q6xjk5cnjct3igdpoqkgqnpzy, creationTime=1537307937359, 
type=org.gradle.api.tasks.compile.GroovyCompile_Decorated, userName=jenkins, 
gradleVersion=4.8, 
rootPath=
 path=:compileGroovy}
Unpacked output for task ':buildSrc:compileGroovy' from cache.

> Task :buildSrc:compileGroovy FROM-CACHE
:compileGroovy (Thread[Task worker for ':buildSrc' Thread 5,5,main]) completed. 
Took 2.523 secs.
:processResources (Thread[Task worker for ':buildSrc' Thread 5,5,main]) started.

> Task 

[jira] [Updated] (BEAM-5375) KafkaIO reader should handle runtime exceptions kafka client

2018-09-18 Thread Raghu Angadi (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated BEAM-5375:
---
Description: 
KafkaIO reader might stop reading from Kafka without any explicit error message 
if KafkaConsumer throws a runtime exception while polling for messages. One of 
the Dataflow customers encountered this issue (see [user@ 
thread|https://lists.apache.org/thread.html/c0cf8f45f567a0623592e2d8340f5288e3e774b59bca985aec410a81@%3Cuser.beam.apache.org%3E])]

'consumerPollThread()' in KafkaIO deliberately avoided catching runtime 
exceptions. It shoud handle it.. stuff happens at runtime. 

It should result in 'IOException' from start()/advance(). The runners will 
handle properly reporting and closing the readers. 

  was:
KafkaIO reader might stop reading from Kafka without any explicit error message 
if KafkaConsumer throws a runtime exception while polling for messages. One of 
the Dataflow customers encountered this issue (see [user@ 
thread|[https://lists.apache.org/thread.html/c0cf8f45f567a0623592e2d8340f5288e3e774b59bca985aec410a81@%3Cuser.beam.apache.org%3E])]

'consumerPollThread()' in KafkaIO deliberately avoided catching runtime 
exceptions. It shoud handle it.. stuff happens at runtime. 

It should result in 'IOException' from start()/advance(). The runners will 
handle properly reporting and closing the readers. 


> KafkaIO reader should handle runtime exceptions kafka client
> 
>
> Key: BEAM-5375
> URL: https://issues.apache.org/jira/browse/BEAM-5375
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kafka
>Affects Versions: 2.7.0
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> KafkaIO reader might stop reading from Kafka without any explicit error 
> message if KafkaConsumer throws a runtime exception while polling for 
> messages. One of the Dataflow customers encountered this issue (see [user@ 
> thread|https://lists.apache.org/thread.html/c0cf8f45f567a0623592e2d8340f5288e3e774b59bca985aec410a81@%3Cuser.beam.apache.org%3E])]
> 'consumerPollThread()' in KafkaIO deliberately avoided catching runtime 
> exceptions. It shoud handle it.. stuff happens at runtime. 
> It should result in 'IOException' from start()/advance(). The runners will 
> handle properly reporting and closing the readers. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_Verify #6018

2018-09-18 Thread Apache Jenkins Server
See 


Changes:

[ryan.blake.williams] specify localhost for portable flink VR tests

[ryan.blake.williams] enumerate primitive transforms in portable construction

[github] Update deprecation comment.

[github] Swap from READ_TRANSFORM to PAR_DO_TRANSFORM

[github] Update comment.

[lcwik] Fix tests expectations and minor code fix up.

[ryan.blake.williams] checkstyle nit

[ryan.blake.williams] use artificial subtransforms of primitives

[ryan.blake.williams] import URNs more directly in QueryablePipeline

[ryan.blake.williams] add splittable URNs as primitives

[lcwik] [BEAM-4176] Add the ability to allow for runners to register native

[chamikara] Upgrades Google API Client libraries to 1.24.1

[ryan.blake.williams] add NativeTransform test assertions

[kevinsi] Check if validation is disabled when validating BigtableSource

--
[...truncated 1.10 MB...]
test_proto_coder (apache_beam.coders.coders_test.ProtoCoderTest) ... ok
test_base64_pickle_coder (apache_beam.coders.coders_test_common.CodersTest) ... 
ok
test_bytes_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_custom_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_deterministic_coder (apache_beam.coders.coders_test_common.CodersTest) ... 
ok
test_dill_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_fast_primitives_coder (apache_beam.coders.coders_test_common.CodersTest) 
... ok
test_float_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_global_window_coder (apache_beam.coders.coders_test_common.CodersTest) ... 
ok
test_interval_window_coder (apache_beam.coders.coders_test_common.CodersTest) 
... ok
test_iterable_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_iterable_coder_unknown_length 
(apache_beam.coders.coders_test_common.CodersTest) ... ok
test_length_prefix_coder (apache_beam.coders.coders_test_common.CodersTest) ... 
ok
test_nested_observables (apache_beam.coders.coders_test_common.CodersTest) ... 
ok
test_pickle_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_proto_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_singleton_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_timestamp_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_tuple_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_tuple_sequence_coder (apache_beam.coders.coders_test_common.CodersTest) 
... ok
test_utf8_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_varint_coder (apache_beam.coders.coders_test_common.CodersTest) ... ok
test_windowed_value_coder (apache_beam.coders.coders_test_common.CodersTest) 
... ok
test_windowedvalue_coder_paneinfo 
(apache_beam.coders.coders_test_common.CodersTest) ... ok
test_observable (apache_beam.coders.observable_test.ObservableMixinTest) ... 
:50:
 DeprecationWarning: Please use assertEqual instead.
  self.assertEquals(3, self.observed_count)
ok
test_base64_pickle_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_bytes_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_custom_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_deterministic_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_dill_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_fast_primitives_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_float_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_global_window_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_interval_window_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_iterable_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_iterable_coder_unknown_length 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_length_prefix_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_nested_observables 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_pickle_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_proto_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_singleton_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_timestamp_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_tuple_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_tuple_sequence_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok
test_utf8_coder 
(apache_beam.coders.slow_coders_test.transplant_class..C) ... ok

[jira] [Commented] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-09-18 Thread Ryan Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619901#comment-16619901
 ] 

Ryan Williams commented on BEAM-4176:
-

As of [#6328|https://github.com/apache/beam/pull/6328] being merged, we're down 
to 117 failures:

!https://camo.githubusercontent.com/be8a669f7d4fd09b1cd83e224ab37db2a4c68fd4/68747470733a2f2f636c2e6c792f3465643734343832393236342f53637265656e25323053686f74253230323031382d30392d31382532306174253230362e30342e3232253230504d2e706e67!

I put them in [this google 
sheet|https://docs.google.com/spreadsheets/d/1aNH4iwR99s2bSjFpj3vcHhdrVg_9q7XySX_5jKjyK2E/edit?usp=sharing],
 where I am also going through and spot-checking why each case failed ([here's 
the full test output I'm working 
from|https://storage.googleapis.com/runsascoded-tmp/beam-portable-flink-vr-tests/ecd1ac085a/index.html]).

So far, all the cases I've checked have failed due transforms of the form 
{{Combine.globally(…)/View.AsIterable/View.CreatePCollectionView}} , e.g.:
{code:java}
[flink-runner-job-server] ERROR 
org.apache.beam.runners.flink.FlinkJobInvocation - Error during job invocation 
combinetest0accumulationtests0testaccumulatingcombineempty-ryan-0918183535-7620b406_f38db722-b27f-45a6-bbf5-e0bc86261cb2.
java.lang.IllegalArgumentException: Unknown type of URN 
beam:transform:create_view:v1 for PTransform with id 
Combine.globally(MeanInts)/View.AsIterable/View.CreatePCollectionView.
at 
org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator.urnNotFound(FlinkBatchPortablePipelineTranslator.java:578)
at 
org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator.translate(FlinkBatchPortablePipelineTranslator.java:233)
at 
org.apache.beam.runners.flink.FlinkJobInvocation.runPipeline(FlinkJobInvocation.java:112)
at 
org.apache.beam.repackaged.beam_runners_flink_2.11.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
at 
org.apache.beam.repackaged.beam_runners_flink_2.11.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
{code}
This is promising; getting {{PTransformTranslation.CREATE_VIEW_TRANSFORM_URN}} 
into  [this list in 
FlinkBatchPortablePipelineTranslator|https://github.com/apache/beam/blob/c49a97ecbf815b320926285dcddba993590e3073/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkBatchPortablePipelineTranslator.java#L132-L153]
 may recover many more cases.

Should FlinkBatchPortablePipelineTranslator handle {{create_view}} URN 
directly? Or should such transforms be folded into ExecutableStages? Or 
something else?

 

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
> Attachments: Screen Shot 2018-08-14 at 4.18.31 PM.png, Screen Shot 
> 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 22h 20m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-2947) SDK harnesses should be able to indicate panic in FnAPI

2018-09-18 Thread Henning Rohde (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde reassigned BEAM-2947:
---

Assignee: (was: Henning Rohde)

> SDK harnesses should be able to indicate panic in FnAPI
> ---
>
> Key: BEAM-2947
> URL: https://issues.apache.org/jira/browse/BEAM-2947
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Priority: Minor
>  Labels: portability
>
> If the SDK harness encounters a permanent problem (local file corruption or 
> bad staged files, say) it should be able to rely that information to the 
> runner as a "panic" message, say. Such conditions may also happen during the 
> boot stage.
> Using process/container exit codes would likely not work well, because that 
> information may not flow easily to the runner logic depending on how 
> containers are managed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5350) Running autocomplete.go on dataflow fails

2018-09-18 Thread Henning Rohde (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde reassigned BEAM-5350:
---

Assignee: (was: Henning Rohde)

> Running autocomplete.go on dataflow fails
> -
>
> Key: BEAM-5350
> URL: https://issues.apache.org/jira/browse/BEAM-5350
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: Not applicable
>Reporter: Tomas Roos
>Priority: Major
>
> I'm in the process as a external developer make sure that all examples are 
> runnable on both direct and the dataflow runner as its crucial for people 
> onboarding this project.
>  
> I've visted the projects before and some are runnable, some probably where 
> previously, and some are def not runnable.
>  
> So I started top down today, in order to make autocomplete.go run on dataflow 
> as well as the direct runner i changed the input in order to make it platform 
> independent instead of pointing to a local file.The reading of the source 
> from the public cloud storage went fine but it fails to run the top.Largest 
> anonymous less function (ran on id: go-job-1-1536575613531078735) failed with
>  
>  
> {code:java}
> RESP: instruction_id: "-205" error: "Invalid bundle desc: decode: bad userfn: 
> bad struct encoding: failed to decode data: decode: failed to find symbol 
> main.main.func1: main.main.func1 not found. Use runtime.RegisterFunction in 
> unit tests" register: < >
>  
> {code}
>  
> [https://github.com/apache/beam/blob/master/sdks/go/examples/complete/autocomplete/autocomplete.go#L63]
>  
> So in order to fix this I introduced the local func called lessFn and 
> registered in the init process. This though now instead when running
>  
> {code:java}
>  
> go run autocomplete.go --project fair-app-213019 --runner dataflow 
> --staging_location=gs://fair-app-213019/staging-test2 
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
> {code}
>  
>  
> fails with
>  
> {code:java}
> 2018/09/10 13:37:10 Running autocomplete
>  Unable to encode combiner for lifting: failed to encode custom coder: bad 
> underlying type: bad field type: bad element: unencodable type: interface 
> {}2018/09/10 13:37:10 Using running binary as worker binary: 
> '/tmp/go-build157286122/b001/exe/autocomplete'
>  2018/09/10 13:37:10 Staging worker binary: 
> /tmp/go-build157286122/b001/exe/autocomplete{code}
>  
> And I know this is when invoking the top.Largest since I've removed the piece 
> of code and then the job runs fine, could you please point me in the right 
> direction why my local func is not encoable as a interface {} and I will of 
> course happily send a PR when this is working on direct and the dataflow 
> direct so I can move on to the other examples
>  
> (All changes can be seen here) 
> [https://github.com/apache/beam/compare/master...ptomasroos:make-autocomplete-dataflowable?expand=1]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3286) Go SDK support for portable side input

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3286?focusedWorklogId=145525=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145525
 ]

ASF GitHub Bot logged work on BEAM-3286:


Author: ASF GitHub Bot
Created on: 18/Sep/18 23:25
Start Date: 18/Sep/18 23:25
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #6402: [BEAM-3286] Add 
Dataflow support for side input
URL: https://github.com/apache/beam/pull/6402#issuecomment-422591851
 
 
   Run Go PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145525)
Time Spent: 3h 10m  (was: 3h)

> Go SDK support for portable side input
> --
>
> Key: BEAM-3286
> URL: https://issues.apache.org/jira/browse/BEAM-3286
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3194) Support annotating that a DoFn requires stable / deterministic input for replay/retry

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3194?focusedWorklogId=145524=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145524
 ]

ASF GitHub Bot logged work on BEAM-3194:


Author: ASF GitHub Bot
Created on: 18/Sep/18 23:14
Start Date: 18/Sep/18 23:14
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #6388: [BEAM-3194] Fail if 
@RequiresStableInput is used on runners that don't supoort it
URL: https://github.com/apache/beam/pull/6388#issuecomment-422589744
 
 
   Talk to @youngoli 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145524)
Time Spent: 3h 10m  (was: 3h)

> Support annotating that a DoFn requires stable / deterministic input for 
> replay/retry
> -
>
> Key: BEAM-3194
> URL: https://issues.apache.org/jira/browse/BEAM-3194
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Yueyang Qiu
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> See the thread: 
> https://lists.apache.org/thread.html/5fd81ce371aeaf642665348f8e6940e308e04275dd7072f380f9f945@%3Cdev.beam.apache.org%3E
> We need this in order to have truly cross-runner end-to-end exactly once via 
> replay + idempotence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3194) Support annotating that a DoFn requires stable / deterministic input for replay/retry

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3194?focusedWorklogId=145523=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145523
 ]

ASF GitHub Bot logged work on BEAM-3194:


Author: ASF GitHub Bot
Created on: 18/Sep/18 23:13
Start Date: 18/Sep/18 23:13
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on issue #6388: [BEAM-3194] Fail if 
@RequiresStableInput is used on runners that don't supoort it
URL: https://github.com/apache/beam/pull/6388#issuecomment-422589533
 
 
   It's supported on Dataflow. DirectRunner doesn't have retry do it doesn't 
matter. I am not very sure how the Reference runner is implemented in terms of 
retry. Do you know the details or who should I talk to?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145523)
Time Spent: 3h  (was: 2h 50m)

> Support annotating that a DoFn requires stable / deterministic input for 
> replay/retry
> -
>
> Key: BEAM-3194
> URL: https://issues.apache.org/jira/browse/BEAM-3194
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Yueyang Qiu
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> See the thread: 
> https://lists.apache.org/thread.html/5fd81ce371aeaf642665348f8e6940e308e04275dd7072f380f9f945@%3Cdev.beam.apache.org%3E
> We need this in order to have truly cross-runner end-to-end exactly once via 
> replay + idempotence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=145521=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145521
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 18/Sep/18 23:10
Start Date: 18/Sep/18 23:10
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #6328: [BEAM-4176] 
enumerate primitive transforms in portable construction
URL: https://github.com/apache/beam/pull/6328#issuecomment-422588775
 
 
   Yea, to be clear, the tests themselves (apparently) weren't getting stuck, 
but attempting to run them was, due to local Docker issues, which I agree is 
preferable  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145521)
Time Spent: 22h 20m  (was: 22h 10m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
> Attachments: Screen Shot 2018-08-14 at 4.18.31 PM.png, Screen Shot 
> 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 22h 20m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4711) LocalFileSystem.delete doesn't support globbing

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4711?focusedWorklogId=145519=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145519
 ]

ASF GitHub Bot logged work on BEAM-4711:


Author: ASF GitHub Bot
Created on: 18/Sep/18 22:57
Start Date: 18/Sep/18 22:57
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on a change in pull request 
#5863: [BEAM-4711] fix globbing in LocalFileSystem.delete
URL: https://github.com/apache/beam/pull/5863#discussion_r218621015
 
 

 ##
 File path: sdks/python/apache_beam/io/localfilesystem.py
 ##
 @@ -321,11 +321,22 @@ def _delete_path(path):
 raise IOError(err)
 
 exceptions = {}
-for path in paths:
+
+def try_delete(path):
   try:
 _delete_path(path)
   except Exception as e:  # pylint: disable=broad-except
 exceptions[path] = e
 
+for match_result in self.match(paths):
+  metadata_list = match_result.metadata_list
+
+  if not metadata_list:
 
 Review comment:
   sorry I dropped this; yea I only made it an error to conform to the existing 
test you mentioned


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145519)
Time Spent: 2h  (was: 1h 50m)

> LocalFileSystem.delete doesn't support globbing
> ---
>
> Key: BEAM-4711
> URL: https://issues.apache.org/jira/browse/BEAM-4711
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
> Fix For: 2.8.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> I attempted to run {{wordcount_it_test:WordCountIT.test_wordcount_it}} 
> locally with {{DirectRunner}}:
> {code}
> python setup.py nosetests \
>   --tests 
> apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it \
>   --test-pipeline-options="--output=foo"
> {code}
> It failed in [the {{delete_files}} cleanup 
> command|https://github.com/apache/beam/blob/a58f1ffaafb0e2ebcc73a1c5abfb05a15ec6a84b/sdks/python/apache_beam/examples/wordcount_it_test.py#L64]:
> {code}
> root: WARNING: Retry with exponential backoff: waiting for 11.1454450937 
> seconds before retrying delete_files because we caught exception: 
> BeamIOError: Delete operation failed with exceptions 
> {'foo/1530557644/results*': IOError(OSError(2, 'No such file or directory'),)}
>  Traceback for above exception (most recent call last):
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/utils/retry.py", line 184, 
> in wrapper
> return fun(*args, **kwargs)
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/testing/test_utils.py", 
> line 136, in delete_files
> FileSystems.delete(file_paths)
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/io/filesystems.py", line 
> 282, in delete
> return filesystem.delete(paths)
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/io/localfilesystem.py", 
> line 304, in delete
> raise BeamIOError("Delete operation failed", exceptions)
> {code}
> The line:
> {code}
> self.addCleanup(delete_files, [output + '*'])
> {code}
> works as expected in GCS, and deletes a test's output-directory, but it fails 
> in on the local-filesystem, which doesn't expand globs before attempting to 
> delete paths.
> It would be good to make these consistent, presumably by adding glob-support 
> to {{LocalFileSystem}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5426) Use both destination and TableDestination for BQ load job IDs

2018-09-18 Thread Reuven Lax (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619830#comment-16619830
 ] 

Reuven Lax commented on BEAM-5426:
--

Two issues:
 # I'm not sure how to do this easily as the destinations are sharded across 
all the workers.
 # We don't have a way of failing jobs from in the SDK. The best we can do is 
throw an exception, but that doesn't necessarily fail the job (for Dataflow 
streaming, that will simply result in a infinite exception loop and a stuck 
job).

> Use both destination and TableDestination for BQ load job IDs
> -
>
> Key: BEAM-5426
> URL: https://issues.apache.org/jira/browse/BEAM-5426
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Priority: Major
>
> Currently we use TableDestination when creating a unique load job ID for a 
> destination: 
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359]
>  
> This can result in a data loss issue if a user returns the same 
> TableDestination for different destination IDs. I think we can prevent this 
> if we include both IDs in the BQ load job ID.
>  
> CC: [~reuvenlax]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5324) Finish Python 3 porting for unpackaged files

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5324?focusedWorklogId=145513=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145513
 ]

ASF GitHub Bot logged work on BEAM-5324:


Author: ASF GitHub Bot
Created on: 18/Sep/18 22:43
Start Date: 18/Sep/18 22:43
Worklog Time Spent: 10m 
  Work Description: RobbeSneyders commented on a change in pull request 
#6424: [BEAM-5324] Partially port unpackaged modules to Python 3
URL: https://github.com/apache/beam/pull/6424#discussion_r218618418
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/operations.py
 ##
 @@ -496,7 +496,8 @@ def __init__(self, name_context, spec, counter_factory, 
state_sampler):
 fn, args, kwargs = pickler.loads(self.spec.combine_fn)[:3]
 self.combine_fn = curry_combine_fn(fn, args, kwargs)
 if (getattr(fn.add_input, 'im_func', None)
-is core.CombineFn.add_input.__func__):
+is getattr(core.CombineFn.add_input, '__func__',
 
 Review comment:
   On Python 2, getting a method from a class returns an unbound method, with a 
reference to the wrapped function stored in `__func__`.
   On Pyhon 3, getting a method from a class returns the function itself.
   
   On Python 2.6+, `im_func` is also available as `__func__`, so we can change 
it to
   ```Python
   if (getattr(fn.add_input, '__func__', None)
   is is getattr(core.CombineFn.add_input, '__func__',
 core.CombineFn.add_input)):
   ```
   if this is more clear.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145513)
Time Spent: 40m  (was: 0.5h)

> Finish Python 3 porting for unpackaged files
> 
>
> Key: BEAM-5324
> URL: https://issues.apache.org/jira/browse/BEAM-5324
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Robbe
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5339) Implement new policy on Beam dependency tooling

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5339?focusedWorklogId=145511=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145511
 ]

ASF GitHub Bot logged work on BEAM-5339:


Author: ASF GitHub Bot
Created on: 18/Sep/18 22:38
Start Date: 18/Sep/18 22:38
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #554: [BEAM-5339] update 
the beam dependency guide
URL: https://github.com/apache/beam-site/pull/554#issuecomment-422581567
 
 
   +R: @chamikaramj 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145511)
Time Spent: 1h 40m  (was: 1.5h)

> Implement new policy on Beam dependency tooling
> ---
>
> Key: BEAM-5339
> URL: https://issues.apache.org/jira/browse/BEAM-5339
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> (1) Instead of a dependency "owners" list we will be maintaining an 
> "interested parties" list. When we create a JIRA for a dependency we will not 
> assign it to an owner but rather we will CC all the folks that mentioned that 
> they will be interested in receiving updates related to that dependency. Hope 
> is that some of the interested parties will also put forward the effort to 
> upgrade dependencies they are interested in but the responsibility of 
> upgrading dependencies lie with the community as a whole.
>  (2) We will be creating JIRAs for upgrading individual dependencies, not for 
> upgrading to specific versions of those dependencies. For example, if a given 
> dependency X is three minor versions or an year behind we will create a JIRA 
> for upgrading that. But the specific version to upgrade to has to be 
> determined by the Beam community. Beam community might choose to close a JIRA 
> if there are known issues with available recent releases. Tool may reopen 
> such a closed JIRA in the future if new information becomes available (for 
> example, 3 new versions have been released since JIRA was closed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5426) Use both destination and TableDestination for BQ load job IDs

2018-09-18 Thread Chamikara Jayalath (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619823#comment-16619823
 ] 

Chamikara Jayalath commented on BEAM-5426:
--

In that case, how about keeping track of load jobs for different destinations, 
and failing the job if we detect two load jobs for the same destination ? We 
should find a way to actively fail for this case, since currently this ends up 
being a silent data loss.

> Use both destination and TableDestination for BQ load job IDs
> -
>
> Key: BEAM-5426
> URL: https://issues.apache.org/jira/browse/BEAM-5426
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Priority: Major
>
> Currently we use TableDestination when creating a unique load job ID for a 
> destination: 
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359]
>  
> This can result in a data loss issue if a user returns the same 
> TableDestination for different destination IDs. I think we can prevent this 
> if we include both IDs in the BQ load job ID.
>  
> CC: [~reuvenlax]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=145510=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145510
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 18/Sep/18 22:36
Start Date: 18/Sep/18 22:36
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6328: [BEAM-4176] 
enumerate primitive transforms in portable construction
URL: https://github.com/apache/beam/pull/6328#issuecomment-422581022
 
 
   Thanks Ryan! 
   Also the good part is that the test cases are not getting stuck.
   Many a times the test use to get stuck instead of failing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145510)
Time Spent: 22h 10m  (was: 22h)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
> Attachments: Screen Shot 2018-08-14 at 4.18.31 PM.png, Screen Shot 
> 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 22h 10m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3878) Improve error reporting in calls.go

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3878?focusedWorklogId=145509=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145509
 ]

ASF GitHub Bot logged work on BEAM-3878:


Author: ASF GitHub Bot
Created on: 18/Sep/18 22:26
Start Date: 18/Sep/18 22:26
Worklog Time Spent: 10m 
  Work Description: lostluck commented on a change in pull request #6156: 
[BEAM-3878] Improve error reporting in calls.go
URL: https://github.com/apache/beam/pull/6156#discussion_r218615114
 
 

 ##
 File path: sdks/go/pkg/beam/core/util/reflectx/calls.tmpl
 ##
 @@ -57,7 +60,9 @@ func (c *shimFunc{{$in}}x{{$out}}) 
Call{{$in}}x{{$out}}({{mkargs $in "arg%v" "in
 
 func ToFunc{{$in}}x{{$out}}(c Func) Func{{$in}}x{{$out}} {
 if c.Type().NumIn() != {{$in}} || c.Type().NumOut() != {{$out}} {
-panic("incompatible func type")
+panic(fmt.Sprintf("incompatible func type, expected function of 
{{$in}} " +
+   "inputs & {{$out}} outputs and instead received a function of %d inputs 
and %d" +
 
 Review comment:
   +1 to this being a good change, and I like henning's suggestion.
   I agree that we should also include the type signature of the failing 
function since it'll help narrow down where something went wrong if the stack 
trace gets swallowed.
   
   %v and passing in c.Type() will accomplish that.
   
   Maybe:
   
   panic(fmt.Sprintf("incompatible func type: %v has %d %d inputs and %d 
outputs, " +" want function of {{$in}} inputs & {{$out}} outputs", 
c.Type(), c.Type().NumIn(), c.Type().NumOut()))


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145509)
Time Spent: 1h  (was: 50m)

> Improve error reporting in calls.go
> ---
>
> Key: BEAM-3878
> URL: https://issues.apache.org/jira/browse/BEAM-3878
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The error messages generated in calls.go are not as helpful as they could be.
> Instead of simply reporting "incompatible func type" it would be great if 
> they reported the topology of the actual function supplied versus what is 
> expected. That would make debugging a lot easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_PVR_Flink_Gradle #60

2018-09-18 Thread Apache Jenkins Server
See 


Changes:

[ryan.blake.williams] specify localhost for portable flink VR tests

[ryan.blake.williams] enumerate primitive transforms in portable construction

[github] Update deprecation comment.

[github] Swap from READ_TRANSFORM to PAR_DO_TRANSFORM

[github] Update comment.

[lcwik] Fix tests expectations and minor code fix up.

[ryan.blake.williams] checkstyle nit

[ryan.blake.williams] use artificial subtransforms of primitives

[ryan.blake.williams] import URNs more directly in QueryablePipeline

[ryan.blake.williams] add splittable URNs as primitives

[lcwik] [BEAM-4176] Add the ability to allow for runners to register native

[ryan.blake.williams] add NativeTransform test assertions

--
[...truncated 554.22 KB...]
_Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Socket closed"
debug_error_string = 
"{"created":"@1537309337.807160614","description":"Error received from 
peer","file":"src/core/lib/surface/call.cc","file_line":1099,"grpc_message":"Socket
 closed","grpc_status":14}"
>



==
ERROR: test_combine_per_key (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 251, in 
test_combine_per_key
assert_that(res, equal_to([('a', 1.5), ('b', 3.0)]))
  File "apache_beam/pipeline.py", line 414, in __exit__
self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 209, in 
wait_until_finish
'Pipeline %s failed in state %s.' % (self._job_id, self._state))
RuntimeError: Pipeline 
test_combine_per_key_1537309329.94_03fd97ca-dd5d-40ef-b6b1-96af9d01b3fc failed 
in state FAILED.

==
ERROR: test_create (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 62, in 
test_create
assert_that(p | beam.Create(['a', 'b']), equal_to(['a', 'b']))
  File "apache_beam/pipeline.py", line 414, in __exit__
self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 209, in 
wait_until_finish
'Pipeline %s failed in state %s.' % (self._job_id, self._state))
RuntimeError: Pipeline 
test_create_1537309330.31_2db5da9e-488a-47b4-b65a-3332a66747a6 failed in state 
FAILED.

==
ERROR: test_flatten (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 244, in 
test_flatten
assert_that(res, equal_to(['a', 'b', 'c', 'd']))
  File "apache_beam/pipeline.py", line 414, in __exit__
self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 209, in 
wait_until_finish
'Pipeline %s failed in state %s.' % (self._job_id, self._state))
RuntimeError: Pipeline 
test_flatten_1537309330.86_18aa7a9c-f600-4b3f-b908-d06aefaced35 failed in state 
FAILED.

==
ERROR: test_flattened_side_input (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 190, in 
test_flattened_side_input
equal_to([(None, {'a': 1, 'b': 2})]))
  File "apache_beam/pipeline.py", line 414, in __exit__
self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 209, in 
wait_until_finish
'Pipeline %s failed in state %s.' % (self._job_id, self._state))
RuntimeError: Pipeline 
test_flattened_side_input_1537309331.37_ec56a120-1c3d-4bec-84cf-35e6014da4a3 
failed in state FAILED.

==
ERROR: test_gbk_side_input (__main__.FlinkRunnerTest)
--
Traceback (most recent call last):
  File "apache_beam/runners/portability/fn_api_runner_test.py", line 198, in 
test_gbk_side_input
equal_to([(None, {'a': [1]})]))
  File "apache_beam/pipeline.py", line 414, in __exit__
self.run().wait_until_finish()
  File "apache_beam/runners/portability/portable_runner.py", line 209, in 
wait_until_finish
'Pipeline %s failed in state %s.' % (self._job_id, self._state))
RuntimeError: Pipeline 
test_gbk_side_input_1537309331.93_b5634110-c0cb-4b05-8811-b6e86f8b423d failed 
in state FAILED.


Build failed in Jenkins: beam_PostCommit_Website_Merge #4

2018-09-18 Thread Apache Jenkins Server
See 


Changes:

[ryan.blake.williams] specify localhost for portable flink VR tests

[ryan.blake.williams] enumerate primitive transforms in portable construction

[github] Update deprecation comment.

[github] Swap from READ_TRANSFORM to PAR_DO_TRANSFORM

[github] Update comment.

[lcwik] Fix tests expectations and minor code fix up.

[ryan.blake.williams] checkstyle nit

[ryan.blake.williams] use artificial subtransforms of primitives

[ryan.blake.williams] import URNs more directly in QueryablePipeline

[ryan.blake.williams] add splittable URNs as primitives

[lcwik] [BEAM-4176] Add the ability to allow for runners to register native

[ryan.blake.williams] add NativeTransform test assertions

--
Started by GitHub push by lukecwik
[EnvInject] - Loading node environment variables.
Building remotely on websites1 (git-websites svn-websites) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision c49a97ecbf815b320926285dcddba993590e3073 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f c49a97ecbf815b320926285dcddba993590e3073
Commit message: "[BEAM-4176] enumerate primitive transforms in portable 
construction"
 > git rev-list --no-walk b292235fa9cce7f0b096415f827d964c9eadcb4e # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[Gradle] - Launching build.
[src] $ 
 
--info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
-Dorg.gradle.jvmargs=-Xmx4g :beam-website:mergeWebsite
Initialized native services in: /home/jenkins/.gradle/native
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/4.8/userguide/gradle_daemon.html.
Starting process 'Gradle build daemon'. Working directory: 
/home/jenkins/.gradle/daemon/4.8 Command: 
/usr/local/asfpackages/java/jdk1.8.0_172/bin/java -Xmx4g -Dfile.encoding=UTF-8 
-Duser.country=US -Duser.language=en -Duser.variant -cp 
/home/jenkins/.gradle/wrapper/dists/gradle-4.8-bin/divx0s2uj4thofgytb7gf9fsi/gradle-4.8/lib/gradle-launcher-4.8.jar
 org.gradle.launcher.daemon.bootstrap.GradleDaemon 4.8
Successfully started process 'Gradle build daemon'
An attempt to start the daemon took 2.959 secs.
The client will now receive all logging from the daemon (pid: 31093). The 
daemon log file: /home/jenkins/.gradle/daemon/4.8/daemon-31093.out.log
Closing daemon's stdin at end of input.
The daemon will no longer process any standard input.
Daemon will be stopped at the end of the build stopping after processing
Using 12 worker leases.
Starting Build
Parallel execution is an incubating feature.

> Configure project :buildSrc
Evaluating project ':buildSrc' using build file 
'
file or directory 
'
 not found
Selected primary task 'build' from project :
file or directory 
'
 not found
Using local directory build cache for build ':buildSrc' (location = 
/home/jenkins/.gradle/caches/build-cache-1, removeUnusedEntriesAfter = 7 days).
:compileJava (Thread[Task worker for ':buildSrc' Thread 2,5,main]) started.

> Task :buildSrc:compileJava NO-SOURCE
file or directory 
'
 not found
Skipping task ':buildSrc:compileJava' as it has no source files and no previous 
output files.
:compileJava (Thread[Task worker for ':buildSrc' Thread 2,5,main]) completed. 
Took 0.035 secs.
:compileGroovy (Thread[Task worker for ':buildSrc' Thread 7,5,main]) started.

> Task :buildSrc:compileGroovy
Build cache key for task ':buildSrc:compileGroovy' is 
4aa1b6c9c202c4fc990c9cd958f0dce7
Task ':buildSrc:compileGroovy' is not up-to-date because:
  No history 

[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=145506=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145506
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 18/Sep/18 22:15
Start Date: 18/Sep/18 22:15
Worklog Time Spent: 10m 
  Work Description: lukecwik closed pull request #6328: [BEAM-4176] 
enumerate primitive transforms in portable construction
URL: https://github.com/apache/beam/pull/6328
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/NativeTransforms.java
 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/NativeTransforms.java
new file mode 100644
index 000..730a5ca561f
--- /dev/null
+++ 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/NativeTransforms.java
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.core.construction;
+
+import com.google.auto.service.AutoService;
+import java.util.Iterator;
+import java.util.ServiceLoader;
+import java.util.function.Predicate;
+import org.apache.beam.model.pipeline.v1.RunnerApi;
+
+/**
+ * An extension point for users to define their own native transforms for 
usage with specific
+ * runners. This extension point enables shared libraries within the Apache 
Beam codebase to treat
+ * the native transform as a primitive transforms that the runner implicitly 
understands.
+ *
+ * Warning:Usage of native transforms within pipelines will prevent 
users from migrating
+ * between runners as there is no expectation that the transform will be 
understood by all runners.
+ * Note that for some use cases this can be a way to test out a new type of 
transform on a limited
+ * set of runners and promote its adoption as a primitive within the Apache 
Beam model.
+ *
+ * Note that users are required to ensure that translation and execution 
for the native transform
+ * is supported by their runner.
+ *
+ * Automatic registration occurs by creating a {@link ServiceLoader} entry 
and a concrete
+ * implementation of the {@link IsNativeTransform} interface. It is optional 
but recommended to use
+ * one of the many build time tools such as {@link AutoService} to generate 
the necessary META-INF
+ * files automatically.
+ */
+public class NativeTransforms {
+  /**
+   * Returns true if an only if the Runner understands this transform and can 
handle it directly.
+   */
+  public static boolean isNative(RunnerApi.PTransform pTransform) {
+Iterator matchers = 
ServiceLoader.load(IsNativeTransform.class).iterator();
+while (matchers.hasNext()) {
+  if (matchers.next().test(pTransform)) {
+return true;
+  }
+}
+return false;
+  }
+
+  /** A predicate which returns true if and only if the transform is a native 
transform. */
+  public interface IsNativeTransform extends Predicate {
+@Override
+boolean test(RunnerApi.PTransform pTransform);
+  }
+}
diff --git 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java
 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java
index 7be45388695..0401b572dda 100644
--- 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java
+++ 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java
@@ -67,7 +67,14 @@
   getUrn(StandardPTransforms.Primitives.ASSIGN_WINDOWS);
   public static final String TEST_STREAM_TRANSFORM_URN =
   getUrn(StandardPTransforms.Primitives.TEST_STREAM);
+  public static final String MAP_WINDOWS_TRANSFORM_URN =
+  

[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=145505=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145505
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 18/Sep/18 22:15
Start Date: 18/Sep/18 22:15
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #6328: [BEAM-4176] 
enumerate primitive transforms in portable construction
URL: https://github.com/apache/beam/pull/6328#issuecomment-422575816
 
 
   Thanks Ryan.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145505)
Time Spent: 21h 50m  (was: 21h 40m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
> Attachments: Screen Shot 2018-08-14 at 4.18.31 PM.png, Screen Shot 
> 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 21h 50m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (b292235 -> c49a97e)

2018-09-18 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from b292235  Merge pull request #6365: [BEAM-5342] Upgrades Google API 
Client libraries to 1.24.1
 add d82a8a5  specify localhost for portable flink VR tests
 add e331952  enumerate primitive transforms in portable construction
 add d987918  Update deprecation comment.
 add 5dc6ed2  Swap from READ_TRANSFORM to PAR_DO_TRANSFORM
 add 2b94f2e  Update comment.
 add 52e21d4  Fix tests expectations and minor code fix up.
 add af58bd2  Merge pull request #1 from lukecwik/pr6328
 add 59570db  checkstyle nit
 add ef202f1  Merge remote-tracking branch 'upstream/HEAD' into vro
 add 727ae56  use artificial subtransforms of primitives
 add d3772a8  import URNs more directly in QueryablePipeline
 add 80a1102  add splittable URNs as primitives
 add f0a0e93  [BEAM-4176] Add the ability to allow for runners to register 
native transforms.
 add 408473d  Merge pull request #2 from lukecwik/pr6328
 add c75b5f4  add NativeTransform test assertions
 new c49a97e  [BEAM-4176] enumerate primitive transforms in portable 
construction

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../core/construction/NativeTransforms.java|  63 +
 .../core/construction/PTransformTranslation.java   |  13 +++
 .../core/construction/graph/QueryablePipeline.java |  46 --
 ...eFactoryTest.java => NativeTransformsTest.java} |  42 +
 .../graph/GreedyPipelineFuserTest.java |  10 +-
 .../construction/graph/GreedyStageFuserTest.java   |  10 ++
 .../construction/graph/OutputDeduplicatorTest.java | 102 -
 .../construction/graph/QueryablePipelineTest.java  |  21 -
 runners/flink/job-server/build.gradle  |   2 +-
 .../java/org/apache/beam/sdk/PipelineTest.java |   2 +-
 10 files changed, 273 insertions(+), 38 deletions(-)
 create mode 100644 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/NativeTransforms.java
 copy 
runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/{UnsupportedOverrideFactoryTest.java
 => NativeTransformsTest.java} (50%)



[beam] 01/01: [BEAM-4176] enumerate primitive transforms in portable construction

2018-09-18 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit c49a97ecbf815b320926285dcddba993590e3073
Merge: b292235 c75b5f4
Author: Lukasz Cwik 
AuthorDate: Tue Sep 18 15:15:47 2018 -0700

[BEAM-4176] enumerate primitive transforms in portable construction

 .../core/construction/NativeTransforms.java|  63 +
 .../core/construction/PTransformTranslation.java   |  13 +++
 .../core/construction/graph/QueryablePipeline.java |  46 --
 .../core/construction/NativeTransformsTest.java|  54 +++
 .../graph/GreedyPipelineFuserTest.java |  10 +-
 .../construction/graph/GreedyStageFuserTest.java   |  10 ++
 .../construction/graph/OutputDeduplicatorTest.java | 102 -
 .../construction/graph/QueryablePipelineTest.java  |  21 -
 runners/flink/job-server/build.gradle  |   2 +-
 .../java/org/apache/beam/sdk/PipelineTest.java |   2 +-
 10 files changed, 305 insertions(+), 18 deletions(-)




Jenkins build is back to normal : beam_PostCommit_Python_PVR_Flink_Gradle #59

2018-09-18 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=145504=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145504
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 18/Sep/18 22:09
Start Date: 18/Sep/18 22:09
Worklog Time Spent: 10m 
  Work Description: ryan-williams commented on issue #6328: [BEAM-4176] 
enumerate primitive transforms in portable construction
URL: https://github.com/apache/beam/pull/6328#issuecomment-422573991
 
 
   OK, I managed to get [a run to 
complete](https://storage.googleapis.com/runsascoded-tmp/beam-portable-flink-vr-tests/ecd1ac085a/index.html):
   
   
![](https://cl.ly/4ed744829264/Screen%20Shot%202018-09-18%20at%206.04.22%20PM.png)
   
   Notes on getting it to run:
   - I increased my local Docker image size to 256GB (from 64GB)
   - I also ran `docker container prune -f` multiple times during the run
   - idk if either of the above are necessary or sufficient
   - `docker system df` is also useful to determine how full the image is.
   
   Notes on the results:
   - 10 more cases are passing since I started this PR
 - possibly due to fixes I added for unittests over the course of this PR
 - or other OOB changes
   - runtime is much worse! 2h30m. seems like nothing to be done there atm, and 
is related to an increased number of containers being created (thanks for 
explaining @angoenka)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145504)
Time Spent: 21h 40m  (was: 21.5h)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
> Attachments: Screen Shot 2018-08-14 at 4.18.31 PM.png, Screen Shot 
> 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 21h 40m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5426) Use both destination and TableDestination for BQ load job IDs

2018-09-18 Thread Reuven Lax (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619792#comment-16619792
 ] 

Reuven Lax commented on BEAM-5426:
--

If different destinations return the same TableDestination, worse things can 
happen. In that case parallel loads to the same table might happen from 
different workers (since we distribute based on the destination), which can 
cause data corruption (e.g. if the disposition is set to WRITE_TRUNCATE).

> Use both destination and TableDestination for BQ load job IDs
> -
>
> Key: BEAM-5426
> URL: https://issues.apache.org/jira/browse/BEAM-5426
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Priority: Major
>
> Currently we use TableDestination when creating a unique load job ID for a 
> destination: 
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359]
>  
> This can result in a data loss issue if a user returns the same 
> TableDestination for different destination IDs. I think we can prevent this 
> if we include both IDs in the BQ load job ID.
>  
> CC: [~reuvenlax]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_PVR_Flink_Gradle #58

2018-09-18 Thread Apache Jenkins Server
See 


Changes:

[lcwik] [BEAM-4711] fix globbing in LocalFileSystem.delete (#5863)

--
[...truncated 6.68 MB...]
[grpc-default-executor-1] INFO sdk_worker.__init__ - Creating insecure control 
channel.
[grpc-default-executor-1] INFO sdk_worker.__init__ - Control channel 
established.
[grpc-default-executor-1] INFO sdk_worker.__init__ - Initializing SDKHarness 
with 12 workers.
[grpc-default-executor-1] INFO 
org.apache.beam.runners.fnexecution.control.FnApiControlClientPoolService - 
Beam Fn Control client connected with id 1
[grpc-default-executor-0] INFO sdk_worker.run - Got work 1
[grpc-default-executor-1] INFO sdk_worker.run - Got work 5
[grpc-default-executor-1] INFO sdk_worker.run - Got work 4
[grpc-default-executor-1] INFO sdk_worker.run - Got work 6
[grpc-default-executor-1] INFO sdk_worker.run - Got work 3
[grpc-default-executor-0] INFO sdk_worker.run - Got work 2
[grpc-default-executor-1] INFO sdk_worker.run - Got work 8
[grpc-default-executor-1] INFO sdk_worker.create_state_handler - Creating 
channel for localhost:45473
[grpc-default-executor-1] INFO sdk_worker.run - Got work 7
[grpc-default-executor-1] INFO data_plane.create_data_channel - Creating 
channel for localhost:42463
[grpc-default-executor-1] INFO 
org.apache.beam.runners.fnexecution.data.GrpcDataService - Beam Fn Data client 
connected.
[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-0] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-0] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (a91689fb8a9f58a5df23640794c96722) switched from 
RUNNING to FINISHED.
[Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Freeing task resources for Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (a91689fb8a9f58a5df23640794c96722).
[Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Ensuring all FileSystem streams are closed for task Source: Collection Source 
-> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (a91689fb8a9f58a5df23640794c96722) [FINISHED]
[flink-akka.actor.default-dispatcher-3] INFO 
org.apache.flink.runtime.taskexecutor.TaskExecutor - Un-registering task and 
sending final execution state FINISHED to JobManager for task Source: 
Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem a91689fb8a9f58a5df23640794c96722.
[flink-akka.actor.default-dispatcher-3] INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Collection 
Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (a91689fb8a9f58a5df23640794c96722) switched from 
RUNNING to FINISHED.
[grpc-default-executor-2] INFO sdk_worker.run - Got work 9
[Source: Collection Source -> 

[jira] [Work logged] (BEAM-5339) Implement new policy on Beam dependency tooling

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5339?focusedWorklogId=145500=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145500
 ]

ASF GitHub Bot logged work on BEAM-5339:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:59
Start Date: 18/Sep/18 21:59
Worklog Time Spent: 10m 
  Work Description: yifanzou opened a new pull request #554: [BEAM-5339] 
update the beam dependency guide
URL: https://github.com/apache/beam-site/pull/554
 
 
   *Please* add a meaningful description for your change here.
   
   Once your pull request has been opened and assigned a number, please edit the
   URL below, replacing `PULL_REQUEST_NUMBER` with the number of your pull 
request.
   
   
http://apache-beam-website-pull-requests.storage.googleapis.com/PULL_REQUEST_NUMBER/index.html
   
   Finally, it will help us expedite review of your Pull Request if you tag
   someone (e.g. @username) to look at it.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145500)
Time Spent: 1.5h  (was: 1h 20m)

> Implement new policy on Beam dependency tooling
> ---
>
> Key: BEAM-5339
> URL: https://issues.apache.org/jira/browse/BEAM-5339
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> (1) Instead of a dependency "owners" list we will be maintaining an 
> "interested parties" list. When we create a JIRA for a dependency we will not 
> assign it to an owner but rather we will CC all the folks that mentioned that 
> they will be interested in receiving updates related to that dependency. Hope 
> is that some of the interested parties will also put forward the effort to 
> upgrade dependencies they are interested in but the responsibility of 
> upgrading dependencies lie with the community as a whole.
>  (2) We will be creating JIRAs for upgrading individual dependencies, not for 
> upgrading to specific versions of those dependencies. For example, if a given 
> dependency X is three minor versions or an year behind we will create a JIRA 
> for upgrading that. But the specific version to upgrade to has to be 
> determined by the Beam community. Beam community might choose to close a JIRA 
> if there are known issues with available recent releases. Tool may reopen 
> such a closed JIRA in the future if new information becomes available (for 
> example, 3 new versions have been released since JIRA was closed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5342) Migrate google-api-client libraries to 1.24.1

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5342?focusedWorklogId=145498=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145498
 ]

ASF GitHub Bot logged work on BEAM-5342:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:58
Start Date: 18/Sep/18 21:58
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #6365: [BEAM-5342] 
Upgrades Google API Client libraries to 1.24.1
URL: https://github.com/apache/beam/pull/6365#issuecomment-422570993
 
 
   Post commit failure is unrelated: 
https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild_PR/127/testReport/
   
   Merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145498)
Time Spent: 1h 10m  (was: 1h)

> Migrate google-api-client libraries to 1.24.1
> -
>
> Key: BEAM-5342
> URL: https://issues.apache.org/jira/browse/BEAM-5342
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-dataflow
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We currently use 1.23 libraries which is about an year old. We should migrate 
> to more recent 1.24.1 which fixes several known issues.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Website_Merge #3

2018-09-18 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Upgrades Google API Client libraries to 1.24.1

--
Started by GitHub push by chamikaramj
[EnvInject] - Loading node environment variables.
Building remotely on websites1 (git-websites svn-websites) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision b292235fa9cce7f0b096415f827d964c9eadcb4e (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f b292235fa9cce7f0b096415f827d964c9eadcb4e
Commit message: "Merge pull request #6365: [BEAM-5342] Upgrades Google API 
Client libraries to 1.24.1"
 > git rev-list --no-walk 25645cd5795a9b6df67c8f2e3354a79394935ae8 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[Gradle] - Launching build.
[src] $ 
 
--info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
-Dorg.gradle.jvmargs=-Xmx4g :beam-website:mergeWebsite
Initialized native services in: /home/jenkins/.gradle/native
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/4.8/userguide/gradle_daemon.html.
Starting process 'Gradle build daemon'. Working directory: 
/home/jenkins/.gradle/daemon/4.8 Command: 
/usr/local/asfpackages/java/jdk1.8.0_172/bin/java -Xmx4g -Dfile.encoding=UTF-8 
-Duser.country=US -Duser.language=en -Duser.variant -cp 
/home/jenkins/.gradle/wrapper/dists/gradle-4.8-bin/divx0s2uj4thofgytb7gf9fsi/gradle-4.8/lib/gradle-launcher-4.8.jar
 org.gradle.launcher.daemon.bootstrap.GradleDaemon 4.8
Successfully started process 'Gradle build daemon'
An attempt to start the daemon took 1.188 secs.
The client will now receive all logging from the daemon (pid: 30893). The 
daemon log file: /home/jenkins/.gradle/daemon/4.8/daemon-30893.out.log
Daemon will be stopped at the end of the build stopping after processing
Closing daemon's stdin at end of input.
The daemon will no longer process any standard input.
Using 12 worker leases.
Starting Build
Parallel execution is an incubating feature.

> Configure project :buildSrc
Evaluating project ':buildSrc' using build file 
'
file or directory 
'
 not found
Selected primary task 'build' from project :
file or directory 
'
 not found
Using local directory build cache for build ':buildSrc' (location = 
/home/jenkins/.gradle/caches/build-cache-1, removeUnusedEntriesAfter = 7 days).
:compileJava (Thread[Task worker for ':buildSrc',5,main]) started.

> Task :buildSrc:compileJava NO-SOURCE
file or directory 
'
 not found
Skipping task ':buildSrc:compileJava' as it has no source files and no previous 
output files.
:compileJava (Thread[Task worker for ':buildSrc',5,main]) completed. Took 0.084 
secs.
:compileGroovy (Thread[Task worker for ':buildSrc' Thread 10,5,main]) started.

> Task :buildSrc:compileGroovy
Build cache key for task ':buildSrc:compileGroovy' is 
4aa1b6c9c202c4fc990c9cd958f0dce7
Task ':buildSrc:compileGroovy' is not up-to-date because:
  No history is available.
Starting process 'Gradle Worker Daemon 1'. Working directory: 
/home/jenkins/.gradle/workers Command: 
/usr/local/asfpackages/java/jdk1.8.0_172/bin/java 
-Djava.security.manager=worker.org.gradle.process.internal.worker.child.BootstrapSecurityManager
 -Dfile.encoding=UTF-8 -Duser.country=US -Duser.language=en -Duser.variant -cp 
/home/jenkins/.gradle/caches/4.8/workerMain/gradle-worker.jar 
worker.org.gradle.process.internal.worker.GradleWorkerMain 'Gradle Worker 
Daemon 1'
Successfully started process 'Gradle Worker Daemon 1'
Started Gradle worker daemon (0.566 secs) with fork options 

[jira] [Work logged] (BEAM-5342) Migrate google-api-client libraries to 1.24.1

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5342?focusedWorklogId=145499=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145499
 ]

ASF GitHub Bot logged work on BEAM-5342:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:58
Start Date: 18/Sep/18 21:58
Worklog Time Spent: 10m 
  Work Description: chamikaramj closed pull request #6365: [BEAM-5342] 
Upgrades Google API Client libraries to 1.24.1
URL: https://github.com/apache/beam/pull/6365
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy 
b/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
index d17f16ca5ed..5b4154d957d 100644
--- a/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
+++ b/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
@@ -302,7 +302,7 @@ class BeamModulePlugin implements Plugin {
 def generated_grpc_ga_version = "1.18.0"
 def google_cloud_bigdataoss_version = "1.9.0"
 def bigtable_version = "1.4.0"
-def google_clients_version = "1.23.0"
+def google_clients_version = "1.24.1"
 def google_auth_version = "0.10.0"
 def grpc_version = "1.13.1"
 def protobuf_version = "3.6.0"
@@ -357,12 +357,12 @@ class BeamModulePlugin implements Plugin {
 google_api_client_jackson2  : 
"com.google.api-client:google-api-client-jackson2:$google_clients_version",
 google_api_client_java6 : 
"com.google.api-client:google-api-client-java6:$google_clients_version",
 google_api_common   : 
"com.google.api:api-common:1.6.0",
-google_api_services_bigquery: 
"com.google.apis:google-api-services-bigquery:v2-rev374-$google_clients_version",
-google_api_services_clouddebugger   : 
"com.google.apis:google-api-services-clouddebugger:v2-rev233-$google_clients_version",
-google_api_services_cloudresourcemanager: 
"com.google.apis:google-api-services-cloudresourcemanager:v1-rev477-$google_clients_version",
-google_api_services_dataflow: 
"com.google.apis:google-api-services-dataflow:v1b3-rev221-$google_clients_version",
-google_api_services_pubsub  : 
"com.google.apis:google-api-services-pubsub:v1-rev382-$google_clients_version",
-google_api_services_storage : 
"com.google.apis:google-api-services-storage:v1-rev124-$google_clients_version",
+google_api_services_bigquery: 
"com.google.apis:google-api-services-bigquery:v2-rev402-$google_clients_version",
+google_api_services_clouddebugger   : 
"com.google.apis:google-api-services-clouddebugger:v2-rev253-$google_clients_version",
+google_api_services_cloudresourcemanager: 
"com.google.apis:google-api-services-cloudresourcemanager:v1-rev502-$google_clients_version",
+google_api_services_dataflow: 
"com.google.apis:google-api-services-dataflow:v1b3-rev257-$google_clients_version",
+google_api_services_pubsub  : 
"com.google.apis:google-api-services-pubsub:v1-rev399-$google_clients_version",
+google_api_services_storage : 
"com.google.apis:google-api-services-storage:v1-rev136-$google_clients_version",
 google_auth_library_credentials : 
"com.google.auth:google-auth-library-credentials:$google_auth_version",
 google_auth_library_oauth2_http : 
"com.google.auth:google-auth-library-oauth2-http:$google_auth_version",
 google_cloud_core   : 
"com.google.cloud:google-cloud-core:$google_cloud_core_version",


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145499)
Time Spent: 1h 20m  (was: 1h 10m)

> Migrate google-api-client libraries to 1.24.1
> -
>
> Key: BEAM-5342
> URL: https://issues.apache.org/jira/browse/BEAM-5342
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, runner-dataflow
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> 

[beam] branch master updated (25645cd -> b292235)

2018-09-18 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 25645cd  Merge pull request #6420: [BEAM-5364] Check if validation is 
disabled when validating BigtableSource
 add 2d76e24  Upgrades Google API Client libraries to 1.24.1
 add b292235  Merge pull request #6365: [BEAM-5342] Upgrades Google API 
Client libraries to 1.24.1

No new revisions were added by this update.

Summary of changes:
 .../groovy/org/apache/beam/gradle/BeamModulePlugin.groovy  | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)



[jira] [Work logged] (BEAM-5364) BigtableIO source tries to validate table ID even though validation is turned off

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5364?focusedWorklogId=145496=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145496
 ]

ASF GitHub Bot logged work on BEAM-5364:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:57
Start Date: 18/Sep/18 21:57
Worklog Time Spent: 10m 
  Work Description: chamikaramj closed pull request #6420: [BEAM-5364] 
Check if validation is disabled when validating BigtableSource
URL: https://github.com/apache/beam/pull/6420
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
index ae8fe7d04d9..edad185323c 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
@@ -1080,6 +1080,11 @@ private long 
getEstimatedSizeBytesBasedOnSamples(List sam
 
 @Override
 public void validate() {
+  if (!config.getValidate()) {
+LOG.debug("Validation is disabled");
+return;
+  }
+
   ValueProvider tableId = config.getTableId();
   checkArgument(
   tableId != null && tableId.isAccessible() && 
!tableId.get().isEmpty(),
diff --git 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIOTest.java
 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIOTest.java
index 47727e5b8a1..cadb908be5a 100644
--- 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIOTest.java
+++ 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIOTest.java
@@ -87,11 +87,13 @@
 import org.apache.beam.sdk.io.gcp.bigtable.BigtableIO.BigtableSource;
 import org.apache.beam.sdk.io.range.ByteKey;
 import org.apache.beam.sdk.io.range.ByteKeyRange;
+import org.apache.beam.sdk.options.Description;
 import org.apache.beam.sdk.options.PipelineOptionsFactory;
 import org.apache.beam.sdk.options.ValueProvider;
 import org.apache.beam.sdk.testing.ExpectedLogs;
 import org.apache.beam.sdk.testing.PAssert;
 import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.TestPipeline.PipelineRunMissingException;
 import org.apache.beam.sdk.transforms.Create;
 import org.apache.beam.sdk.transforms.SerializableFunction;
 import org.apache.beam.sdk.transforms.display.DisplayData;
@@ -115,6 +117,27 @@
   @Rule public ExpectedException thrown = ExpectedException.none();
   @Rule public ExpectedLogs logged = ExpectedLogs.none(BigtableIO.class);
 
+  /** Read Options for testing. */
+  public interface ReadOptions extends GcpOptions {
+@Description("The project that contains the table to export.")
+ValueProvider getBigtableProject();
+
+@SuppressWarnings("unused")
+void setBigtableProject(ValueProvider projectId);
+
+@Description("The Bigtable instance id that contains the table to export.")
+ValueProvider getBigtableInstanceId();
+
+@SuppressWarnings("unused")
+void setBigtableInstanceId(ValueProvider instanceId);
+
+@Description("The Bigtable table id to export.")
+ValueProvider getBigtableTableId();
+
+@SuppressWarnings("unused")
+void setBigtableTableId(ValueProvider tableId);
+  }
+
   static final ValueProvider NOT_ACCESSIBLE_VALUE =
   new ValueProvider() {
 @Override
@@ -223,6 +246,39 @@ public void 
testReadValidationFailsMissingInstanceIdAndProjectId() {
 read.expand(null);
   }
 
+  @Test
+  public void testReadWithRuntimeParametersValidationFailed() {
+ReadOptions options = 
PipelineOptionsFactory.fromArgs().withValidation().as(ReadOptions.class);
+
+BigtableIO.Read read =
+BigtableIO.read()
+.withProjectId(options.getBigtableProject())
+.withInstanceId(options.getBigtableInstanceId())
+.withTableId(options.getBigtableTableId());
+
+thrown.expect(IllegalArgumentException.class);
+thrown.expectMessage("tableId was not supplied");
+
+p.apply(read);
+  }
+
+  @Test
+  public void testReadWithRuntimeParametersValidationDisabled() {
+ReadOptions options = 
PipelineOptionsFactory.fromArgs().withValidation().as(ReadOptions.class);
+
+BigtableIO.Read read =
+BigtableIO.read()
+.withoutValidation()
+.withProjectId(options.getBigtableProject())
+.withInstanceId(options.getBigtableInstanceId())
+

[beam] branch master updated (073f77d -> 25645cd)

2018-09-18 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 073f77d  [BEAM-4711] fix globbing in LocalFileSystem.delete (#5863)
 add a908ad5  Check if validation is disabled when validating BigtableSource
 new 25645cd  Merge pull request #6420: [BEAM-5364] Check if validation is 
disabled when validating BigtableSource

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../beam/sdk/io/gcp/bigtable/BigtableIO.java   |  5 ++
 .../beam/sdk/io/gcp/bigtable/BigtableIOTest.java   | 56 ++
 2 files changed, 61 insertions(+)



Build failed in Jenkins: beam_PostCommit_Website_Merge #2

2018-09-18 Thread Apache Jenkins Server
See 


Changes:

[kevinsi] Check if validation is disabled when validating BigtableSource

--
Started by GitHub push by chamikaramj
[EnvInject] - Loading node environment variables.
Building remotely on websites1 (git-websites svn-websites) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 25645cd5795a9b6df67c8f2e3354a79394935ae8 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 25645cd5795a9b6df67c8f2e3354a79394935ae8
Commit message: "Merge pull request #6420: [BEAM-5364] Check if validation is 
disabled when validating BigtableSource"
 > git rev-list --no-walk 073f77df021c452bea03ee21eb61fc5331b61c14 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[Gradle] - Launching build.
[src] $ 
 
--info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
-Dorg.gradle.jvmargs=-Xmx4g :beam-website:mergeWebsite
Initialized native services in: /home/jenkins/.gradle/native
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/4.8/userguide/gradle_daemon.html.
Starting process 'Gradle build daemon'. Working directory: 
/home/jenkins/.gradle/daemon/4.8 Command: 
/usr/local/asfpackages/java/jdk1.8.0_172/bin/java -Xmx4g -Dfile.encoding=UTF-8 
-Duser.country=US -Duser.language=en -Duser.variant -cp 
/home/jenkins/.gradle/wrapper/dists/gradle-4.8-bin/divx0s2uj4thofgytb7gf9fsi/gradle-4.8/lib/gradle-launcher-4.8.jar
 org.gradle.launcher.daemon.bootstrap.GradleDaemon 4.8
Successfully started process 'Gradle build daemon'
An attempt to start the daemon took 0.951 secs.
The client will now receive all logging from the daemon (pid: 20809). The 
daemon log file: /home/jenkins/.gradle/daemon/4.8/daemon-20809.out.log
Closing daemon's stdin at end of input.
The daemon will no longer process any standard input.
Daemon will be stopped at the end of the build stopping after processing
Using 12 worker leases.
Starting Build
Parallel execution is an incubating feature.

> Configure project :buildSrc
Evaluating project ':buildSrc' using build file 
'
file or directory 
'
 not found
Selected primary task 'build' from project :
file or directory 
'
 not found
Using local directory build cache for build ':buildSrc' (location = 
/home/jenkins/.gradle/caches/build-cache-1, removeUnusedEntriesAfter = 7 days).
:compileJava (Thread[Task worker for ':buildSrc',5,main]) started.

> Task :buildSrc:compileJava NO-SOURCE
file or directory 
'
 not found
Skipping task ':buildSrc:compileJava' as it has no source files and no previous 
output files.
:compileJava (Thread[Task worker for ':buildSrc',5,main]) completed. Took 0.038 
secs.
:compileGroovy (Thread[Task worker for ':buildSrc',5,main]) started.

> Task :buildSrc:compileGroovy
Build cache key for task ':buildSrc:compileGroovy' is 
8f8b31e5d9e84d48db72a071a889209f
Task ':buildSrc:compileGroovy' is not up-to-date because:
  No history is available.
Origin for task ':buildSrc:compileGroovy': {executionTime=5906, 
hostName=jenkins-websites1.apache.org, operatingSystem=Linux, 
buildInvocationId=fyduppipababfgg4v3zmwlzwrm, creationTime=1537305634099, 
type=org.gradle.api.tasks.compile.GroovyCompile_Decorated, userName=jenkins, 
gradleVersion=4.8, 
rootPath=
 path=:compileGroovy}
Unpacked output for task ':buildSrc:compileGroovy' from cache.

> Task :buildSrc:compileGroovy FROM-CACHE
:compileGroovy (Thread[Task worker for ':buildSrc',5,main]) completed. 

[jira] [Work logged] (BEAM-5364) BigtableIO source tries to validate table ID even though validation is turned off

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5364?focusedWorklogId=145495=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145495
 ]

ASF GitHub Bot logged work on BEAM-5364:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:56
Start Date: 18/Sep/18 21:56
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #6420: [BEAM-5364] Check 
if validation is disabled when validating BigtableSource
URL: https://github.com/apache/beam/pull/6420#issuecomment-422570609
 
 
   Post-commit failures are unrelated. BigTable tests passed: 
https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild_PR/126/testReport/org.apache.beam.sdk.io.gcp.bigtable/
   
   Merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145495)
Time Spent: 4h 50m  (was: 4h 40m)

>  BigtableIO source tries to validate table ID even though validation is 
> turned off
> --
>
> Key: BEAM-5364
> URL: https://issues.apache.org/jira/browse/BEAM-5364
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kevin Si
>Assignee: Chamikara Jayalath
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java#L1084|https://www.google.com/url?q=https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java%23L1084=D=AFQjCNEfHprTOvnwAwFSrXwUuLvc__JBWg]
> The validation can be turned off with following:
> BigtableIO.read()
>             .withoutValidation() // skip validation when constructing the 
> pipelline.
> A Dataflow template cannot be constructed due to this validation failure.
>  
> Error log when trying to construct a template:
> Exception in thread "main" java.lang.IllegalArgumentException: tableId was 
> not supplied
>         at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
>         at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableSource.validate(BigtableIO.java:1084)
>         at org.apache.beam.sdk.io.Read$Bounded.expand(Read.java:95)
>         at org.apache.beam.sdk.io.Read$Bounded.expand(Read.java:85)
>         at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
>         at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:471)
>         at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44)
>         at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:167)
>         at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.expand(BigtableIO.java:423)
>         at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.expand(BigtableIO.java:179)
>         at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
>         at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:488)
>         at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56)
>         at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:182)
>         at 
> com.google.cloud.teleport.bigtable.BigtableToAvro.main(BigtableToAvro.java:89)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: Merge pull request #6420: [BEAM-5364] Check if validation is disabled when validating BigtableSource

2018-09-18 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 25645cd5795a9b6df67c8f2e3354a79394935ae8
Merge: 073f77d a908ad5
Author: Chamikara Jayalath 
AuthorDate: Tue Sep 18 14:57:03 2018 -0700

Merge pull request #6420: [BEAM-5364] Check if validation is disabled when 
validating BigtableSource

 .../beam/sdk/io/gcp/bigtable/BigtableIO.java   |  5 ++
 .../beam/sdk/io/gcp/bigtable/BigtableIOTest.java   | 56 ++
 2 files changed, 61 insertions(+)



[jira] [Assigned] (BEAM-5426) Use both destination and TableDestination for BQ load job IDs

2018-09-18 Thread Chamikara Jayalath (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-5426:


Assignee: (was: Chamikara Jayalath)

> Use both destination and TableDestination for BQ load job IDs
> -
>
> Key: BEAM-5426
> URL: https://issues.apache.org/jira/browse/BEAM-5426
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Chamikara Jayalath
>Priority: Major
>
> Currently we use TableDestination when creating a unique load job ID for a 
> destination: 
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359]
>  
> This can result in a data loss issue if a user returns the same 
> TableDestination for different destination IDs. I think we can prevent this 
> if we include both IDs in the BQ load job ID.
>  
> CC: [~reuvenlax]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4711) LocalFileSystem.delete doesn't support globbing

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4711?focusedWorklogId=145494=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145494
 ]

ASF GitHub Bot logged work on BEAM-4711:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:52
Start Date: 18/Sep/18 21:52
Worklog Time Spent: 10m 
  Work Description: lukecwik closed pull request #5863: [BEAM-4711] fix 
globbing in LocalFileSystem.delete
URL: https://github.com/apache/beam/pull/5863
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/io/localfilesystem.py 
b/sdks/python/apache_beam/io/localfilesystem.py
index 73d9a4de6cd..ddd5022b2bb 100644
--- a/sdks/python/apache_beam/io/localfilesystem.py
+++ b/sdks/python/apache_beam/io/localfilesystem.py
@@ -321,11 +321,22 @@ def _delete_path(path):
 raise IOError(err)
 
 exceptions = {}
-for path in paths:
+
+def try_delete(path):
   try:
 _delete_path(path)
   except Exception as e:  # pylint: disable=broad-except
 exceptions[path] = e
 
+for match_result in self.match(paths):
+  metadata_list = match_result.metadata_list
+
+  if not metadata_list:
+exceptions[match_result.pattern] = \
+  IOError('No files found to delete under: %s' % match_result.pattern)
+
+  for metadata in match_result.metadata_list:
+try_delete(metadata.path)
+
 if exceptions:
   raise BeamIOError("Delete operation failed", exceptions)
diff --git a/sdks/python/apache_beam/io/localfilesystem_test.py 
b/sdks/python/apache_beam/io/localfilesystem_test.py
index 5d032db59a1..bc45e4221dd 100644
--- a/sdks/python/apache_beam/io/localfilesystem_test.py
+++ b/sdks/python/apache_beam/io/localfilesystem_test.py
@@ -291,6 +291,195 @@ def test_checksum(self):
 self.assertEquals(self.fs.checksum(path1), str(5))
 self.assertEquals(self.fs.checksum(path2), str(3))
 
+  def make_tree(self, path, value, expected_leaf_count=None):
+"""Create a file+directory structure from a simple dict-based DSL
+
+:param path: root path to create directories+files under
+:param value: a specification of what ``path`` should contain: ``None`` to
+ make it an empty directory, a string literal to make it a file with those
+  contents, and a ``dict`` to make it a non-empty directory and recurse
+:param expected_leaf_count: only be set at the top of a recursive call
+ stack; after the whole tree has been created, verify the presence and
+ number of all files+directories, as a sanity check
+"""
+if value is None:
+  # empty directory
+  os.makedirs(path)
+elif isinstance(value, str):
+  # file with string-literal contents
+  dir = os.path.dirname(path)
+  if not os.path.exists(dir):
+os.makedirs(dir)
+  with open(path, 'a') as f:
+f.write(value)
+elif isinstance(value, dict):
+  # recurse to create a subdirectory tree
+  for basename, v in value.items():
+self.make_tree(
+os.path.join(path, basename),
+v
+)
+else:
+  raise Exception(
+  'Unexpected value in tempdir tree: %s' % value
+  )
+
+if expected_leaf_count != None:
+  self.assertEqual(
+  self.check_tree(path, value),
+  expected_leaf_count
+  )
+
+  def check_tree(self, path, value, expected_leaf_count=None):
+"""Verify a directory+file structure according to the rules described in
+``make_tree``
+
+:param path: path to check under
+:param value: DSL-representation of expected files+directories under
+``path``
+:return: number of leaf files/directories that were verified
+"""
+actual_leaf_count = None
+if value is None:
+  # empty directory
+  self.assertTrue(os.path.exists(path), msg=path)
+  self.assertEqual(os.listdir(path), [])
+  actual_leaf_count = 1
+elif isinstance(value, str):
+  # file with string-literal contents
+  with open(path, 'r') as f:
+self.assertEqual(f.read(), value, msg=path)
+
+  actual_leaf_count = 1
+elif isinstance(value, dict):
+  # recurse to check subdirectory tree
+  actual_leaf_count = sum(
+  [
+  self.check_tree(
+  os.path.join(path, basename),
+  v
+  )
+  for basename, v in value.items()
+  ]
+  )
+else:
+  raise Exception(
+  'Unexpected value in tempdir tree: %s' % value
+  )
+
+if expected_leaf_count != None:
+  self.assertEqual(actual_leaf_count, expected_leaf_count)
+
+return actual_leaf_count
+
+  _test_tree = {
+  'path1': 

[beam] branch master updated: [BEAM-4711] fix globbing in LocalFileSystem.delete (#5863)

2018-09-18 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 073f77d  [BEAM-4711] fix globbing in LocalFileSystem.delete (#5863)
073f77d is described below

commit 073f77df021c452bea03ee21eb61fc5331b61c14
Author: Ryan Williams 
AuthorDate: Tue Sep 18 17:52:39 2018 -0400

[BEAM-4711] fix globbing in LocalFileSystem.delete (#5863)
---
 sdks/python/apache_beam/io/localfilesystem.py  |  13 +-
 sdks/python/apache_beam/io/localfilesystem_test.py | 189 +
 2 files changed, 201 insertions(+), 1 deletion(-)

diff --git a/sdks/python/apache_beam/io/localfilesystem.py 
b/sdks/python/apache_beam/io/localfilesystem.py
index 73d9a4d..ddd5022 100644
--- a/sdks/python/apache_beam/io/localfilesystem.py
+++ b/sdks/python/apache_beam/io/localfilesystem.py
@@ -321,11 +321,22 @@ class LocalFileSystem(FileSystem):
 raise IOError(err)
 
 exceptions = {}
-for path in paths:
+
+def try_delete(path):
   try:
 _delete_path(path)
   except Exception as e:  # pylint: disable=broad-except
 exceptions[path] = e
 
+for match_result in self.match(paths):
+  metadata_list = match_result.metadata_list
+
+  if not metadata_list:
+exceptions[match_result.pattern] = \
+  IOError('No files found to delete under: %s' % match_result.pattern)
+
+  for metadata in match_result.metadata_list:
+try_delete(metadata.path)
+
 if exceptions:
   raise BeamIOError("Delete operation failed", exceptions)
diff --git a/sdks/python/apache_beam/io/localfilesystem_test.py 
b/sdks/python/apache_beam/io/localfilesystem_test.py
index 5d032db..bc45e42 100644
--- a/sdks/python/apache_beam/io/localfilesystem_test.py
+++ b/sdks/python/apache_beam/io/localfilesystem_test.py
@@ -291,6 +291,195 @@ class LocalFileSystemTest(unittest.TestCase):
 self.assertEquals(self.fs.checksum(path1), str(5))
 self.assertEquals(self.fs.checksum(path2), str(3))
 
+  def make_tree(self, path, value, expected_leaf_count=None):
+"""Create a file+directory structure from a simple dict-based DSL
+
+:param path: root path to create directories+files under
+:param value: a specification of what ``path`` should contain: ``None`` to
+ make it an empty directory, a string literal to make it a file with those
+  contents, and a ``dict`` to make it a non-empty directory and recurse
+:param expected_leaf_count: only be set at the top of a recursive call
+ stack; after the whole tree has been created, verify the presence and
+ number of all files+directories, as a sanity check
+"""
+if value is None:
+  # empty directory
+  os.makedirs(path)
+elif isinstance(value, str):
+  # file with string-literal contents
+  dir = os.path.dirname(path)
+  if not os.path.exists(dir):
+os.makedirs(dir)
+  with open(path, 'a') as f:
+f.write(value)
+elif isinstance(value, dict):
+  # recurse to create a subdirectory tree
+  for basename, v in value.items():
+self.make_tree(
+os.path.join(path, basename),
+v
+)
+else:
+  raise Exception(
+  'Unexpected value in tempdir tree: %s' % value
+  )
+
+if expected_leaf_count != None:
+  self.assertEqual(
+  self.check_tree(path, value),
+  expected_leaf_count
+  )
+
+  def check_tree(self, path, value, expected_leaf_count=None):
+"""Verify a directory+file structure according to the rules described in
+``make_tree``
+
+:param path: path to check under
+:param value: DSL-representation of expected files+directories under
+``path``
+:return: number of leaf files/directories that were verified
+"""
+actual_leaf_count = None
+if value is None:
+  # empty directory
+  self.assertTrue(os.path.exists(path), msg=path)
+  self.assertEqual(os.listdir(path), [])
+  actual_leaf_count = 1
+elif isinstance(value, str):
+  # file with string-literal contents
+  with open(path, 'r') as f:
+self.assertEqual(f.read(), value, msg=path)
+
+  actual_leaf_count = 1
+elif isinstance(value, dict):
+  # recurse to check subdirectory tree
+  actual_leaf_count = sum(
+  [
+  self.check_tree(
+  os.path.join(path, basename),
+  v
+  )
+  for basename, v in value.items()
+  ]
+  )
+else:
+  raise Exception(
+  'Unexpected value in tempdir tree: %s' % value
+  )
+
+if expected_leaf_count != None:
+  self.assertEqual(actual_leaf_count, expected_leaf_count)
+
+return actual_leaf_count
+
+  _test_tree = {
+  'path1': '111',
+  'path2': {
+  '2': '222',
+  'emptydir': None
+  },
+  

[jira] [Resolved] (BEAM-4711) LocalFileSystem.delete doesn't support globbing

2018-09-18 Thread Luke Cwik (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-4711.
-
   Resolution: Fixed
Fix Version/s: 2.8.0

> LocalFileSystem.delete doesn't support globbing
> ---
>
> Key: BEAM-4711
> URL: https://issues.apache.org/jira/browse/BEAM-4711
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
> Fix For: 2.8.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> I attempted to run {{wordcount_it_test:WordCountIT.test_wordcount_it}} 
> locally with {{DirectRunner}}:
> {code}
> python setup.py nosetests \
>   --tests 
> apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it \
>   --test-pipeline-options="--output=foo"
> {code}
> It failed in [the {{delete_files}} cleanup 
> command|https://github.com/apache/beam/blob/a58f1ffaafb0e2ebcc73a1c5abfb05a15ec6a84b/sdks/python/apache_beam/examples/wordcount_it_test.py#L64]:
> {code}
> root: WARNING: Retry with exponential backoff: waiting for 11.1454450937 
> seconds before retrying delete_files because we caught exception: 
> BeamIOError: Delete operation failed with exceptions 
> {'foo/1530557644/results*': IOError(OSError(2, 'No such file or directory'),)}
>  Traceback for above exception (most recent call last):
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/utils/retry.py", line 184, 
> in wrapper
> return fun(*args, **kwargs)
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/testing/test_utils.py", 
> line 136, in delete_files
> FileSystems.delete(file_paths)
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/io/filesystems.py", line 
> 282, in delete
> return filesystem.delete(paths)
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/io/localfilesystem.py", 
> line 304, in delete
> raise BeamIOError("Delete operation failed", exceptions)
> {code}
> The line:
> {code}
> self.addCleanup(delete_files, [output + '*'])
> {code}
> works as expected in GCS, and deletes a test's output-directory, but it fails 
> in on the local-filesystem, which doesn't expand globs before attempting to 
> delete paths.
> It would be good to make these consistent, presumably by adding glob-support 
> to {{LocalFileSystem}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Website_Merge #1

2018-09-18 Thread Apache Jenkins Server
See 


--
Started by GitHub push by lukecwik
[EnvInject] - Loading node environment variables.
Building remotely on websites1 (git-websites svn-websites) in workspace 

Cloning the remote Git repository
Cloning repository https://github.com/apache/beam.git
 > git init 
 >  # 
 > timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # 
 > timeout=10
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 073f77df021c452bea03ee21eb61fc5331b61c14 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 073f77df021c452bea03ee21eb61fc5331b61c14
Commit message: "[BEAM-4711] fix globbing in LocalFileSystem.delete (#5863)"
First time build. Skipping changelog.
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[Gradle] - Launching build.
[src] $ 
 
--info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
-Dorg.gradle.jvmargs=-Xmx4g :beam-website:mergeWebsite
Initialized native services in: /home/jenkins/.gradle/native
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/4.8/userguide/gradle_daemon.html.
Starting process 'Gradle build daemon'. Working directory: 
/home/jenkins/.gradle/daemon/4.8 Command: 
/usr/local/asfpackages/java/jdk1.8.0_172/bin/java -Xmx4g -Dfile.encoding=UTF-8 
-Duser.country=US -Duser.language=en -Duser.variant -cp 
/home/jenkins/.gradle/wrapper/dists/gradle-4.8-bin/divx0s2uj4thofgytb7gf9fsi/gradle-4.8/lib/gradle-launcher-4.8.jar
 org.gradle.launcher.daemon.bootstrap.GradleDaemon 4.8
Successfully started process 'Gradle build daemon'
An attempt to start the daemon took 0.911 secs.
The client will now receive all logging from the daemon (pid: 5618). The daemon 
log file: /home/jenkins/.gradle/daemon/4.8/daemon-5618.out.log
Closing daemon's stdin at end of input.
The daemon will no longer process any standard input.
Daemon will be stopped at the end of the build stopping after processing
Using 12 worker leases.
Starting Build
Parallel execution is an incubating feature.

> Configure project :buildSrc
Evaluating project ':buildSrc' using build file 
'
file or directory 
'
 not found
Selected primary task 'build' from project :
file or directory 
'
 not found
Using local directory build cache for build ':buildSrc' (location = 
/home/jenkins/.gradle/caches/build-cache-1, removeUnusedEntriesAfter = 7 days).
:compileJava (Thread[Task worker for ':buildSrc' Thread 2,5,main]) started.

> Task :buildSrc:compileJava NO-SOURCE
file or directory 
'
 not found
Skipping task ':buildSrc:compileJava' as it has no source files and no previous 
output files.
:compileJava (Thread[Task worker for ':buildSrc' Thread 2,5,main]) completed. 
Took 0.029 secs.
:compileGroovy (Thread[Task worker for ':buildSrc' Thread 2,5,main]) started.

> Task :buildSrc:compileGroovy FROM-CACHE
Build cache key for task ':buildSrc:compileGroovy' is 
8f8b31e5d9e84d48db72a071a889209f
Task ':buildSrc:compileGroovy' is not up-to-date because:
  No history is available.
Origin for task ':buildSrc:compileGroovy': {executionTime=5906, 
hostName=jenkins-websites1.apache.org, operatingSystem=Linux, 
buildInvocationId=fyduppipababfgg4v3zmwlzwrm, creationTime=1537305634099, 
type=org.gradle.api.tasks.compile.GroovyCompile_Decorated, userName=jenkins, 
gradleVersion=4.8, 

[jira] [Created] (BEAM-5426) Use both destination and TableDestination for BQ load job IDs

2018-09-18 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5426:


 Summary: Use both destination and TableDestination for BQ load job 
IDs
 Key: BEAM-5426
 URL: https://issues.apache.org/jira/browse/BEAM-5426
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


Currently we use TableDestination when creating a unique load job ID for a 
destination: 
[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java#L359]

 

This can result in a data loss issue if a user returns the same 
TableDestination for different destination IDs. I think we can prevent this if 
we include both IDs in the BQ load job ID.

 

CC: [~reuvenlax]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5249) org.apache.beam.runners.flink.streaming.UnboundedSourceWrapperTest ParameterizedUnboundedSourceWrapperTest hangs

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5249?focusedWorklogId=145493=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145493
 ]

ASF GitHub Bot logged work on BEAM-5249:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:49
Start Date: 18/Sep/18 21:49
Worklog Time Spent: 10m 
  Work Description: alanmyrvold closed pull request #6288: [BEAM-5249] Fix 
timeouts in beam_Release_Gradle_NightlySnapshot by extending time from 100min 
to 150min
URL: https://github.com/apache/beam/pull/6288
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/.test-infra/jenkins/job_Release_Gradle_NightlySnapshot.groovy 
b/.test-infra/jenkins/job_Release_Gradle_NightlySnapshot.groovy
index df1e29e16ca..48968121312 100644
--- a/.test-infra/jenkins/job_Release_Gradle_NightlySnapshot.groovy
+++ b/.test-infra/jenkins/job_Release_Gradle_NightlySnapshot.groovy
@@ -27,7 +27,7 @@ job('beam_Release_Gradle_NightlySnapshot') {
   concurrentBuild()
 
   // Set common parameters.
-  commonJobProperties.setTopLevelMainJobProperties(delegate)
+  commonJobProperties.setTopLevelMainJobProperties(delegate, 'master', 150)
 
   // This is a post-commit job that runs once per day, not for every push.
   commonJobProperties.setAutoJob(


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145493)
Time Spent: 1h  (was: 50m)

> org.apache.beam.runners.flink.streaming.UnboundedSourceWrapperTest 
> ParameterizedUnboundedSourceWrapperTest hangs
> 
>
> Key: BEAM-5249
> URL: https://issues.apache.org/jira/browse/BEAM-5249
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> beam_Release_Gradle_NightlySnapshot sometimes times out at 100 minutes
> [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/155/]
> [https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/152/]
> https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/142/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5339) Implement new policy on Beam dependency tooling

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5339?focusedWorklogId=145488=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145488
 ]

ASF GitHub Bot logged work on BEAM-5339:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:36
Start Date: 18/Sep/18 21:36
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on a change in pull request #6430: 
[BEAM-5339]/apply new policy on Beam dependency tooling
URL: https://github.com/apache/beam/pull/6430#discussion_r218603374
 
 

 ##
 File path: .test-infra/jenkins/jira_utils/jira_manager_test.py
 ##
 @@ -46,76 +54,6 @@ def setUp(self):
 print("\n\nTest : " + self._testMethodName)
 
 
 
 Review comment:
   Remove tests of find_owners. They are no longer needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145488)
Time Spent: 1h 10m  (was: 1h)

> Implement new policy on Beam dependency tooling
> ---
>
> Key: BEAM-5339
> URL: https://issues.apache.org/jira/browse/BEAM-5339
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> (1) Instead of a dependency "owners" list we will be maintaining an 
> "interested parties" list. When we create a JIRA for a dependency we will not 
> assign it to an owner but rather we will CC all the folks that mentioned that 
> they will be interested in receiving updates related to that dependency. Hope 
> is that some of the interested parties will also put forward the effort to 
> upgrade dependencies they are interested in but the responsibility of 
> upgrading dependencies lie with the community as a whole.
>  (2) We will be creating JIRAs for upgrading individual dependencies, not for 
> upgrading to specific versions of those dependencies. For example, if a given 
> dependency X is three minor versions or an year behind we will create a JIRA 
> for upgrading that. But the specific version to upgrade to has to be 
> determined by the Beam community. Beam community might choose to close a JIRA 
> if there are known issues with available recent releases. Tool may reopen 
> such a closed JIRA in the future if new information becomes available (for 
> example, 3 new versions have been released since JIRA was closed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5339) Implement new policy on Beam dependency tooling

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5339?focusedWorklogId=145490=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145490
 ]

ASF GitHub Bot logged work on BEAM-5339:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:36
Start Date: 18/Sep/18 21:36
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on a change in pull request #6430: 
[BEAM-5339]/apply new policy on Beam dependency tooling
URL: https://github.com/apache/beam/pull/6430#discussion_r218603504
 
 

 ##
 File path: .test-infra/jenkins/jira_utils/jira_manager_test.py
 ##
 @@ -219,11 +156,63 @@ def test_run_with_reopening_existing_parent_issue(self, 
*args):
 manager.jira.reopen_issue.assert_called_once()
 
 
-  def _get_experct_summary(self, dep_name, dep_latest_version):
-summary =  'Beam Dependency Update Request: ' + dep_name
-if dep_latest_version:
-  summary = summary + " " + dep_latest_version
-return summary
+  @patch('jira_utils.jira_manager.datetime', 
Mock(today=Mock(return_value=datetime.strptime('2000-01-01', '%Y-%m-%d'
+  @patch('jira_utils.jira_manager.JiraClient.create_issue', side_effect = 
[MockedJiraIssue('BEAM-2000', 'summary', 'description', 'Open')])
+  @patch.object(jira_utils.jira_manager.JiraManager,
+'_get_next_release_version', side_effect=['2.8.0.dev'])
+  def test_run_with_reopening_issue_with_fixversions(self, *args):
 
 Review comment:
   Add tests for the JIRA reopening rules.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145490)

> Implement new policy on Beam dependency tooling
> ---
>
> Key: BEAM-5339
> URL: https://issues.apache.org/jira/browse/BEAM-5339
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> (1) Instead of a dependency "owners" list we will be maintaining an 
> "interested parties" list. When we create a JIRA for a dependency we will not 
> assign it to an owner but rather we will CC all the folks that mentioned that 
> they will be interested in receiving updates related to that dependency. Hope 
> is that some of the interested parties will also put forward the effort to 
> upgrade dependencies they are interested in but the responsibility of 
> upgrading dependencies lie with the community as a whole.
>  (2) We will be creating JIRAs for upgrading individual dependencies, not for 
> upgrading to specific versions of those dependencies. For example, if a given 
> dependency X is three minor versions or an year behind we will create a JIRA 
> for upgrading that. But the specific version to upgrade to has to be 
> determined by the Beam community. Beam community might choose to close a JIRA 
> if there are known issues with available recent releases. Tool may reopen 
> such a closed JIRA in the future if new information becomes available (for 
> example, 3 new versions have been released since JIRA was closed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5339) Implement new policy on Beam dependency tooling

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5339?focusedWorklogId=145487=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145487
 ]

ASF GitHub Bot logged work on BEAM-5339:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:36
Start Date: 18/Sep/18 21:36
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on a change in pull request #6430: 
[BEAM-5339]/apply new policy on Beam dependency tooling
URL: https://github.com/apache/beam/pull/6430#discussion_r218603005
 
 

 ##
 File path: 
.test-infra/jenkins/dependency_check/dependency_check_report_generator.py
 ##
 @@ -153,39 +154,6 @@ def prioritize_dependencies(deps, sdk_type):
   return high_priority_deps
 
 
-def compare_dependency_versions(curr_ver, latest_ver):
 
 Review comment:
   remove this method from the report generator, put it into an independent 
file version_comparer.py that could be used by both report generator and jira 
manager.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145487)
Time Spent: 1h  (was: 50m)

> Implement new policy on Beam dependency tooling
> ---
>
> Key: BEAM-5339
> URL: https://issues.apache.org/jira/browse/BEAM-5339
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> (1) Instead of a dependency "owners" list we will be maintaining an 
> "interested parties" list. When we create a JIRA for a dependency we will not 
> assign it to an owner but rather we will CC all the folks that mentioned that 
> they will be interested in receiving updates related to that dependency. Hope 
> is that some of the interested parties will also put forward the effort to 
> upgrade dependencies they are interested in but the responsibility of 
> upgrading dependencies lie with the community as a whole.
>  (2) We will be creating JIRAs for upgrading individual dependencies, not for 
> upgrading to specific versions of those dependencies. For example, if a given 
> dependency X is three minor versions or an year behind we will create a JIRA 
> for upgrading that. But the specific version to upgrade to has to be 
> determined by the Beam community. Beam community might choose to close a JIRA 
> if there are known issues with available recent releases. Tool may reopen 
> such a closed JIRA in the future if new information becomes available (for 
> example, 3 new versions have been released since JIRA was closed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5339) Implement new policy on Beam dependency tooling

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5339?focusedWorklogId=145489=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145489
 ]

ASF GitHub Bot logged work on BEAM-5339:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:36
Start Date: 18/Sep/18 21:36
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on a change in pull request #6430: 
[BEAM-5339]/apply new policy on Beam dependency tooling
URL: https://github.com/apache/beam/pull/6430#discussion_r218602615
 
 

 ##
 File path: 
.test-infra/jenkins/dependency_check/dependency_check_report_generator.py
 ##
 @@ -16,17 +16,18 @@
 # limitations under the License.
 #
 
 
 Review comment:
   adjust the order of import packages that obey alphabetical order.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145489)
Time Spent: 1h 20m  (was: 1h 10m)

> Implement new policy on Beam dependency tooling
> ---
>
> Key: BEAM-5339
> URL: https://issues.apache.org/jira/browse/BEAM-5339
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> (1) Instead of a dependency "owners" list we will be maintaining an 
> "interested parties" list. When we create a JIRA for a dependency we will not 
> assign it to an owner but rather we will CC all the folks that mentioned that 
> they will be interested in receiving updates related to that dependency. Hope 
> is that some of the interested parties will also put forward the effort to 
> upgrade dependencies they are interested in but the responsibility of 
> upgrading dependencies lie with the community as a whole.
>  (2) We will be creating JIRAs for upgrading individual dependencies, not for 
> upgrading to specific versions of those dependencies. For example, if a given 
> dependency X is three minor versions or an year behind we will create a JIRA 
> for upgrading that. But the specific version to upgrade to has to be 
> determined by the Beam community. Beam community might choose to close a JIRA 
> if there are known issues with available recent releases. Tool may reopen 
> such a closed JIRA in the future if new information becomes available (for 
> example, 3 new versions have been released since JIRA was closed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4496) Create Jenkins job to push generated HTML to asf-site branch

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4496?focusedWorklogId=145486=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145486
 ]

ASF GitHub Bot logged work on BEAM-4496:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:34
Start Date: 18/Sep/18 21:34
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #6431: [BEAM-4496] 
Website merge
URL: https://github.com/apache/beam/pull/6431#issuecomment-422563896
 
 
   Commented out the git push, but before I uncomment it and try it, can 
@swegner PTAL?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145486)
Time Spent: 1h 20m  (was: 1h 10m)

> Create Jenkins job to push generated HTML to asf-site branch
> 
>
> Key: BEAM-4496
> URL: https://issues.apache.org/jira/browse/BEAM-4496
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4496) Create Jenkins job to push generated HTML to asf-site branch

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4496?focusedWorklogId=145483=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145483
 ]

ASF GitHub Bot logged work on BEAM-4496:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:30
Start Date: 18/Sep/18 21:30
Worklog Time Spent: 10m 
  Work Description: alanmyrvold removed a comment on issue #6431: 
[BEAM-4496] Website merge
URL: https://github.com/apache/beam/pull/6431#issuecomment-422559339
 
 
   Run Website Merge


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145483)
Time Spent: 1h  (was: 50m)

> Create Jenkins job to push generated HTML to asf-site branch
> 
>
> Key: BEAM-4496
> URL: https://issues.apache.org/jira/browse/BEAM-4496
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4496) Create Jenkins job to push generated HTML to asf-site branch

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4496?focusedWorklogId=145484=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145484
 ]

ASF GitHub Bot logged work on BEAM-4496:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:30
Start Date: 18/Sep/18 21:30
Worklog Time Spent: 10m 
  Work Description: alanmyrvold removed a comment on issue #6431: 
[BEAM-4496] Website merge
URL: https://github.com/apache/beam/pull/6431#issuecomment-422555877
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145484)
Time Spent: 1h 10m  (was: 1h)

> Create Jenkins job to push generated HTML to asf-site branch
> 
>
> Key: BEAM-4496
> URL: https://issues.apache.org/jira/browse/BEAM-4496
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle #1072

2018-09-18 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-4496) Create Jenkins job to push generated HTML to asf-site branch

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4496?focusedWorklogId=145482=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145482
 ]

ASF GitHub Bot logged work on BEAM-4496:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:24
Start Date: 18/Sep/18 21:24
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #6431: [BEAM-4496] 
Website merge
URL: https://github.com/apache/beam/pull/6431#issuecomment-422561008
 
 
   Run Website Merge


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145482)
Time Spent: 50m  (was: 40m)

> Create Jenkins job to push generated HTML to asf-site branch
> 
>
> Key: BEAM-4496
> URL: https://issues.apache.org/jira/browse/BEAM-4496
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4496) Create Jenkins job to push generated HTML to asf-site branch

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4496?focusedWorklogId=145481=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145481
 ]

ASF GitHub Bot logged work on BEAM-4496:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:19
Start Date: 18/Sep/18 21:19
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #6431: [BEAM-4496] 
Website merge
URL: https://github.com/apache/beam/pull/6431#issuecomment-422559339
 
 
   Run Website Merge


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145481)
Time Spent: 40m  (was: 0.5h)

> Create Jenkins job to push generated HTML to asf-site branch
> 
>
> Key: BEAM-4496
> URL: https://issues.apache.org/jira/browse/BEAM-4496
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4496) Create Jenkins job to push generated HTML to asf-site branch

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4496?focusedWorklogId=145480=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145480
 ]

ASF GitHub Bot logged work on BEAM-4496:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:15
Start Date: 18/Sep/18 21:15
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #6431: [BEAM-4496] 
Website merge
URL: https://github.com/apache/beam/pull/6431#issuecomment-422557902
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145480)
Time Spent: 0.5h  (was: 20m)

> Create Jenkins job to push generated HTML to asf-site branch
> 
>
> Key: BEAM-4496
> URL: https://issues.apache.org/jira/browse/BEAM-4496
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4496) Create Jenkins job to push generated HTML to asf-site branch

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4496?focusedWorklogId=145478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145478
 ]

ASF GitHub Bot logged work on BEAM-4496:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:09
Start Date: 18/Sep/18 21:09
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #6431: [BEAM-4496] 
Website merge
URL: https://github.com/apache/beam/pull/6431#issuecomment-422555877
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145478)
Time Spent: 20m  (was: 10m)

> Create Jenkins job to push generated HTML to asf-site branch
> 
>
> Key: BEAM-4496
> URL: https://issues.apache.org/jira/browse/BEAM-4496
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4496) Create Jenkins job to push generated HTML to asf-site branch

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4496?focusedWorklogId=145477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145477
 ]

ASF GitHub Bot logged work on BEAM-4496:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:08
Start Date: 18/Sep/18 21:08
Worklog Time Spent: 10m 
  Work Description: alanmyrvold opened a new pull request #6431: 
[BEAM-4496] Website merge
URL: https://github.com/apache/beam/pull/6431
 
 
   New Jenkins job to run on https://builds.apache.org/label/git-websites/ and 
build and push the beam website from master into the asf-site branch of the 
beam repository.
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145477)
Time Spent: 10m
Remaining Estimate: 0h

> Create Jenkins job to push generated HTML to asf-site branch
> 
>
> Key: BEAM-4496
> URL: https://issues.apache.org/jira/browse/BEAM-4496
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5364) BigtableIO source tries to validate table ID even though validation is turned off

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5364?focusedWorklogId=145475=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145475
 ]

ASF GitHub Bot logged work on BEAM-5364:


Author: ASF GitHub Bot
Created on: 18/Sep/18 21:06
Start Date: 18/Sep/18 21:06
Worklog Time Spent: 10m 
  Work Description: kevinsi4508 commented on issue #6420: [BEAM-5364] Check 
if validation is disabled when validating BigtableSource
URL: https://github.com/apache/beam/pull/6420#issuecomment-422555064
 
 
   PostCommit seems flaky, anyway to retry?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145475)
Time Spent: 4h 40m  (was: 4.5h)

>  BigtableIO source tries to validate table ID even though validation is 
> turned off
> --
>
> Key: BEAM-5364
> URL: https://issues.apache.org/jira/browse/BEAM-5364
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Kevin Si
>Assignee: Chamikara Jayalath
>Priority: Minor
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java#L1084|https://www.google.com/url?q=https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java%23L1084=D=AFQjCNEfHprTOvnwAwFSrXwUuLvc__JBWg]
> The validation can be turned off with following:
> BigtableIO.read()
>             .withoutValidation() // skip validation when constructing the 
> pipelline.
> A Dataflow template cannot be constructed due to this validation failure.
>  
> Error log when trying to construct a template:
> Exception in thread "main" java.lang.IllegalArgumentException: tableId was 
> not supplied
>         at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
>         at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableSource.validate(BigtableIO.java:1084)
>         at org.apache.beam.sdk.io.Read$Bounded.expand(Read.java:95)
>         at org.apache.beam.sdk.io.Read$Bounded.expand(Read.java:85)
>         at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
>         at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:471)
>         at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44)
>         at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:167)
>         at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.expand(BigtableIO.java:423)
>         at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.expand(BigtableIO.java:179)
>         at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
>         at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:488)
>         at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56)
>         at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:182)
>         at 
> com.google.cloud.teleport.bigtable.BigtableToAvro.main(BigtableToAvro.java:89)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=145473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145473
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 18/Sep/18 20:59
Start Date: 18/Sep/18 20:59
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #6328: [BEAM-4176] 
enumerate primitive transforms in portable construction
URL: https://github.com/apache/beam/pull/6328#issuecomment-422552764
 
 
   It seems like we should be able to reuse a single docker container image for 
all the test runs instead of having each test create one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145473)
Time Spent: 21.5h  (was: 21h 20m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
> Attachments: Screen Shot 2018-08-14 at 4.18.31 PM.png, Screen Shot 
> 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 21.5h
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4711) LocalFileSystem.delete doesn't support globbing

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4711?focusedWorklogId=145468=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145468
 ]

ASF GitHub Bot logged work on BEAM-4711:


Author: ASF GitHub Bot
Created on: 18/Sep/18 20:51
Start Date: 18/Sep/18 20:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5863: 
[BEAM-4711] fix globbing in LocalFileSystem.delete
URL: https://github.com/apache/beam/pull/5863#discussion_r218590309
 
 

 ##
 File path: sdks/python/apache_beam/io/localfilesystem.py
 ##
 @@ -321,11 +321,22 @@ def _delete_path(path):
 raise IOError(err)
 
 exceptions = {}
-for path in paths:
+
+def try_delete(path):
   try:
 _delete_path(path)
   except Exception as e:  # pylint: disable=broad-except
 exceptions[path] = e
 
+for match_result in self.match(paths):
+  metadata_list = match_result.metadata_list
+
+  if not metadata_list:
 
 Review comment:
   Turns out that we test that delete throws an error if non existent and that 
we have an inconsistent definition between multiple implementations (e.g Java 
LocalFileSystem). Filed https://issues.apache.org/jira/browse/BEAM-5425 to ask 
whether the contract should be changed to allow prior partial successes to 
continue and for errors to not be thrown for non-existent files.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145468)
Time Spent: 1h 40m  (was: 1.5h)

> LocalFileSystem.delete doesn't support globbing
> ---
>
> Key: BEAM-4711
> URL: https://issues.apache.org/jira/browse/BEAM-4711
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Affects Versions: 2.5.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> I attempted to run {{wordcount_it_test:WordCountIT.test_wordcount_it}} 
> locally with {{DirectRunner}}:
> {code}
> python setup.py nosetests \
>   --tests 
> apache_beam.examples.wordcount_it_test:WordCountIT.test_wordcount_it \
>   --test-pipeline-options="--output=foo"
> {code}
> It failed in [the {{delete_files}} cleanup 
> command|https://github.com/apache/beam/blob/a58f1ffaafb0e2ebcc73a1c5abfb05a15ec6a84b/sdks/python/apache_beam/examples/wordcount_it_test.py#L64]:
> {code}
> root: WARNING: Retry with exponential backoff: waiting for 11.1454450937 
> seconds before retrying delete_files because we caught exception: 
> BeamIOError: Delete operation failed with exceptions 
> {'foo/1530557644/results*': IOError(OSError(2, 'No such file or directory'),)}
>  Traceback for above exception (most recent call last):
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/utils/retry.py", line 184, 
> in wrapper
> return fun(*args, **kwargs)
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/testing/test_utils.py", 
> line 136, in delete_files
> FileSystems.delete(file_paths)
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/io/filesystems.py", line 
> 282, in delete
> return filesystem.delete(paths)
>   File "/Users/ryan/c/beam/sdks/python/apache_beam/io/localfilesystem.py", 
> line 304, in delete
> raise BeamIOError("Delete operation failed", exceptions)
> {code}
> The line:
> {code}
> self.addCleanup(delete_files, [output + '*'])
> {code}
> works as expected in GCS, and deletes a test's output-directory, but it fails 
> in on the local-filesystem, which doesn't expand globs before attempting to 
> delete paths.
> It would be good to make these consistent, presumably by adding glob-support 
> to {{LocalFileSystem}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5425) FileSystems contract rename/delete to allow prior partial successes to make continued progress

2018-09-18 Thread Luke Cwik (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-5425:
---

Assignee: Ahmet Altay  (was: Kenneth Knowles)

> FileSystems contract rename/delete to allow prior partial successes to make 
> continued progress
> --
>
> Key: BEAM-5425
> URL: https://issues.apache.org/jira/browse/BEAM-5425
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-py-core
>Reporter: Luke Cwik
>Assignee: Ahmet Altay
>Priority: Major
>
> The filesystems contract for delete/rename say that an error should be raised 
> if the resources don't exist.
>  
> I believe the contract should be updated to not have failures if the 
> resources don't exist as we want them to be retried on failure without 
> needing the caller know that a prior call may have been partially successful.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5425) FileSystems contract rename/delete to allow prior partial successes to make continued progress

2018-09-18 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-5425:
---

 Summary: FileSystems contract rename/delete to allow prior partial 
successes to make continued progress
 Key: BEAM-5425
 URL: https://issues.apache.org/jira/browse/BEAM-5425
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core, sdk-py-core
Reporter: Luke Cwik
Assignee: Kenneth Knowles


The filesystems contract for delete/rename say that an error should be raised 
if the resources don't exist.

 

I believe the contract should be updated to not have failures if the resources 
don't exist as we want them to be retried on failure without needing the caller 
know that a prior call may have been partially successful.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4843) Incorrect docs on FileSystems.delete

2018-09-18 Thread Luke Cwik (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-4843.
-
   Resolution: Fixed
Fix Version/s: 2.7.0

> Incorrect docs on FileSystems.delete
> 
>
> Key: BEAM-4843
> URL: https://issues.apache.org/jira/browse/BEAM-4843
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.6.0
>Reporter: Ryan Williams
>Assignee: Ryan Williams
>Priority: Minor
> Fix For: 2.7.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [The docs on {{FileSystems.delete}} 
> say|https://github.com/apache/beam/blob/b5e8335d982ee69d9f788f65f27356cddd5293d1/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java#L332-L333]:
> bq. It is allowed but not recommended to delete directories recursively. 
> Callers depends on {@link FileSystems} and uses {@code DeleteOptions}.
> However, the function actually takes a {{MoveOptions...}} param, there's 
> never been a {{DeleteOptions}} afaict, and there is no way to recursively 
> delete a {{ResourceId}}.
> The docs should be fixed, at a minimum; actually supporting recursive delete 
> would also be nice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5339) Implement new policy on Beam dependency tooling

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5339?focusedWorklogId=145462=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145462
 ]

ASF GitHub Bot logged work on BEAM-5339:


Author: ASF GitHub Bot
Created on: 18/Sep/18 20:34
Start Date: 18/Sep/18 20:34
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #6430: [BEAM-5339]/apply 
new policy on Beam dependency tooling
URL: https://github.com/apache/beam/pull/6430#issuecomment-422543643
 
 
   +R: @chamikaramj 
   
   Changes include:
   1. Remove the versions from JIRA summary. The issue title uses the 
dependency name only.
  (e.g Beam Dependency Update Request: com.group1:dep0)
   2. Stop assigning issues to owners directly. Instead, cc owners in the 
descriptions.
   3. Applied the JIRA reopening rules (6 months + 3 versions).
   4. More unit tests coverage.
   
   Please review. Thanks. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145462)
Time Spent: 50m  (was: 40m)

> Implement new policy on Beam dependency tooling
> ---
>
> Key: BEAM-5339
> URL: https://issues.apache.org/jira/browse/BEAM-5339
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> (1) Instead of a dependency "owners" list we will be maintaining an 
> "interested parties" list. When we create a JIRA for a dependency we will not 
> assign it to an owner but rather we will CC all the folks that mentioned that 
> they will be interested in receiving updates related to that dependency. Hope 
> is that some of the interested parties will also put forward the effort to 
> upgrade dependencies they are interested in but the responsibility of 
> upgrading dependencies lie with the community as a whole.
>  (2) We will be creating JIRAs for upgrading individual dependencies, not for 
> upgrading to specific versions of those dependencies. For example, if a given 
> dependency X is three minor versions or an year behind we will create a JIRA 
> for upgrading that. But the specific version to upgrade to has to be 
> determined by the Beam community. Beam community might choose to close a JIRA 
> if there are known issues with available recent releases. Tool may reopen 
> such a closed JIRA in the future if new information becomes available (for 
> example, 3 new versions have been released since JIRA was closed)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5424) Beam Dependency Update Request: fake.group1:fake-dependency1

2018-09-18 Thread yifan zou (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yifan zou resolved BEAM-5424.
-
   Resolution: Fixed
Fix Version/s: (was: 2.8.0)
   Not applicable

> Beam Dependency Update Request: fake.group1:fake-dependency1
> 
>
> Key: BEAM-5424
> URL: https://issues.apache.org/jira/browse/BEAM-5424
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: yifan zou
>Priority: Major
> Fix For: Not applicable
>
>
> 2018-09-18 13:16:25.752733
> Please review and upgrade the fake.group1:fake-dependency1 to the latest 
> version 1.1.0
> cc:
> 2018-09-18 13:20:11.867319
> Please review and upgrade the fake.group1:fake-dependency1 to the 
> latest version 1.4.0 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5339) Implement new policy on Beam dependency tooling

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5339?focusedWorklogId=145461=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145461
 ]

ASF GitHub Bot logged work on BEAM-5339:


Author: ASF GitHub Bot
Created on: 18/Sep/18 20:27
Start Date: 18/Sep/18 20:27
Worklog Time Spent: 10m 
  Work Description: yifanzou opened a new pull request #6430: 
[BEAM-5339]/new policy dependency tooling
URL: https://github.com/apache/beam/pull/6430
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145461)
Time Spent: 40m  (was: 0.5h)

> Implement new policy on Beam dependency tooling
> ---
>
> Key: BEAM-5339
> URL: https://issues.apache.org/jira/browse/BEAM-5339
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> (1) Instead of a dependency "owners" list we will be maintaining an 
> "interested parties" list. When we create a 

[jira] [Updated] (BEAM-5424) Beam Dependency Update Request: fake.group1:fake-dependency1

2018-09-18 Thread Beam JIRA Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Beam JIRA Bot updated BEAM-5424:

Description: 
2018-09-18 13:16:25.752733

Please review and upgrade the fake.group1:fake-dependency1 to the latest 
version 1.1.0

cc:

2018-09-18 13:20:11.867319

Please review and upgrade the fake.group1:fake-dependency1 to the 
latest version 1.4.0 
 
cc: 

  was:
2018-09-18 13:16:25.752733

Please review and upgrade the fake.group1:fake-dependency1 to the latest 
version 1.1.0

cc:


> Beam Dependency Update Request: fake.group1:fake-dependency1
> 
>
> Key: BEAM-5424
> URL: https://issues.apache.org/jira/browse/BEAM-5424
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: yifan zou
>Priority: Major
> Fix For: 2.8.0
>
>
> 2018-09-18 13:16:25.752733
> Please review and upgrade the fake.group1:fake-dependency1 to the latest 
> version 1.1.0
> cc:
> 2018-09-18 13:20:11.867319
> Please review and upgrade the fake.group1:fake-dependency1 to the 
> latest version 1.4.0 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (BEAM-5424) Beam Dependency Update Request: fake.group1:fake-dependency1

2018-09-18 Thread Beam JIRA Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Beam JIRA Bot reopened BEAM-5424:
-

> Beam Dependency Update Request: fake.group1:fake-dependency1
> 
>
> Key: BEAM-5424
> URL: https://issues.apache.org/jira/browse/BEAM-5424
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: yifan zou
>Priority: Major
> Fix For: 2.8.0
>
>
> 2018-09-18 13:16:25.752733
> Please review and upgrade the fake.group1:fake-dependency1 to the latest 
> version 1.1.0
> cc:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-5424) Beam Dependency Update Request: fake.group1:fake-dependency1

2018-09-18 Thread yifan zou (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yifan zou closed BEAM-5424.
---
   Resolution: Won't Fix
Fix Version/s: 2.8.0

> Beam Dependency Update Request: fake.group1:fake-dependency1
> 
>
> Key: BEAM-5424
> URL: https://issues.apache.org/jira/browse/BEAM-5424
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: yifan zou
>Priority: Major
> Fix For: 2.8.0
>
>
> 2018-09-18 13:16:25.752733
> Please review and upgrade the fake.group1:fake-dependency1 to the latest 
> version 1.1.0
> cc:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5424) Beam Dependency Update Request: fake.group1:fake-dependency1

2018-09-18 Thread yifan zou (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yifan zou reassigned BEAM-5424:
---

Assignee: yifan zou

> Beam Dependency Update Request: fake.group1:fake-dependency1
> 
>
> Key: BEAM-5424
> URL: https://issues.apache.org/jira/browse/BEAM-5424
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: yifan zou
>Priority: Major
>
> 2018-09-18 13:16:25.752733
> Please review and upgrade the fake.group1:fake-dependency1 to the 
> latest version 1.3.0 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5424) Beam Dependency Update Request: fake.group1:fake-dependency1

2018-09-18 Thread yifan zou (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yifan zou updated BEAM-5424:

Description: 
2018-09-18 13:16:25.752733

Please review and upgrade the fake.group1:fake-dependency1 to the latest 
version 1.1.0

cc:

  was:


2018-09-18 13:16:25.752733

Please review and upgrade the fake.group1:fake-dependency1 to the 
latest version 1.3.0 
 
cc: 


> Beam Dependency Update Request: fake.group1:fake-dependency1
> 
>
> Key: BEAM-5424
> URL: https://issues.apache.org/jira/browse/BEAM-5424
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: yifan zou
>Priority: Major
>
> 2018-09-18 13:16:25.752733
> Please review and upgrade the fake.group1:fake-dependency1 to the latest 
> version 1.1.0
> cc:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5424) Beam Dependency Update Request: fake.group1:fake-dependency1

2018-09-18 Thread Beam JIRA Bot (JIRA)
Beam JIRA Bot created BEAM-5424:
---

 Summary: Beam Dependency Update Request: 
fake.group1:fake-dependency1
 Key: BEAM-5424
 URL: https://issues.apache.org/jira/browse/BEAM-5424
 Project: Beam
  Issue Type: Sub-task
  Components: dependencies
Reporter: Beam JIRA Bot




2018-09-18 13:16:25.752733

Please review and upgrade the fake.group1:fake-dependency1 to the 
latest version 1.3.0 
 
cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5423) Beam Dependency Update Request: fake.group1

2018-09-18 Thread Beam JIRA Bot (JIRA)
Beam JIRA Bot created BEAM-5423:
---

 Summary: Beam Dependency Update Request: fake.group1
 Key: BEAM-5423
 URL: https://issues.apache.org/jira/browse/BEAM-5423
 Project: Beam
  Issue Type: Bug
  Components: dependencies
Reporter: Beam JIRA Bot




2018-09-18 13:16:24.434209

Please review and upgrade the fake.group1 to the latest version None 
 
cc: 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3089) Issue with setting the parallelism at client level using Flink runner

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3089?focusedWorklogId=145459=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145459
 ]

ASF GitHub Bot logged work on BEAM-3089:


Author: ASF GitHub Bot
Created on: 18/Sep/18 20:03
Start Date: 18/Sep/18 20:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6426: 
[BEAM-3089] Fix default values in FlinkPipelineOptions / Add tests
URL: https://github.com/apache/beam/pull/6426#discussion_r218573541
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
 ##
 @@ -56,12 +56,13 @@
   "Address of the Flink Master where the Pipeline should be executed. Can"
   + " either be of the form \"host:port\" or one of the special values 
[local], "
   + "[collection] or [auto].")
+  @Default.String("[auto]")
   String getFlinkMaster();
 
   void setFlinkMaster(String value);
 
   @Description("The degree of parallelism to be used when distributing 
operations onto workers.")
-  @Default.InstanceFactory(DefaultParallelismFactory.class)
+  @Default.Integer(-1)
 
 Review comment:
   Nit: update the description to signify <= 0 meaning


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145459)
Time Spent: 2h  (was: 1h 50m)

> Issue with setting the parallelism at client level using Flink runner
> -
>
> Key: BEAM-3089
> URL: https://issues.apache.org/jira/browse/BEAM-3089
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.0.0
> Environment: I am using Flink 1.2.1 running on Docker, with Task 
> Managers distributed across different VMs as part of a Docker Swarm.
>Reporter: Thalita Vergilio
>Assignee: Grzegorz Kołakowski
>Priority: Major
>  Labels: docker, flink, parallel-deployment
> Fix For: 2.8.0
>
> Attachments: flink-ui-parallelism.png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When uploading an Apache Beam application using the Flink Web UI, the 
> parallelism set at job submission doesn't get picked up. The same happens 
> when submitting a job using the Flink CLI.
> In both cases, the parallelism ends up defaulting to 1.
> When I set the parallelism programmatically within the Apache Beam code, it 
> works: {{flinkPipelineOptions.setParallelism(4);}}
> I suspect the root of the problem may be in the 
> org.apache.beam.runners.flink.DefaultParallelismFactory class, as it checks 
> for Flink's GlobalConfiguration, which may not pick up runtime values passed 
> to Flink, then defaults to 1 if it doesn't find anything.
> Any ideas on how this could be fixed or worked around? I need to be able to 
> change the parallelism dynamically, so the programmatic approach won't really 
> work for me, nor will setting the Flink configuration at system level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3089) Issue with setting the parallelism at client level using Flink runner

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3089?focusedWorklogId=145457=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145457
 ]

ASF GitHub Bot logged work on BEAM-3089:


Author: ASF GitHub Bot
Created on: 18/Sep/18 20:03
Start Date: 18/Sep/18 20:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6426: 
[BEAM-3089] Fix default values in FlinkPipelineOptions / Add tests
URL: https://github.com/apache/beam/pull/6426#discussion_r218574954
 
 

 ##
 File path: runners/flink/src/test/resources/flink-conf.yaml
 ##
 @@ -0,0 +1 @@
+parallelism.default: 23
 
 Review comment:
   nit: new line


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145457)
Time Spent: 1h 50m  (was: 1h 40m)

> Issue with setting the parallelism at client level using Flink runner
> -
>
> Key: BEAM-3089
> URL: https://issues.apache.org/jira/browse/BEAM-3089
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.0.0
> Environment: I am using Flink 1.2.1 running on Docker, with Task 
> Managers distributed across different VMs as part of a Docker Swarm.
>Reporter: Thalita Vergilio
>Assignee: Grzegorz Kołakowski
>Priority: Major
>  Labels: docker, flink, parallel-deployment
> Fix For: 2.8.0
>
> Attachments: flink-ui-parallelism.png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When uploading an Apache Beam application using the Flink Web UI, the 
> parallelism set at job submission doesn't get picked up. The same happens 
> when submitting a job using the Flink CLI.
> In both cases, the parallelism ends up defaulting to 1.
> When I set the parallelism programmatically within the Apache Beam code, it 
> works: {{flinkPipelineOptions.setParallelism(4);}}
> I suspect the root of the problem may be in the 
> org.apache.beam.runners.flink.DefaultParallelismFactory class, as it checks 
> for Flink's GlobalConfiguration, which may not pick up runtime values passed 
> to Flink, then defaults to 1 if it doesn't find anything.
> Any ideas on how this could be fixed or worked around? I need to be able to 
> change the parallelism dynamically, so the programmatic approach won't really 
> work for me, nor will setting the Flink configuration at system level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3089) Issue with setting the parallelism at client level using Flink runner

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3089?focusedWorklogId=145458=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145458
 ]

ASF GitHub Bot logged work on BEAM-3089:


Author: ASF GitHub Bot
Created on: 18/Sep/18 20:03
Start Date: 18/Sep/18 20:03
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6426: 
[BEAM-3089] Fix default values in FlinkPipelineOptions / Add tests
URL: https://github.com/apache/beam/pull/6426#discussion_r218574267
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkExecutionEnvironments.java
 ##
 @@ -156,9 +181,11 @@ public static StreamExecutionEnvironment 
createStreamExecutionEnvironment(
 throw new IllegalArgumentException("The checkpoint interval must be 
positive");
   }
   flinkStreamEnv.enableCheckpointing(checkpointInterval, 
options.getCheckpointingMode());
-  flinkStreamEnv
-  .getCheckpointConfig()
-  .setCheckpointTimeout(options.getCheckpointTimeoutMillis());
+  if (options.getCheckpointTimeoutMillis() != -1) {
 
 Review comment:
   nit: < 0


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145458)

> Issue with setting the parallelism at client level using Flink runner
> -
>
> Key: BEAM-3089
> URL: https://issues.apache.org/jira/browse/BEAM-3089
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.0.0
> Environment: I am using Flink 1.2.1 running on Docker, with Task 
> Managers distributed across different VMs as part of a Docker Swarm.
>Reporter: Thalita Vergilio
>Assignee: Grzegorz Kołakowski
>Priority: Major
>  Labels: docker, flink, parallel-deployment
> Fix For: 2.8.0
>
> Attachments: flink-ui-parallelism.png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When uploading an Apache Beam application using the Flink Web UI, the 
> parallelism set at job submission doesn't get picked up. The same happens 
> when submitting a job using the Flink CLI.
> In both cases, the parallelism ends up defaulting to 1.
> When I set the parallelism programmatically within the Apache Beam code, it 
> works: {{flinkPipelineOptions.setParallelism(4);}}
> I suspect the root of the problem may be in the 
> org.apache.beam.runners.flink.DefaultParallelismFactory class, as it checks 
> for Flink's GlobalConfiguration, which may not pick up runtime values passed 
> to Flink, then defaults to 1 if it doesn't find anything.
> Any ideas on how this could be fixed or worked around? I need to be able to 
> change the parallelism dynamically, so the programmatic approach won't really 
> work for me, nor will setting the Flink configuration at system level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5409) Beam Java SDK 2.4/2.5 PAssert with CoGroupByKey

2018-09-18 Thread haden lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haden lee updated BEAM-5409:

Description: 
I may be missing something obvious, but for some reason I can't make 
{{PAssert}} & {{TestPipeline}} work with {{CoGroupByKey}} -- but without it, it 
works fine.
 Here is a reference test file that can reproduce the issue I'm facing. I 
tested with both beam sdk 2.4 and 2.5.

([For the record this was posted on StackOverflow 
before|https://stackoverflow.com/questions/51334429/beam-java-sdk-2-4-2-5-passert-with-cogroupbykey].)

For comparison, {{testWorking}} works as intended, and {{testBroken}} has an 
additional step like this:
{code:java}
// code placeholder
// The following four lines causes an issue.
PCollectionTuple tuple =
KeyedPCollectionTuple.of(inTag1, pc1.apply("String to KV", 
ParDo.of(new String2KV(
.and(inTag2, pc2).apply("CoGroupByKey", 
CoGroupByKey.create()).apply("Some Merge DoFn",
ParDo.of(new MergeDoFn(inTag1, inTag2, 
outTag2)).withOutputTags(outTag1, TupleTagList.of(outTag2)));
{code}
The error I get can be found after the code below.

Has anyone had a similar issue with test pipeline before? I haven't tested it 
yet extensively, but I couldn't find relevant information on {{CoGroupByKey}} & 
{{TestPipeline}} together. In production, the same code works fine for my team, 
and we wanted to add a few unit tests using {{TestPipeline}} and {{PAssert}}. 
That's how we ended up with this issue.

Any help will be appreciated!

*NOTE: Resolved, after adding 'implements Serializable' to the main Test class 
as shown below. Without it, it will throw an exception. I'll leave the original 
contents for reference.*
{code:java}
// code placeholder
public class ReferenceTest implements Serializable {
  @Rule
  public final transient TestPipeline pipe1 = TestPipeline.create();
  @Rule
  public final transient TestPipeline pipe2 = TestPipeline.create();

  public static class String2KV extends DoFn> {
@ProcessElement
public void processElement(ProcessContext c) {
  // "key1:value1" -> ["key1", "value1"]
  String[] tokens = c.element().split(":");
  c.output(KV.of(tokens[0], tokens[1]));
}
  }

  public static class MergeDoFn extends DoFn, String> {
final TupleTag inTag1;
final TupleTag inTag2;
final TupleTag outTag2;

public MergeDoFn(TupleTag inTag1, TupleTag inTag2, 
TupleTag outTag2) {
  this.inTag1 = inTag1;
  this.inTag2 = inTag2;
  this.outTag2 = outTag2;
}

@ProcessElement
public void processElement(ProcessContext c) {
  String val1 = c.element().getValue().getOnly(inTag1);
  String val2 = c.element().getValue().getOnly(inTag2);

  // outTag1 = main output
  // outTag2 = side output
  c.output(outTag2, val1 + "," + val2);
}
  }

  @Test
  public void testWorking() {
// Create two PCs for test.
PCollection pc1 =
pipe1.apply("create pc1", 
Create.of("key1:value1").withCoder(StringUtf8Coder.of()));
PCollection> pc2 =
pipe1.apply("create pc2", Create.>of(KV.of("key1", 
"key1:value2"))
.withCoder(KvCoder.of(StringUtf8Coder.of(), StringUtf8Coder.of(;

// Sanity check.
PAssert.that(pc1).containsInAnyOrder("key1:value1");
PAssert.that(pc2).containsInAnyOrder(KV.of("key1", "key1:value2"));

pipe1.run();
  }

  // Disabled as of 2018-07-13.
  // 
https://stackoverflow.com/questions/51334429/beam-java-sdk-2-4-2-5-passert-with-cogroupbykey
  @Test
  public void testBroken() {
// Create two PCs for test.
PCollection pc1 =
pipe2.apply("create pc1", 
Create.of("key1:value1").withCoder(StringUtf8Coder.of()));
PCollection> pc2 =
pipe2.apply("create pc2", Create.>of(KV.of("key1", 
"value2"))
.withCoder(KvCoder.of(StringUtf8Coder.of(), StringUtf8Coder.of(;

// Sanity check.
PAssert.that(pc1).containsInAnyOrder("key1:value1");
PAssert.that(pc2).containsInAnyOrder(KV.of("key1", "value2"));

TupleTag inTag1 = new TupleTag() {
  private static final long serialVersionUID = 1L;
};
TupleTag inTag2 = new TupleTag() {
  private static final long serialVersionUID = 1L;
};
TupleTag outTag1 = new TupleTag() {
  private static final long serialVersionUID = 1L;
};
TupleTag outTag2 = new TupleTag() {
  private static final long serialVersionUID = 1L;
};

// The following four lines causes an issue.
PCollectionTuple tuple =
KeyedPCollectionTuple.of(inTag1, pc1.apply("String to KV", ParDo.of(new String2KV(
.and(inTag2, pc2).apply("CoGroupByKey", 
CoGroupByKey.create()).apply("Some Merge DoFn",
ParDo.of(new MergeDoFn(inTag1, inTag2, 
outTag2)).withOutputTags(outTag1, TupleTagList.of(outTag2)));

PAssert.that(tuple.get(outTag1)).empty();

[jira] [Resolved] (BEAM-5409) Beam Java SDK 2.4/2.5 PAssert with CoGroupByKey

2018-09-18 Thread haden lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haden lee resolved BEAM-5409.
-
   Resolution: Fixed
Fix Version/s: 2.4.0
   2.5.0

Found out that adding the following for the Test class is necessary to make 
PAssert work with CoGroupByKey.
{code:java}
implements Serializable{code}
{code:java}
// code placeholder
import java.io.Serializable;

import org.apache.beam.sdk.coders.KvCoder;
import org.apache.beam.sdk.coders.StringUtf8Coder;
import org.apache.beam.sdk.testing.PAssert;
import org.apache.beam.sdk.testing.TestPipeline;
import org.apache.beam.sdk.transforms.Create;
import org.apache.beam.sdk.transforms.DoFn;
import org.apache.beam.sdk.transforms.ParDo;
import org.apache.beam.sdk.transforms.join.CoGbkResult;
import org.apache.beam.sdk.transforms.join.CoGroupByKey;
import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
import org.apache.beam.sdk.values.KV;
import org.apache.beam.sdk.values.PCollection;
import org.apache.beam.sdk.values.PCollectionTuple;
import org.apache.beam.sdk.values.TupleTag;
import org.apache.beam.sdk.values.TupleTagList;
import org.junit.Rule;
import org.junit.Test;

public class ReferenceTest implements Serializable {
  @Rule
  public final transient TestPipeline pipe1 = TestPipeline.create();
  @Rule
  public final transient TestPipeline pipe2 = TestPipeline.create();

  public static class String2KV extends DoFn> {
@ProcessElement
public void processElement(ProcessContext c) {
  // "key1:value1" -> ["key1", "value1"]
  String[] tokens = c.element().split(":");
  c.output(KV.of(tokens[0], tokens[1]));
}
  }

  public static class MergeDoFn extends DoFn, String> {
final TupleTag inTag1;
final TupleTag inTag2;
final TupleTag outTag2;

public MergeDoFn(TupleTag inTag1, TupleTag inTag2, 
TupleTag outTag2) {
  this.inTag1 = inTag1;
  this.inTag2 = inTag2;
  this.outTag2 = outTag2;
}

@ProcessElement
public void processElement(ProcessContext c) {
  String val1 = c.element().getValue().getOnly(inTag1);
  String val2 = c.element().getValue().getOnly(inTag2);

  // outTag1 = main output
  // outTag2 = side output
  c.output(outTag2, val1 + "," + val2);
}
  }

  @Test
  public void testWorking() {
// Create two PCs for test.
PCollection pc1 =
pipe1.apply("create pc1", 
Create.of("key1:value1").withCoder(StringUtf8Coder.of()));
PCollection> pc2 =
pipe1.apply("create pc2", Create.>of(KV.of("key1", 
"key1:value2"))
.withCoder(KvCoder.of(StringUtf8Coder.of(), StringUtf8Coder.of(;

// Sanity check.
PAssert.that(pc1).containsInAnyOrder("key1:value1");
PAssert.that(pc2).containsInAnyOrder(KV.of("key1", "key1:value2"));

pipe1.run();
  }

  // Disabled as of 2018-07-13.
  // 
https://stackoverflow.com/questions/51334429/beam-java-sdk-2-4-2-5-passert-with-cogroupbykey
  @Test
  public void testBroken() {
// Create two PCs for test.
PCollection pc1 =
pipe2.apply("create pc1", 
Create.of("key1:value1").withCoder(StringUtf8Coder.of()));
PCollection> pc2 =
pipe2.apply("create pc2", Create.>of(KV.of("key1", 
"value2"))
.withCoder(KvCoder.of(StringUtf8Coder.of(), StringUtf8Coder.of(;

// Sanity check.
PAssert.that(pc1).containsInAnyOrder("key1:value1");
PAssert.that(pc2).containsInAnyOrder(KV.of("key1", "value2"));

TupleTag inTag1 = new TupleTag() {
  private static final long serialVersionUID = 1L;
};
TupleTag inTag2 = new TupleTag() {
  private static final long serialVersionUID = 1L;
};
TupleTag outTag1 = new TupleTag() {
  private static final long serialVersionUID = 1L;
};
TupleTag outTag2 = new TupleTag() {
  private static final long serialVersionUID = 1L;
};

// The following four lines causes an issue.
PCollectionTuple tuple =
KeyedPCollectionTuple.of(inTag1, pc1.apply("String to KV", ParDo.of(new String2KV(
.and(inTag2, pc2).apply("CoGroupByKey", 
CoGroupByKey.create()).apply("Some Merge DoFn",
ParDo.of(new MergeDoFn(inTag1, inTag2, 
outTag2)).withOutputTags(outTag1, TupleTagList.of(outTag2)));

PAssert.that(tuple.get(outTag1)).empty();
PAssert.that(tuple.get(outTag2)).containsInAnyOrder("value1,value2");

pipe2.run();
  }
}

{code}

> Beam Java SDK 2.4/2.5 PAssert with CoGroupByKey
> ---
>
> Key: BEAM-5409
> URL: https://issues.apache.org/jira/browse/BEAM-5409
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Affects Versions: 2.4.0, 2.5.0
>Reporter: haden lee
>Assignee: Jason Kuster
>Priority: Major
> Fix For: 2.5.0, 2.4.0
>
>
> I may be missing something obvious, but for some 

[jira] [Work logged] (BEAM-4824) Get BigQueryIO batch loads to return something actionable

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4824?focusedWorklogId=145454=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145454
 ]

ASF GitHub Bot logged work on BEAM-4824:


Author: ASF GitHub Bot
Created on: 18/Sep/18 19:51
Start Date: 18/Sep/18 19:51
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #6055: [BEAM-4824] Batch 
BigQueryIO returns job results
URL: https://github.com/apache/beam/pull/6055#issuecomment-422527344
 
 
   Oh, @calonso this was intended to make Wait.on work, right? Let me see if I 
can think of something.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145454)
Time Spent: 1h 20m  (was: 1h 10m)

> Get BigQueryIO batch loads to return something actionable
> -
>
> Key: BEAM-4824
> URL: https://issues.apache.org/jira/browse/BEAM-4824
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Carlos Alonso
>Assignee: Carlos Alonso
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> ATM BigQueryIO batchloads returns an empty collection that has no information 
> related to how the load job finished. It is even returned before the job 
> finishes.
>  
> Change it so that:
>  # The returning PCollection only appers when the job has actually finished
>  # The returning PCollection contains information about the job result



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4824) Get BigQueryIO batch loads to return something actionable

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4824?focusedWorklogId=145452=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145452
 ]

ASF GitHub Bot logged work on BEAM-4824:


Author: ASF GitHub Bot
Created on: 18/Sep/18 19:43
Start Date: 18/Sep/18 19:43
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #6055: [BEAM-4824] Batch 
BigQueryIO returns job results
URL: https://github.com/apache/beam/pull/6055#issuecomment-422524409
 
 
   BTW my second comment still stands I think. BigQueryIO currently uses load 
jobs as an implementation detail. It might end up creating one load job per 
table, or it might end up creating multiple load jobs per table (if the table 
is very large). Collapsing the multiple jobs together might be very confusing. 
I think making information about these jobs part of the public API is very 
confusing, when the actual logical model is per record.
   
   Another thing: there will be upcoming changes to the BigQuery API, and we 
plan on getting rid of load jobs entirely from BigQueryIO. If we make 
information about load jobs part of the public API, it might be problematic 
when we remove the load jobs.
   
   Is this something that could be accomplished with better logging, or are 
there concrete use cases for wanting the output in a PCollection?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145452)
Time Spent: 1h 10m  (was: 1h)

> Get BigQueryIO batch loads to return something actionable
> -
>
> Key: BEAM-4824
> URL: https://issues.apache.org/jira/browse/BEAM-4824
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Carlos Alonso
>Assignee: Carlos Alonso
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> ATM BigQueryIO batchloads returns an empty collection that has no information 
> related to how the load job finished. It is even returned before the job 
> finishes.
>  
> Change it so that:
>  # The returning PCollection only appers when the job has actually finished
>  # The returning PCollection contains information about the job result



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4824) Get BigQueryIO batch loads to return something actionable

2018-09-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4824?focusedWorklogId=145450=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-145450
 ]

ASF GitHub Bot logged work on BEAM-4824:


Author: ASF GitHub Bot
Created on: 18/Sep/18 19:39
Start Date: 18/Sep/18 19:39
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #6055: [BEAM-4824] Batch 
BigQueryIO returns job results
URL: https://github.com/apache/beam/pull/6055#issuecomment-422522755
 
 
   Sorry for the long delay! Looking again


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 145450)
Time Spent: 1h  (was: 50m)

> Get BigQueryIO batch loads to return something actionable
> -
>
> Key: BEAM-4824
> URL: https://issues.apache.org/jira/browse/BEAM-4824
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Carlos Alonso
>Assignee: Carlos Alonso
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> ATM BigQueryIO batchloads returns an empty collection that has no information 
> related to how the load job finished. It is even returned before the job 
> finishes.
>  
> Change it so that:
>  # The returning PCollection only appers when the job has actually finished
>  # The returning PCollection contains information about the job result



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: Merge pull request #6401 from boyuanzz/python_bq_it

2018-09-18 Thread pabloem
This is an automated email from the ASF dual-hosted git repository.

pabloem pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 5d2e2f63fb692d545362db5cb8c1066d10e11946
Merge: 46f0a02 d6b456d
Author: Pablo 
AuthorDate: Tue Sep 18 12:00:55 2018 -0700

Merge pull request #6401 from boyuanzz/python_bq_it

[BEAM-5377] Add big_query_query_to_table_it to python SDK

 .../io/gcp/big_query_query_to_table_it_test.py | 172 +
 .../io/gcp/big_query_query_to_table_pipeline.py|  68 
 sdks/python/scripts/run_postcommit.sh  |   9 +-
 3 files changed, 248 insertions(+), 1 deletion(-)



[beam] branch master updated (46f0a02 -> 5d2e2f6)

2018-09-18 Thread pabloem
This is an automated email from the ASF dual-hosted git repository.

pabloem pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 46f0a02  [BEAM-3194] Fail if @RequiresStableInput is used on runners 
that don't supoort it
 add d6b456d  Add big_query_query_to_table_it to python SDK
 new 5d2e2f6  Merge pull request #6401 from boyuanzz/python_bq_it

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../io/gcp/big_query_query_to_table_it_test.py | 172 +
 .../io/gcp/big_query_query_to_table_pipeline.py|  68 
 sdks/python/scripts/run_postcommit.sh  |   9 +-
 3 files changed, 248 insertions(+), 1 deletion(-)
 create mode 100644 
sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py
 create mode 100644 
sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py



[jira] [Resolved] (BEAM-4417) BigqueryIO Numeric datatype Support

2018-09-18 Thread Pablo Estrada (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada resolved BEAM-4417.
-
   Resolution: Fixed
Fix Version/s: (was: 2.8.0)
   2.7.0

This will be available in 2.7.0

> BigqueryIO Numeric datatype Support
> ---
>
> Key: BEAM-4417
> URL: https://issues.apache.org/jira/browse/BEAM-4417
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Kishan Kumar
>Assignee: Pablo Estrada
>Priority: Critical
>  Labels: newbie, patch
> Fix For: 2.7.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> The BigQueryIO.read fails while parsing the data from the avro file generated 
> while reading the data from the table which has columns with *Numeric* 
> datatypes. 
> We have gone through the source code at Git-Hub and noticed that *Numeric 
> data type is not yet supported.* 
>  
> Caused by: com.google.common.base.VerifyException: Unsupported BigQuery type: 
> NUMERIC
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle #1071

2018-09-18 Thread Apache Jenkins Server
See 


Changes:

[ccy] [BEAM-5264] Reference DirectRunner implementation of Python User State

[ccy] Address review comments

[ccy] Address additional comments

[ccy] Address more comments

--
[...truncated 18.74 MB...]
INFO: Adding 
View.AsSingleton/Combine.GloballyAsSingletonView/CreateDataflowView as step s9
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding Create123/Read(CreateSource) as step s10
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding OutputSideInputs as step s11
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Window.Into()/Window.Assign as step 
s12
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/GatherAllOutputs/Reify.Window/ParDo(Anonymous) as step 
s13
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map 
as step s14
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign as step 
s15
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/GroupByKey as step 
s16
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/Values/Values/Map as 
step s17
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/RewindowActuals/Window.Assign as step 
s18
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/KeyForDummy/AddKeys/Map as step s19
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveActualsTriggering/Flatten.PCollections as step 
s20
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Create.Values/Read(CreateSource) as 
step s21
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/WindowIntoDummy/Window.Assign as step 
s22
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveDummyTriggering/Flatten.PCollections as step s23
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/FlattenDummyAndContents as step s24
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/NeverTrigger/Flatten.PCollections as 
step s25
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GroupDummyAndContents as step s26
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Values/Values/Map as step s27
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/ParDo(Concat) as step s28
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GetPane/Map as step s29
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/RunChecks as step s30
Sep 18, 2018 6:56:51 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/VerifyAssertions/ParDo(DefaultConclude) as step s31
Sep 18, 2018 6:56:51 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0918185646-197165f3/output/results/staging/
Sep 18, 2018 6:56:52 PM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <71190 bytes, hash 

[jira] [Resolved] (BEAM-5418) Add Flink version compatibility table to Runner page

2018-09-18 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-5418.
--
   Resolution: Fixed
Fix Version/s: 2.8.0

> Add Flink version compatibility table to Runner page 
> -
>
> Key: BEAM-5418
> URL: https://issues.apache.org/jira/browse/BEAM-5418
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink, website
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Following up a discussion on the mailing list. There have been confusions 
> which version of Beam is compatible with which Flink version. The only way 
> for users at the moment is to look into the source code.
> A table like this will be helpful:
> || Beam || Flink ||
> |2.5.0 | 1.4.0 |
> |2.6.0 |  1.5.0 |
> |2.7.0 |  1.5.2 |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5385) Flink jobserver does not honor --flink-master-url

2018-09-18 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-5385:
-
Fix Version/s: (was: 2.8.0)
   2.7.0

> Flink jobserver does not honor --flink-master-url
> -
>
> Key: BEAM-5385
> URL: https://issues.apache.org/jira/browse/BEAM-5385
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> It will use the external Flink cluster when specified, but only with default 
> port number 8081, because the actual port is not in propagated in 
> FlinkExecutionEnvironments (RestOptions.PORT setting).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5422) Update BigQueryIO DynamicDestinations documentation to clarify usage of getDestination() and getTable()

2018-09-18 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-5422:


 Summary: Update BigQueryIO DynamicDestinations documentation to 
clarify usage of getDestination() and getTable()
 Key: BEAM-5422
 URL: https://issues.apache.org/jira/browse/BEAM-5422
 Project: Beam
  Issue Type: Improvement
  Components: io-java-gcp
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


Currently, there are some details related to these methods that should be 
further clarified. For example, getTable() is expected to return a unique value 
for each destination.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5239) Allow configure latencyTrackingInterval

2018-09-18 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-5239:
-
Fix Version/s: 2.7.0

> Allow configure latencyTrackingInterval
> ---
>
> Key: BEAM-5239
> URL: https://issues.apache.org/jira/browse/BEAM-5239
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Affects Versions: 2.6.0
>Reporter: Jozef Vilcek
>Assignee: Jozef Vilcek
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Because of FLINK-10226, we need to be able to set 
> latencyTrackingConfiguration for flink via FlinkPipelineOptions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >