[jira] [Commented] (BEAM-1082) Use use_standard_sql flag everywhere instead of use_legacy_sql
[ https://issues.apache.org/jira/browse/BEAM-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717582#comment-15717582 ] ASF GitHub Bot commented on BEAM-1082: -- Github user sb2nov closed the pull request at: https://github.com/apache/incubator-beam/pull/1497 > Use use_standard_sql flag everywhere instead of use_legacy_sql > -- > > Key: BEAM-1082 > URL: https://issues.apache.org/jira/browse/BEAM-1082 > Project: Beam > Issue Type: Improvement > Components: sdk-py >Reporter: Sourabh Bajaj >Assignee: Sourabh Bajaj > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1497: [BEAM-1082] Make the legacy SQL flag cons...
Github user sb2nov closed the pull request at: https://github.com/apache/incubator-beam/pull/1497 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Removing a bug in .travis.yml that makes the build fail.
Removing a bug in .travis.yml that makes the build fail. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/90db7908 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/90db7908 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/90db7908 Branch: refs/heads/python-sdk Commit: 90db7908b807cb752c23c445b220b3d5dd08b36b Parents: 9a175a5 Author: Pablo Authored: Tue Nov 29 13:58:55 2016 -0800 Committer: Robert Bradshaw Committed: Fri Dec 2 22:15:41 2016 -0800 -- .travis.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/90db7908/.travis.yml -- diff --git a/.travis.yml b/.travis.yml index 3080341..470d2fc 100644 --- a/.travis.yml +++ b/.travis.yml @@ -59,7 +59,7 @@ before_install: install: - if [ ! "$TEST_PYTHON" ]; then travis_retry mvn -B install clean -U -DskipTests=true; fi - if [ "$TEST_PYTHON" ] && pip list | grep tox; then TOX_FILE=`which tox` ; export TOX_HOME=`dirname $TOX_FILE`; fi - - if [ "$TEST_PYTHON" ] && ! pip list | grep tox; then travis_retry pip install tox --user `whoami`; fi + - if [ "$TEST_PYTHON" ] && ! pip list | grep tox; then travis_retry pip install tox --user; fi # Removing this here protects from inadvertent caching - rm -rf "$HOME/.m2/repository/org/apache/beam"
[1/2] incubator-beam git commit: Closes #1456
Repository: incubator-beam Updated Branches: refs/heads/python-sdk 9a175a5fe -> 7c5e4aa66 Closes #1456 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7c5e4aa6 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7c5e4aa6 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7c5e4aa6 Branch: refs/heads/python-sdk Commit: 7c5e4aa66c3916b98cf7ecf932f11c2b057e1858 Parents: 9a175a5 90db790 Author: Robert Bradshaw Authored: Fri Dec 2 22:15:41 2016 -0800 Committer: Robert Bradshaw Committed: Fri Dec 2 22:15:41 2016 -0800 -- .travis.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --
[1/2] incubator-beam git commit: Call from_p12_keyfile() with the correct arguments.
Repository: incubator-beam Updated Branches: refs/heads/python-sdk 7c0bf25fa -> 9a175a5fe Call from_p12_keyfile() with the correct arguments. This code path is failing, because a wrong list of arguments is passed. Fixing that uncovered that oauth2client depends on pyOpenSSL for this call to work. I did not add this dependency to setup.py because, it does not install cleanly in all environments. As a workaround, users who would like to use this authentication method could first do a 'pip install pyOpenSSL'. I added a test, that skips if 'pyOpenSSL' is not installed. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/f23b717e Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/f23b717e Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/f23b717e Branch: refs/heads/python-sdk Commit: f23b717e7e433c30c0acee0fea8d179e6343b8b8 Parents: 7c0bf25 Author: Ahmet Altay Authored: Fri Dec 2 13:38:31 2016 -0800 Committer: Robert Bradshaw Committed: Fri Dec 2 22:14:32 2016 -0800 -- sdks/python/apache_beam/internal/auth.py| 6 +-- sdks/python/apache_beam/internal/auth_test.py | 44 +++ sdks/python/apache_beam/tests/data/README.md| 20 + .../apache_beam/tests/data/privatekey.p12 | Bin 0 -> 2452 bytes sdks/python/setup.py| 2 +- 5 files changed, 68 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/f23b717e/sdks/python/apache_beam/internal/auth.py -- diff --git a/sdks/python/apache_beam/internal/auth.py b/sdks/python/apache_beam/internal/auth.py index a043fcf..056f40c 100644 --- a/sdks/python/apache_beam/internal/auth.py +++ b/sdks/python/apache_beam/internal/auth.py @@ -133,6 +133,7 @@ def get_service_credentials(): 'https://www.googleapis.com/auth/datastore' ] +# TODO(BEAM-1068): Do not recreate options from sys.argv. # We are currently being run from the command line. google_cloud_options = PipelineOptions( sys.argv).view_as(GoogleCloudOptions) @@ -151,8 +152,8 @@ def get_service_credentials(): return ServiceAccountCredentials.from_p12_keyfile( google_cloud_options.service_account_name, google_cloud_options.service_account_key_file, -client_scopes, -user_agent=user_agent) +private_key_password=None, +scopes=client_scopes) except ImportError: with file(google_cloud_options.service_account_key_file) as f: service_account_key = f.read() @@ -162,7 +163,6 @@ def get_service_credentials(): service_account_key, client_scopes, user_agent=user_agent) - else: try: credentials = _GCloudWrapperCredentials(user_agent) http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/f23b717e/sdks/python/apache_beam/internal/auth_test.py -- diff --git a/sdks/python/apache_beam/internal/auth_test.py b/sdks/python/apache_beam/internal/auth_test.py new file mode 100644 index 000..dfd408e --- /dev/null +++ b/sdks/python/apache_beam/internal/auth_test.py @@ -0,0 +1,44 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +"""Unit tests for the auth module.""" + +import os +import sys +import unittest + +import mock + +from apache_beam.internal import auth + + +class AuthTest(unittest.TestCase): + + def test_create_application_client(self): +try: + test_args = [ + 'test', '--service_account_name', 'abc', '--service_account_key_file', + os.path.join( + os.path.dirname(__file__), '..', 'tests/data/privatekey.p12')] + with mock.patch.object(sys, 'argv', test_args): +credentials = auth.get_service_credentials() +self.assertIsNotNone(credentials) +except NotImplementedError: + self.skipTest('service account tests require pyOpe
[2/2] incubator-beam git commit: Closes #1491
Closes #1491 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9a175a5f Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9a175a5f Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9a175a5f Branch: refs/heads/python-sdk Commit: 9a175a5fe17d087f2c3ca0ff8e6d53faad6beab4 Parents: 7c0bf25 f23b717 Author: Robert Bradshaw Authored: Fri Dec 2 22:14:33 2016 -0800 Committer: Robert Bradshaw Committed: Fri Dec 2 22:14:33 2016 -0800 -- sdks/python/apache_beam/internal/auth.py| 6 +-- sdks/python/apache_beam/internal/auth_test.py | 44 +++ sdks/python/apache_beam/tests/data/README.md| 20 + .../apache_beam/tests/data/privatekey.p12 | Bin 0 -> 2452 bytes sdks/python/setup.py| 2 +- 5 files changed, 68 insertions(+), 4 deletions(-) --
[1/2] incubator-beam git commit: Add labels to lambdas in write finalization
Repository: incubator-beam Updated Branches: refs/heads/python-sdk 72fa21f98 -> 7c0bf25fa Add labels to lambdas in write finalization Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/51e97d4a Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/51e97d4a Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/51e97d4a Branch: refs/heads/python-sdk Commit: 51e97d4a8d3f25608b6ee80f57c973186798d54f Parents: 72fa21f Author: Charles Chen Authored: Fri Dec 2 15:17:55 2016 -0800 Committer: Robert Bradshaw Committed: Fri Dec 2 22:12:35 2016 -0800 -- sdks/python/apache_beam/io/iobase.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/51e97d4a/sdks/python/apache_beam/io/iobase.py -- diff --git a/sdks/python/apache_beam/io/iobase.py b/sdks/python/apache_beam/io/iobase.py index b7cac3e..fd6ae57 100644 --- a/sdks/python/apache_beam/io/iobase.py +++ b/sdks/python/apache_beam/io/iobase.py @@ -767,10 +767,10 @@ class WriteImpl(ptransform.PTransform): write_result_coll = (pcoll | core.ParDo('write_bundles', _WriteBundleDoFn(), self.sink, AsSingleton(init_result_coll)) - | core.Map(lambda x: (None, x)) + | core.Map('pair', lambda x: (None, x)) | core.WindowInto(window.GlobalWindows()) | core.GroupByKey() - | core.FlatMap(lambda x: x[1])) + | core.FlatMap('extract', lambda x: x[1])) return do_once | core.FlatMap( 'finalize_write', _finalize_write,
[2/2] incubator-beam git commit: Closes #1496
Closes #1496 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7c0bf25f Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7c0bf25f Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7c0bf25f Branch: refs/heads/python-sdk Commit: 7c0bf25fadbe2e74ab62c87c90111b8b7c34e297 Parents: 72fa21f 51e97d4 Author: Robert Bradshaw Authored: Fri Dec 2 22:12:36 2016 -0800 Committer: Robert Bradshaw Committed: Fri Dec 2 22:12:36 2016 -0800 -- sdks/python/apache_beam/io/iobase.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --
[1/2] incubator-beam git commit: Make the legacy SQL flag consistent between Java and Python
Repository: incubator-beam Updated Branches: refs/heads/python-sdk 8365b6838 -> 72fa21f98 Make the legacy SQL flag consistent between Java and Python Renamed the BigQuery use_legacy_sql parameter to use_standard_sql. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/72721031 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/72721031 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/72721031 Branch: refs/heads/python-sdk Commit: 727210318404a585bb7742591ade0a09ccc20444 Parents: 8365b68 Author: Sourabh Bajaj Authored: Fri Dec 2 16:45:19 2016 -0800 Committer: Robert Bradshaw Committed: Fri Dec 2 22:10:21 2016 -0800 -- sdks/python/apache_beam/examples/snippets/snippets.py | 2 +- sdks/python/apache_beam/io/bigquery.py| 9 + sdks/python/apache_beam/io/bigquery_test.py | 4 ++-- 3 files changed, 8 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/72721031/sdks/python/apache_beam/examples/snippets/snippets.py -- diff --git a/sdks/python/apache_beam/examples/snippets/snippets.py b/sdks/python/apache_beam/examples/snippets/snippets.py index 580a3d7..6dcf05e 100644 --- a/sdks/python/apache_beam/examples/snippets/snippets.py +++ b/sdks/python/apache_beam/examples/snippets/snippets.py @@ -962,7 +962,7 @@ def model_bigqueryio(): 'ReadYearAndTemp', beam.io.BigQuerySource( query='SELECT year, mean_temp FROM `samples.weather_stations`', - use_legacy_sql=False)) + use_standard_sql=True)) # [END model_bigqueryio_query_standard_sql] # [START model_bigqueryio_schema] http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/72721031/sdks/python/apache_beam/io/bigquery.py -- diff --git a/sdks/python/apache_beam/io/bigquery.py b/sdks/python/apache_beam/io/bigquery.py index 0885e3a..ce75e10 100644 --- a/sdks/python/apache_beam/io/bigquery.py +++ b/sdks/python/apache_beam/io/bigquery.py @@ -323,7 +323,7 @@ class BigQuerySource(dataflow_io.NativeSource): """A source based on a BigQuery table.""" def __init__(self, table=None, dataset=None, project=None, query=None, - validate=False, coder=None, use_legacy_sql=True): + validate=False, coder=None, use_standard_sql=False): """Initialize a BigQuerySource. Args: @@ -351,8 +351,8 @@ class BigQuerySource(dataflow_io.NativeSource): in a file as a JSON serialized dictionary. This argument needs a value only in special cases when returning table rows as dictionaries is not desirable. - useLegacySql: Specifies whether to use BigQuery's legacy -SQL dialect for this query. The default value is true. If set to false, + use_standard_sql: Specifies whether to use BigQuery's standard +SQL dialect for this query. The default value is False. If set to True, the query will use BigQuery's updated SQL dialect with improved standards compliance. This parameter is ignored for table inputs. @@ -374,7 +374,8 @@ class BigQuerySource(dataflow_io.NativeSource): self.use_legacy_sql = True else: self.query = query - self.use_legacy_sql = use_legacy_sql + # TODO(BEAM-1082): Change the internal flag to be standard_sql + self.use_legacy_sql = not use_standard_sql self.table_reference = None self.validate = validate http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/72721031/sdks/python/apache_beam/io/bigquery_test.py -- diff --git a/sdks/python/apache_beam/io/bigquery_test.py b/sdks/python/apache_beam/io/bigquery_test.py index e263e13..a2cf947 100644 --- a/sdks/python/apache_beam/io/bigquery_test.py +++ b/sdks/python/apache_beam/io/bigquery_test.py @@ -199,7 +199,7 @@ class TestBigQuerySource(unittest.TestCase): hc.assert_that(dd.items, hc.contains_inanyorder(*expected_items)) def test_specify_query_sql_format(self): -source = beam.io.BigQuerySource(query='my_query', use_legacy_sql=False) +source = beam.io.BigQuerySource(query='my_query', use_standard_sql=True) self.assertEqual(source.query, 'my_query') self.assertFalse(source.use_legacy_sql) @@ -371,7 +371,7 @@ class TestBigQueryReader(unittest.TestCase): jobComplete=True, rows=table_rows, schema=schema) actual_rows = [] with beam.io.BigQuerySource( -query='query', use_legacy_sql=False).reader(client) as reader: +query='query', use_standard_sql=True).reader(client) as reader: for row in reader: actual_rows.append(row)
[2/2] incubator-beam git commit: Closes #1497
Closes #1497 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/72fa21f9 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/72fa21f9 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/72fa21f9 Branch: refs/heads/python-sdk Commit: 72fa21f98527e57cd5c7fad3977c95d7c994325e Parents: 8365b68 7272103 Author: Robert Bradshaw Authored: Fri Dec 2 22:10:22 2016 -0800 Committer: Robert Bradshaw Committed: Fri Dec 2 22:10:22 2016 -0800 -- sdks/python/apache_beam/examples/snippets/snippets.py | 2 +- sdks/python/apache_beam/io/bigquery.py| 9 + sdks/python/apache_beam/io/bigquery_test.py | 4 ++-- 3 files changed, 8 insertions(+), 7 deletions(-) --
[2/2] incubator-beam git commit: Closes #1494
Closes #1494 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8365b683 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8365b683 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8365b683 Branch: refs/heads/python-sdk Commit: 8365b6838eda6dcef39097ab19b85b9af270914f Parents: fd6a52c 16ffdb2 Author: Robert Bradshaw Authored: Fri Dec 2 22:06:42 2016 -0800 Committer: Robert Bradshaw Committed: Fri Dec 2 22:06:42 2016 -0800 -- sdks/python/apache_beam/examples/snippets/snippets.py | 2 +- sdks/python/apache_beam/internal/apiclient_test.py | 1 + sdks/python/apache_beam/io/datastore/v1/datastoreio.py | 10 +++--- sdks/python/apache_beam/io/datastore/v1/helper.py | 4 +++- 4 files changed, 8 insertions(+), 9 deletions(-) --
[1/2] incubator-beam git commit: Fix auth related unit test failures
Repository: incubator-beam Updated Branches: refs/heads/python-sdk fd6a52c15 -> 8365b6838 Fix auth related unit test failures Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/16ffdb25 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/16ffdb25 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/16ffdb25 Branch: refs/heads/python-sdk Commit: 16ffdb25f6029c4bee71f035d8d9747f6330ec9f Parents: fd6a52c Author: Vikas Kedigehalli Authored: Fri Dec 2 14:13:31 2016 -0800 Committer: Robert Bradshaw Committed: Fri Dec 2 22:06:41 2016 -0800 -- sdks/python/apache_beam/examples/snippets/snippets.py | 2 +- sdks/python/apache_beam/internal/apiclient_test.py | 1 + sdks/python/apache_beam/io/datastore/v1/datastoreio.py | 10 +++--- sdks/python/apache_beam/io/datastore/v1/helper.py | 4 +++- 4 files changed, 8 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16ffdb25/sdks/python/apache_beam/examples/snippets/snippets.py -- diff --git a/sdks/python/apache_beam/examples/snippets/snippets.py b/sdks/python/apache_beam/examples/snippets/snippets.py index c2a047f..580a3d7 100644 --- a/sdks/python/apache_beam/examples/snippets/snippets.py +++ b/sdks/python/apache_beam/examples/snippets/snippets.py @@ -919,7 +919,7 @@ def model_datastoreio(): # [START model_datastoreio_write] p = beam.Pipeline(options=PipelineOptions()) musicians = p | 'Musicians' >> beam.Create( - ['Mozart', 'Chopin', 'Beethoven', 'Bach']) + ['Mozart', 'Chopin', 'Beethoven', 'Vivaldi']) def to_entity(content): entity = entity_pb2.Entity() http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16ffdb25/sdks/python/apache_beam/internal/apiclient_test.py -- diff --git a/sdks/python/apache_beam/internal/apiclient_test.py b/sdks/python/apache_beam/internal/apiclient_test.py index 66cc8db..31b2dad 100644 --- a/sdks/python/apache_beam/internal/apiclient_test.py +++ b/sdks/python/apache_beam/internal/apiclient_test.py @@ -25,6 +25,7 @@ from apache_beam.internal import apiclient class UtilTest(unittest.TestCase): + @unittest.skip("Enable once BEAM-1080 is fixed.") def test_create_application_client(self): pipeline_options = PipelineOptions() apiclient.DataflowApplicationClient( http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16ffdb25/sdks/python/apache_beam/io/datastore/v1/datastoreio.py -- diff --git a/sdks/python/apache_beam/io/datastore/v1/datastoreio.py b/sdks/python/apache_beam/io/datastore/v1/datastoreio.py index 054002f..20466b9 100644 --- a/sdks/python/apache_beam/io/datastore/v1/datastoreio.py +++ b/sdks/python/apache_beam/io/datastore/v1/datastoreio.py @@ -22,7 +22,6 @@ import logging from google.datastore.v1 import datastore_pb2 from googledatastore import helper as datastore_helper -from apache_beam.internal import auth from apache_beam.io.datastore.v1 import helper from apache_beam.io.datastore.v1 import query_splitter from apache_beam.transforms import Create @@ -154,8 +153,7 @@ class ReadFromDatastore(PTransform): self._num_splits = num_splits def start_bundle(self, context): - self._datastore = helper.get_datastore(self._project, - auth.get_service_credentials()) + self._datastore = helper.get_datastore(self._project) def process(self, p_context, *args, **kwargs): # distinct key to be used to group query splits. @@ -210,8 +208,7 @@ class ReadFromDatastore(PTransform): self._datastore = None def start_bundle(self, context): - self._datastore = helper.get_datastore(self._project, - auth.get_service_credentials()) + self._datastore = helper.get_datastore(self._project) def process(self, p_context, *args, **kwargs): query = p_context.element @@ -341,8 +338,7 @@ class _Mutate(PTransform): def start_bundle(self, context): self._mutations = [] - self._datastore = helper.get_datastore(self._project, - auth.get_service_credentials()) + self._datastore = helper.get_datastore(self._project) def process(self, context): self._mutations.append(context.element) http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16ffdb25/sdks/python/apache_beam/io/datastore/v1/helper.py -- diff --git a/sdks/python/apache_beam/io/datastore/v1/helper.py b/sdks/python/apache_beam/io
[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn
[ https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717483#comment-15717483 ] Daniel Halperin commented on BEAM-498: -- I think we have simply deleted the DatastoreWordCount example, so should just delete that link from the README.md. > Make DoFnWithContext the new DoFn > - > > Key: BEAM-498 > URL: https://issues.apache.org/jira/browse/BEAM-498 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > Labels: backward-incompatible > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-1078) Modifying the links in JavaDocs to point to the Beam github repo rather than GCP
[ https://issues.apache.org/jira/browse/BEAM-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin updated BEAM-1078: -- Issue Type: Improvement (was: Bug) > Modifying the links in JavaDocs to point to the Beam github repo rather than > GCP > - > > Key: BEAM-1078 > URL: https://issues.apache.org/jira/browse/BEAM-1078 > Project: Beam > Issue Type: Improvement >Reporter: Neelesh Srinivas Salian >Assignee: Neelesh Srinivas Salian > Fix For: Not applicable > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-1078) Modifying the links in JavaDocs to point to the Beam github repo rather than GCP
[ https://issues.apache.org/jira/browse/BEAM-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin updated BEAM-1078: -- Fix Version/s: Not applicable > Modifying the links in JavaDocs to point to the Beam github repo rather than > GCP > - > > Key: BEAM-1078 > URL: https://issues.apache.org/jira/browse/BEAM-1078 > Project: Beam > Issue Type: Improvement >Reporter: Neelesh Srinivas Salian >Assignee: Neelesh Srinivas Salian > Fix For: Not applicable > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/2] incubator-beam git commit: BEAM-1078: Changing the links from GCP to incubator-beam in the project
Repository: incubator-beam Updated Branches: refs/heads/master 8a7919b5a -> a13024c40 BEAM-1078: Changing the links from GCP to incubator-beam in the project Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5a997a1a Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5a997a1a Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5a997a1a Branch: refs/heads/master Commit: 5a997a1a5d5d977bb84af1737db1128df916de7a Parents: 8a7919b Author: Neelesh Srinivas Salian Authored: Fri Dec 2 17:43:34 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 21:44:33 2016 -0800 -- .travis/README.md | 2 +- .../java/org/apache/beam/examples/complete/README.md | 14 +++--- .../java/org/apache/beam/examples/cookbook/README.md | 14 +++--- 3 files changed, 15 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/5a997a1a/.travis/README.md -- diff --git a/.travis/README.md b/.travis/README.md index e0c13f2..536692d 100644 --- a/.travis/README.md +++ b/.travis/README.md @@ -19,5 +19,5 @@ # Travis Scripts -This directory contains scripts used for [Travis CI](https://travis-ci.org/GoogleCloudPlatform/DataflowJavaSDK) +This directory contains scripts used for [Travis CI](https://travis-ci.org/apache/incubator-beam/) testing. http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/5a997a1a/examples/java/src/main/java/org/apache/beam/examples/complete/README.md -- diff --git a/examples/java/src/main/java/org/apache/beam/examples/complete/README.md b/examples/java/src/main/java/org/apache/beam/examples/complete/README.md index b98be7a..b0b6f9d 100644 --- a/examples/java/src/main/java/org/apache/beam/examples/complete/README.md +++ b/examples/java/src/main/java/org/apache/beam/examples/complete/README.md @@ -22,34 +22,34 @@ This directory contains end-to-end example pipelines that perform complex data processing tasks. They include: - https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/complete/AutoComplete.java";>AutoComplete + https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/complete/AutoComplete.java";>AutoComplete — An example that computes the most popular hash tags for every prefix, which can be used for auto-completion. Demonstrates how to use the same pipeline in both streaming and batch, combiners, and composite transforms. - https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/complete/StreamingWordExtract.java";>StreamingWordExtract + https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/complete/StreamingWordExtract.java";>StreamingWordExtract — A streaming pipeline example that inputs lines of text from a Cloud Pub/Sub topic, splits each line into individual words, capitalizes those words, and writes the output to a BigQuery table. - https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/complete/TfIdf.java";>TfIdf + https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/complete/TfIdf.java";>TfIdf — An example that computes a basic TF-IDF search table for a directory or Cloud Storage prefix. Demonstrates joining data, side inputs, and logging. - https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/complete/TopWikipediaSessions.java";>TopWikipediaSessions + https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java";>TopWikipediaSessions — An example that reads Wikipedia edit data from Cloud Storage and computes the user with the longest string of edits separated by no more than an hour within each month. Demonstrates using Cloud Dataflow Windowing to perform time-based aggregations of data. - https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/complete/TrafficMaxLaneFlow.java";>TrafficMaxLaneFlow + https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/complete/TrafficMaxLaneFlow.java";>TrafficMaxLaneFlow — A streaming Beam Example using BigQuery output in the traffic sensor domain. Demonstrates the Cloud Dataflo
[jira] [Commented] (BEAM-1078) Modifying the links in JavaDocs to point to the Beam github repo rather than GCP
[ https://issues.apache.org/jira/browse/BEAM-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717478#comment-15717478 ] ASF GitHub Bot commented on BEAM-1078: -- Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1498 > Modifying the links in JavaDocs to point to the Beam github repo rather than > GCP > - > > Key: BEAM-1078 > URL: https://issues.apache.org/jira/browse/BEAM-1078 > Project: Beam > Issue Type: Bug >Reporter: Neelesh Srinivas Salian >Assignee: Neelesh Srinivas Salian > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1498: [BEAM-1078] [Docs] - Modifying the links ...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1498 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: [BEAM-1078] Closes #1498
[BEAM-1078] Closes #1498 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a13024c4 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a13024c4 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a13024c4 Branch: refs/heads/master Commit: a13024c40f73b6065ea4094d6e750b50c5027bb2 Parents: 8a7919b 5a997a1 Author: Dan Halperin Authored: Fri Dec 2 21:45:31 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 21:45:31 2016 -0800 -- .travis/README.md | 2 +- .../java/org/apache/beam/examples/complete/README.md | 14 +++--- .../java/org/apache/beam/examples/cookbook/README.md | 14 +++--- 3 files changed, 15 insertions(+), 15 deletions(-) --
[jira] [Commented] (BEAM-1078) Modifying the links in JavaDocs to point to the Beam github repo rather than GCP
[ https://issues.apache.org/jira/browse/BEAM-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717402#comment-15717402 ] ASF GitHub Bot commented on BEAM-1078: -- GitHub user nssalian opened a pull request: https://github.com/apache/incubator-beam/pull/1498 [BEAM-1078] [Docs] - Modifying the links in JavaDocs to point to the Beam github repo rather than GCP Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: https://issues.apache.org/jira/browse/BEAM-1078 Description: 1) Changed the links to point to the master of the incubator-beam github repository. 2) Haven't changed one of the links since the pull request is still pending (https://github.com/GoogleCloudPlatform/DataflowJavaSDK/pull/41) 3) Haven't changed the DatastoreWordCount.java link in the cookbook/README.md since BEAM-498 is still pending and it has replaced DatastoreWordCount with DoFn. Awaiting @kennknowles ' response on that one. - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/nssalian/incubator-beam BEAM-1078 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1498.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1498 commit b26bbe331df7497c7857cb412a597a75ac828222 Author: Neelesh Srinivas Salian Date: 2016-12-03T01:43:34Z BEAM-1078: Changing the links from GCP to incubator-beam in the project commit 6573c30888c00186b511ff33de79cf370be111d6 Author: Neelesh Srinivas Salian Date: 2016-12-03T01:46:29Z BEAM-1078: Changed links to master branch > Modifying the links in JavaDocs to point to the Beam github repo rather than > GCP > - > > Key: BEAM-1078 > URL: https://issues.apache.org/jira/browse/BEAM-1078 > Project: Beam > Issue Type: Bug >Reporter: Neelesh Srinivas Salian >Assignee: Neelesh Srinivas Salian > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1498: [BEAM-1078] [Docs] - Modifying the links ...
GitHub user nssalian opened a pull request: https://github.com/apache/incubator-beam/pull/1498 [BEAM-1078] [Docs] - Modifying the links in JavaDocs to point to the Beam github repo rather than GCP Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: https://issues.apache.org/jira/browse/BEAM-1078 Description: 1) Changed the links to point to the master of the incubator-beam github repository. 2) Haven't changed one of the links since the pull request is still pending (https://github.com/GoogleCloudPlatform/DataflowJavaSDK/pull/41) 3) Haven't changed the DatastoreWordCount.java link in the cookbook/README.md since BEAM-498 is still pending and it has replaced DatastoreWordCount with DoFn. Awaiting @kennknowles ' response on that one. - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/nssalian/incubator-beam BEAM-1078 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1498.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1498 commit b26bbe331df7497c7857cb412a597a75ac828222 Author: Neelesh Srinivas Salian Date: 2016-12-03T01:43:34Z BEAM-1078: Changing the links from GCP to incubator-beam in the project commit 6573c30888c00186b511ff33de79cf370be111d6 Author: Neelesh Srinivas Salian Date: 2016-12-03T01:46:29Z BEAM-1078: Changed links to master branch --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (BEAM-682) Invoker Class should be created in Thread Context Classloader
[ https://issues.apache.org/jira/browse/BEAM-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717358#comment-15717358 ] Luke Cwik commented on BEAM-682: ReflectHelpers exposes a method which figures out the correct class loader to use: https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/common/ReflectHelpers.java#L224 You can't assume the current threads context class loader is always available since it can be null. http://stackoverflow.com/questions/3459216/can-the-thread-context-class-loader-be-null > Invoker Class should be created in Thread Context Classloader > - > > Key: BEAM-682 > URL: https://issues.apache.org/jira/browse/BEAM-682 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 0.3.0-incubating >Reporter: Sumit Chawla >Assignee: Sumit Chawla > > As of now the InvokerClass is being loaded in wrong classloader. It should be > loaded into Thread.currentThread.getContextClassLoader() > https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnInvokers.java#L167 > {code} > Class> res = > (Class>) > unloaded > .load(DoFnInvokers.class.getClassLoader(), > ClassLoadingStrategy.Default.INJECTION) > .getLoaded(); > {code} > Fix > {code} > Class> res = > (Class>) > unloaded > .load(Thread.currentThread().getContextClassLoader(), > ClassLoadingStrategy.Default.INJECTION) > .getLoaded(); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-1065) FileBasedSource: replace SeekableByteChannel with open(spec, startingPosition)
[ https://issues.apache.org/jira/browse/BEAM-1065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717184#comment-15717184 ] Daniel Halperin commented on BEAM-1065: --- More impressions based on your list, and in reverse order :). 3. Yes, giving the source implementation the ability to control the starting office is a clear win, and can save a seek -- love it! However, this can (and should) be done independent of any changes to seekability. 2. Two concerns: A) I am not certain that a file system that cannot provide a seek can provide an open-at-a-nonzero-offset. So I'm not so convinced this is a trivial change. B) Just because the stream is opened at a specific place does not mean the user would not want to seek. For example, consider a very efficient reader for PDF files. They have an index at the beginning, so you know exactly where every page starts. Maybe the "open offset" would be the start of the file, and then we would immediate seek to the first page in range. So I think seekability is useful. Considering the combination of A/B, I would actually be supportive of the other direction -- just change the return value of {{open}} to {{SeekableByteChannel}} -- requiring that seek be supported. I'm not sure we have any examples of filesystems that don't support seeking in practice. 1. This is true, but (see below) I think that {{SeekableByteChannel}} is still important. > FileBasedSource: replace SeekableByteChannel with open(spec, startingPosition) > -- > > Key: BEAM-1065 > URL: https://issues.apache.org/jira/browse/BEAM-1065 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Pei He >Assignee: Pei He > > FileBasedReader should be able to open the file with the > Source.getStartOffset(), and then read forward to find the first input > element. > The benefits are: > 1. It is easier to implement a ReadableByteChannel. > 2. Dynamically splitting won't require file systems to support seeking. > 3. Doesn't need to seek to position twice, which is what current API does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-1064) Convert Jenkins jobs to DSL
[ https://issues.apache.org/jira/browse/BEAM-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717116#comment-15717116 ] ASF GitHub Bot commented on BEAM-1064: -- Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1390 > Convert Jenkins jobs to DSL > --- > > Key: BEAM-1064 > URL: https://issues.apache.org/jira/browse/BEAM-1064 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Jason Kuster >Assignee: Jason Kuster > Labels: jenkins > > Move Jenkins jobs to DSL. PR is here: > https://github.com/apache/incubator-beam/pull/1390 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1390: [BEAM-1064] Jenkins DSL Config
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1390 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: This closes #1390
This closes #1390 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8a7919b5 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8a7919b5 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8a7919b5 Branch: refs/heads/master Commit: 8a7919b5a5faa23f85397be33495924130bdbca0 Parents: c840455 ad9ca45 Author: Davor Bonaci Authored: Fri Dec 2 17:40:20 2016 -0800 Committer: Davor Bonaci Committed: Fri Dec 2 17:40:20 2016 -0800 -- .jenkins/common_job_properties.groovy | 166 +++ ...job_beam_PostCommit_Java_MavenInstall.groovy | 42 + ...ommit_Java_RunnableOnService_Dataflow.groovy | 39 + ...stCommit_Java_RunnableOnService_Flink.groovy | 38 + ...ommit_Java_RunnableOnService_Gearpump.groovy | 41 + ...stCommit_Java_RunnableOnService_Spark.groovy | 38 + .../job_beam_PostCommit_Python_Verify.groovy| 37 + .../job_beam_PreCommit_Java_MavenInstall.groovy | 42 + .../job_beam_Release_NightlySnapshot.groovy | 46 + .jenkins/job_seed.groovy| 47 ++ 10 files changed, 536 insertions(+) --
[1/2] incubator-beam git commit: Initial commit of jobs
Repository: incubator-beam Updated Branches: refs/heads/master c84045573 -> 8a7919b5a Initial commit of jobs Signed-off-by: Jason Kuster Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/ad9ca455 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/ad9ca455 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/ad9ca455 Branch: refs/heads/master Commit: ad9ca455218f6dda32e31ee97fe721e8b4ad6c2a Parents: c840455 Author: Jason Kuster Authored: Mon Nov 14 15:35:40 2016 -0800 Committer: Davor Bonaci Committed: Fri Dec 2 17:40:12 2016 -0800 -- .jenkins/common_job_properties.groovy | 166 +++ ...job_beam_PostCommit_Java_MavenInstall.groovy | 42 + ...ommit_Java_RunnableOnService_Dataflow.groovy | 39 + ...stCommit_Java_RunnableOnService_Flink.groovy | 38 + ...ommit_Java_RunnableOnService_Gearpump.groovy | 41 + ...stCommit_Java_RunnableOnService_Spark.groovy | 38 + .../job_beam_PostCommit_Python_Verify.groovy| 37 + .../job_beam_PreCommit_Java_MavenInstall.groovy | 42 + .../job_beam_Release_NightlySnapshot.groovy | 46 + .jenkins/job_seed.groovy| 47 ++ 10 files changed, 536 insertions(+) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ad9ca455/.jenkins/common_job_properties.groovy -- diff --git a/.jenkins/common_job_properties.groovy b/.jenkins/common_job_properties.groovy new file mode 100644 index 000..f3a8a07 --- /dev/null +++ b/.jenkins/common_job_properties.groovy @@ -0,0 +1,166 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// Contains functions that help build Jenkins projects. Functions typically set +// common properties that are shared among all Jenkins projects. +class common_job_properties { + + // Sets common top-level job properties. + static def setTopLevelJobProperties(def context, + def default_branch = 'master', + def default_timeout = 100) { +// GitHub project. +context.properties { + githubProjectUrl('https://github.com/apache/incubator-beam/') +} + +// Set JDK version. +context.jdk('JDK 1.8 (latest)') + +// Restrict this project to run only on Jenkins executors dedicated to the +// Apache Beam project. +context.label('beam') + +// Discard old builds. Build records are only kept up to this number of days. +context.logRotator { + daysToKeep(14) +} + +// Source code management. +context.scm { + git { +remote { + url('https://github.com/apache/incubator-beam.git') + refspec('+refs/heads/*:refs/remotes/origin/* ' + + '+refs/pull/*:refs/remotes/origin/pr/*') +} +branch('${sha1}') +extensions { + cleanAfterCheckout() +} + } +} + +context.parameters { + // This is a recommended setup if you want to run the job manually. The + // ${sha1} parameter needs to be provided, and defaults to the main branch. + stringParam( + 'sha1', + default_branch, + 'Commit id or refname (eg: origin/pr/9/head) you want to build.') +} + +context.wrappers { + // Abort the build if it's stuck for more minutes than specified. + timeout { +absolute(default_timeout) +abortBuild() + } + + // Set SPARK_LOCAL_IP for spark tests. + environmentVariables { +env('SPARK_LOCAL_IP', '127.0.0.1') + } +} + } + + // Sets the pull request build trigger. + static def setPullRequestBuildTrigger(def context, +def commitStatusContext, +def successComment = '--none--') { +context.triggers { + githubPullRequest { +admins(['asfbot']) +useGitHubHooks() +orgWhitelist(['apache']) +
[jira] [Commented] (BEAM-551) Support Dynamic PipelineOptions
[ https://issues.apache.org/jira/browse/BEAM-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717092#comment-15717092 ] ASF GitHub Bot commented on BEAM-551: - Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1475 > Support Dynamic PipelineOptions > --- > > Key: BEAM-551 > URL: https://issues.apache.org/jira/browse/BEAM-551 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Sam McVeety >Assignee: Sam McVeety >Priority: Minor > > During the graph construction phase, the given SDK generates an initial > execution graph for the program. At execution time, this graph is > executed, either locally or by a service. Currently, Beam only supports > parameterization at graph construction time. Both Flink and Spark supply > functionality that allows a pre-compiled job to be run without SDK > interaction with updated runtime parameters. > In its current incarnation, Dataflow can read values of PipelineOptions at > job submission time, but this requires the presence of an SDK to properly > encode these values into the job. We would like to build a common layer > into the Beam model so that these dynamic options can be properly provided > to jobs. > Please see > https://docs.google.com/document/d/1I-iIgWDYasb7ZmXbGBHdok_IK1r1YAJ90JG5Fz0_28o/edit > for the high-level model, and > https://docs.google.com/document/d/17I7HeNQmiIfOJi0aI70tgGMMkOSgGi8ZUH-MOnFatZ8/edit > for > the specific API proposal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/2] incubator-beam git commit: Closes #1475
Closes #1475 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/c8404557 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/c8404557 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/c8404557 Branch: refs/heads/master Commit: c84045573948a7cba72e37e5e562c7f63375e9ea Parents: 26eb435 9a038c4 Author: Dan Halperin Authored: Fri Dec 2 17:25:36 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 17:25:36 2016 -0800 -- .../org/apache/beam/sdk/io/FileBasedSink.java | 22 +-- .../java/org/apache/beam/sdk/io/TextIO.java | 28 .../java/org/apache/beam/sdk/io/XmlSink.java| 4 +-- .../org/apache/beam/sdk/io/XmlSinkTest.java | 6 ++--- 4 files changed, 42 insertions(+), 18 deletions(-) --
[GitHub] incubator-beam pull request #1475: [BEAM-551] Add TextIO.Write support for V...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1475 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: Add TextIO.Write support for runtime-valued output prefix
Repository: incubator-beam Updated Branches: refs/heads/master 26eb4354c -> c84045573 Add TextIO.Write support for runtime-valued output prefix * Updates to TextIO * Updates for FileBasedSink to support this change * Updates to other FileBasedSinks that do not yet support runtime values but need to be aware that values are now ValueProvider instead of String Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9a038c4f Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9a038c4f Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9a038c4f Branch: refs/heads/master Commit: 9a038c4f3404a3707eca29c5e898014df7fafbf4 Parents: 26eb435 Author: Sam McVeety Authored: Wed Nov 30 14:06:59 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 17:24:12 2016 -0800 -- .../org/apache/beam/sdk/io/FileBasedSink.java | 22 +-- .../java/org/apache/beam/sdk/io/TextIO.java | 28 .../java/org/apache/beam/sdk/io/XmlSink.java| 4 +-- .../org/apache/beam/sdk/io/XmlSinkTest.java | 6 ++--- 4 files changed, 42 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9a038c4f/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java index 5375b90..1396ab6 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java @@ -41,6 +41,8 @@ import javax.annotation.Nullable; import org.apache.beam.sdk.coders.Coder; import org.apache.beam.sdk.coders.SerializableCoder; import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.ValueProvider; +import org.apache.beam.sdk.options.ValueProvider.StaticValueProvider; import org.apache.beam.sdk.transforms.display.DisplayData; import org.apache.beam.sdk.util.IOChannelFactory; import org.apache.beam.sdk.util.IOChannelUtils; @@ -135,7 +137,7 @@ public abstract class FileBasedSink extends Sink { /** * Base filename for final output files. */ - protected final String baseOutputFilename; + protected final ValueProvider baseOutputFilename; /** * The extension to be used for the final output files. @@ -162,7 +164,8 @@ public abstract class FileBasedSink extends Sink { */ public FileBasedSink(String baseOutputFilename, String extension, WritableByteChannelFactory writableByteChannelFactory) { -this(baseOutputFilename, extension, ShardNameTemplate.INDEX_OF_MAX, writableByteChannelFactory); +this(StaticValueProvider.of(baseOutputFilename), extension, +ShardNameTemplate.INDEX_OF_MAX, writableByteChannelFactory); } /** @@ -173,7 +176,8 @@ public abstract class FileBasedSink extends Sink { * See {@link ShardNameTemplate} for a description of file naming templates. */ public FileBasedSink(String baseOutputFilename, String extension, String fileNamingTemplate) { -this(baseOutputFilename, extension, fileNamingTemplate, CompressionType.UNCOMPRESSED); +this(StaticValueProvider.of(baseOutputFilename), extension, fileNamingTemplate, +CompressionType.UNCOMPRESSED); } /** @@ -182,8 +186,8 @@ public abstract class FileBasedSink extends Sink { * * See {@link ShardNameTemplate} for a description of file naming templates. */ - public FileBasedSink(String baseOutputFilename, String extension, String fileNamingTemplate, - WritableByteChannelFactory writableByteChannelFactory) { + public FileBasedSink(ValueProvider baseOutputFilename, String extension, + String fileNamingTemplate, WritableByteChannelFactory writableByteChannelFactory) { this.writableByteChannelFactory = writableByteChannelFactory; this.baseOutputFilename = baseOutputFilename; if (!isNullOrEmpty(writableByteChannelFactory.getFilenameSuffix())) { @@ -198,7 +202,7 @@ public abstract class FileBasedSink extends Sink { * Returns the base output filename for this file based sink. */ public String getBaseOutputFilename() { -return baseOutputFilename; +return baseOutputFilename.get(); } @Override @@ -216,7 +220,9 @@ public abstract class FileBasedSink extends Sink { super.populateDisplayData(builder); String fileNamePattern = String.format("%s%s%s", -baseOutputFilename, fileNamingTemplate, getFileExtension(extension)); +baseOutputFilename.isAccessible() +? baseOutputFilename.get() : baseOutputFilename.toString(), +fileNamingTemplate, getFileExtensi
[jira] [Resolved] (BEAM-511) Fill in the contribute/technical-vision section of the website
[ https://issues.apache.org/jira/browse/BEAM-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-511. -- Resolution: Fixed Fix Version/s: Not applicable Page has been deleted, does not need populating. > Fill in the contribute/technical-vision section of the website > -- > > Key: BEAM-511 > URL: https://issues.apache.org/jira/browse/BEAM-511 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Frances Perry >Assignee: Hadar Hod > Fix For: Not applicable > > > As per > https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (BEAM-511) Fill in the contribute/technical-vision section of the website
[ https://issues.apache.org/jira/browse/BEAM-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin closed BEAM-511. > Fill in the contribute/technical-vision section of the website > -- > > Key: BEAM-511 > URL: https://issues.apache.org/jira/browse/BEAM-511 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Frances Perry >Assignee: Hadar Hod > Fix For: Not applicable > > > As per > https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-1073) Staged websites have extra whitespace around links
[ https://issues.apache.org/jira/browse/BEAM-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-1073. --- Resolution: Won't Fix Basically, Jason has looked into this and it's probably not worth fixing. > Staged websites have extra whitespace around links > -- > > Key: BEAM-1073 > URL: https://issues.apache.org/jira/browse/BEAM-1073 > Project: Beam > Issue Type: Bug > Components: website >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Jason Kuster >Priority: Minor > Fix For: Not applicable > > > cc [~davor] [~frances] > e.g., > http://apache-beam-website-pull-requests.storage.googleapis.com/97/documentation/runners/flink/index.html > has this source when staged: > {code} > > > Programming Guide > > > {code} > but this source: > {code} > Programming Guide > {code} > when live. I assume this comes from the rewriting tool we use to make > directories work. > The former (space between end of {{Guide}} and {{}}) is what I assume > causes the visual effects. > NBD, but I spent a while figuring out why someone's PR caused this to happen > and then saw it disappear after merging and going live. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (BEAM-511) Fill in the contribute/technical-vision section of the website
[ https://issues.apache.org/jira/browse/BEAM-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hadar Hod reassigned BEAM-511: -- Assignee: Hadar Hod > Fill in the contribute/technical-vision section of the website > -- > > Key: BEAM-511 > URL: https://issues.apache.org/jira/browse/BEAM-511 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Frances Perry >Assignee: Hadar Hod > > As per > https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-194) Create a walkthrough of Beam examples in mobile gaming domain
[ https://issues.apache.org/jira/browse/BEAM-194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-194. -- Resolution: Fixed Fix Version/s: Not applicable Closing per request from Hadar. Looks like open PR is an SDK change, not a website change, so this seems to make sense. > Create a walkthrough of Beam examples in mobile gaming domain > - > > Key: BEAM-194 > URL: https://issues.apache.org/jira/browse/BEAM-194 > Project: Beam > Issue Type: Task > Components: website >Reporter: Devin Donnelly >Assignee: Hadar Hod > Fix For: Not applicable > > > The Beam SDKs provide a series of example pipelines in the mobile gaming > domain. The Dataflow documentation contains an detailed walkthrough of these > examples, explaining the use case, pipeline design, and some of the code. > Port these examples to the Beam website for Beam users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (BEAM-977) Fill in Python SDK page
[ https://issues.apache.org/jira/browse/BEAM-977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hadar Hod reassigned BEAM-977: -- Assignee: Hadar Hod > Fill in Python SDK page > --- > > Key: BEAM-977 > URL: https://issues.apache.org/jira/browse/BEAM-977 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-498) Make DoFnWithContext the new DoFn
[ https://issues.apache.org/jira/browse/BEAM-498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15717038#comment-15717038 ] Neelesh Srinivas Salian commented on BEAM-498: -- Hi [~kenn], Since this JIRA ports DatastoreWordCount example to new DoFn, do you think we should update the README.md to DoFn: https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/cookbook/README.md I noticed this while working on BEAM-1078. > Make DoFnWithContext the new DoFn > - > > Key: BEAM-498 > URL: https://issues.apache.org/jira/browse/BEAM-498 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles > Labels: backward-incompatible > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (BEAM-509) Fill in the Additonal Resources page of the website
[ https://issues.apache.org/jira/browse/BEAM-509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hadar Hod reassigned BEAM-509: -- Assignee: Hadar Hod > Fill in the Additonal Resources page of the website > --- > > Key: BEAM-509 > URL: https://issues.apache.org/jira/browse/BEAM-509 > Project: Beam > Issue Type: Task > Components: website >Reporter: Frances Perry >Assignee: Hadar Hod > > As per > https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit > Do a nicer curation of great Beam articles, videos, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (BEAM-665) Copy prose (translate html to md, remove Dataflow references, etc.)
[ https://issues.apache.org/jira/browse/BEAM-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hadar Hod closed BEAM-665. -- Resolution: Fixed Fix Version/s: Not applicable > Copy prose (translate html to md, remove Dataflow references, etc.) > --- > > Key: BEAM-665 > URL: https://issues.apache.org/jira/browse/BEAM-665 > Project: Beam > Issue Type: Sub-task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-1082) Use use_standard_sql flag everywhere instead of use_legacy_sql
Sourabh Bajaj created BEAM-1082: --- Summary: Use use_standard_sql flag everywhere instead of use_legacy_sql Key: BEAM-1082 URL: https://issues.apache.org/jira/browse/BEAM-1082 Project: Beam Issue Type: Improvement Components: sdk-py Reporter: Sourabh Bajaj Assignee: Sourabh Bajaj -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (BEAM-906) Fill in the get-started/downloads portion of the website
[ https://issues.apache.org/jira/browse/BEAM-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hadar Hod closed BEAM-906. -- > Fill in the get-started/downloads portion of the website > > > Key: BEAM-906 > URL: https://issues.apache.org/jira/browse/BEAM-906 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam-site pull request #101: Additional Resources page
GitHub user hadarhg opened a pull request: https://github.com/apache/incubator-beam-site/pull/101 Additional Resources page - [ ] Update descriptions for each section - [ ] Add subtitle/description for each resource You can merge this pull request into a Git repository by running: $ git pull https://github.com/hadarhg/incubator-beam-site additional-resources Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam-site/pull/101.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #101 commit 32a3fafc7dc2e3c6e0872e5a2b82fa7fbb54cd3f Author: Hadar Hod Date: 2016-12-01T22:58:23Z Update Additional Resources page commit 96751819a717992fd76bb95bd06116ccd59c5879 Author: Hadar Hod Date: 2016-12-03T00:37:24Z Update Additional Resources page --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #1497: Make the legacy SQL flag consistent betwe...
GitHub user sb2nov opened a pull request: https://github.com/apache/incubator-beam/pull/1497 Make the legacy SQL flag consistent between Java and Python Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- R: @chamikaramj @aaltay PTAL You can merge this pull request into a Git repository by running: $ git pull https://github.com/sb2nov/incubator-beam BEAM-consistent-flag-bw-java-py Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1497.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1497 commit cfc0ba009316671c883b1f383395d362b2dee796 Author: Sourabh Bajaj Date: 2016-12-03T00:45:19Z Make the legacy SQL flag consistent between Java and Python --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (BEAM-906) Fill in the get-started/downloads portion of the website
[ https://issues.apache.org/jira/browse/BEAM-906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-906. -- Resolution: Fixed Fix Version/s: Not applicable Done as part of https://github.com/apache/incubator-beam-site/pull/85 > Fill in the get-started/downloads portion of the website > > > Key: BEAM-906 > URL: https://issues.apache.org/jira/browse/BEAM-906 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1229: [Ignore] Add a template for PubSub -> Big...
Github user sammcveety closed the pull request at: https://github.com/apache/incubator-beam/pull/1229 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (BEAM-1079) Validation speed for really large number of files
[ https://issues.apache.org/jira/browse/BEAM-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-1079. --- Resolution: Fixed Fix Version/s: Not applicable > Validation speed for really large number of files > - > > Key: BEAM-1079 > URL: https://issues.apache.org/jira/browse/BEAM-1079 > Project: Beam > Issue Type: Improvement > Components: sdk-py >Reporter: Sourabh Bajaj >Assignee: Sourabh Bajaj >Priority: Trivial > Fix For: Not applicable > > > Filebased source during validation at pipeline creation does a full glob when > it only needs to validate atleast one file. So create a limit parameter to > make this faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1431: [BEAM-1079] Improve validation speed for ...
Github user sb2nov closed the pull request at: https://github.com/apache/incubator-beam/pull/1431 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (BEAM-1079) Validation speed for really large number of files
[ https://issues.apache.org/jira/browse/BEAM-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716937#comment-15716937 ] ASF GitHub Bot commented on BEAM-1079: -- Github user sb2nov closed the pull request at: https://github.com/apache/incubator-beam/pull/1431 > Validation speed for really large number of files > - > > Key: BEAM-1079 > URL: https://issues.apache.org/jira/browse/BEAM-1079 > Project: Beam > Issue Type: Improvement > Components: sdk-py >Reporter: Sourabh Bajaj >Assignee: Sourabh Bajaj >Priority: Trivial > > Filebased source during validation at pipeline creation does a full glob when > it only needs to validate atleast one file. So create a limit parameter to > make this faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/2] incubator-beam git commit: Closes #1431
Closes #1431 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/fd6a52c1 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/fd6a52c1 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/fd6a52c1 Branch: refs/heads/python-sdk Commit: fd6a52c15df5741d6b6661ea98c680a94775f7f9 Parents: 2363ee5 1688690 Author: Dan Halperin Authored: Fri Dec 2 16:13:28 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 16:13:28 2016 -0800 -- sdks/python/apache_beam/io/filebasedsource.py | 3 ++- sdks/python/apache_beam/io/fileio.py | 7 --- sdks/python/apache_beam/io/gcsio.py | 6 -- sdks/python/apache_beam/io/gcsio_test.py | 7 +++ 4 files changed, 17 insertions(+), 6 deletions(-) --
[1/2] incubator-beam git commit: Do not need to list all files in GCS for validation. Add limit field to fileIO
Repository: incubator-beam Updated Branches: refs/heads/python-sdk 2363ee510 -> fd6a52c15 Do not need to list all files in GCS for validation. Add limit field to fileIO Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/16886904 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/16886904 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/16886904 Branch: refs/heads/python-sdk Commit: 16886904df9fd1d3f92e1f7aabd134a28d6c1c00 Parents: 2363ee5 Author: Sourabh Bajaj Authored: Fri Dec 2 13:56:42 2016 -0800 Committer: Sourabh Bajaj Committed: Fri Dec 2 13:56:42 2016 -0800 -- sdks/python/apache_beam/io/filebasedsource.py | 3 ++- sdks/python/apache_beam/io/fileio.py | 7 --- sdks/python/apache_beam/io/gcsio.py | 6 -- sdks/python/apache_beam/io/gcsio_test.py | 7 +++ 4 files changed, 17 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16886904/sdks/python/apache_beam/io/filebasedsource.py -- diff --git a/sdks/python/apache_beam/io/filebasedsource.py b/sdks/python/apache_beam/io/filebasedsource.py index 14c2b06..8921801 100644 --- a/sdks/python/apache_beam/io/filebasedsource.py +++ b/sdks/python/apache_beam/io/filebasedsource.py @@ -175,7 +175,8 @@ class FileBasedSource(iobase.BoundedSource): def _validate(self): """Validate if there are actual files in the specified glob pattern """ -if len(fileio.ChannelFactory.glob(self._pattern)) <= 0: +# Limit the responses as we only want to check if something exists +if len(fileio.ChannelFactory.glob(self._pattern, limit=1)) <= 0: raise IOError( 'No files found based on the file pattern %s' % self._pattern) http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16886904/sdks/python/apache_beam/io/fileio.py -- diff --git a/sdks/python/apache_beam/io/fileio.py b/sdks/python/apache_beam/io/fileio.py index c71a730..82e7813 100644 --- a/sdks/python/apache_beam/io/fileio.py +++ b/sdks/python/apache_beam/io/fileio.py @@ -588,11 +588,12 @@ class ChannelFactory(object): raise IOError(err) @staticmethod - def glob(path): + def glob(path, limit=None): if path.startswith('gs://'): - return gcsio.GcsIO().glob(path) + return gcsio.GcsIO().glob(path, limit) else: - return glob.glob(path) + files = glob.glob(path) + return files[:limit] @staticmethod def size_in_bytes(path): http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16886904/sdks/python/apache_beam/io/gcsio.py -- diff --git a/sdks/python/apache_beam/io/gcsio.py b/sdks/python/apache_beam/io/gcsio.py index 9adb946..748465f 100644 --- a/sdks/python/apache_beam/io/gcsio.py +++ b/sdks/python/apache_beam/io/gcsio.py @@ -142,7 +142,7 @@ class GcsIO(object): @retry.with_exponential_backoff( retry_filter=retry.retry_on_server_errors_and_timeout_filter) - def glob(self, pattern): + def glob(self, pattern, limit=None): """Return the GCS path names matching a given path name pattern. Path name patterns are those recognized by fnmatch.fnmatch(). The path @@ -166,9 +166,11 @@ class GcsIO(object): object_paths.append('gs://%s/%s' % (item.bucket, item.name)) if response.nextPageToken: request.pageToken = response.nextPageToken +if limit is not None and len(object_paths) >= limit: + break else: break -return object_paths +return object_paths[:limit] @retry.with_exponential_backoff( retry_filter=retry.retry_on_server_errors_and_timeout_filter) http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16886904/sdks/python/apache_beam/io/gcsio_test.py -- diff --git a/sdks/python/apache_beam/io/gcsio_test.py b/sdks/python/apache_beam/io/gcsio_test.py index 9d44e17..5af13c6 100644 --- a/sdks/python/apache_beam/io/gcsio_test.py +++ b/sdks/python/apache_beam/io/gcsio_test.py @@ -652,6 +652,13 @@ class TestGCSIO(unittest.TestCase): self.assertEqual( set(self.gcs.glob(file_pattern)), set(expected_file_names)) +# Check if limits are followed correctly +limit = 3 +for file_pattern, expected_object_names in test_cases: + expected_num_items = min(len(expected_object_names), limit) + self.assertEqual( + len(self.gcs.glob(file_pattern, limit)), expected_num_items) + def test_size_of_files_in_glob(self): bucket_name = 'gcsio-test' object_names = [
[jira] [Resolved] (BEAM-914) Update Beam Overview doc (landing page)
[ https://issues.apache.org/jira/browse/BEAM-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-914. -- Resolution: Fixed Fix Version/s: Not applicable https://github.com/apache/incubator-beam-site/pull/93 > Update Beam Overview doc (landing page) > --- > > Key: BEAM-914 > URL: https://issues.apache.org/jira/browse/BEAM-914 > Project: Beam > Issue Type: Task > Components: website >Reporter: Hadar Hod >Assignee: Hadar Hod > Fix For: Not applicable > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-1081) annotations should support custom messages and classes
[ https://issues.apache.org/jira/browse/BEAM-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Altay updated BEAM-1081: -- Assignee: (was: Frances Perry) > annotations should support custom messages and classes > -- > > Key: BEAM-1081 > URL: https://issues.apache.org/jira/browse/BEAM-1081 > Project: Beam > Issue Type: Improvement > Components: sdk-py >Reporter: Ahmet Altay >Priority: Minor > Labels: starter > > Update > https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/utils/annotations.py > to add 2 new features: > 1. ability to customize message > 2. ability to tag classes (not only functions) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-1081) annotations should support custom messages and classes
Ahmet Altay created BEAM-1081: - Summary: annotations should support custom messages and classes Key: BEAM-1081 URL: https://issues.apache.org/jira/browse/BEAM-1081 Project: Beam Issue Type: Improvement Components: sdk-py Reporter: Ahmet Altay Assignee: Frances Perry Priority: Minor Update https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/utils/annotations.py to add 2 new features: 1. ability to customize message 2. ability to tag classes (not only functions) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-1060) Make DoFnTester use new DoFn
[ https://issues.apache.org/jira/browse/BEAM-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-1060. --- Resolution: Fixed Fix Version/s: 0.4.0-incubating > Make DoFnTester use new DoFn > > > Key: BEAM-1060 > URL: https://issues.apache.org/jira/browse/BEAM-1060 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Eugene Kirpichov >Assignee: Eugene Kirpichov > Fix For: 0.4.0-incubating > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-505) Fill in the documentation/runners/direct portion of the website
[ https://issues.apache.org/jira/browse/BEAM-505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-505. -- Resolution: Fixed Fix Version/s: Not applicable > Fill in the documentation/runners/direct portion of the website > --- > > Key: BEAM-505 > URL: https://issues.apache.org/jira/browse/BEAM-505 > Project: Beam > Issue Type: Task > Components: website >Reporter: Frances Perry >Assignee: Melissa Pashniak > Fix For: Not applicable > > > As per > https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit. > Should be a landing page for the Direct runner -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-904) Dataflow setup instructions
[ https://issues.apache.org/jira/browse/BEAM-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-904. -- Resolution: Fixed Fix Version/s: Not applicable > Dataflow setup instructions > --- > > Key: BEAM-904 > URL: https://issues.apache.org/jira/browse/BEAM-904 > Project: Beam > Issue Type: Sub-task > Components: website >Reporter: Frances Perry >Assignee: Melissa Pashniak > Fix For: Not applicable > > > As you are working on the Dataflow Runner page, please include the getting > started instructions, as I'm linking there from the quickstart. Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-508) Fill in the documentation/runners/dataflow portion of the website
[ https://issues.apache.org/jira/browse/BEAM-508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-508. -- Resolution: Fixed Fix Version/s: Not applicable > Fill in the documentation/runners/dataflow portion of the website > - > > Key: BEAM-508 > URL: https://issues.apache.org/jira/browse/BEAM-508 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Frances Perry >Assignee: Melissa Pashniak > Fix For: Not applicable > > > As per > https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit. > Should be a landing page for Dataflow-runner-specific content -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-505) Fill in the documentation/runners/direct portion of the website
[ https://issues.apache.org/jira/browse/BEAM-505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin updated BEAM-505: - Issue Type: Task (was: Bug) > Fill in the documentation/runners/direct portion of the website > --- > > Key: BEAM-505 > URL: https://issues.apache.org/jira/browse/BEAM-505 > Project: Beam > Issue Type: Task > Components: website >Reporter: Frances Perry >Assignee: Melissa Pashniak > Fix For: Not applicable > > > As per > https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit. > Should be a landing page for the Direct runner -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-277) Add Transforms Section
[ https://issues.apache.org/jira/browse/BEAM-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-277. -- Resolution: Fixed Fix Version/s: Not applicable > Add Transforms Section > -- > > Key: BEAM-277 > URL: https://issues.apache.org/jira/browse/BEAM-277 > Project: Beam > Issue Type: Sub-task > Components: website >Reporter: Devin Donnelly >Assignee: Melissa Pashniak > Fix For: Not applicable > > > Document general transforms usage and ParDo usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1461: [BEAM-1060] Makes DoFnTester use new DoFn...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1461 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (BEAM-1060) Make DoFnTester use new DoFn
[ https://issues.apache.org/jira/browse/BEAM-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716878#comment-15716878 ] ASF GitHub Bot commented on BEAM-1060: -- Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1461 > Make DoFnTester use new DoFn > > > Key: BEAM-1060 > URL: https://issues.apache.org/jira/browse/BEAM-1060 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Eugene Kirpichov >Assignee: Eugene Kirpichov > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/4] incubator-beam git commit: This closes #1461
Repository: incubator-beam Updated Branches: refs/heads/master e04cd47dd -> 26eb4354c This closes #1461 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/26eb4354 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/26eb4354 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/26eb4354 Branch: refs/heads/master Commit: 26eb4354cab72b7c482d8868c99eeb5933fd797e Parents: e04cd47 86173a8 Author: Kenneth Knowles Authored: Fri Dec 2 15:42:33 2016 -0800 Committer: Kenneth Knowles Committed: Fri Dec 2 15:42:33 2016 -0800 -- .../GroupAlsoByWindowsViaOutputBufferDoFn.java | 6 +- .../core/GroupByKeyViaGroupByKeyOnly.java | 22 +- .../core/GroupAlsoByWindowsProperties.java | 590 +++ .../beam/sdk/transforms/DoFnAdapters.java | 2 + .../apache/beam/sdk/transforms/DoFnTester.java | 279 + .../sdk/transforms/reflect/DoFnInvokers.java| 11 - .../beam/sdk/transforms/DoFnTesterTest.java | 38 +- 7 files changed, 536 insertions(+), 412 deletions(-) --
[2/4] incubator-beam git commit: Supports window parameter in DoFnTester
Supports window parameter in DoFnTester Also prohibits other parameters, and prohibits output from bundle methods (whereas previously it was silently dropped). Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/78ac009b Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/78ac009b Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/78ac009b Branch: refs/heads/master Commit: 78ac009be743a2e053580e9966f841174b636e88 Parents: 9645576 Author: Eugene Kirpichov Authored: Fri Dec 2 11:39:48 2016 -0800 Committer: Kenneth Knowles Committed: Fri Dec 2 15:42:33 2016 -0800 -- .../apache/beam/sdk/transforms/DoFnTester.java | 166 ++- .../beam/sdk/transforms/DoFnTesterTest.java | 34 2 files changed, 158 insertions(+), 42 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/78ac009b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnTester.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnTester.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnTester.java index a9f93dd..7c1abef 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnTester.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnTester.java @@ -38,13 +38,18 @@ import org.apache.beam.sdk.testing.ValueInSingleWindow; import org.apache.beam.sdk.transforms.Combine.CombineFn; import org.apache.beam.sdk.transforms.reflect.DoFnInvoker; import org.apache.beam.sdk.transforms.reflect.DoFnInvokers; +import org.apache.beam.sdk.transforms.reflect.DoFnSignature; +import org.apache.beam.sdk.transforms.reflect.DoFnSignatures; +import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker; import org.apache.beam.sdk.transforms.windowing.BoundedWindow; import org.apache.beam.sdk.transforms.windowing.GlobalWindow; import org.apache.beam.sdk.transforms.windowing.PaneInfo; import org.apache.beam.sdk.util.SerializableUtils; +import org.apache.beam.sdk.util.Timer; import org.apache.beam.sdk.util.TimerInternals; import org.apache.beam.sdk.util.UserCodeException; import org.apache.beam.sdk.util.WindowedValue; +import org.apache.beam.sdk.util.WindowingInternals; import org.apache.beam.sdk.util.state.InMemoryStateInternals; import org.apache.beam.sdk.util.state.InMemoryTimerInternals; import org.apache.beam.sdk.util.state.StateInternals; @@ -84,6 +89,9 @@ public class DoFnTester implements AutoCloseable { /** * Returns a {@code DoFnTester} supporting unit-testing of the given * {@link DoFn}. By default, uses {@link CloningBehavior#CLONE_ONCE}. + * + * The only supported extra parameter of the {@link DoFn.ProcessElement} method is + * {@link BoundedWindow}. */ @SuppressWarnings("unchecked") public static DoFnTester of(DoFn fn) { @@ -236,7 +244,7 @@ public class DoFnTester implements AutoCloseable { if (state == State.UNINITIALIZED) { initializeState(); } -TestContext context = createContext(fn); +TestContext context = new TestContext(); context.setupDelegateAggregators(); // State and timer internals are per-bundle. stateInternals = InMemoryStateInternals.forKey(new Object()); @@ -262,7 +270,7 @@ public class DoFnTester implements AutoCloseable { /** * Calls the {@link DoFn.ProcessElement} method on the {@link DoFn} under test, in a * context where {@link DoFn.ProcessContext#element} returns the - * given element. + * given element and the element is in the global window. * * Will call {@link #startBundle} automatically, if it hasn't * already been called. @@ -277,26 +285,86 @@ public class DoFnTester implements AutoCloseable { /** * Calls {@link DoFn.ProcessElement} on the {@code DoFn} under test, in a * context where {@link DoFn.ProcessContext#element} returns the - * given element and timestamp. + * given element and timestamp and the element is in the global window. * * Will call {@link #startBundle} automatically, if it hasn't * already been called. - * - * If the input timestamp is {@literal null}, the minimum timestamp will be used. */ public void processTimestampedElement(TimestampedValue element) throws Exception { checkNotNull(element, "Timestamped element cannot be null"); +processWindowedElement( +element.getValue(), element.getTimestamp(), GlobalWindow.INSTANCE); + } + + /** + * Calls {@link DoFn.ProcessElement} on the {@code DoFn} under test, in a + * context where {@link DoFn.ProcessContext#element} returns the + * given element and timestamp and the element is in the given window. + * + * Will call
[4/4] incubator-beam git commit: Makes DoFnTester use new DoFn internally.
Makes DoFnTester use new DoFn internally. There were 2 remaining users of DoFnTester.of(OldDoFn): - SplittableParDo.ProcessElements: this is fixed in https://github.com/apache/incubator-beam/pull/1261 - GroupAlsoByWindowsProperties: this one is harder. Various GABWDoFn's use OldDoFn.windowingInternals, and we can't pass that through a new DoFn. So instead I removed usage of DoFnTester from GroupAlsoByWindowsProperties in favor of a tiny hand-coded solution. So after #1261 DoFnTester.of(OldDoFn) can be deleted. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/96455768 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/96455768 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/96455768 Branch: refs/heads/master Commit: 96455768568616141a95833380f37c478a989397 Parents: e04cd47 Author: Eugene Kirpichov Authored: Fri Nov 18 13:10:22 2016 -0800 Committer: Kenneth Knowles Committed: Fri Dec 2 15:42:33 2016 -0800 -- .../GroupAlsoByWindowsViaOutputBufferDoFn.java | 6 +- .../core/GroupByKeyViaGroupByKeyOnly.java | 22 +- .../core/GroupAlsoByWindowsProperties.java | 590 +++ .../beam/sdk/transforms/DoFnAdapters.java | 2 + .../apache/beam/sdk/transforms/DoFnTester.java | 130 ++-- .../sdk/transforms/reflect/DoFnInvokers.java| 11 - .../beam/sdk/transforms/DoFnTesterTest.java | 4 +- 7 files changed, 394 insertions(+), 371 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/96455768/runners/core-java/src/main/java/org/apache/beam/runners/core/GroupAlsoByWindowsViaOutputBufferDoFn.java -- diff --git a/runners/core-java/src/main/java/org/apache/beam/runners/core/GroupAlsoByWindowsViaOutputBufferDoFn.java b/runners/core-java/src/main/java/org/apache/beam/runners/core/GroupAlsoByWindowsViaOutputBufferDoFn.java index f8f6207..b4b366c 100644 --- a/runners/core-java/src/main/java/org/apache/beam/runners/core/GroupAlsoByWindowsViaOutputBufferDoFn.java +++ b/runners/core-java/src/main/java/org/apache/beam/runners/core/GroupAlsoByWindowsViaOutputBufferDoFn.java @@ -21,7 +21,6 @@ import com.google.common.collect.Iterables; import java.util.List; import org.apache.beam.runners.core.triggers.ExecutableTriggerStateMachine; import org.apache.beam.runners.core.triggers.TriggerStateMachines; -import org.apache.beam.sdk.transforms.OldDoFn; import org.apache.beam.sdk.transforms.windowing.BoundedWindow; import org.apache.beam.sdk.util.SystemDoFnInternal; import org.apache.beam.sdk.util.WindowedValue; @@ -30,7 +29,6 @@ import org.apache.beam.sdk.util.state.InMemoryTimerInternals; import org.apache.beam.sdk.util.state.StateInternals; import org.apache.beam.sdk.util.state.StateInternalsFactory; import org.apache.beam.sdk.util.state.TimerCallback; -import org.apache.beam.sdk.values.KV; import org.joda.time.Instant; /** @@ -55,9 +53,7 @@ public class GroupAlsoByWindowsViaOutputBufferDoFn>>, KV>.ProcessContext c) - throws Exception { + public void processElement(ProcessContext c) throws Exception { K key = c.element().getKey(); // Used with Batch, we know that all the data is available for this key. We can't use the // timer manager from the context because it doesn't exist. So we create one and emulate the http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/96455768/runners/core-java/src/main/java/org/apache/beam/runners/core/GroupByKeyViaGroupByKeyOnly.java -- diff --git a/runners/core-java/src/main/java/org/apache/beam/runners/core/GroupByKeyViaGroupByKeyOnly.java b/runners/core-java/src/main/java/org/apache/beam/runners/core/GroupByKeyViaGroupByKeyOnly.java index 79d2252..43047ca 100644 --- a/runners/core-java/src/main/java/org/apache/beam/runners/core/GroupByKeyViaGroupByKeyOnly.java +++ b/runners/core-java/src/main/java/org/apache/beam/runners/core/GroupByKeyViaGroupByKeyOnly.java @@ -26,15 +26,13 @@ import java.util.List; import org.apache.beam.sdk.coders.Coder; import org.apache.beam.sdk.coders.IterableCoder; import org.apache.beam.sdk.coders.KvCoder; +import org.apache.beam.sdk.transforms.DoFn; import org.apache.beam.sdk.transforms.GroupByKey; -import org.apache.beam.sdk.transforms.OldDoFn; import org.apache.beam.sdk.transforms.PTransform; import org.apache.beam.sdk.transforms.ParDo; -import org.apache.beam.sdk.transforms.windowing.BoundedWindow; import org.apache.beam.sdk.util.WindowedValue; import org.apache.beam.sdk.util.WindowedValue.WindowedValueCoder; import org.apache.beam.sdk.util.WindowingStrategy; -import org.apache.beam.sdk.util.state.StateInternalsFactory; import org.apache.beam
[3/4] incubator-beam git commit: Removes DoFnTester.of(OldDoFn)
Removes DoFnTester.of(OldDoFn) Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/86173a83 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/86173a83 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/86173a83 Branch: refs/heads/master Commit: 86173a839f57cf7ed45566b380e557cf1defcba9 Parents: 78ac009 Author: Eugene Kirpichov Authored: Fri Dec 2 11:44:02 2016 -0800 Committer: Kenneth Knowles Committed: Fri Dec 2 15:42:33 2016 -0800 -- .../org/apache/beam/sdk/transforms/DoFnTester.java | 15 --- 1 file changed, 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/86173a83/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnTester.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnTester.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnTester.java index 7c1abef..9f32aec 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnTester.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFnTester.java @@ -23,7 +23,6 @@ import static com.google.common.base.Preconditions.checkState; import com.google.common.base.Function; import com.google.common.base.MoreObjects; import com.google.common.collect.ImmutableList; -import com.google.common.collect.Iterables; import com.google.common.collect.Lists; import java.util.ArrayList; import java.util.Arrays; @@ -100,20 +99,6 @@ public class DoFnTester implements AutoCloseable { } /** - * Returns a {@code DoFnTester} supporting unit-testing of the given - * {@link OldDoFn}. - * - * @see #of(DoFn) - */ - @SuppressWarnings("unchecked") - @Deprecated - public static DoFnTester - of(OldDoFn fn) { -checkNotNull(fn, "fn can't be null"); -return new DoFnTester<>(fn.toDoFn()); - } - - /** * Registers the tuple of values of the side input {@link PCollectionView}s to * pass to the {@link DoFn} under test. *
[jira] [Resolved] (BEAM-239) Rename RemoveDuplicates to Distinct
[ https://issues.apache.org/jira/browse/BEAM-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-239. -- Resolution: Fixed Fix Version/s: 0.4.0-incubating > Rename RemoveDuplicates to Distinct > --- > > Key: BEAM-239 > URL: https://issues.apache.org/jira/browse/BEAM-239 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Jesse Anderson >Assignee: Neelesh Srinivas Salian >Priority: Minor > Labels: backward-incompatible, newbie, starter > Fix For: 0.4.0-incubating > > > I had a really tough time finding this transform in the docs. I suggest > changing this class' name to Distinct instead of RemoveDuplicates. At the > very least, the JavaDoc for RemoveDuplicates should have the word distinct in > it to make this more findable/searchable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-879) Renaming DeDupExample to DistinctExample
[ https://issues.apache.org/jira/browse/BEAM-879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Halperin resolved BEAM-879. -- Resolution: Fixed Fix Version/s: 0.4.0-incubating > Renaming DeDupExample to DistinctExample > > > Key: BEAM-879 > URL: https://issues.apache.org/jira/browse/BEAM-879 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Neelesh Srinivas Salian >Assignee: Neelesh Srinivas Salian >Priority: Trivial > Fix For: 0.4.0-incubating > > > In BEAM-239, we renamed DeDupExampleTest to DistinctExampleTest. > Need to modify DeDupExample to DistinctExample as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-879) Renaming DeDupExample to DistinctExample
[ https://issues.apache.org/jira/browse/BEAM-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716840#comment-15716840 ] ASF GitHub Bot commented on BEAM-879: - Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1489 > Renaming DeDupExample to DistinctExample > > > Key: BEAM-879 > URL: https://issues.apache.org/jira/browse/BEAM-879 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Neelesh Srinivas Salian >Assignee: Neelesh Srinivas Salian >Priority: Trivial > > In BEAM-239, we renamed DeDupExampleTest to DistinctExampleTest. > Need to modify DeDupExample to DistinctExample as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/2] incubator-beam git commit: BEAM-879: Changing DeDupExample to DistinctExample
Repository: incubator-beam Updated Branches: refs/heads/master 1abbb9007 -> e04cd47dd BEAM-879: Changing DeDupExample to DistinctExample Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e3dca4ca Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e3dca4ca Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e3dca4ca Branch: refs/heads/master Commit: e3dca4cab6914166465c70f5f0b4be4f06ddd088 Parents: 1abbb90 Author: Neelesh Srinivas Salian Authored: Thu Dec 1 20:28:43 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 15:20:16 2016 -0800 -- .../beam/examples/cookbook/DeDupExample.java| 96 .../beam/examples/cookbook/DistinctExample.java | 96 2 files changed, 96 insertions(+), 96 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e3dca4ca/examples/java/src/main/java/org/apache/beam/examples/cookbook/DeDupExample.java -- diff --git a/examples/java/src/main/java/org/apache/beam/examples/cookbook/DeDupExample.java b/examples/java/src/main/java/org/apache/beam/examples/cookbook/DeDupExample.java deleted file mode 100644 index 34fb901..000 --- a/examples/java/src/main/java/org/apache/beam/examples/cookbook/DeDupExample.java +++ /dev/null @@ -1,96 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.beam.examples.cookbook; - -import org.apache.beam.sdk.Pipeline; -import org.apache.beam.sdk.io.TextIO; -import org.apache.beam.sdk.options.Default; -import org.apache.beam.sdk.options.DefaultValueFactory; -import org.apache.beam.sdk.options.Description; -import org.apache.beam.sdk.options.PipelineOptions; -import org.apache.beam.sdk.options.PipelineOptionsFactory; -import org.apache.beam.sdk.transforms.Distinct; -import org.apache.beam.sdk.util.gcsfs.GcsPath; - -/** - * This example uses as input Shakespeare's plays as plaintext files, and will remove any - * duplicate lines across all the files. (The output does not preserve any input order). - * - * Concepts: the Distinct transform, and how to wire transforms together. - * Demonstrates {@link org.apache.beam.sdk.io.TextIO.Read}/ - * {@link Distinct}/{@link org.apache.beam.sdk.io.TextIO.Write}. - * - * To execute this pipeline locally, specify a local output file or output prefix on GCS: - * --output=[YOUR_LOCAL_FILE | gs://YOUR_OUTPUT_PREFIX] - * - * To change the runner, specify: - * {@code - * --runner=YOUR_SELECTED_RUNNER - * } - * - * See examples/java/README.md for instructions about how to configure different runners. - * - * The input defaults to {@code gs://apache-beam-samples/shakespeare/*} and can be - * overridden with {@code --input}. - */ -public class DeDupExample { - - /** - * Options supported by {@link DeDupExample}. - * - * Inherits standard configuration options. - */ - private interface Options extends PipelineOptions { -@Description("Path to the directory or GCS prefix containing files to read from") -@Default.String("gs://apache-beam-samples/shakespeare/*") -String getInput(); -void setInput(String value); - -@Description("Path of the file to write to") -@Default.InstanceFactory(OutputFactory.class) -String getOutput(); -void setOutput(String value); - -/** Returns gs://${TEMP_LOCATION}/"deduped.txt". */ -class OutputFactory implements DefaultValueFactory { - @Override - public String create(PipelineOptions options) { -if (options.getTempLocation() != null) { - return GcsPath.fromUri(options.getTempLocation()) - .resolve("deduped.txt").toString(); -} else { - throw new IllegalArgumentException("Must specify --output or --tempLocation"); -} - } -} - } - - - public static void main(String[] args) - throws Exception { - -Options options = PipelineOptionsFactory.fromArgs(args).withValidation().as(Options.class); -
[GitHub] incubator-beam pull request #1489: [BEAM-879]: Renaming DeDupExample to Dist...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1489 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Closes #1489
Closes #1489 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e04cd47d Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e04cd47d Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e04cd47d Branch: refs/heads/master Commit: e04cd47ddf8fb5f04f1f684219724031179a55ec Parents: 1abbb90 e3dca4c Author: Dan Halperin Authored: Fri Dec 2 15:20:17 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 15:20:17 2016 -0800 -- .../beam/examples/cookbook/DeDupExample.java| 96 .../beam/examples/cookbook/DistinctExample.java | 96 2 files changed, 96 insertions(+), 96 deletions(-) --
[GitHub] incubator-beam pull request #1496: Add labels to lambdas in write finalizati...
GitHub user charlesccychen opened a pull request: https://github.com/apache/incubator-beam/pull/1496 Add labels to lambdas in write finalization You can merge this pull request into a Git repository by running: $ git pull https://github.com/charlesccychen/incubator-beam iobase-labels Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1496.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1496 commit 7d980e70312457556ed2ef3fd87b24f774804be6 Author: Charles Chen Date: 2016-12-02T23:17:55Z Add labels to lambdas in write finalization --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (BEAM-1067) apex.examples.WordCountTest.testWordCountExample may be flaky
[ https://issues.apache.org/jira/browse/BEAM-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716750#comment-15716750 ] Stas Levin commented on BEAM-1067: -- Not sure, but {{org.apache.beam.runners.apex.translation.ParDoBoundTranslatorTest.testMultiOutputParDoWithSideInputs}} seems to fail [here and there as well|https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/5451/org.apache.beam$beam-runners-apex/testReport/org.apache.beam.runners.apex.translation/ParDoBoundTranslatorTest/testMultiOutputParDoWithSideInputs/]. > apex.examples.WordCountTest.testWordCountExample may be flaky > - > > Key: BEAM-1067 > URL: https://issues.apache.org/jira/browse/BEAM-1067 > Project: Beam > Issue Type: Bug > Components: runner-apex >Reporter: Stas Levin >Assignee: Thomas Weise > > Seems that > {{org.apache.beam.runners.apex.examples.WordCountTest.testWordCountExample}} > is flaky. > For example, > [this|https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/5408/org.apache.beam$beam-runners-apex/testReport/org.apache.beam.runners.apex.examples/WordCountTest/testWordCountExample/ > ] run failed although no changes were made in {{runner-apex}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-646) Get runners out of the apply()
[ https://issues.apache.org/jira/browse/BEAM-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716726#comment-15716726 ] ASF GitHub Bot commented on BEAM-646: - GitHub user tgroh opened a pull request: https://github.com/apache/incubator-beam/pull/1495 [BEAM-646] Add DirectTestUtils Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Add getGraph(Pipeline) and getProducer(PValue), which use the DirectGraphVisitor and DirectGraph methods to provide access to the producing AppliedPTransform. Remove getProducingTransformInternal from everywhere except DirectGraphVisitorTest This removes all remaining uses of `PValue.getProducingTransformInternal` from the `DirectRunner` You can merge this pull request into a Git repository by running: $ git pull https://github.com/tgroh/incubator-beam producers_consumers_as_datastructure Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1495.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1495 commit 5dc2386be460e004108aeaf89e0b5a5e3c3e50bd Author: Thomas Groh Date: 2016-12-02T18:56:36Z Add DirectTestUtils Add getGraph(Pipeline) and getProducer(PValue), which use the DirectGraphVisitor and DirectGraph methods to provide access to the producing AppliedPTransform. Remove getProducingTransformInternal from everywhere except DirectGraphVisitorTest commit a7863cfbdfc8d2289fccdf5499fd245928a11a7d Author: Thomas Groh Date: 2016-12-02T22:26:04Z Remove getProducingTransformInternal from DirectGraphVisitorTest > Get runners out of the apply() > -- > > Key: BEAM-646 > URL: https://issues.apache.org/jira/browse/BEAM-646 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Thomas Groh > > Right now, the runner intercepts calls to apply() and replaces transforms as > we go. This means that there is no "original" user graph. For portability and > misc architectural benefits, we would like to build the original graph first, > and have the runner override later. > Some runners already work in this manner, but we could integrate it more > smoothly, with more validation, via some handy APIs on e.g. the Pipeline > object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1495: [BEAM-646] Add DirectTestUtils
GitHub user tgroh opened a pull request: https://github.com/apache/incubator-beam/pull/1495 [BEAM-646] Add DirectTestUtils Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Add getGraph(Pipeline) and getProducer(PValue), which use the DirectGraphVisitor and DirectGraph methods to provide access to the producing AppliedPTransform. Remove getProducingTransformInternal from everywhere except DirectGraphVisitorTest This removes all remaining uses of `PValue.getProducingTransformInternal` from the `DirectRunner` You can merge this pull request into a Git repository by running: $ git pull https://github.com/tgroh/incubator-beam producers_consumers_as_datastructure Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1495.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1495 commit 5dc2386be460e004108aeaf89e0b5a5e3c3e50bd Author: Thomas Groh Date: 2016-12-02T18:56:36Z Add DirectTestUtils Add getGraph(Pipeline) and getProducer(PValue), which use the DirectGraphVisitor and DirectGraph methods to provide access to the producing AppliedPTransform. Remove getProducingTransformInternal from everywhere except DirectGraphVisitorTest commit a7863cfbdfc8d2289fccdf5499fd245928a11a7d Author: Thomas Groh Date: 2016-12-02T22:26:04Z Remove getProducingTransformInternal from DirectGraphVisitorTest --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (BEAM-485) Can't set Flink runner in code
[ https://issues.apache.org/jira/browse/BEAM-485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716711#comment-15716711 ] Emanuele Cesena commented on BEAM-485: -- I didn’t really had the chance to try, but if it’s resolved, I trust that. -- Emanuele Cesena http://www.theneeds.com Il corpo non ha ideali > Can't set Flink runner in code > -- > > Key: BEAM-485 > URL: https://issues.apache.org/jira/browse/BEAM-485 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Emanuele Cesena > Fix For: Not applicable > > > Calling: > options.setRunner(FlinkRunner.class); > doesn't seem to properly set the runner. > Running --runner=FlinkRunner from the command line works. > Both approaches were working on 0.1.0, but options.setRunner doesn't seem to > work on master anymore. > I noticed there are tests that only cover the command line case: > https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/FlinkRunnerRegistrarTest.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1494: Fix auth related unit test failures
GitHub user vikkyrk opened a pull request: https://github.com/apache/incubator-beam/pull/1494 Fix auth related unit test failures Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- - Ensure that actual credentials are not fetched in unit testing. This fails when on machines where credentials are not available. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vikkyrk/incubator-beam py_ds_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1494.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1494 commit c231217609aae29c82413eb36e93440dbfc4aeda Author: Vikas Kedigehalli Date: 2016-12-02T22:13:31Z Fix auth related unit test failures --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (BEAM-1080) python sdk apiclient needs proper unit tests
Vikas Kedigehalli created BEAM-1080: --- Summary: python sdk apiclient needs proper unit tests Key: BEAM-1080 URL: https://issues.apache.org/jira/browse/BEAM-1080 Project: Beam Issue Type: New Feature Components: sdk-py Reporter: Vikas Kedigehalli Assignee: Frances Perry There is only one unit test right now that tries to fetch actual gcp credentials instead of mocking. This test fails when the credentials are not available on the machine in which it is running. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-1079) Validation speed for really large number of files
[ https://issues.apache.org/jira/browse/BEAM-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sourabh Bajaj updated BEAM-1079: Description: Filebased source during validation at pipeline creation does a full glob when it only needs to validate atleast one file. So create a limit parameter to make this faster. > Validation speed for really large number of files > - > > Key: BEAM-1079 > URL: https://issues.apache.org/jira/browse/BEAM-1079 > Project: Beam > Issue Type: Improvement > Components: sdk-py >Reporter: Sourabh Bajaj >Assignee: Sourabh Bajaj >Priority: Trivial > > Filebased source during validation at pipeline creation does a full glob when > it only needs to validate atleast one file. So create a limit parameter to > make this faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-986) ReduceFnRunner doesn't batch prefetching pane firings
[ https://issues.apache.org/jira/browse/BEAM-986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716669#comment-15716669 ] ASF GitHub Bot commented on BEAM-986: - GitHub user scwhittle opened a pull request: https://github.com/apache/incubator-beam/pull/1493 [BEAM-986] Deprecate TimerCallback and InMemoryTimerInternals methods using it. Deprecate TimerCallback and InMemoryTimerInternals methods using it. Instead separate advancing watermarks and removing eligible timers. SDK changes necessary for improving ReduceFnRunner prefetching. Hi @kennknowles , can you please take a look? This is the sdk changes separated out from PR adding prefetching. You can merge this pull request into a Git repository by running: $ git pull https://github.com/scwhittle/incubator-beam rm_timer_callback Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1493.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1493 commit 550c6b05b426714f291174323366f82d34afeab3 Author: Sam Whittle Date: 2016-12-02T06:51:37Z Deprecate TimerCallback and InMemoryTimerInternals methods using it. Instead separate advancing watermarks and removing eligible timers. > ReduceFnRunner doesn't batch prefetching pane firings > - > > Key: BEAM-986 > URL: https://issues.apache.org/jira/browse/BEAM-986 > Project: Beam > Issue Type: Bug > Components: runner-core >Reporter: Sam Whittle >Assignee: Sam Whittle > Original Estimate: 24h > Remaining Estimate: 24h > > Specifically > - in ProcessElements, if there are multiple windows to consider each is > processed sequentially with sequential state fetches instead of a bulk > prefetch > - onTimer method doesn't evaluate multiple timers at a time meaning that if > multiple timers are fired at once each is processed sequentially without > batched prefetching -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1493: [BEAM-986] Deprecate TimerCallback and In...
GitHub user scwhittle opened a pull request: https://github.com/apache/incubator-beam/pull/1493 [BEAM-986] Deprecate TimerCallback and InMemoryTimerInternals methods using it. Deprecate TimerCallback and InMemoryTimerInternals methods using it. Instead separate advancing watermarks and removing eligible timers. SDK changes necessary for improving ReduceFnRunner prefetching. Hi @kennknowles , can you please take a look? This is the sdk changes separated out from PR adding prefetching. You can merge this pull request into a Git repository by running: $ git pull https://github.com/scwhittle/incubator-beam rm_timer_callback Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1493.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1493 commit 550c6b05b426714f291174323366f82d34afeab3 Author: Sam Whittle Date: 2016-12-02T06:51:37Z Deprecate TimerCallback and InMemoryTimerInternals methods using it. Instead separate advancing watermarks and removing eligible timers. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (BEAM-646) Get runners out of the apply()
[ https://issues.apache.org/jira/browse/BEAM-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716663#comment-15716663 ] ASF GitHub Bot commented on BEAM-646: - Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1487 > Get runners out of the apply() > -- > > Key: BEAM-646 > URL: https://issues.apache.org/jira/browse/BEAM-646 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Thomas Groh > > Right now, the runner intercepts calls to apply() and replaces transforms as > we go. This means that there is no "original" user graph. For portability and > misc architectural benefits, we would like to build the original graph first, > and have the runner override later. > Some runners already work in this manner, but we could integrate it more > smoothly, with more validation, via some handy APIs on e.g. the Pipeline > object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1487: [BEAM-646] Stop using Maps of Transforms ...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1487 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/3] incubator-beam git commit: Rename ConsumerTrackingPipelineVisitor to DirectGraphVisitor
Rename ConsumerTrackingPipelineVisitor to DirectGraphVisitor Reduce visibility of Visitor. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/662416a4 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/662416a4 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/662416a4 Branch: refs/heads/master Commit: 662416a4e176cca252c0d6fde1bf4252aeaa56c0 Parents: 8162cd2 Author: Thomas Groh Authored: Fri Dec 2 10:07:05 2016 -0800 Committer: Thomas Groh Committed: Fri Dec 2 14:02:25 2016 -0800 -- .../direct/ConsumerTrackingPipelineVisitor.java | 145 --- .../beam/runners/direct/DirectGraphVisitor.java | 145 +++ .../beam/runners/direct/DirectRunner.java | 8 +- .../ConsumerTrackingPipelineVisitorTest.java| 239 --- .../runners/direct/DirectGraphVisitorTest.java | 239 +++ .../runners/direct/EvaluationContextTest.java | 6 +- .../ImmutabilityCheckingBundleFactoryTest.java | 2 +- .../runners/direct/WatermarkManagerTest.java| 8 +- 8 files changed, 396 insertions(+), 396 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/662416a4/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ConsumerTrackingPipelineVisitor.java -- diff --git a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ConsumerTrackingPipelineVisitor.java b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ConsumerTrackingPipelineVisitor.java deleted file mode 100644 index b9e77c5..000 --- a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ConsumerTrackingPipelineVisitor.java +++ /dev/null @@ -1,145 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.beam.runners.direct; - -import static com.google.common.base.Preconditions.checkState; - -import com.google.common.collect.ArrayListMultimap; -import com.google.common.collect.ListMultimap; -import java.util.HashMap; -import java.util.HashSet; -import java.util.Map; -import java.util.Set; -import org.apache.beam.sdk.Pipeline; -import org.apache.beam.sdk.Pipeline.PipelineVisitor; -import org.apache.beam.sdk.runners.PipelineRunner; -import org.apache.beam.sdk.runners.TransformHierarchy; -import org.apache.beam.sdk.transforms.AppliedPTransform; -import org.apache.beam.sdk.transforms.PTransform; -import org.apache.beam.sdk.values.PCollectionView; -import org.apache.beam.sdk.values.PInput; -import org.apache.beam.sdk.values.POutput; -import org.apache.beam.sdk.values.PValue; - -/** - * Tracks the {@link AppliedPTransform AppliedPTransforms} that consume each {@link PValue} in the - * {@link Pipeline}. This is used to schedule consuming {@link PTransform PTransforms} to consume - * input after the upstream transform has produced and committed output. - */ -public class ConsumerTrackingPipelineVisitor extends PipelineVisitor.Defaults { - private Map> producers = new HashMap<>(); - - private ListMultimap> primitiveConsumers = - ArrayListMultimap.create(); - - private Set> views = new HashSet<>(); - private Set> rootTransforms = new HashSet<>(); - private Map, String> stepNames = new HashMap<>(); - private Set toFinalize = new HashSet<>(); - private int numTransforms = 0; - private boolean finalized = false; - - @Override - public CompositeBehavior enterCompositeTransform(TransformHierarchy.Node node) { -checkState( -!finalized, -"Attempting to traverse a pipeline (node %s) with a %s " -+ "which has already visited a Pipeline and is finalized", -node.getFullName(), -ConsumerTrackingPipelineVisitor.class.getSimpleName()); -return CompositeBehavior.ENTER_TRANSFORM; - } - - @Override - public void leaveCompositeTransform(TransformHierarchy.Node node) { -checkState( -!finalized, -"Attempting to traverse a pipeline (node %s) with a %s which is already finalized", -
[3/3] incubator-beam git commit: This closes #1487
This closes #1487 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1abbb900 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1abbb900 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1abbb900 Branch: refs/heads/master Commit: 1abbb9007e83fc64f1bb61ff4593f37c6c386545 Parents: 8cb2689 662416a Author: Thomas Groh Authored: Fri Dec 2 14:02:25 2016 -0800 Committer: Thomas Groh Committed: Fri Dec 2 14:02:25 2016 -0800 -- .../direct/ConsumerTrackingPipelineVisitor.java | 173 --- .../apache/beam/runners/direct/DirectGraph.java | 89 ++ .../beam/runners/direct/DirectGraphVisitor.java | 145 ++ .../beam/runners/direct/DirectRunner.java | 35 +-- .../beam/runners/direct/EvaluationContext.java | 76 ++--- .../direct/ExecutorServiceParallelExecutor.java | 15 +- .../ImmutabilityCheckingBundleFactory.java | 21 +- .../beam/runners/direct/WatermarkManager.java | 50 ++-- .../ConsumerTrackingPipelineVisitorTest.java| 287 --- .../runners/direct/DirectGraphVisitorTest.java | 239 +++ .../runners/direct/EvaluationContextTest.java | 29 +- .../ImmutabilityCheckingBundleFactoryTest.java | 6 +- .../runners/direct/WatermarkManagerTest.java| 23 +- 13 files changed, 575 insertions(+), 613 deletions(-) --
[1/3] incubator-beam git commit: Stop using Maps of Transforms in the DirectRunner
Repository: incubator-beam Updated Branches: refs/heads/master 8cb2689f8 -> 1abbb9007 Stop using Maps of Transforms in the DirectRunner Instead, add a "DirectGraph" class, which adds a layer of indirection to all lookup methods. Remove all remaining uses of getProducingTransformInternal, and instead use DirectGraph methods to obtain the producing transform. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8162cd29 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8162cd29 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8162cd29 Branch: refs/heads/master Commit: 8162cd29d97ef307b6fac588f453e4e39d70fca7 Parents: 8cb2689 Author: Thomas Groh Authored: Thu Dec 1 15:39:30 2016 -0800 Committer: Thomas Groh Committed: Fri Dec 2 14:02:24 2016 -0800 -- .../direct/ConsumerTrackingPipelineVisitor.java | 108 +++ .../apache/beam/runners/direct/DirectGraph.java | 89 +++ .../beam/runners/direct/DirectRunner.java | 31 +++--- .../beam/runners/direct/EvaluationContext.java | 76 - .../direct/ExecutorServiceParallelExecutor.java | 15 +-- .../ImmutabilityCheckingBundleFactory.java | 21 ++-- .../beam/runners/direct/WatermarkManager.java | 50 - .../ConsumerTrackingPipelineVisitorTest.java| 98 + .../runners/direct/EvaluationContextTest.java | 25 ++--- .../ImmutabilityCheckingBundleFactoryTest.java | 6 +- .../runners/direct/WatermarkManagerTest.java| 23 ++-- 11 files changed, 252 insertions(+), 290 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/8162cd29/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ConsumerTrackingPipelineVisitor.java -- diff --git a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ConsumerTrackingPipelineVisitor.java b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ConsumerTrackingPipelineVisitor.java index acfad16..b9e77c5 100644 --- a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ConsumerTrackingPipelineVisitor.java +++ b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ConsumerTrackingPipelineVisitor.java @@ -19,8 +19,8 @@ package org.apache.beam.runners.direct; import static com.google.common.base.Preconditions.checkState; -import java.util.ArrayList; -import java.util.Collection; +import com.google.common.collect.ArrayListMultimap; +import com.google.common.collect.ListMultimap; import java.util.HashMap; import java.util.HashSet; import java.util.Map; @@ -33,6 +33,7 @@ import org.apache.beam.sdk.transforms.AppliedPTransform; import org.apache.beam.sdk.transforms.PTransform; import org.apache.beam.sdk.values.PCollectionView; import org.apache.beam.sdk.values.PInput; +import org.apache.beam.sdk.values.POutput; import org.apache.beam.sdk.values.PValue; /** @@ -41,9 +42,13 @@ import org.apache.beam.sdk.values.PValue; * input after the upstream transform has produced and committed output. */ public class ConsumerTrackingPipelineVisitor extends PipelineVisitor.Defaults { - private Map>> valueToConsumers = new HashMap<>(); - private Collection> rootTransforms = new ArrayList<>(); - private Collection> views = new ArrayList<>(); + private Map> producers = new HashMap<>(); + + private ListMultimap> primitiveConsumers = + ArrayListMultimap.create(); + + private Set> views = new HashSet<>(); + private Set> rootTransforms = new HashSet<>(); private Map, String> stepNames = new HashMap<>(); private Set toFinalize = new HashSet<>(); private int numTransforms = 0; @@ -81,81 +86,38 @@ public class ConsumerTrackingPipelineVisitor extends PipelineVisitor.Defaults { rootTransforms.add(appliedTransform); } else { for (PValue value : node.getInput().expand()) { -valueToConsumers.get(value).add(appliedTransform); +primitiveConsumers.put(value, appliedTransform); } } } - private AppliedPTransform getAppliedTransform(TransformHierarchy.Node node) { -@SuppressWarnings({"rawtypes", "unchecked"}) -AppliedPTransform application = AppliedPTransform.of( -node.getFullName(), node.getInput(), node.getOutput(), (PTransform) node.getTransform()); -return application; - } - - @Override + @Override public void visitValue(PValue value, TransformHierarchy.Node producer) { toFinalize.add(value); + +AppliedPTransform appliedTransform = getAppliedTransform(producer); +if (!producers.containsKey(value)) { + producers.put(value, appliedTransform); +} for (PValue expandedValue : value.expand()) { - valueToConsumers.put(expandedValue, new
[jira] [Commented] (BEAM-1077) @ValidatesRunner test in Python postcommit
[ https://issues.apache.org/jira/browse/BEAM-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716656#comment-15716656 ] ASF GitHub Bot commented on BEAM-1077: -- GitHub user markflyhigh opened a pull request: https://github.com/apache/incubator-beam/pull/1492 [BEAM-1077] @ValidatesRunner Test in Python Postcommit Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Add ValidatesRunner tests to postcommit. Currently running tests on dataflow service. You can merge this pull request into a Git repository by running: $ git pull https://github.com/markflyhigh/incubator-beam validatesrunner-test-in-postcommit Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1492.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1492 commit 82fb441da2ff9a8e5a3c87d571618cbb77710f87 Author: Mark Liu Date: 2016-12-02T21:58:39Z [BEAM-1077] @ValidatesRunner Test in Python Postcommit > @ValidatesRunner test in Python postcommit > -- > > Key: BEAM-1077 > URL: https://issues.apache.org/jira/browse/BEAM-1077 > Project: Beam > Issue Type: Test > Components: sdk-py, testing >Reporter: Mark Liu >Assignee: Mark Liu > > Modify run_postcommit.sh to have @ValidatesRunner tests running on service. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1492: [BEAM-1077] @ValidatesRunner Test in Pyth...
GitHub user markflyhigh opened a pull request: https://github.com/apache/incubator-beam/pull/1492 [BEAM-1077] @ValidatesRunner Test in Python Postcommit Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Add ValidatesRunner tests to postcommit. Currently running tests on dataflow service. You can merge this pull request into a Git repository by running: $ git pull https://github.com/markflyhigh/incubator-beam validatesrunner-test-in-postcommit Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1492.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1492 commit 82fb441da2ff9a8e5a3c87d571618cbb77710f87 Author: Mark Liu Date: 2016-12-02T21:58:39Z [BEAM-1077] @ValidatesRunner Test in Python Postcommit --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (BEAM-646) Get runners out of the apply()
[ https://issues.apache.org/jira/browse/BEAM-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716648#comment-15716648 ] ASF GitHub Bot commented on BEAM-646: - Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1490 > Get runners out of the apply() > -- > > Key: BEAM-646 > URL: https://issues.apache.org/jira/browse/BEAM-646 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Thomas Groh > > Right now, the runner intercepts calls to apply() and replaces transforms as > we go. This means that there is no "original" user graph. For portability and > misc architectural benefits, we would like to build the original graph first, > and have the runner override later. > Some runners already work in this manner, but we could integrate it more > smoothly, with more validation, via some handy APIs on e.g. the Pipeline > object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-beam pull request #1490: [BEAM-646] Explicitly Throw in TransformE...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam/pull/1490 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: Explicitly Throw in TransformExecutorTest
Repository: incubator-beam Updated Branches: refs/heads/master 37e891fe9 -> 8cb2689f8 Explicitly Throw in TransformExecutorTest Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/b4ee8b73 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/b4ee8b73 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/b4ee8b73 Branch: refs/heads/master Commit: b4ee8b730bffb31ee1178303f1dbd5058eb22a11 Parents: 37e891f Author: Thomas Groh Authored: Fri Dec 2 10:56:15 2016 -0800 Committer: Thomas Groh Committed: Fri Dec 2 13:58:38 2016 -0800 -- .../runners/direct/TransformExecutorTest.java | 184 ++- 1 file changed, 97 insertions(+), 87 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/b4ee8b73/runners/direct-java/src/test/java/org/apache/beam/runners/direct/TransformExecutorTest.java -- diff --git a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/TransformExecutorTest.java b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/TransformExecutorTest.java index 85eff65..08b1e18 100644 --- a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/TransformExecutorTest.java +++ b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/TransformExecutorTest.java @@ -37,13 +37,10 @@ import java.util.concurrent.Future; import java.util.concurrent.atomic.AtomicBoolean; import org.apache.beam.runners.direct.CommittedResult.OutputType; import org.apache.beam.runners.direct.DirectRunner.CommittedBundle; -import org.apache.beam.sdk.coders.ByteArrayCoder; import org.apache.beam.sdk.testing.TestPipeline; import org.apache.beam.sdk.transforms.AppliedPTransform; import org.apache.beam.sdk.transforms.Create; -import org.apache.beam.sdk.transforms.PTransform; import org.apache.beam.sdk.transforms.WithKeys; -import org.apache.beam.sdk.util.IllegalMutationException; import org.apache.beam.sdk.util.WindowedValue; import org.apache.beam.sdk.values.KV; import org.apache.beam.sdk.values.PCollection; @@ -63,7 +60,9 @@ import org.mockito.MockitoAnnotations; public class TransformExecutorTest { @Rule public ExpectedException thrown = ExpectedException.none(); private PCollection created; - private PCollection> downstream; + + private AppliedPTransform createdProducer; + private AppliedPTransform downstreamProducer; private CountDownLatch evaluatorCompleted; @@ -88,15 +87,17 @@ public class TransformExecutorTest { TestPipeline p = TestPipeline.create(); created = p.apply(Create.of("foo", "spam", "third")); -downstream = created.apply(WithKeys.of(3)); +PCollection> downstream = created.apply(WithKeys.of(3)); + +createdProducer = created.getProducingTransformInternal(); +downstreamProducer = downstream.getProducingTransformInternal(); when(evaluationContext.getMetrics()).thenReturn(metrics); } @Test public void callWithNullInputBundleFinishesBundleAndCompletes() throws Exception { -final TransformResult result = - StepTransformResult.withoutHold(created.getProducingTransformInternal()).build(); +final TransformResult result = StepTransformResult.withoutHold(createdProducer).build(); final AtomicBoolean finishCalled = new AtomicBoolean(false); TransformEvaluator evaluator = new TransformEvaluator() { @@ -112,8 +113,7 @@ public class TransformExecutorTest { } }; -when(registry.forApplication(created.getProducingTransformInternal(), null)) -.thenReturn(evaluator); +when(registry.forApplication(createdProducer, null)).thenReturn(evaluator); TransformExecutor executor = TransformExecutor.create( @@ -121,7 +121,7 @@ public class TransformExecutorTest { registry, Collections.emptyList(), null, -created.getProducingTransformInternal(), +createdProducer, completionCallback, transformEvaluationState); executor.run(); @@ -133,7 +133,7 @@ public class TransformExecutorTest { @Test public void nullTransformEvaluatorTerminates() throws Exception { -when(registry.forApplication(created.getProducingTransformInternal(), null)).thenReturn(null); +when(registry.forApplication(createdProducer, null)).thenReturn(null); TransformExecutor executor = TransformExecutor.create( @@ -141,7 +141,7 @@ public class TransformExecutorTest { registry, Collections.emptyList(), null, -created.getProducingTransformInternal(), +createdProducer, completionCallback, transformEvaluationState); executo
[2/2] incubator-beam git commit: This closes #1490
This closes #1490 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8cb2689f Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8cb2689f Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8cb2689f Branch: refs/heads/master Commit: 8cb2689f8952a73a4e855a03f98c1d5bec8181fb Parents: 37e891f b4ee8b7 Author: Thomas Groh Authored: Fri Dec 2 13:58:39 2016 -0800 Committer: Thomas Groh Committed: Fri Dec 2 13:58:39 2016 -0800 -- .../runners/direct/TransformExecutorTest.java | 184 ++- 1 file changed, 97 insertions(+), 87 deletions(-) --
[GitHub] incubator-beam pull request #1491: [BEAM-1070] Call from_p12_keyfile() with ...
GitHub user aaltay opened a pull request: https://github.com/apache/incubator-beam/pull/1491 [BEAM-1070] Call from_p12_keyfile() with the correct arguments This code path is failing, because a wrong list of arguments is passed. Fixing that uncovered that oauth2client depends on pyOpenSSL for this call to work. I did not add this dependency to setup.py because, it does not install cleanly in all environments. As a workaround, users who would like to use this authentication method could first do a 'pip install pyOpenSSL'. I added a test, that skips if 'pyOpenSSL' is not installed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/aaltay/incubator-beam serkey Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1491.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1491 commit db9e3c7b1ec515789b9531e013669ae621646aaf Author: Ahmet Altay Date: 2016-12-02T21:38:31Z Call from_p12_keyfile() with the correct arguments. This code path is failing, because a wrong list of arguments is passed. Fixing that uncovered that oauth2client depends on pyOpenSSL for this call to work. I did not add this dependency to setup.py because, it does not install cleanly in all environments. As a workaround, users who would like to use this authentication method could first do a 'pip install pyOpenSSL'. I added a test, that skips if 'pyOpenSSL' is not installed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (BEAM-1070) Service Account Based Authentication Broken
[ https://issues.apache.org/jira/browse/BEAM-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716627#comment-15716627 ] ASF GitHub Bot commented on BEAM-1070: -- GitHub user aaltay opened a pull request: https://github.com/apache/incubator-beam/pull/1491 [BEAM-1070] Call from_p12_keyfile() with the correct arguments This code path is failing, because a wrong list of arguments is passed. Fixing that uncovered that oauth2client depends on pyOpenSSL for this call to work. I did not add this dependency to setup.py because, it does not install cleanly in all environments. As a workaround, users who would like to use this authentication method could first do a 'pip install pyOpenSSL'. I added a test, that skips if 'pyOpenSSL' is not installed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/aaltay/incubator-beam serkey Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1491.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1491 commit db9e3c7b1ec515789b9531e013669ae621646aaf Author: Ahmet Altay Date: 2016-12-02T21:38:31Z Call from_p12_keyfile() with the correct arguments. This code path is failing, because a wrong list of arguments is passed. Fixing that uncovered that oauth2client depends on pyOpenSSL for this call to work. I did not add this dependency to setup.py because, it does not install cleanly in all environments. As a workaround, users who would like to use this authentication method could first do a 'pip install pyOpenSSL'. I added a test, that skips if 'pyOpenSSL' is not installed. > Service Account Based Authentication Broken > --- > > Key: BEAM-1070 > URL: https://issues.apache.org/jira/browse/BEAM-1070 > Project: Beam > Issue Type: Bug > Components: sdk-py > Environment: CentOS Linux release 7.1.1503 (Core) > Python 2.7.5 >Reporter: Stephen Reichling >Assignee: Ahmet Altay >Priority: Critical > > {{sdks/python/apache_beam/internal/auth.py}} calls into the > {{oauth2client.service_account.ServiceAccountCredentials.from_p12_keyfile}} > method with invalid and incorrectly-ordered parameters. Compare the [function > signature of > ServiceAccountCredentials.from_p12_keyfile|https://github.com/google/oauth2client/blob/ae73312942d3cf0e98f097dfbb40f136c2a7c463/oauth2client/service_account.py#L300-L303] > with [how it is > invoked|https://github.com/apache/incubator-beam/blob/9ded359daefc6040d61a1f33c77563474fcb09b6/sdks/python/apache_beam/internal/auth.py#L150-L154]. > This causes a runtime error when one attempts to use a service account to > authenticate with the Google Dataflow APIs. > The specific problems are: > - the {{client_scopes}} variable (a list) is passed as a positional > parameter where the function signature expects the {{private_key_password}} > parameter (a string). > - a keyed parameter, {{user_agent}}, is passed but no such parameter is > defined in the function signature. > - no value is provided for {{private_key_password}}. All p12 key files for > service accounts issued by Google Cloud have the password {{notasecret}} as > documented > [here|https://support.google.com/cloud/answer/6158849?hl=en#serviceaccounts], > so it's currently not possible to use a Google-issued p12 key file with this > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/3] incubator-beam-site git commit: Simplify Flink Runner instructions for running on cluster
Repository: incubator-beam-site Updated Branches: refs/heads/asf-site 33a2bb45a -> b6423e5f4 Simplify Flink Runner instructions for running on cluster Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/f92e07c5 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/f92e07c5 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/f92e07c5 Branch: refs/heads/asf-site Commit: f92e07c53cfab50888e4cc8fb7aa46c85b3524b9 Parents: 33a2bb4 Author: Aljoscha Krettek Authored: Thu Dec 1 21:51:54 2016 +0100 Committer: Davor Bonaci Committed: Fri Dec 2 13:43:54 2016 -0800 -- src/get-started/quickstart.md | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/f92e07c5/src/get-started/quickstart.md -- diff --git a/src/get-started/quickstart.md b/src/get-started/quickstart.md index 73f3064..3bd4b98 100644 --- a/src/get-started/quickstart.md +++ b/src/get-started/quickstart.md @@ -88,12 +88,9 @@ $ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ {:.runner-flink-cluster} ``` -$ mvn package -Pflink-runner -$ cp target/word-count-beam-bundled-0.1.jar /path/to/flink/lib/ -$ bin/flink run -c org.apache.beam.examples.WordCount lib/word-count-beam-0.1.jar \ ---inputFile=/path/to/quickstart/pom.xml \ ---output=/tmp/counts \ ---runner=org.apache.beam.runners.flink.FlinkRunner +$ mvn package exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ + -Dexec.args="--runner=FlinkRunner --flinkMaster= --filesToStage=target/word-count-beam-bundled-0.1.jar \ + --inputFile=/path/to/quickstart/pom.xml --output=/tmp/counts" -Pflink-runner You can monitor the running job by visiting the Flink dashboard at http://:8081 ```
[2/3] incubator-beam-site git commit: Regenerate website
Regenerate website Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/18d5db7e Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/18d5db7e Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/18d5db7e Branch: refs/heads/asf-site Commit: 18d5db7e2e344e4a1da3405a91e675b1f6cd13a2 Parents: f92e07c Author: Davor Bonaci Authored: Fri Dec 2 13:44:07 2016 -0800 Committer: Davor Bonaci Committed: Fri Dec 2 13:44:07 2016 -0800 -- content/get-started/quickstart/index.html | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/18d5db7e/content/get-started/quickstart/index.html -- diff --git a/content/get-started/quickstart/index.html b/content/get-started/quickstart/index.html index 218863a..b1b33ab 100644 --- a/content/get-started/quickstart/index.html +++ b/content/get-started/quickstart/index.html @@ -232,12 +232,9 @@ MinimalWordCount.java WordCount.java -$ mvn package -Pflink-runner -$ cp target/word-count-beam-bundled-0.1.jar /path/to/flink/lib/ -$ bin/flink run -c org.apache.beam.examples.WordCount lib/word-count-beam-0.1.jar \ ---inputFile=/path/to/quickstart/pom.xml \ ---output=/tmp/counts \ ---runner=org.apache.beam.runners.flink.FlinkRunner +$ mvn package exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ + -Dexec.args="--runner=FlinkRunner --flinkMaster=--filesToStage=target/word-count-beam-bundled-0.1.jar \ + --inputFile=/path/to/quickstart/pom.xml --output=/tmp/counts" -Pflink-runner You can monitor the running job by visiting the Flink dashboard at http:// :8081
[GitHub] incubator-beam-site pull request #99: Simplify Flink Runner instructions for...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-beam-site/pull/99 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[3/3] incubator-beam-site git commit: This closes #99
This closes #99 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/b6423e5f Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/b6423e5f Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/b6423e5f Branch: refs/heads/asf-site Commit: b6423e5f42c67df0e5b1d2d6c6c758755de01cfa Parents: 33a2bb4 18d5db7 Author: Davor Bonaci Authored: Fri Dec 2 13:44:07 2016 -0800 Committer: Davor Bonaci Committed: Fri Dec 2 13:44:07 2016 -0800 -- content/get-started/quickstart/index.html | 9 +++-- src/get-started/quickstart.md | 9 +++-- 2 files changed, 6 insertions(+), 12 deletions(-) --
[jira] [Created] (BEAM-1079) Validation speed for really large number of files
Sourabh Bajaj created BEAM-1079: --- Summary: Validation speed for really large number of files Key: BEAM-1079 URL: https://issues.apache.org/jira/browse/BEAM-1079 Project: Beam Issue Type: Improvement Components: sdk-py Reporter: Sourabh Bajaj Assignee: Sourabh Bajaj Priority: Trivial -- This message was sent by Atlassian JIRA (v6.3.4#6332)