[beam] branch asf-site updated: Publishing website 2019/12/11 00:59:41 at commit 11c60b8

2019-12-10 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new dcf3676  Publishing website 2019/12/11 00:59:41 at commit 11c60b8
dcf3676 is described below

commit dcf3676a00857826169f08fe153b223ffad65b0e
Author: jenkins 
AuthorDate: Wed Dec 11 00:59:42 2019 +

Publishing website 2019/12/11 00:59:41 at commit 11c60b8
---
 .../extensions/create-external-table/index.html| 31 +++---
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git 
a/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
 
b/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
index 2fc6503..c1a1eee 100644
--- 
a/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
+++ 
b/website/generated-content/documentation/dsls/sql/extensions/create-external-table/index.html
@@ -431,14 +431,26 @@ See the I/O specific sections for tblProperties<
 CREATE EXTERNAL TABLE [ IF NOT EXISTS ] tableName 
(tableElement [, tableElement ]*)
 TYPE bigquery
 LOCATION '[PROJECT_ID]:[DATASET].[TABLE]'
+TBLPROPERTIES '{"method": "DEFAULT"}'
 
 
 
-  LOCATION:Location of the table in 
the BigQuery CLI format.
+  LOCATION: Location of the table 
in the BigQuery CLI format.
 
-  PROJECT_ID: ID of the Google 
Cloud Project
-  DATASET: BigQuery Dataset 
ID
-  TABLE: BigQuery Table ID 
within the Dataset
+  PROJECT_ID: ID of the Google 
Cloud Project.
+  DATASET: BigQuery Dataset 
ID.
+  TABLE: BigQuery Table ID 
within the Dataset.
+
+  
+  TBLPROPERTIES:
+
+  method: Optional. Read method 
to use. Following options are available:
+
+  DEFAULT: If no property 
is set, will be used as default. Currently uses EXPORT.
+  DIRECT_READ: Use the 
BigQuery Storage API.
+  EXPORT: Export data to 
Google Cloud Storage in Avro format and read data files from that location.
+
+  
 
   
 
@@ -448,6 +460,17 @@ LOCATION '[PROJECT_ID]:[DATASET].[TABLE]'
 Beam SQL supports reading columns with simple types (simpleType) and arrays of simple
 types (ARRAY).
 
+When reading using EXPORT method the 
following pipeline options should be set:
+
+  project: ID of the Google Cloud 
Project.
+  tempLocation: Bucket to store 
intermediate data in. Ex: gs://temp-storage/temp.
+
+
+When reading using DIRECT_READ 
method, an optimizer will attempt to perform
+project and predicate push-down, potentially reducing the time requited to 
read the data from BigQuery.
+
+More information about the BigQuery Storage API can be found https://beam.apache.org/documentation/io/built-in/google-bigquery/#storage-api";>here.
+
 Write Mode
 
 if the table does not exist, Beam creates the table specified in location 
when



[beam] branch master updated: Update SQL BigQuery doc

2019-12-10 Thread amaliujia
This is an automated email from the ASF dual-hosted git repository.

amaliujia pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 92e92bc  Update SQL BigQuery doc
 new 11c60b8  Merge pull request #10260 from 11moon11/UpdateBigQueryDoc
92e92bc is described below

commit 92e92bc0b8fb01b9395e6480480a81832a86111f
Author: kirillkozlov 
AuthorDate: Mon Dec 2 16:11:16 2019 -0800

Update SQL BigQuery doc
---
 .../dsls/sql/extensions/create-external-table.md   | 23 ++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git 
a/website/src/documentation/dsls/sql/extensions/create-external-table.md 
b/website/src/documentation/dsls/sql/extensions/create-external-table.md
index 81d7dae..2489bb3 100644
--- a/website/src/documentation/dsls/sql/extensions/create-external-table.md
+++ b/website/src/documentation/dsls/sql/extensions/create-external-table.md
@@ -89,18 +89,33 @@ tableElement: columnName fieldType [ NOT NULL ]
 CREATE EXTERNAL TABLE [ IF NOT EXISTS ] tableName (tableElement [, 
tableElement ]*)
 TYPE bigquery
 LOCATION '[PROJECT_ID]:[DATASET].[TABLE]'
+TBLPROPERTIES '{"method": "DEFAULT"}'
 ```
 
-*   `LOCATION:`Location of the table in the BigQuery CLI format.
-*   `PROJECT_ID`: ID of the Google Cloud Project
-*   `DATASET`: BigQuery Dataset ID
-*   `TABLE`: BigQuery Table ID within the Dataset
+*   `LOCATION`: Location of the table in the BigQuery CLI format.
+*   `PROJECT_ID`: ID of the Google Cloud Project.
+*   `DATASET`: BigQuery Dataset ID.
+*   `TABLE`: BigQuery Table ID within the Dataset.
+*   `TBLPROPERTIES`:
+*   `method`: Optional. Read method to use. Following options are 
available:
+*   `DEFAULT`: If no property is set, will be used as default. 
Currently uses `EXPORT`.
+*   `DIRECT_READ`: Use the BigQuery Storage API.
+*   `EXPORT`: Export data to Google Cloud Storage in Avro format and 
read data files from that location.
 
 ### Read Mode
 
 Beam SQL supports reading columns with simple types (`simpleType`) and arrays 
of simple
 types (`ARRAY`).
 
+When reading using `EXPORT` method the following pipeline options should be 
set:
+*   `project`: ID of the Google Cloud Project.
+*   `tempLocation`: Bucket to store intermediate data in. Ex: 
`gs://temp-storage/temp`.
+
+When reading using `DIRECT_READ` method, an optimizer will attempt to perform
+project and predicate push-down, potentially reducing the time requited to 
read the data from BigQuery.
+
+More information about the BigQuery Storage API can be found 
[here](https://beam.apache.org/documentation/io/built-in/google-bigquery/#storage-api).
+
 ### Write Mode
 
 if the table does not exist, Beam creates the table specified in location when



[beam] branch master updated (98ad0a6 -> 4b92c34)

2019-12-10 Thread apilloud
This is an automated email from the ASF dual-hosted git repository.

apilloud pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 98ad0a6  Add an ML section to python SDK overview (#10233)
 add c43ca65  Updated the cost model to favor IO with push-down
 add c498c21  BigQueryFilter numSupported method
 add 4b92c34  Merge pull request #10060: [BEAM-8343] [SQL] Updated the cost 
model to favor IO with push-down.

No new revisions were added by this update.

Summary of changes:
 .../sdk/extensions/sql/impl/rel/BeamCalcRel.java   | 25 +-
 .../sql/impl/rel/BeamPushDownIOSourceRel.java  | 19 ++--
 .../extensions/sql/meta/BeamSqlTableFilter.java| 24 +
 .../extensions/sql/meta/DefaultTableFilter.java|  5 +
 .../sql/meta/provider/bigquery/BigQueryFilter.java |  5 +
 .../sql/meta/provider/test/TestTableFilter.java|  5 +
 .../sql/meta/CustomTableResolverTest.java  | 18 +++-
 .../provider/bigquery/BigQueryReadWriteIT.java |  6 +++---
 8 files changed, 77 insertions(+), 30 deletions(-)



[beam] branch asf-site updated: Publishing website 2019/12/10 23:41:39 at commit 98ad0a6

2019-12-10 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 85194d2  Publishing website 2019/12/10 23:41:39 at commit 98ad0a6
85194d2 is described below

commit 85194d2863dffdba07c1c586d2e06cd00ceb6a51
Author: jenkins 
AuthorDate: Tue Dec 10 23:41:39 2019 +

Publishing website 2019/12/10 23:41:39 at commit 98ad0a6
---
 website/generated-content/documentation/sdks/python/index.html | 5 +
 1 file changed, 5 insertions(+)

diff --git a/website/generated-content/documentation/sdks/python/index.html 
b/website/generated-content/documentation/sdks/python/index.html
index 4eaa12c..07e88fe 100644
--- a/website/generated-content/documentation/sdks/python/index.html
+++ b/website/generated-content/documentation/sdks/python/index.html
@@ -292,6 +292,7 @@
   Python type safety
   Managing Python 
pipeline dependencies
   Developing new I/O 
connectors for Python
+  Using Beam Python 
SDK in your ML pipelines
 
 
 
@@ -342,6 +343,10 @@ new I/O connectors. See the D
 for information about developing new I/O connectors and links to
 language-specific implementation guidance.
 
+Using Beam Python SDK in 
your ML pipelines
+
+https://www.tensorflow.org/tfx";>TensorFlow Extended (TFX) is 
an end-to-end platform for deploying production ML pipelines. TFX is integrated 
with Beam. For more information, see https://www.tensorflow.org/tfx/guide";>TFX user guide.
+
   
 
 

[beam] branch master updated (d032994 -> 98ad0a6)

2019-12-10 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from d032994  Merge pull request #9926 from davidcavazos/groupbykey-code
 add 98ad0a6  Add an ML section to python SDK overview (#10233)

No new revisions were added by this update.

Summary of changes:
 website/src/documentation/sdks/python.md | 4 
 1 file changed, 4 insertions(+)



[beam] branch aaltay-patch-2 updated (03fd14f -> 46acfb3)

2019-12-10 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a change to branch aaltay-patch-2
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 03fd14f  fixup
 add 46acfb3  Reviewer comments.

No new revisions were added by this update.

Summary of changes:
 website/src/documentation/sdks/python.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



[beam] branch master updated: [BEAM-7390] Add code snippet for GroupByKey

2019-12-10 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new f51edc1  [BEAM-7390] Add code snippet for GroupByKey
 new d032994  Merge pull request #9926 from davidcavazos/groupbykey-code
f51edc1 is described below

commit f51edc10e1c724bdf113f84ab3b7283b9fabe19c
Author: David Cavazos 
AuthorDate: Wed Oct 16 18:36:39 2019 -0700

[BEAM-7390] Add code snippet for GroupByKey
---
 .../snippets/transforms/aggregation/groupbykey.py  | 47 +++
 .../transforms/aggregation/groupbykey_test.py  | 54 ++
 2 files changed, 101 insertions(+)

diff --git 
a/sdks/python/apache_beam/examples/snippets/transforms/aggregation/groupbykey.py
 
b/sdks/python/apache_beam/examples/snippets/transforms/aggregation/groupbykey.py
new file mode 100644
index 000..83e4f87
--- /dev/null
+++ 
b/sdks/python/apache_beam/examples/snippets/transforms/aggregation/groupbykey.py
@@ -0,0 +1,47 @@
+# coding=utf-8
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+from __future__ import print_function
+
+
+def groupbykey(test=None):
+  # [START groupbykey]
+  import apache_beam as beam
+
+  with beam.Pipeline() as pipeline:
+produce_counts = (
+pipeline
+| 'Create produce counts' >> beam.Create([
+('spring', '🍓'),
+('spring', '🥕'),
+('spring', '🍆'),
+('spring', '🍅'),
+('summer', '🥕'),
+('summer', '🍅'),
+('summer', '🌽'),
+('fall', '🥕'),
+('fall', '🍅'),
+('winter', '🍆'),
+])
+| 'Group counts per produce' >> beam.GroupByKey()
+| beam.Map(print)
+)
+# [END groupbykey]
+if test:
+  test(produce_counts)
diff --git 
a/sdks/python/apache_beam/examples/snippets/transforms/aggregation/groupbykey_test.py
 
b/sdks/python/apache_beam/examples/snippets/transforms/aggregation/groupbykey_test.py
new file mode 100644
index 000..4d8283a
--- /dev/null
+++ 
b/sdks/python/apache_beam/examples/snippets/transforms/aggregation/groupbykey_test.py
@@ -0,0 +1,54 @@
+# coding=utf-8
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+from __future__ import print_function
+
+import unittest
+
+import mock
+
+from apache_beam.examples.snippets.util import assert_matches_stdout
+from apache_beam.testing.test_pipeline import TestPipeline
+
+from . import groupbykey
+
+
+def check_produce_counts(actual):
+  expected = '''[START produce_counts]
+('spring', ['🍓', '🥕', '🍆', '🍅'])
+('summer', ['🥕', '🍅', '🌽'])
+('fall', ['🥕', '🍅'])
+('winter', ['🍆'])
+[END produce_counts]'''.splitlines()[1:-1]
+  # The elements order is non-deterministic, so sort them first.
+  assert_matches_stdout(
+  actual, expected, lambda pair: (pair[0], sorted(pair[1])))
+
+
+@mock.patch('apache_beam.Pipeline', TestPipeline)
+@mock.patch(
+'apache_beam.examples.snippets.transforms.aggregation.groupbykey.print',
+str)
+class GroupByKeyTest(unittest.TestCase):
+  def test_groupbykey(self):
+groupbykey.groupbykey(check_produce_counts)
+
+
+if __name__ == '__main__':
+  unittest.main()



[beam] branch master updated (bdd70ab -> 095ac4d)

2019-12-10 Thread mikhail
This is an automated email from the ASF dual-hosted git repository.

mikhail pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from bdd70ab  [BEAM-8575] Test DoFn context params (#10130)
 new e58cafa  Strict equality comparision for the version of tensorflow 
dependency
 new 3d7f7d2  Extract installChicagoTaxiExampleRequirements step
 new 3ea8077  Look up log level values using `getattr`
 new 095ac4d  Merge pull request #10269 from 
kamilwu/chicago-taxi-dependencies-fix

The 24609 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 ...ommit_Python_Chicago_Taxi_Example_Dataflow.groovy | 20 +---
 .../testing/benchmarks/chicago_taxi/preprocess.py|  2 +-
 .../testing/benchmarks/chicago_taxi/process_tfma.py  |  2 +-
 .../testing/benchmarks/chicago_taxi/requirements.txt |  3 +--
 .../testing/benchmarks/chicago_taxi/run_chicago.sh   |  6 +++---
 .../testing/benchmarks/chicago_taxi/setup.py |  6 ++
 .../chicago_taxi/tfdv_analyze_and_validate.py|  2 +-
 .../testing/benchmarks/chicago_taxi/trainer/task.py  |  4 ++--
 sdks/python/test-suites/dataflow/py2/build.gradle| 18 --
 9 files changed, 36 insertions(+), 27 deletions(-)



[beam] branch master updated (659039e -> bdd70ab)

2019-12-10 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 659039e  Merge pull request #10332: [BEAM-8858] 
sdks/java/extensions/sql to declare used-but-undeclared dependencies
 add bdd70ab  [BEAM-8575] Test DoFn context params (#10130)

No new revisions were added by this update.

Summary of changes:
 sdks/python/apache_beam/pipeline_test.py | 20 
 1 file changed, 20 insertions(+)



[beam] branch master updated (dfa2cf5 -> 659039e)

2019-12-10 Thread iemejia
This is an automated email from the ASF dual-hosted git repository.

iemejia pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from dfa2cf5  Merge pull request #10326: [BEAM-8929] Remove obsolete 
InterruptedException hanlding in FnApiControlClient
 add 2fc88ca  Commons-codec dependency in extensions/sql
 add 131b18d  Adding joda_time dependency
 add b01ac35  Commons_lang3 and jackson_databind dependency
 add 659039e  Merge pull request #10332: [BEAM-8858] 
sdks/java/extensions/sql to declare used-but-undeclared dependencies

No new revisions were added by this update.

Summary of changes:
 sdks/java/extensions/sql/build.gradle | 4 
 1 file changed, 4 insertions(+)



[beam] branch master updated (44d4568 -> dfa2cf5)

2019-12-10 Thread mxm
This is an automated email from the ASF dual-hosted git repository.

mxm pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 44d4568  Merge pull request #10312 from apache/aaltay-patch-1
 add 2a3a7f7  [BEAM-8929] Remove unnecessary exception handling in 
FnApiControlClientPoolService.
 add dfa2cf5  Merge pull request #10326: [BEAM-8929] Remove obsolete 
InterruptedException hanlding in FnApiControlClient

No new revisions were added by this update.

Summary of changes:
 .../runners/fnexecution/control/FnApiControlClientPoolService.java | 3 ---
 1 file changed, 3 deletions(-)