[beam] tag nightly-master updated (aca9099 -> 84db719)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to tag nightly-master in repository https://gitbox.apache.org/repos/asf/beam.git. *** WARNING: tag nightly-master was modified! *** from aca9099 (commit) to 84db719 (commit) from aca9099 Merge pull request #13192: [BEAM-10402] Move @Nullable annotations adjacent to the type they annotate add d151352 Fix URL to BEAM-9615 (#13190) add cad2c7c [BEAM-11058] Enable HadoopFormatIOElasticIT on Java PostCommit add b13a305 Merge pull request #13085: [BEAM-11058] Enable HadoopFormatIOElasticIT on Java PostCommit add 7673f29 Move Beam 2.25.0 release blog to the correct directory add ba24b08 Merge pull request #13197 from robinyqiu/release-blog add 9d76ee5 Update beam-2.24.0.md add 3f9f1aa Merge pull request #13182 from y1chi/patch-2 add b73fcc6 [BEAM-3] Switch default pickler compressor back to zlib for Coders in python sdk add 38feb03 Merge pull request #13183 from y1chi/BEAM-3 add bfb243c Enable JSON_EXTRACT and TO_JSON_STRING in ZetaSQL add 3973137 Merge pull request #13195: Enable JSON_EXTRACT and TO_JSON_STRING in ZetaSQL add ac4b068 [BEAM-11130] Exclude OrderedListState VR tests. add 84db719 Merge pull request #13196: [BEAM-11130] Exclude OrderedListState VR tests from Dataflow V2 No new revisions were added by this update. Summary of changes: build.gradle | 1 + runners/google-cloud-dataflow-java/build.gradle| 3 + .../zetasql/SupportedZetaSqlBuiltinFunctions.java | 15 ++-- sdks/java/io/hadoop-format/build.gradle| 9 ++ .../io/hadoop/format/HadoopFormatIOElasticIT.java | 96 -- sdks/python/apache_beam/coders/coders.py | 5 +- sdks/python/apache_beam/internal/pickler.py| 19 - website/www/site/content/en/blog/beam-2.24.0.md| 3 + .../{static => content/en/blog}/beam-2.25.0.md | 0 website/www/site/content/en/roadmap/go-sdk.md | 2 +- 10 files changed, 133 insertions(+), 20 deletions(-) rename website/www/site/{static => content/en/blog}/beam-2.25.0.md (100%)
[beam] 01/01: Merge pull request #13196: [BEAM-11130] Exclude OrderedListState VR tests from Dataflow V2
This is an automated email from the ASF dual-hosted git repository. kenn pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git commit 84db7198f1fc325b649e37bf3803f47967a52d09 Merge: 3973137 ac4b068 Author: Kenn Knowles AuthorDate: Mon Oct 26 18:52:21 2020 -0700 Merge pull request #13196: [BEAM-11130] Exclude OrderedListState VR tests from Dataflow V2 runners/google-cloud-dataflow-java/build.gradle | 3 +++ 1 file changed, 3 insertions(+)
[beam] branch master updated (3973137 -> 84db719)
This is an automated email from the ASF dual-hosted git repository. kenn pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 3973137 Merge pull request #13195: Enable JSON_EXTRACT and TO_JSON_STRING in ZetaSQL add ac4b068 [BEAM-11130] Exclude OrderedListState VR tests. new 84db719 Merge pull request #13196: [BEAM-11130] Exclude OrderedListState VR tests from Dataflow V2 The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: runners/google-cloud-dataflow-java/build.gradle | 3 +++ 1 file changed, 3 insertions(+)
[beam] branch master updated (38feb03 -> 3973137)
This is an automated email from the ASF dual-hosted git repository. kenn pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from 38feb03 Merge pull request #13183 from y1chi/BEAM-3 add bfb243c Enable JSON_EXTRACT and TO_JSON_STRING in ZetaSQL add 3973137 Merge pull request #13195: Enable JSON_EXTRACT and TO_JSON_STRING in ZetaSQL No new revisions were added by this update. Summary of changes: .../sql/zetasql/SupportedZetaSqlBuiltinFunctions.java | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-)
[beam] branch master updated: [BEAM-11113] Switch default pickler compressor back to zlib for Coders in python sdk
This is an automated email from the ASF dual-hosted git repository. goenka pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new b73fcc6 [BEAM-3] Switch default pickler compressor back to zlib for Coders in python sdk new 38feb03 Merge pull request #13183 from y1chi/BEAM-3 b73fcc6 is described below commit b73fcc6502624a1b19db28ec108db4a1424df2d4 Author: Yichi Zhang AuthorDate: Fri Oct 23 13:41:37 2020 -0700 [BEAM-3] Switch default pickler compressor back to zlib for Coders in python sdk --- sdks/python/apache_beam/coders/coders.py| 5 +++-- sdks/python/apache_beam/internal/pickler.py | 19 +++ 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/sdks/python/apache_beam/coders/coders.py b/sdks/python/apache_beam/coders/coders.py index bc35779..1641ae1 100644 --- a/sdks/python/apache_beam/coders/coders.py +++ b/sdks/python/apache_beam/coders/coders.py @@ -107,12 +107,13 @@ ConstructorFn = Callable[[Optional[Any], List['Coder'], 'PipelineContext'], Any] def serialize_coder(coder): from apache_beam.internal import pickler return b'%s$%s' % ( - coder.__class__.__name__.encode('utf-8'), pickler.dumps(coder)) + coder.__class__.__name__.encode('utf-8'), + pickler.dumps(coder, use_zlib=True)) def deserialize_coder(serialized): from apache_beam.internal import pickler - return pickler.loads(serialized.split(b'$', 1)[1]) + return pickler.loads(serialized.split(b'$', 1)[1], use_zlib=True) # pylint: enable=wrong-import-order, wrong-import-position diff --git a/sdks/python/apache_beam/internal/pickler.py b/sdks/python/apache_beam/internal/pickler.py index c4bfb44..395d511 100644 --- a/sdks/python/apache_beam/internal/pickler.py +++ b/sdks/python/apache_beam/internal/pickler.py @@ -39,6 +39,7 @@ import sys import threading import traceback import types +import zlib from typing import Any from typing import Dict from typing import Tuple @@ -241,7 +242,7 @@ if 'save_module' in dir(dill.dill): logging.getLogger('dill').setLevel(logging.WARN) -def dumps(o, enable_trace=True): +def dumps(o, enable_trace=True, use_zlib=False): # type: (...) -> bytes """For internal use only; no backwards-compatibility guarantees.""" @@ -260,18 +261,28 @@ def dumps(o, enable_trace=True): # Compress as compactly as possible (compresslevel=9) to decrease peak memory # usage (of multiple in-memory copies) and to avoid hitting protocol buffer # limits. - c = bz2.compress(s, compresslevel=9) + # WARNING: Be cautious about compressor change since it can lead to pipeline + # representation change, and can break streaming job update compatibility on + # runners such as Dataflow. + if use_zlib: +c = zlib.compress(s, 9) + else: +c = bz2.compress(s, compresslevel=9) del s # Free up some possibly large and no-longer-needed memory. return base64.b64encode(c) -def loads(encoded, enable_trace=True): +def loads(encoded, enable_trace=True, use_zlib=False): """For internal use only; no backwards-compatibility guarantees.""" c = base64.b64decode(encoded) - s = bz2.decompress(c) + if use_zlib: +s = zlib.decompress(c) + else: +s = bz2.decompress(c) + del c # Free up some possibly large and no-longer-needed memory. with _pickle_lock_unless_py2:
[beam] branch master updated: Update beam-2.24.0.md
This is an automated email from the ASF dual-hosted git repository. goenka pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 9d76ee5 Update beam-2.24.0.md new 3f9f1aa Merge pull request #13182 from y1chi/patch-2 9d76ee5 is described below commit 9d76ee555f50549ee7f50214083985ce49721bed Author: Yichi Zhang AuthorDate: Fri Oct 23 13:39:37 2020 -0700 Update beam-2.24.0.md --- website/www/site/content/en/blog/beam-2.24.0.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/website/www/site/content/en/blog/beam-2.24.0.md b/website/www/site/content/en/blog/beam-2.24.0.md index 6ec264f..461514c 100644 --- a/website/www/site/content/en/blog/beam-2.24.0.md +++ b/website/www/site/content/en/blog/beam-2.24.0.md @@ -55,6 +55,9 @@ For more information on changes in 2.24.0, check out the --temp_location, or pass method="STREAMING_INSERTS" to WriteToBigQuery ([BEAM-6928](https://issues.apache.org/jira/browse/BEAM-6928)). * Python SDK now understands `typing.FrozenSet` type hints, which are not interchangeable with `typing.Set`. You may need to update your pipelines if type checking fails. ([BEAM-10197](https://issues.apache.org/jira/browse/BEAM-10197)) +## Known Issues + +* ([BEAM-3](https://issues.apache.org/jira/browse/BEAM-3)) Default compressor change breaks dataflow python streaming job update compatibility. ## List of Contributors
[beam] branch asf-site updated: Publishing website 2020/10/26 18:04:02 at commit ba24b08
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/asf-site by this push: new 23686b8 Publishing website 2020/10/26 18:04:02 at commit ba24b08 23686b8 is described below commit 23686b81c22cf43204da2e2beb8d2becf9511068 Author: jenkins AuthorDate: Mon Oct 26 18:04:03 2020 + Publishing website 2020/10/26 18:04:02 at commit ba24b08 --- website/generated-content/beam-2.25.0.md | 85 --- .../generated-content/blog/beam-2.25.0/index.html | 31 ++ website/generated-content/blog/index.html | 4 +- website/generated-content/blog/index.xml | 117 + .../generated-content/categories/blog/index.xml| 100 +- website/generated-content/categories/index.xml | 2 +- website/generated-content/feed.xml | 107 +-- website/generated-content/index.html | 2 +- .../generated-content/roadmap/go-sdk/index.html| 2 +- website/generated-content/roadmap/index.xml| 2 +- website/generated-content/sitemap.xml | 2 +- 11 files changed, 264 insertions(+), 190 deletions(-) diff --git a/website/generated-content/beam-2.25.0.md b/website/generated-content/beam-2.25.0.md deleted file mode 100644 index d757147..000 --- a/website/generated-content/beam-2.25.0.md +++ /dev/null @@ -1,85 +0,0 @@ -title: "Apache Beam 2.25.0" -date: 2020-10-23 14:00:00 -0800 -categories: - - blog -authors: - - Robin Qiu - -We are happy to present the new 2.25.0 release of Apache Beam. This release includes both improvements and new functionality. -See the [download page](/get-started/downloads/#2250-2020-10-23) for this release. -For more information on changes in 2.25.0, check out the -[detailed release notes](https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12347147). - -## Highlights - -* Splittable DoFn is now the default for executing the Read transform for Java based runners (Direct, Flink, Jet, Samza, Twister2). The expected output of the Read transform is unchanged. Users can opt-out using `--experiments=use_deprecated_read`. The Apache Beam community is looking for feedback for this change as the community is planning to make this change permanent with no opt-out. If you run into an issue requiring the opt-out, please send an e-mail to [u...@beam.apache.org](mailt [...] - -## I/Os - -* Added cross-language support to Java's KinesisIO, now available in the Python module `apache_beam.io.kinesis` ([BEAM-10138](https://issues.apache.org/jira/browse/BEAM-10138), [BEAM-10137](https://issues.apache.org/jira/browse/BEAM-10137)). -* Update Snowflake JDBC dependency for SnowflakeIO ([BEAM-10864](https://issues.apache.org/jira/browse/BEAM-10864)) -* Added cross-language support to Java's SnowflakeIO.Write, now available in the Python module `apache_beam.io.snowflake` ([BEAM-9898](https://issues.apache.org/jira/browse/BEAM-9898)). -* Added delete function to Java's `ElasticsearchIO#Write`. Now, Java's ElasticsearchIO can be used to selectively delete documents using `withIsDeleteFn` function ([BEAM-5757](https://issues.apache.org/jira/browse/BEAM-5757)). -* Java SDK: Added new IO connector for InfluxDB - InfluxDbIO ([BEAM-2546](https://issues.apache.org/jira/browse/BEAM-2546)). - -## New Features / Improvements - -* Support for repeatable fields in JSON decoder for `ReadFromBigQuery` added. (Python) ([BEAM-10524](https://issues.apache.org/jira/browse/BEAM-10524)) -* Added an opt-in, performance-driven runtime type checking system for the Python SDK ([BEAM-10549](https://issues.apache.org/jira/browse/BEAM-10549)). -More details will be in an upcoming [blog post](https://beam.apache.org/blog/python-performance-runtime-type-checking/index.html). -* Added support for Python 3 type annotations on PTransforms using typed PCollections ([BEAM-10258](https://issues.apache.org/jira/browse/BEAM-10258)). -More details will be in an upcoming [blog post](https://beam.apache.org/blog/python-improved-annotations/index.html). -* Improved the Interactive Beam API where recording streaming jobs now start a long running background recording job. Running ib.show() or ib.collect() samples from the recording ([BEAM-10603](https://issues.apache.org/jira/browse/BEAM-10603)). -* In Interactive Beam, ib.show() and ib.collect() now have "n" and "duration" as parameters. These mean read only up to "n" elements and up to "duration" seconds of data read from the recording ([BEAM-10603](https://issues.apache.org/jira/browse/BEAM-10603)). -* Initial preview of [Dataframes](https://s.apache.org/simpler-python-pipelines-2020#slide=id.g905ac9257b_1_21) support. -See also example at apache_beam/examples/wordcount_dataframe.py -* Fixed support for
[beam] branch master updated: Move Beam 2.25.0 release blog to the correct directory
This is an automated email from the ASF dual-hosted git repository. robinyqiu pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/beam.git The following commit(s) were added to refs/heads/master by this push: new 7673f29 Move Beam 2.25.0 release blog to the correct directory new ba24b08 Merge pull request #13197 from robinyqiu/release-blog 7673f29 is described below commit 7673f293812b12155c701c19b896cc9d69233438 Author: Yueyang Qiu AuthorDate: Mon Oct 26 10:46:37 2020 -0700 Move Beam 2.25.0 release blog to the correct directory --- website/www/site/{static => content/en/blog}/beam-2.25.0.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) diff --git a/website/www/site/static/beam-2.25.0.md b/website/www/site/content/en/blog/beam-2.25.0.md similarity index 100% rename from website/www/site/static/beam-2.25.0.md rename to website/www/site/content/en/blog/beam-2.25.0.md
[beam] branch master updated (d151352 -> b13a305)
This is an automated email from the ASF dual-hosted git repository. aromanenko pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from d151352 Fix URL to BEAM-9615 (#13190) add cad2c7c [BEAM-11058] Enable HadoopFormatIOElasticIT on Java PostCommit add b13a305 Merge pull request #13085: [BEAM-11058] Enable HadoopFormatIOElasticIT on Java PostCommit No new revisions were added by this update. Summary of changes: build.gradle | 1 + sdks/java/io/hadoop-format/build.gradle| 9 ++ .../io/hadoop/format/HadoopFormatIOElasticIT.java | 96 -- 3 files changed, 101 insertions(+), 5 deletions(-)
[beam] branch master updated (d151352 -> b13a305)
This is an automated email from the ASF dual-hosted git repository. aromanenko pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from d151352 Fix URL to BEAM-9615 (#13190) add cad2c7c [BEAM-11058] Enable HadoopFormatIOElasticIT on Java PostCommit add b13a305 Merge pull request #13085: [BEAM-11058] Enable HadoopFormatIOElasticIT on Java PostCommit No new revisions were added by this update. Summary of changes: build.gradle | 1 + sdks/java/io/hadoop-format/build.gradle| 9 ++ .../io/hadoop/format/HadoopFormatIOElasticIT.java | 96 -- 3 files changed, 101 insertions(+), 5 deletions(-)
[beam] branch master updated (aca9099 -> d151352)
This is an automated email from the ASF dual-hosted git repository. lostluck pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/beam.git. from aca9099 Merge pull request #13192: [BEAM-10402] Move @Nullable annotations adjacent to the type they annotate add d151352 Fix URL to BEAM-9615 (#13190) No new revisions were added by this update. Summary of changes: website/www/site/content/en/roadmap/go-sdk.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)