[image: Beam.png]

October 2018 | Newsletter


What’s been done

[image: Tick - done]

New Beam Chair!

   -

   The Board has appointed Kenneth Knowles as the second chair of the
   Apache Beam project
   -

   You can find information about the Apache Beam Committee here
   <https://projects.apache.org/committee.html?beam>


Beam Dashboards (by: Mikhail Gryzykhin, Huygaa Batsaikhan)

   -

   Ongoing work to create Grafana dashboards for Beam.
   -

   See the current POC dashboard
   <http://104.154.241.245/d/_TNndF2iz/pre-commit-tests?orgId=1> for
   post-commit greenness and pre-commits duration dashboard
   <http://104.154.241.245/d/D81lW0pmk/post-commit-tests?orgId=1> for more
   details.


Beam Website Sources Migrated to apache/beam (by: Scott Wegner)

   -

   Source code for the Beam website has been successfully migrated to the
   apache/beam repository
   -

   See Beam-Site Automation Reliability
   <https://s.apache.org/beam-site-automation> for more background
   information
   -

   Thanks to Jason Kuster and Melissa Pashniak for their mentorship, Thomas
   Weise and Robert Bradshaw for design feedback, and Alan Myrvold and Udi
   Meiri for implementation and documentation

Python SDK Support for User State and Timers (by: Charles Chen, Robert
Bradshaw)

   -

   Timely, stateful processing
   <https://beam.apache.org/blog/2017/08/28/timely-processing.html> has
   been ported to the Python SDK.
   -

   Supported in the direct runner and portability harness. Ongoing work to
   support in Flink and Dataflow.
   -

   Full design doc at http://s.apache.org/beam-python-user-state-and-timers


Graphite Metrics Sink (by: Etienne Chauchot)

   -

   Metrics Pusher can now push Beam metrics to Graphite


Euphoria Java 8 DSL  (by: David Moravek, Vaclav Plajt, Marek Simunek)

   -

   A higher-level Java 8 DSL based on the Euphoria API project
   -

   https://beam.apache.org/documentation/sdks/java/euphoria/


Hosted the first EU Beam Summit! (by: Matthias Baetens, Alex Van Boxel,
Gris Cuevas, Victor Kotai)

   -

   You can take the feedback survey here
   
<https://docs.google.com/forms/d/e/1FAIpQLSdN0Qp_HYsrhV2hOmJEHWRf4BfAP9sFWWrglNyb_dZf_Tj49Q/viewform>
   !


What we’re working on...

Donating the Dataflow Worker (by: Lukasz Cwik)

   -

   Donating the Dataflow worker code as part of Apache Beam master branch


Python 3 Support (by, in alphabetical order: Ahmet Altay, Robert Bradshaw,
Charles Chen, Matthias Feys, Ruoyun Huang, Juta Staes, Simon Plovyt, Robbe
Sneyders, Valentyn Tymofieiev, Manu Zhang)

   -

   Active work is ongoing to support Python 3 in Beam.
   -

   Currently at least 10 active contributors, 4 of which  joined the
   community effort last month.
   -

   Contributions are welcome!
   <https://beam.apache.org/contribute/#python-3-support> This is an
   excellent opportunity to learn more about Beam Python SDK!


Beam User Survey (by: Rose Nguyen, David Cavazos)

   -

   We need input from the community on your Beam experience and use cases!
   -

   Your survey results will be used to shape the Beam Cookbook
   -

   Find the survey here: Beam User Tasks
   
<https://docs.google.com/forms/d/1H58NCAOqUxW1lBGrYQzhK4rb6UArXX-csp6QY6EVSRw>


Organizing Meetups in Bay Area (by: Austin Bennett)

   -

   Looking to start organizing events for Beam around San Francisco, CA
   -

   Contact Austin at ‘whatwouldausti...@gmail.com’ if you are interested in
   speaking and sharing about what you are doing with Beam!
   -

   General info found on Meetup page:
   https://www.meetup.com/San-Francisco-Apache-Beam/


Flink Portable Runner (by: Ankur Goenka, Maximilian Michels, Thomas Weise,
Ryan Williams, Robert Bradshaw)

   -

   All Java and Python ValidatesRunner tests pass for supported features.
   -

   Option to configure process-based execution in addition to Docker
   -

   Support for accessing local filesystem using process-based execution
   -

   Option to cache execution environment
   -

   Checkpointing for portable Pipelines
   -

   Fixed an SDK harness memory leak
   -

   Improved testing of components
   -

   Ability to pass custom pipeline options to the Runner
   -

   Integration of user state (state accessed within SDK user code)


RabbitMQ IO (by: Jean-Baptiste Onofré)

   -

   New IO to support Rabbit MQ


HadoopFormatIO (by: Alexey Romanenko, David Moravek, David Hrbacek)

   -

   New IO to support hadoop input and output formats.
   -

   https://s.apache.org/beam-streaming-hofio



New Members

New Contributors

   -

   Sam Rohde, Seattle, WA
   -

      Working on Cloud Dataflow and starting work on Beam soon


Talks & Meetups

Meet Apache Beam @ Prague

   -

   Czech & Slovak Hadoop User Group (CS HUG).
   -

   https://www.meetup.com/CS-HUG/events/255361277/
   -

   Vaclav Plajt, David Moravek



Resources

Indexing Documents into Elasticsearch using Cloud Dataflow (by: Sameer
Abhyankar)

   -

   This post demonstrates the process of using Apache Beam for reading JSON
   documents from Cloud Pub/Sub, enhancing the document using metadata stored
   in Cloud Bigtable and indexing those documents into Elasticsearch. The
   pipeline also validates the documents for correctness and availability of
   metadata and publishes any documents that fail validation into another
   Cloud Pub/Sub topic for debugging and eventual reprocessing. Medium Post
   
<https://medium.com/google-cloud/using-cloud-dataflow-to-index-documents-into-elasticsearch-b3a31e999dfc>.


*Until Next Time!*

*This edition was curated by our community of contributors, committers and
PMCs. It contains work done in September 2018 and ongoing efforts. We hope
to provide visibility to what's going on in the community, so if you have
questions, feel free to ask in this thread.*
-- 
Rose Thị Nguyễn

Reply via email to