Different Beam project launched

2023-09-13 Thread Kerry Donny-Clark via dev
https://github.com/slai-labs/get-beam

This seems to overlap with our branding/messaging on ML.
Kerry


Re: Proposal for pyproject.toml Support in Apache Beam Python

2023-08-28 Thread Kerry Donny-Clark via dev
+1
Hi Anand,
I appreciate this effort. Managing python dependencies has been a major
pain point for me, and I think this approach would help.
Kerry

On Mon, Aug 28, 2023 at 10:14 AM Anand Inguva via dev 
wrote:

> Hello Beam Dev Team,
>
> I've compiled a design document
> [1]
> proposing the integration of pyproject.toml into Apache Beam's Python build
> process. Your insights and feedback would be invaluable.
>
> What is pyproject.toml?
> pyproject.toml is a configuration file that specifies a project's build
> dependencies and other project-related metadata in a standardized
> format. Before pyproject.toml, Python projects often had multiple
> configuration files (like setup.py, setup.cfg, and requirements.txt).
> pyproject.toml aims to centralize these configurations into one place,
> making project setups more organized and straightforward. One of the
> significant features enabled by pyproject.toml is the ability to perform
> isolated builds. This ensures that build dependencies are separated from
> the project's runtime dependencies, leading to more consistent and
> reproducible builds.
>
> [1]
> https://docs.google.com/document/d/17-y48WW25-VGBWZNyTdoN0WUN03k9ZhJjLp9wtyG1Wc/edit#heading=h.wskna8eurvjv
>
> Thanks,
> Anand
>


Re: [ANNOUNCE] New committer: Ahmed Abualsaud

2023-08-25 Thread Kerry Donny-Clark via dev
Well done Ahmed!

On Fri, Aug 25, 2023 at 9:17 AM Danny McCormick via dev 
wrote:

> Congrats Ahmed!
>
> On Fri, Aug 25, 2023 at 3:16 AM Jan Lukavský  wrote:
>
>> Congrats Ahmed!
>> On 8/25/23 07:56, Anand Inguva via dev wrote:
>>
>> Congratulations Ahmed :)
>>
>> On Fri, Aug 25, 2023 at 1:17 AM Damon Douglas 
>> wrote:
>>
>>> Well deserved! Congratulations, Ahmed! I'm so happy for you.
>>>
>>> On Thu, Aug 24, 2023, 5:46 PM Byron Ellis via dev 
>>> wrote:
>>>
 Congratulations!

 On Thu, Aug 24, 2023 at 5:34 PM Robert Burke 
 wrote:

> Congratulations Ahmed!!
>
> On Thu, Aug 24, 2023, 4:08 PM Chamikara Jayalath via dev <
> dev@beam.apache.org> wrote:
>
>> Congrats Ahmed!!
>>
>> On Thu, Aug 24, 2023 at 4:06 PM Bruno Volpato via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Congratulations, Ahmed!
>>>
>>> Very well deserved!
>>>
>>>
>>> On Thu, Aug 24, 2023 at 6:09 PM XQ Hu via dev 
>>> wrote:
>>>
 Congratulations, Ahmed!

 On Thu, Aug 24, 2023, 5:49 PM Ahmet Altay via dev <
 dev@beam.apache.org> wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Ahmed Abualsaud (ahmedabuals...@apache.org).
>
> Ahmed has been part of the Beam community since January 2022,
> working mostly on IO connectors, made a large amount of contributions 
> to
> make Beam IOs more usable, performant, and reliable. And at the same 
> time
> Ahmed was active in the user list and at the Beam summit helping 
> users by
> sharing his knowledge.
>
> Considering their contributions to the project over this
> timeframe, the Beam PMC trusts Ahmed with the responsibilities of a 
> Beam
> committer. [1]
>
> Thank you Ahmed! And we are looking to see more of your
> contributions!
>
> Ahmet, on behalf of the Apache Beam PMC
>
> [1]
>
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>
>


Re: [PROPOSAL] Design Doc template for PTransforms

2023-08-24 Thread Kerry Donny-Clark via dev
Thanks Kenn! I think this would be a great community resource. While we
don't want to enforce usage, perhaps we could introduce tooling to check
basic compliance and raise a warning. Similar to a linter or test suite.
Kerry

On Thu, Aug 24, 2023 at 10:25 AM Kenneth Knowles  wrote:

> Hi all,
>
> Based on some work I've been doing internally, I put together a public
> version of a design doc template for PTransforms.
>
> https://s.apache.org/ptransform-design-doc
>
> A major goal is to be explicit about important questions that make a
> transform robust:
>
>  - what are "all" the parameters to a transform?
>  - how could a transform fail?
>  - how could we monitor or measure the transform?
>  - how could we use a transform in a new context like YAML or a new SDK?
>
> All of these together add up to a PTransform being a more self-contained
> piece of software that can be understood and used in novel ways, instead of
> just defined by the code and behavior that may accrete over time tightly
> coupled to the SDK it was written with.
>
> LMK what you think. Of course, I can't force anyone to use it or not use
> it, except for my team internal to my employer :-)
>
> Kenn
>


Re: Beam Website Feedback

2023-08-24 Thread Kerry Donny-Clark via dev
Thanks Jonas. Can you please submit a quick PR to update the text? I'm
happy to review.
Kerry

On Wed, Aug 23, 2023, 10:03 PM Jonas Eyob  wrote:

> Hi on this page:
> https://beam.apache.org/documentation/io/built-in/google-bigquery/#storage-api
>
>
>
> Under ”Using Storage Read API” there is a paragraph and example showing
> how Beam SDK for Python can use the Bigquery Storage API.
>
> But in code snippet box just below it says *“The SDK for Python does not
> support the BigQuery Storage API”*
>
>
>
> As it seems supported perhaps this needs updating.
>
>
>
> Cheers,
>
> Jonas
> --
> ATTENTION: This e-mail may contain confidential information that is
> intended solely for the addressee. If you are not the intended recipient,
> you should delete this message and are hereby notified that any disclosure,
> copying, or distribution of this message, or the taking of any action based
> on it, is strictly prohibited.
> --
> PRIVACY NOTICE: Your privacy is important for us at ICA Gruppen AB and its
> subsidiaries (ICA). We are transparent with how we collect and process any
> personal data that you share with us. More detailed information on how we
> process your personal data can be found at www.ica.se/dataskydd.
>


Re: [RFC] Bootloader Buffered Logging

2023-08-17 Thread Kerry Donny-Clark via dev
Much appreciated, reviewing the doc now.

On Wed, Aug 16, 2023 at 9:08 PM Valentyn Tymofieiev via dev <
dev@beam.apache.org> wrote:

> Thanks, Jack! left some comments, looking forward to this work!
>
> On Wed, Aug 16, 2023 at 10:31 AM Robert Burke  wrote:
>
>> I've added some comments but generally +1 on this.
>>
>> A later change might be able to build from this to ensure the various
>> STDErr and STDOut logs from the SDK harness executions are always plumbed
>> as described.
>>
>> But that would take more thought since other incidental logs from the
>> users worker binary (sic) might be misconstrued as serious when they were
>> largely benign noise previously ignored (since they were invisible).
>>
>> On Wed, Aug 16, 2023, 9:57 AM Jack McCluskey via dev 
>> wrote:
>>
>>> Hey everyone,
>>>
>>> I've written a small design doc around implementing some buffered
>>> logging for the Beam boot.go scripts that is available at
>>> https://s.apache.org/beam-buffered-logging. This should help surface
>>> errors that occur during worker set-up (like issues with dependency
>>> installation) that tend to be logged improperly at INFO.
>>>
>>> Thanks,
>>>
>>> Jack McCluskey
>>>
>>> --
>>>
>>>
>>> Jack McCluskey
>>> SWE - DataPLS PLAT/ Dataflow ML
>>> RDU
>>> jrmcclus...@google.com
>>>
>>>
>>>


Re: [Discuss] Get rid of OWNERS files

2023-08-08 Thread Kerry Donny-Clark via dev
Thanks Danny!
I agree. OWNERS causes unnecessary friction, and doesn't provide value.
Kerry

On Tue, Aug 8, 2023 at 10:55 AM Danny McCormick via dev 
wrote:

> Hey everyone, I'd like to propose getting rid of OWNERS files from the
> Beam repo. Right now, I don't think they are serving a meaningful purpose:
>
> - Many OWNERS files are outdated and point to people who are no longer
> actively involved in the project (examples: 1
> , 2
> , 3
> ,
> there are many more)
> - Many dependencies don't have owners assigned
> - Many major directories function fine without OWNERS files
> - We lack sufficient documentation of what OWNERS files mean (
> https://s.apache.org/beam-owners is not helpful and I couldn't find other
> resources)
> - We now have the review bot to automatically assign reviewers based on
> areas of ownership. That has proven more likely to stay up to date.
>
> Given all of these, I don't see any obvious usefulness for OWNERS files.
> Please chime in if you disagree (or agree). If there are no objections I'll
> assume silent consensus and remove them next week.
>
> Thanks,
> Danny
>


Re: Asgarde: Error Handling for Beam?

2023-06-15 Thread Kerry Donny-Clark via dev
This looks like an excellent contribution. I can easily understand the
motivation, and I think Beam would benefit from a higher level abstraction
for error handling.
Kerry

On Wed, Jun 14, 2023, 6:31 PM Austin Bennett  wrote:

> Hi Beam Devs,
>
> @Mazlum  was
> suggested to consider donating Asgarde
>  to Beam for Java/Kotlin error
> handling to Beam [ see:
> https://2022.beamsummit.org/sessions/error-handling-asgarde/ for last
> year's Beam Summit talk ], he is also the author of Pasgard
> e [ for Python ] and Milgard [ for
> a simplified Kotlin API ].
>
> Would Asgarde be a good contribution, something the Beam community would
> be willing to accept?  I imagine we might want it to live at
> github.com/apache/beam-asgarde ?  Or perhaps there is a good place in
> github.com/apache/beam ??
>
> Especially once/if officially part of Beam, I imagine we'd add follow-up
> items like getting onto the website/docs, and related.
>
> Cheers,
> Austin
>
>
> P.S.  This might warrant separate/additional conversations for his other
> libraries, but let's focus any discussion on Asgarde for now?
>


Re: [beam-starter-typescript]: Missing place to create issue

2023-06-14 Thread Kerry Donny-Clark via dev
Jack may also be able to help you create an issue.
Kerry

On Wed, Jun 14, 2023, 1:09 PM XQ Hu via dev  wrote:

> I believe Robert is the owner for that project.
>
> On Mon, Jun 12, 2023 at 11:30 PM david-kh...@hotmail.com <
> david-kh...@hotmail.com> wrote:
>
>> Hi Beam community,
>>
>>
>>
>> I am David and new to the community. After tried to tweak some code from
>> beam-starter-ts, I have found some issues and want to raise. But there is
>> no way I can create an Github issue in the same project
>>
>> apache/beam-starter-typescript: Apache beam (github.com)
>> .
>>
>>
>>
>> I also double check the Contribute.md and get no idea still.
>>
>>
>>
>> Would you mind guide me to the right path?
>>
>>
>>
>> Regards,
>>
>> David L.
>>
>


[Proposal] DNS name for Tour of Beam site

2023-05-26 Thread Kerry Donny-Clark via dev
Hi all,
I would like to update everyone on a small DNS change for the Tour Of Beam,
a new Beam interactive learning app we've been working on. We are excited
to get it live and share it with the community, and one of the remaining
steps for us is to make a DNS record for 'tour.beam.apache.org' and point
it to our production environment hosted in Firebase. This does not require
any action from you, but rather it's an update to the community about a
change to our infra. Feel free to ask questions if there are any, and look
out for a live Tour of Beam site as soon as we get this completed.

Kerry


Re: [ANNOUNCE] New committer: Damon Douglas

2023-04-24 Thread Kerry Donny-Clark via dev
Damon, you have done outstanding work to grow and improve Beam and the Beam
community. Well done, well deserved!

On Mon, Apr 24, 2023 at 4:39 PM XQ Hu via dev  wrote:

> Congrats Damon!!!
>
> On Mon, Apr 24, 2023 at 4:34 PM Danny McCormick via dev <
> dev@beam.apache.org> wrote:
>
>> Congrats Damon!
>>
>> On Mon, Apr 24, 2023 at 4:03 PM Ahmet Altay via dev 
>> wrote:
>>
>>> Congratulations Damon!
>>>
>>> On Mon, Apr 24, 2023 at 1:00 PM Robert Burke  wrote:
>>>
 Congratulations Damon!!!

 On Mon, Apr 24, 2023, 12:52 PM Kenneth Knowles  wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Damon Douglas (damondoug...@apache.org)
>
> Damon has contributed widely: Beam Katas, playground, infrastructure,
> and many IO connectors. Damon does lots of code review in addition to 
> code.
> (yes, you can review code as a non-committer!)
>
> Considering their contributions to the project over this timeframe,
> the Beam PMC trusts Damon with the responsibilities of a Beam committer. 
> [1]
>
> Thank you Damon! And we are looking to see more of your contributions!
>
> Kenn, on behalf of the Apache Beam PMC
>
> [1]
>
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>



Re: [ANNOUNCE] New committer: Anand Inguva

2023-04-24 Thread Kerry Donny-Clark via dev
Great work Anand, this is well deserved.


On Mon, Apr 24, 2023 at 10:35 AM Yi Hu via dev  wrote:

> Congrats Anand!
>
> On Fri, Apr 21, 2023 at 3:54 PM Danielle Syse via dev 
> wrote:
>
>> Congratulations!
>>
>> On Fri, Apr 21, 2023 at 3:53 PM Damon Douglas via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Congratulations Anand!
>>>
>>> On Fri, Apr 21, 2023 at 12:28 PM Ritesh Ghorse via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Congratulations Anand!

 On Fri, Apr 21, 2023 at 3:24 PM Ahmed Abualsaud via dev <
 dev@beam.apache.org> wrote:

> Congrats Anand!
>
> On Fri, Apr 21, 2023 at 3:18 PM Anand Inguva via dev <
> dev@beam.apache.org> wrote:
>
>> Thanks everyone. Really excited to be a part of Beam Committers.
>>
>> On Fri, Apr 21, 2023 at 3:07 PM XQ Hu via dev 
>> wrote:
>>
>>> Congratulations, Anand!!!
>>>
>>> On Fri, Apr 21, 2023 at 2:31 PM Jack McCluskey via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Congratulations, Anand!

 On Fri, Apr 21, 2023 at 2:28 PM Valentyn Tymofieiev via dev <
 dev@beam.apache.org> wrote:

> Congratulations!
>
> On Fri, Apr 21, 2023 at 8:19 PM Jan Lukavský 
> wrote:
>
>> Congrats Anand!
>> On 4/21/23 20:05, Robert Burke wrote:
>>
>> Congratulations Anand!
>>
>> On Fri, Apr 21, 2023, 10:55 AM Danny McCormick via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Woohoo, congrats Anand! This is very well deserved!
>>>
>>> On Fri, Apr 21, 2023 at 1:54 PM Chamikara Jayalath <
>>> chamik...@apache.org> wrote:
>>>
 Hi all,

 Please join me and the rest of the Beam PMC in welcoming a new
 committer: Anand Inguva (ananding...@apache.org)

 Anand has been contributing to Apache Beam for more than a year
 and  authored and reviewed more than 100 PRs. Anand has been a core
 contributor to Beam Python SDK and drove the efforts to support 
 Python 3.10
 and Python 3.11.

 Considering their contributions to the project over this
 timeframe, the Beam PMC trusts Anand with the responsibilities of 
 a Beam
 committer. [1]

 Thank you Anand! And we are looking to see more of your
 contributions!

 Cham, on behalf of the Apache Beam PMC

 [1]
 https://beam.apache.org/contribute/become-a-committer
 /#an-apache-beam-committer

>>>


Re: [DISCUSS] @Experimental, @Internal, @Stable, etc annotations

2023-04-17 Thread Kerry Donny-Clark via dev
+1 to eliminating @Experimental as a Beam level annotation.
I think the main point is that if no one pays attention to such
annotations, then they are only noise and deliver negative value.
Kerry

PS- Kenn says " the point about the culture of stagnation came from my
recent experiences as code reviewer where there was some idea that we
couldn't change things even when they were plainly wrong and the change was
plainly a fix." This seems like a major point that deserves a more focused
discussion.

On Fri, Apr 14, 2023 at 5:47 PM Chamikara Jayalath via dev <
dev@beam.apache.org> wrote:

> I think we've been using the Java Experimental tags in two ways.
>
> * New APIs
> * Any APIs that use specific features identified by pre-defined
> experimental Kind types defined in [1] (for example, I/O connectors APIs
> that use Beam Schemas).
>
> Removing the experimental tag has the effect of finalizing a number of
> APIs we've been reluctant to call stable (for example, Beam Schemas,
> portability, metrics related APIs). These APIs have been around for a long
> time and I don't see them changing so probably this is the right thing to
> do. But I just wanted to call it out.
>
> Thanks,
> Cham
>
> [1]
> https://github.com/apache/beam/blob/b9f27f9da2e63b564feecaeb593d7b12783192b0/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java#L48
>
> On Fri, Apr 14, 2023 at 1:26 PM Ahmet Altay via dev 
> wrote:
>
>>
>>
>> On Fri, Apr 14, 2023 at 1:15 PM Kenneth Knowles  wrote:
>>
>>>
>>> Thanks for the discussion. Many good points. Probably just removing all
>>> the annotations is a noop to users, and will solve the "afraid to use
>>> experimental features" problem.
>>>
>>> Regarding stability, the capabilities of Java (and Python is much much
>>> worse) make it infeasible to produce quality software with the rule "once
>>> it is public it is frozen forever". But on the other hand, there isn't much
>>> of a practical alternative. Most projects just make breaking changes at
>>> minor releases quite often, in my experience. I don't want to follow that
>>> pattern, for sure.
>>>
>>> Regarding Danny's comment of not seeing this culture - check out any of
>>> our more mature IOs, which all have very high cyclomatic complexity due to
>>> never being significantly refactored. Adhering to in-place state
>>> compatibility for update instead of focusing on blue/green deployment is
>>> also a culprit here. I don't have examples to mind, but the point about the
>>> culture of stagnation came from my recent experiences as code
>>> reviewer where there was some idea that we couldn't change things even when
>>> they were plainly wrong and the change was plainly a fix.
>>>
>>> Often, it comes from corners like triggered side inputs where we simply
>>> never had a clear concept and so bringing things into alignment with a spec
>>> will break someone, by necessity. To be clear: I have not received pushback
>>> on that one (yet). Some other examples are
>>> https://s.apache.org/finishing-triggers-drop-data (breaking change
>>> necessary to eliminate data loss risk)
>>> https://github.com/apache/beam/issues/20528 (fix was too slow because
>>> we were hesitant to commit a breaking fix)
>>> https://github.com/apache/beam/pull/8134#pullrequestreview-218592801
>>> (left unsafe API in place, applied doc-only fix).
>>>
>>> But indeed, of all the issues I raised, the customer concern with
>>> `@Experimental` was the most important. We have had a few threads about it
>>> in the past, too, and it hasn't gotten better.
>>>
>>>  1. It does not have the intended effect (making users OK with evolving
>>> APIs and behavior to allow us to reach a high level of quality)
>>>  2. It has an unintended effect (making users afraid to use things which
>>> they should be happy to use)
>>>  3. We don't use it consistently (many less-safe things are not
>>> experimental, many totally stable things are experimental)
>>>
>>> Because of 3, if we don't have a feasible way to move to
>>> "evolving/unstable by default" in a way that users know and are OK with,
>>> then 1 is impossible. And so the only way to fix 2 is to just eliminate the
>>> annotation approach entirely and go with language conventions.
>>>
>>
>> +1 to eliminating @Experimental as a Beam level annotation. That is the
>> simplest approach that will get us to a consistent state, and it will align
>> the goals and intentions of us with users'.
>>
>>
>>>
>>> Kenn
>>>
>>> On Wed, Apr 12, 2023 at 5:10 PM Ahmet Altay via dev 
>>> wrote:
>>>
 I agree with Alexey and Byron.
 1. We do not have any concrete evidence of our users paying attention
 to any of those annotations. Experimental API that were in that state for a
 long while are good examples. A possible exception is a deprecated
 annotation. My preference would be to simplify annotations to nothing
 (stable enough for use and will evolve backward compatibility), and maybe
 deprecated annotations.
 2. If you 

Re: Beam Release DockerHub Group

2023-04-17 Thread Kerry Donny-Clark via dev
+1, should there also be an update to remove folks who are not active on
the project?
Kerry

On Mon, Apr 17, 2023 at 11:40 AM Jack McCluskey via dev 
wrote:

> +1 to simplifying the infra side, especially with an aim towards
> automating the processes we can. The more we can streamline and simplify
> the better.
>
> On Mon, Apr 17, 2023 at 11:18 AM Danny McCormick via dev <
> dev@beam.apache.org> wrote:
>
>> Hey everyone, in an effort to reduce the burden of running a Beam
>> release, a few committers (self included) have volunteered to try to take a
>> larger role in releases (including both running them and contributing to
>> making them better going forward). To aid in that process, I would like to
>> request that they all be added to our Beam DockerHub group. Those
>> committers are:
>>
>> - damccorm
>> - jrmccluskey
>> - kennknowles
>> - lostluck
>> - abacn
>>
>> At the same time, Infra would like us to reduce the number of people with
>> DockerHub seats to 5 because they have a limited number of seats for all
>> of Apache. Currently, we have 2 groups taking up 10 seats: Beam admin and
>> Beam maintainers.
>>
>> Beam admin has admin privileges over most (though not quite all) of our
>> DockerHub repos and includes:
>>
>> - aaltay
>> - hannahjiang
>> - kileysok
>> - pabloem
>> - robertwb
>>
>> Beam maintainers has write privileges and some additional admin
>> privileges and includes:
>>
>> - aaltay
>> - chamikaramj
>> - kennknowles
>> - kileysok
>> - robertwb
>>
>> To get down to 5 seats, I propose we consolidate to a single group with
>> admin privileges and add just the committers I mentioned since they will
>> likely be the most actively involved in the release process in the short
>> term. A future goal of mine is to automate the DockerHub release steps so
>> that we just need 2 dockerhub seats: 1 for the automation and 1 for an easy
>> manual fallback (probably for the PMC chair).
>>
>> If you have any concerns (or would like to help with this effort), please
>> respond here. Otherwise I will follow up with infra to make this change in
>> a day.
>>
>> Thanks,
>> Danny
>>
>


Re: 2.47.0 Release Update and Brief Code Freeze

2023-04-06 Thread Kerry Donny-Clark via dev
Thanks for the transparent and quick explanation! Identifying an issue,
taking steps to correct it, and documenting what went wrong is exactly how
we should address isssues like this.
Kerry

On Wed, Apr 5, 2023, 4:53 PM Jack McCluskey via dev 
wrote:

> Hey everyone,
>
> While I was working on cutting the release branch this afternoon I hit a
> snag with the release script, and in trying to revert the halfway-completed
> work on that the entire Beam repo was (briefly)
> deleted. This has been fixed, but as a consequence of this and the
> permissions on the repo the history for the repo is messed up (AKA every
> file has a removal and restoration in its history now.) While we work with
> infra to gain permission to amend the history (tracking for that will be at
> https://issues.apache.org/jira/browse/INFRA-24433) we will have a brief
> code freeze to ensure that everything is restored properly.
>
> I take full responsibility for the incident and will be working on
> diagnosing how this happened + improving the release process so this can't
> happen again.
>
> Thanks,
>
> Jack McCluskey
> --
>
>
> Jack McCluskey
> SWE - DataPLS PLAT/ Dataflow ML
> RDU
> jrmcclus...@google.com
>
>
>


Re: Launch Dataflow Flex Templates from Go

2023-02-15 Thread Kerry Donny-Clark via dev
Jack added the Go templates capabilities, he should be able to help you
out.

On Wed, Feb 15, 2023, 12:37 AM Ashok KS  wrote:

> Hi Shivam,
>
> Thanks a lot for your response. I did check the http request. But I wanted
> to see if I can use the Google API client Library.
> The docs show a Python example for it shown below. I wanted to know if
> there is something similar with Go.
>
> from googleapiclient.discovery import build
>
> # project = 'your-gcp-project'
> # job = 'unique-job-name'
> # template = 'gs://dataflow-templates/latest/Word_Count'
> # parameters = {
> # 'inputFile': 'gs://dataflow-samples/shakespeare/kinglear.txt',
> # 'output': 'gs:///wordcount/outputs',
> # }
>
> dataflow = build('dataflow', 'v1b3')
> request = dataflow.projects().templates().launch(
> projectId=project,
> gcsPath=template,
> body={
> 'jobName': job,
> 'parameters': parameters,
> }
> )
>
> response
>
>


Re: Exploring existing Beam features to prevent web service API overuse

2023-02-09 Thread Kerry Donny-Clark via dev
Thanks Damon, I appreciate the data-driven effort you are making to test
different approaches to rate limiting in Beam.

On Thu, Feb 9, 2023 at 12:54 AM Damon Douglas 
wrote:

> Hello Everyone,
>
> The following exploratory study proposal aims to evaluate existing
> features of Beam to prevent web service API overusage.  API providers
> typically design for application workloads smaller than parallelized data
> processing.  This presents a challenge when using these resources to read
> from and write to in the context of Beam.
>
> The study defines primary and secondary measures, identifying experimental
> groups based on a key Beam feature applied to its pipeline design, such as
> windowing or the State API.  The data will foster evidence based approaches
> to designing a solution to this problem.
>
>
> https://docs.google.com/document/d/1VZ9YphDO7kewBSz5oMXVPHWaib3S03Z6aZ66BhciB3E/edit?usp=sharing=0-ItxMSG72EzfSwVedSz-Zeg
>
> Best,
>
> Damon
>


Re: Achievement unlocked: fully triaged

2022-12-06 Thread Kerry Donny-Clark via dev
I really like the idea of multi-select and automatic "awaiting triage".
Kenn, I think the list you have looks good to me.

On Tue, Dec 6, 2022 at 1:55 PM Kenneth Knowles  wrote:

> Noting that what you've listed are the options in the issue template,
> which are then expanded to multiple labels. So focusing on the issue
> template, I like the general idea, but maybe we can simplify it even more:
>
> When a user is filing a bug, I think a good outcome is for it to get into
> the right person's saved search (like Go, Python, etc) while still having
> the "awaiting triage" label on it.
>
> What if we just went all the way simple and had checkboxes for just the
> highest level. Something like the following:
>
> Which language SDK or feature is related to your report? (check all that
> apply)
> [ ] Python
> [ ] Java
> [ ] Go
> [ ] Typescript
> [ ] IO connector
> [ ] Beam examples
> [ ] Beam playground
> [ ] Beam katas
> [ ] Website
> [ ] Spark Runner
> [ ] Flink Runner
> [ ] Samza Runner
> [ ] Twister2 Runner
> [ ] Hazelcast Jet Runner
> [ ] Google Cloud Dataflow Runner
>
> We could even trim it even further to just language, and let the person
> doing triage handle the rest.
>
> Kenn
>
> On Tue, Dec 6, 2022 at 9:11 AM Danny McCormick via dev <
> dev@beam.apache.org> wrote:
>
>> > Is it possible to not have a default option?
>>
>> Sadly, no AFAIK. I agree this would help. We could try things like making
>> the default " " and auto-closing issues that don't pick something other
>> than the default, that's a pretty rough experience though and not worth it
>> IMO.
>>
>> > I definitely think reducing the label zoo could help.
>>
>> What's our desired end state here? I put together a doc with my suggested
>> labels -
>> https://docs.google.com/document/d/1FpaFr_Sdg217ogd5oMDRX4uLIMSatKLF_if9CzLg9tM/edit?usp=sharing
>>  -
>> listed below as well for convenience. Please comment in the doc if you have
>> thoughts/labels you care about, or continue the email thread if you have
>> bigger ideas (e.g. getting rid of labels, changing our templates entirely
>> instead, etc...).
>>
>> *Danny's Proposed Labels:*
>>
>>
>>-
>>
>>beam-community
>>-
>>
>>beam-playground
>>-
>>
>>community-metrics
>>-
>>
>>cross-language
>>-
>>
>>examples-java
>>-
>>
>>examples-python
>>-
>>
>>extensions
>>-
>>
>>infrastructure
>>-
>>
>>io-go
>>-
>>
>>io-ideas
>>-
>>
>>io-java
>>-
>>
>>io-py
>>-
>>
>>katas
>>-
>>
>>release
>>-
>>
>>run-inference
>>-
>>
>>runner
>>-
>>
>>runner-dataflow
>>-
>>
>>runner-direct
>>-
>>
>>runner-flink
>>-
>>
>>runner-samza
>>-
>>
>>runner-spark
>>-
>>
>>runner-universal
>>-
>>
>>sdk-go
>>-
>>
>>sdk-ideas
>>-
>>
>>sdk-java
>>-
>>
>>sdk-py
>>-
>>
>>sdk-typescript
>>-
>>
>>test-failures
>>-
>>
>>website
>>
>>
>> On Tue, Dec 6, 2022 at 11:17 AM Bjorn Pedersen 
>> wrote:
>>
>>> As someone still newer to Beam, I can attest that the number of labels
>>> can be overwhelming.
>>>
>>> Is it possible to not have a default option? Even just getting people to
>>> interact with the dropdown might go a long way, especially if the labels
>>> were fewer and clearer.
>>>
>>> Bjorn
>>>
>>> On Mon, Dec 5, 2022 at 6:46 PM Kenneth Knowles  wrote:
>>>
>>>> I definitely think reducing the label zoo could help. We have a lot of
>>>> labels that are decompositions of what used to be Jira components.
>>>>
>>>> Kenn
>>>>
>>>> On Mon, Dec 5, 2022 at 12:17 PM Danny McCormick via dev <
>>>> dev@beam.apache.org> wrote:
>>>>
>>>>> > Previously, we had automation that would automatically mark
>>>>> self-assigned self-reported issues as triaged. That is probably a third of
>>>>> issues or more.
>>>>>
>>>>> I believe that automatio

Re: Gradle Task Configuration Avoidance

2022-12-05 Thread Kerry Donny-Clark via dev
Thanks Damon! I really appreciate how clear your emails are here. Instead
of my usual feeling of "I don't quite understand, and don't have time to
get context" I can read all the context in the mail.
This error message had confused me, so I really appreciate the cleanup and
explanation.

On Fri, Dec 2, 2022, 7:28 PM Damon Douglas via dev 
wrote:

> Hello Everyone,
>
> *If you are new to Beam and coming from non-Java language conventions, it
> is likely you are new to gradle.  At the end of this email is a list of
> definitions and references to help understand this email.*
>
> *Short Version (For those who know gradle)*:
> A pull request [1] may fix the continual error message "Error: Backend
> initialization required, please run "terraform init"".  The PR applies Task
> Configuration Avoidance [2] by applying changes to a few tasks from
> tasks(String) to tasks.register(String).
>
> *Long Version (For those who are not as familiar with gradle)*:
>
> I write this not as an expert but as someone still learning.  Gradle [3]
> is the software we use in the Beam repository to automate many needed tasks
> associated with building and testing code.  It is typically used in Java
> projects but can be extended for other purposes.  We store code related to
> our Beam Playground [4] that also uses gradle though it is not mainly a
> Java project.  The unit of work for Gradle is what is called a task.  To
> run a task you open a terminal and type "./gradlew nameOfMyTask".  There
> are two main ways to create a custom task in our build.gradle files.  One
> is writing task("doSomething") and the other is
> tasks.register("doSomethingElse").  According to [2], the recommendation is
> to use the tasks.register("doSomething").  This avoids executing other work
> (configuration but don't worry about it for now) until one runs the
> doSomething task or another task we are running depends on it.
>
> So why were we seeing this "Error: Backend initialization required"
> message all the time?  The reason is that tasks were configured as
> task("doSomething").  All I had to do was change this to
> tasks.register("doSomething") and it removed the message.
>
> *Definitions/References*
>
> 1. https://github.com/apache/beam/pull/24509
> 2.
> https://docs.gradle.org/current/userguide/task_configuration_avoidance.html
> 3. https://docs.gradle.org/current/userguide/what_is_gradle.html
> 4. https://play.beam.apache.org/
>
> *Suggested Learning Path To Understand This Email*
> 1.
> https://docs.gradle.org/current/samples/sample_building_java_libraries.html
> 2. https://docs.gradle.org/current/userguide/build_lifecycle.html
> 3. https://docs.gradle.org/current/userguide/tutorial_using_tasks.html
> 4.
> https://docs.gradle.org/current/userguide/task_configuration_avoidance.html
>
> Best,
>
> Damon
>
>


Re: Achievement unlocked: fully triaged

2022-12-05 Thread Kerry Donny-Clark via dev
This is a glorious achievement Kenn! To keep things clean going forward are
there any improvements we can make in our issue creation flow?

On Fri, Dec 2, 2022, 6:44 PM Kenneth Knowles  wrote:

> Hi all,
>
> I've finally done it! I've emptied the label "awaiting triage". Help me
> keep it that way! This ensures that we actually at least *look* at each
> issue once, preferably soon after it is filed. The idea is that you make
> sure the priority and other labels are right, since users are not expected
> to know how we use labels.
>
>
> https://github.com/apache/beam/issues?q=is%3Aissue+is%3Aopen+label%3A%22awaiting+triage%22
>
> Kenn
>


Re: Open Pull Request to improve documentation

2022-11-16 Thread Kerry Donny-Clark via dev
Hi Laksh,
I see that you have found a reviewer. In the future, or for anyone else, we
have GitHub Actions that should automatically assign a reviewer and guide
the review and approval flows. If you wait for a day and don't see a
reviewer automatically assigned please email the dev list and we will try
to fix it.
Kerry

On Wed, Nov 16, 2022 at 1:24 PM Laksh  wrote:

> Hello,
>
> I would like to join the Apache beam slack channel as an observer. I
> recently opened a PR (https://github.com/apache/beam/pull/24199) and I am
> looking for a reviewer. I came across this
> https://infra.apache.org/slack.html, so thought it would be helpful to
> join the community.
>
> Please do the needful!
>
> Best,
> Laksh
>


Re: [ANNOUNCE] New committer: Yi Hu

2022-11-10 Thread Kerry Donny-Clark via dev
Great job Yi! I am happy to see your contributions recognized.

On Thu, Nov 10, 2022 at 11:52 AM Yi Hu via dev  wrote:

> Thank you for the help of you all over the time, and I am glad to
> contribute and help with the community.
>
> Best,
> Yi
>
> On Thu, Nov 10, 2022 at 11:29 AM Alexey Romanenko <
> aromanenko@gmail.com> wrote:
>
>> Congratulations! Well deserved!
>>
>> —
>> Alexey
>>
>> On 9 Nov 2022, at 21:01, Tomo Suzuki via dev  wrote:
>>
>> Congratulations!
>>
>> On Wed, Nov 9, 2022 at 3:00 PM John Casey via dev 
>> wrote:
>>
>>> Congrats! this is well deserved YI
>>>
>>> On Wed, Nov 9, 2022 at 2:58 PM Austin Bennett <
>>> whatwouldausti...@gmail.com> wrote:
>>>
 Congrats, and Thanks, Yi!

 On Wed, Nov 9, 2022 at 11:24 AM Valentyn Tymofieiev via dev <
 dev@beam.apache.org> wrote:

> I am with the Beam PMC on this, congratulations and very well
> deserved, Yi!
>
> On Wed, Nov 9, 2022 at 11:08 AM Byron Ellis via dev <
> dev@beam.apache.org> wrote:
>
>> Congratulations!
>>
>> On Wed, Nov 9, 2022 at 11:00 AM Pablo Estrada via dev <
>> dev@beam.apache.org> wrote:
>>
>>> +1 thanks Yi : D
>>>
>>> On Wed, Nov 9, 2022 at 10:47 AM Danny McCormick via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Congrats Yi! I've really appreciated the ways you've consistently
 taken responsibility for improving our team's infra and working through
 sharp edges in the codebase that others have ignored. This is 
 definitely
 well deserved!

 Thanks,
 Danny

 On Wed, Nov 9, 2022 at 1:37 PM Anand Inguva via dev <
 dev@beam.apache.org> wrote:

> Congratulations Yi!
>
> On Wed, Nov 9, 2022 at 1:35 PM Ritesh Ghorse via dev <
> dev@beam.apache.org> wrote:
>
>> Congratulations Yi!
>>
>> On Wed, Nov 9, 2022 at 1:34 PM Ahmed Abualsaud via dev <
>> dev@beam.apache.org> wrote:
>>
>>> Congrats Yi!
>>>
>>> On Wed, Nov 9, 2022 at 1:33 PM Sachin Agarwal via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Congratulations Yi!

 On Wed, Nov 9, 2022 at 10:32 AM Kenneth Knowles <
 k...@apache.org> wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Yi Hu (y...@apache.org)
>
> Yi started contributing to Beam in early 2022. Yi's
> contributions are very diverse! I/Os, performance tests, Jenkins, 
> support
> for Schema logical types. Not only code but a very large amount 
> of code
> review. Yi is also noted for picking up smaller issues that 
> normally would
> be left on the backburner and filing issues that he finds rather 
> than
> ignoring them.
>
> Considering their contributions to the project over this
> timeframe, the Beam PMC trusts Yi with the responsibilities of a 
> Beam
> committer. [1]
>
> Thank you Yi! And we are looking to see more of your
> contributions!
>
> Kenn, on behalf of the Apache Beam PMC
>
> [1]
>
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>

>>
>> --
>> Regards,
>> Tomo
>>
>>
>>


Re: [ANNOUNCE] New committer: Ritesh Ghorse

2022-11-04 Thread Kerry Donny-Clark via dev
Congratulations Ritesh, I'm happy to see your hard work and community
spirit recognized!

On Fri, Nov 4, 2022 at 10:16 AM Jack McCluskey via dev 
wrote:

> Congrats Ritesh!
>
> On Thu, Nov 3, 2022 at 10:12 PM Danny McCormick via dev <
> dev@beam.apache.org> wrote:
>
>> Congrats Ritesh! This is definitely well deserved!
>>
>> On Thu, Nov 3, 2022 at 8:08 PM Robert Burke  wrote:
>>
>>> Woohoo! Well done Ritesh! :D
>>>
>>> On Thu, Nov 3, 2022, 5:04 PM Anand Inguva via dev 
>>> wrote:
>>>
 Congratulations Ritesh.

 On Thu, Nov 3, 2022 at 7:51 PM Yi Hu via dev 
 wrote:

> Congratulations Ritesh!
>
> On Thu, Nov 3, 2022 at 7:23 PM Byron Ellis via dev <
> dev@beam.apache.org> wrote:
>
>> Congratulations!
>>
>> On Thu, Nov 3, 2022 at 4:21 PM Austin Bennett <
>> whatwouldausti...@gmail.com> wrote:
>>
>>> Congratulations, and Thanks @riteshgho...@apache.org!
>>>
>>> On Thu, Nov 3, 2022 at 4:17 PM Sachin Agarwal via dev <
>>> dev@beam.apache.org> wrote:
>>>
 Congrats Ritesh!

 On Thu, Nov 3, 2022 at 4:16 PM Kenneth Knowles 
 wrote:

> Hi all,
>
> Please join me and the rest of the Beam PMC in welcoming a new
> committer: Ritesh Ghorse (riteshgho...@apache.org)
>
> Ritesh started contributing to Beam in mid-2021 and has
> contributed immensely to bringin the Go SDK to fruition, in addition 
> to
> contributions to Java and Python and release validation.
>
> Considering their contributions to the project over this
> timeframe, the Beam PMC trusts Ritesh with the responsibilities of a 
> Beam
> committer. [1]
>
> Thank you Ritesh! And we are looking to see more of your
> contributions!
>
> Kenn, on behalf of the Apache Beam PMC
>
> [1]
>
> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>



Re: [DISCUSS] Dependency management in Apache Beam Python SDK

2022-08-26 Thread Kerry Donny-Clark via dev
Jarek, I really appreciate you sharing your experience and expertise here.
I think Beam would benefit from adopting some of these practices.
Kerry

On Fri, Aug 26, 2022, 7:35 AM Jarek Potiuk  wrote:

>
>> I'm curious Jarek, does Airflow take any dependencies on popular
>> libraries like pandas, numpy, pyarrow, scipy, etc... which users are likely
>> to have their own dependency on? I think these dependencies are challenging
>> in a different way than the client libraries - ideally we would support a
>> wide version range so as not to require users to upgrade those libraries in
>> lockstep with Beam. However in some cases our dependency is pretty tight
>> (e.g. the DataFrame API's dependency on pandas), so we need to make sure to
>> explicitly test with multiple different versions. Does Airflow have any
>> similar issues?
>>
>
> Yes we do (all of those I think :) ). Complete set of all our deps can be
> found here
> https://github.com/apache/airflow/blob/constraints-main/constraints-3.9.txt
> (continuously updated and we have different sets for different python
> versions).
>
> We took a rather interesting and unusual approach (more details in my
> talk) - mainly because Airflow is both an application to install (for
> users) and library to use (for DAG authors) and both have contradicting
> expectations (installation stability versus flexibility in
> upgrading/downgrading dependencies). Our approach is really smart in making
> sure water and fire play well with each other.
>
> Most of those dependencies are coming from optional extras (list of all
> extras here:
> https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html).
> More often than not the "problematic" dependencies you mention are
> transitive dependencies through some client libraries we use (for example
> Apache Beam SDK is a big contributor to those :).
>
> Airflow "core" itself has far less dependencies
> https://github.com/apache/airflow/blob/constraints-main/constraints-no-providers-3.9.txt
> (175 currently) and we actively made sure that all "pandas" of this world
> are only optional extra deps.
>
> Now - the interesting thing is that we use "constraints'' (the links you
> with dependencies that I posted are those constraints) to pin versions of
> the dependencies that are "golden" - i.e. we test those continuously in our
> CI and we automatically upgrade the constraints when all the unit and
> integration tests pass.
> There is a little bit of complexity and sometimes conflicts to handle (as
> `pip` has to find the right set of deps that will work for all our optional
> extras), but eventually we have really one "golden" set of constraints at
> any moment in time main (or v2-x branch - we have a separate set for each
> branch) that we are dealing with. And this is the only "set" of dependency
> versions that Airflow gets tested with. Note - these are *constraints *not
> *requirements *- that makes a whole world of difference.
>
> Then when we release airflow, we "freeze" the constraints with the version
> tag. We know they work because all our tests pass with them in CI.
>
> Then we communicate to our users (and we use it in our Docker image) that
> the only "supported" way of installing airflow is with using `pip` and
> constraints
> https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html.
> And we do not support poetry, pipenv - we leave it up to users to handle
> them (until poetry/pipenv will support constraints - which we are waiting
> for and there is an issue where I explained  why it is useful). It looks
> like that `pip install "apache-airflow==2.3.4" --constraint "
> https://raw.githubusercontent.com/apache/airflow/constraints-2.3.4/constraints-3.9.txt"`
> (different constraints for different airflow version and Python version you
> have)
>
> Constraints have this nice feature that they are only used during the "pip
> install" phase and thrown out immediately after the install is complete.
> They do not create "hard" requirements for airflow. Airflow still has a
> number of "lower-bound" limits for a number of constraints but we try to
> avoid putting upper-bounds at all (only in specific cases and documenting
> them) and our bounds are rather relaxed. This way we achieve two things:
>
> 1) when someone does not use constraints and has a problem with broken
> dependency - we tell them to use constraints - this is what we as a
> community commit to and support
> 2) but by using constraints mechanism we do not limit our users if they
> want to upgrade or downgrade any dependencies. They are free to do it (as
> long as it fits the - rather relaxed lower/upper bounds of Airflow). But
> "with great powers come great responsibilities" - if they want to do that.,
> THEY have to make sure that airflow will work. We make no guarantees there.
> 3) we are not limited by the 3rd-party libraries that come as extras - if
> you do not use those, the limits do not apply
>
> I think this 

Re: [ANNOUNCE] New committer: John Casey

2022-07-29 Thread Kerry Donny-Clark via dev
John, you have made a huge impact on the many, many users of Kafka and
other IOs. This is great recognition of your commitment to Beam.
Kerry

On Fri, Jul 29, 2022 at 4:46 PM Byron Ellis via dev 
wrote:

> Congratulations John!
>
> On Fri, Jul 29, 2022 at 1:09 PM Danny McCormick via dev <
> dev@beam.apache.org> wrote:
>
>> Congrats John and welcome! This is well deserved!
>>
>> On Fri, Jul 29, 2022 at 4:07 PM Kenneth Knowles  wrote:
>>
>>> Hi all,
>>>
>>> Please join me and the rest of the Beam PMC in welcoming
>>> a new committer: John Casey (johnca...@apache.org)
>>>
>>> John started contributing to Beam in late 2021. John has quickly become
>>> our resident expert on KafkaIO - identifying bugs, making enhancements,
>>> helping users - in addition to a variety of other contributions.
>>>
>>> Considering his contributions to the project over this timeframe, the
>>> Beam PMC trusts John with the responsibilities of a Beam committer. [1]
>>>
>>> Thank you John! And we are looking to see more of your contributions!
>>>
>>> Kenn, on behalf of the Apache Beam PMC
>>>
>>> [1]
>>> https://beam.apache.org/contribute/become-a-committer/#an-apache-beam-committer
>>>
>>


Re: [RFC] State & Timers API Design for Go SDK

2022-07-28 Thread Kerry Donny-Clark via dev
I think this a perfect example of a clear design doc. Great, deeply
detailed alternatives considered and why they were rejected. This makes
review easy, and lets us follow your thought process.
I think this is a good implementation, and I support the chosen approach.
Kerry

On Thu, Jul 28, 2022 at 1:41 PM Kenneth Knowles  wrote:

> Really thorough. Love it!
>
> On Thu, Jul 28, 2022 at 9:02 AM Ritesh Ghorse via dev 
> wrote:
>
>> Hey everyone,
>>
>> Danny  and I have been working on
>> designing the state and timers for Go SDK. We wrote a design doc with
>> user-facing API, execution details, and different alternatives considered.
>> It would be really helpful if we could get your
>> suggestions/feedback/comments on the design.
>>
>> Design Doc:
>> https://docs.google.com/document/d/1rcKa1Z6orDDFr1l8t6NA1eLl6zanQbYAEiAqk39NQUU/edit?usp=sharing
>>
>> Thanks!
>> Ritesh Ghorse
>>
>