Re: Beam Java starter project template

2022-02-15 Thread Reza Ardeshir Rokni
Hi,

This is great!

What do folks think about also having a less minimal set of starters? For
Java I am thinking about protobuf / autovalue. For Python maybe an
opinionated setup with tox etc... Again this would just contain 'hello'
world samples to get folks going.

Regards
Reza

On Wed, 9 Feb 2022 at 13:56, Robert Burke  wrote:

> SGTM.
>
> On Wed, Feb 9, 2022 at 1:09 PM Kenneth Knowles  wrote:
>
>> Based on discussion on https://issues.apache.org/jira/browse/LEGAL-601 I
>> think it will be simplest to license it under ASL2 and include a NOTICE
>> file. The user will be free to "clone and go".
>>
>> I would bring these points back to the dev list:
>>
>>  - ASL2 is what people expect from an ASF project, so it is "least
>> surprise"
>>  - Dual-licensing is possible (but I think not worthwhile due to its
>> impact on contributor license agreements)
>>  - ASL2 says "You must cause any modified files to carry prominent
>> notices stating that You changed the files" which won't apply to the user's
>> code and I would guess they simply won't bother with for files in the
>> template. Or maybe there is a clever way to phrase the header so it is
>> already good to go.
>>  - ASL2 says if the work includes a NOTICE file, you have to includes the
>> attributions from it. The NOTICE file is required by ASF policy. We can
>> easily set it up to be a noop for the user.
>>
>> So my overall take is that we should go ahead with ASL2 and a simple
>> NOTICE file. Check the Jira for details.
>>
>> Kenn
>>
>> On Mon, Feb 7, 2022 at 10:47 AM Kenneth Knowles  wrote:
>>
>>> And I've created the repos just now.
>>>
>>> Kenn
>>>
>>> On Mon, Feb 7, 2022 at 10:39 AM Kenneth Knowles  wrote:
>>>
 Legal question asked at https://issues.apache.org/jira/browse/LEGAL-601

 Kenn

 On Fri, Feb 4, 2022 at 7:58 AM Danny McCormick <
 dannymccorm...@google.com> wrote:

> Sure - I'm happy to help out with the Actions setup (and/or with the
> Go template). I will say though, the Actions config should be pretty darn
> simple for these examples -
> https://github.com/davidcavazos/beam-java/blob/main/.github/workflows/test.yaml
> seems right, for each language configuration we're targeting we basically
> just want a job with:
>
>- checkout
>- setup-
>- inlined script to run tests
>
> Always happy to help with or consult on any actions issues 
>
> Thanks,
> Danny
>
> On Fri, Feb 4, 2022 at 10:21 AM Kerry Donny-Clark 
> wrote:
>
>> Danny has extensive experience with GitHub actions, and may be able
>> to help out.
>> Kerry
>>
>> On Thu, Feb 3, 2022, 11:47 PM Kenneth Knowles 
>> wrote:
>>
>>> I'm convinced on all points. My main motivation was to keep it
>>> simple. But of course we should keep it simple for users, not us :-)
>>>
>>> I can take on the task of asking about MIT license and requesting
>>> the repos be created. Not sure if it needs my level of privileges but 
>>> I'm
>>> happy to do it anyhow.
>>>
>>> Kenn
>>>
>>> On Wed, Feb 2, 2022 at 10:30 AM Robert Bradshaw 
>>> wrote:
>>>
 On Wed, Feb 2, 2022 at 10:12 AM David Cavazos 
 wrote:
 >
 > MIT is much more permissive, but I also don't have any problems
 changing it to Apache license. In any case, how about we create the
 following repos?

 For these starter projects, we don't want to encumber any users of
 these templates with any particular licensing requirements (right?)
 and we don't even care about attribution. We want these to be pretty
 much as close to public domain as possible. That's not what the
 Apache
 licence does. (If it's even relevant, a good argument could likely
 be
 made for de minis or fair use, but I think it's best to be explicit
 about this. Perhaps this'd be a good question for apache legal?

 > apache/beam-starter-java
 > apache/beam-starter-python
 > apache/beam-starter-go
 > apache/beam-starter-kotlin
 > apache/beam-starter-scala
 >
 > We'll start by populating the Java one which is the most pressing
 one and the one that is ready, but the rest should be simpler.
 >
 > +David Huntsperger, tldr; these are minimal starter projects for
 every language. Once we have Java, Python and Go, it might be a good 
 idea
 to change the quickstarts to use these instead of the word count. 
 There is
 already a dedicated word count walkthrough so I think that is already
 covered.
 >
 > If we all agree on the repo names, who can help us create them?
 >
 > On Thu, Jan 27, 2022 at 12:58 PM Robert Bradshaw <
 rober...@google.com> wrote:
 >>
 >> On Tue, Jan 18, 2022 at 6:17 AM 

Re: [DISCUSS] Migrate Jira to GitHub Issues?

2022-02-15 Thread Alexey Romanenko
First of all, many thanks for putting the details into this design doc and 
sorry for delay with my response.

I’m still quite neutral with this migration because of several concerns:

- Imho, Github Issues is still not well enough mature as an issue tracker and 
it doesn’t provide the solutions for all needs as, for example, Jira and other 
tracker do (though, seems that there are many features upcoming). For example, 
many things in GH Issues still can be resolved only with “labels" and we can 
potentially end up with a huge bunch of them with a different naming policy, 
mixed purposes and so on.

- If we won’t do a transfer of the issues/users/filters/etc from Jira to GH 
Issues then, it looks, that we will live with two trackers for some (unknown) 
amount of time which is not very convenient (I believe that we need to specify 
our workflows with having this). 

- If we do a transfer then what kind of tools are going to be used, how much 
time it will take - so, we’d need a detailed plan on this.

On the other positive hand, for sure, GH Issues has, by design, a solid 
integration with other Github services which is, obviously, a huge advantage 
for the long term as well. 

In any case, adding (or substitute) a new tool should help us to make the 
development process, in general, easier and faster. So I hope we can achieve 
this with Github Issues.

—
Alexey

> On 15 Feb 2022, at 06:52, Aizhamal Nurmamat kyzy  wrote:
> 
> Very humbly, I think the benefits of moving to GitHub Issues outweigh the 
> shortcomings.
> 
> Jan, Kenn, Alexey, JB: adding you directly as you had some concerns. Please, 
> let us know if they were addressed by the options that we described in the 
> doc [1]?
> 
> If noone objects, I can start working with some of you on Migration TODOs 
> outlined in the doc I am referencing. 
> 
> 
> [1] 
> https://docs.google.com/document/d/1_n7gboVbSKPs-CVcHzADgg8qpNL9igiHqUPCmiOslf0/edit#bookmark=id.izn35w5gsjft
>  
> 
> On Thu, Feb 10, 2022 at 1:12 PM Danny McCormick  > wrote:
> I'm definitely +1 on moving to help make the bar for entry lower for new 
> contributors (like myself!)
> 
> Thanks,
> Danny
> 
> On Thu, Feb 10, 2022 at 2:32 PM Aizhamal Nurmamat kyzy  > wrote:
> Hi all,
> 
> I think we've had a chance to discuss shortcomings and advantages. I think 
> each person may have a different bias / preference. My bias is to move to 
> Github, to have a more inclusive, approachable project despite the 
> differences in workflow. So I'm +1 on moving.
> 
> Could others share their bias? Don't think of this as a vote, but I'd like to 
> get a sense of people's preferences, to see if there's a strong/slight 
> feeling either way.
> 
> Again, the sticky points are summarized here [1], feel free to add to the doc.
> 
> [1] 
> https://docs.google.com/document/d/1_n7gboVbSKPs-CVcHzADgg8qpNL9igiHqUPCmiOslf0/edit#
>  
> 
> 
> 
> On Mon, Jan 31, 2022 at 7:23 PM Aizhamal Nurmamat kyzy  > wrote:
> Welcome to the Beam community, Danny!
> 
> We would love your help if/when we end up migrating. 
> 
> Please add your comments to the doc I shared[1], in case we missed some cool 
> GH features that could be helpful. Thanks!
> 
> [1] 
> https://docs.google.com/document/d/1_n7gboVbSKPs-CVcHzADgg8qpNL9igiHqUPCmiOslf0/edit#
>  
> 
> 
> On Mon, Jan 31, 2022, 10:06 AM Danny McCormick  > wrote:
> > Then (this is something you'd have to code) you could easily write or use 
> > an existing GithubAction or bot that will assign the labels based on the 
> > initial selection done by the user at entry. We have not done it yet but we 
> > might.
> 
> Hey, new contributor here - wanted to chime in with a shameless plug because 
> I happen to have written an action that does pretty much exactly what you're 
> describing[1] and could be extensible to the use case discussed here - it 
> should basically just require writing some config (example in action[2]). In 
> general, automated management of labels based on the initial issue 
> description + content isn't too hard, it does get significantly trickier (but 
> definitely still possible) if you try to automate labels based on responses 
> or edits.
> 
> Also, big +1 that the easy integration with Actions is a significant 
> advantage of using issues since it helps keep your automations in one place 
> (or at least fewer places) and gives you a lot of tools out of the box both 
> from the community and from the Actions org. Disclaimer: I am definitely 
> biased. Until 3 weeks ago I was working on the Actions team at GitHub.
> 
> I'd be happy to help with some of the issue 

RE: Re: Re: Re: [Question][Contribution] Python SDK ByteKeyRange

2022-02-15 Thread Sami Niemi
That tracker is not a restriction tracker which I need for my Bigtable reader 
SDF. When I started working on this tracker I noticed that it was implemented 
in Java and I figured it would be best to make functionally similar 
implementation in Python. LexicographicKeyRangeTracker is not that different 
except it can also handle strings as keys. I did not need the tracker to do 
this so I left it out to keep it more simple and closer to Java implementation.

I’m open to changes in implementation but I would like to keep it simple and 
not too far away from Java implementation.

On 2022/02/15 16:42:35 Robert Bradshaw wrote:
> On Tue, Feb 15, 2022 at 2:03 AM Sami Niemi 
> mailto:sa...@solita.fi>> wrote:
> >
> > Hi Ismaël,
> >
> >
> >
> > What I’ve currently been working on locally is almost 100% based on that 
> > Java implementation.
>
> Did the existing LexicographicKeyRangeTracker not meet your needs?
>
> > I suppose I need to create Jira issue and make the contribution.
> >
> >
> >
> > On 2022/02/15 09:19:33 Ismaël Mejía wrote:
> >
> > > Oh, forgot to add also the link to the tests that cover most of those
> >
> > > unexpected cases:
> >
> > > [2]
> >
> > > https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTrackerTest.java
> >
> > >
> >
> > >
> >
> > > On Tue, Feb 15, 2022 at 10:17 AM Ismaël Mejía 
> > > mailto:ie...@gmail.com>> wrote:
> >
> > >
> >
> > > > Great idea, please take a look at the Java ByteKeyRestrictionTracker
> >
> > > > implementation for consistency [1]
> >
> > > > I remember we had to deal with lots of corner cases so probably worth a
> >
> > > > look.
> >
> > > >
> >
> > > > [1]
> >
> > > > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTracker.java
> >
> > > >
> >
> > > >
> >
> > > > On Mon, Feb 14, 2022 at 6:39 PM Robert Bradshaw 
> > > > mailto:ro...@google.com>>
> >
> > > > wrote:
> >
> > > >
> >
> > > >> +1 to being forward looking and making restriction trackers.
> >
> > > >> Hopefully the restriction tracker and existing range tracker could 
> > > >> share
> >
> > > >> 90% of their code.
> >
> > > >>
> >
> > > >> On Mon, Feb 14, 2022 at 9:36 AM Sami Niemi 
> > > >> mailto:sa...@solita.fi>> wrote:
> >
> > > >>
> >
> > > >>> Hello Robert,
> >
> > > >>>
> >
> > > >>>
> >
> > > >>>
> >
> > > >>> Beam has documented only OffsetRangeTracker [1] for new SDF API. Since
> >
> > > >>> Beam is moving away from Source API, I thought it would be nice to 
> > > >>> develop
> >
> > > >>> IO connectors by using new SDFs. For this I need to create restriction
> >
> > > >>> tracker that follows new SDF API.
> >
> > > >>>
> >
> > > >>>
> >
> > > >>>
> >
> > > >>> So I propose adding ByteKeyRange as new restriction class and
> >
> > > >>> ByteKeyRestrictionTracker as new restriction tracker class. In my
> >
> > > >>> implementation I’ve also used ByteKey class which are given to 
> > > >>> restriction.
> >
> > > >>>
> >
> > > >>>
> >
> > > >>>
> >
> > > >>>1.
> >
> > > >>>
> > > >>> https://github.com/apache/beam/blob/7eb7fd017a43353204eb8037603409dda7e0414a/sdks/python/apache_beam/io/restriction_trackers.py#L76
> >
> > > >>>
> >
> > > >>>
> >
> > > >>>
> >
> > > >>> On 2022/02/11 18:27:23 Robert Bradshaw wrote:
> >
> > > >>>
> >
> > > >>> > Hi Sam! Glad to hear you're willing to contribute.
> >
> > > >>>
> >
> > > >>> >
> >
> > > >>>
> >
> > > >>> > Though the name is a bit different, I'm wondering if this is already
> >
> > > >>>
> >
> > > >>> > present as LexicographicKeyRangeTracker.
> >
> > > >>>
> >
> > > >>> >
> >
> > > >>> https://github.com/apache/beam/blob/release-2.35.0/sdks/python/apache_beam/io/range_trackers.py#L349
> >
> > > >>>
> >
> > > >>> >
> >
> > > >>>
> >
> > > >>> > On Fri, Feb 11, 2022 at 9:54 AM Ahmet Altay 
> > > >>> > mailto:al...@google.com>> wrote:
> >
> > > >>>
> >
> > > >>> > >
> >
> > > >>>
> >
> > > >>> > > Hi Sami. Thank you for your interest.
> >
> > > >>>
> >
> > > >>> > >
> >
> > > >>>
> >
> > > >>> > > Adding people who might be able to comment: @Chamikara Jayalath
> >
> > > >>> @Lukasz Cwik
> >
> > > >>>
> >
> > > >>> > >
> >
> > > >>>
> >
> > > >>> > > On Thu, Feb 10, 2022 at 8:38 AM Sami Niemi 
> > > >>> > > mailto:sa...@solita.fi>> wrote:
> >
> > > >>>
> >
> > > >>> > >>
> >
> > > >>>
> >
> > > >>> > >> Hello,
> >
> > > >>>
> >
> > > >>> > >>
> >
> > > >>>
> >
> > > >>> > >>
> >
> > > >>>
> >
> > > >>> > >>
> >
> > > >>>
> >
> > > >>> > >> I noticed that Python SDK only has implementation for
> >
> > > >>> OffsetRangeTracker and OffsetRange while Java also has ByteKeyRange 
> > > >>> and
> >
> > > >>> -Tracker.
> >
> > > >>>
> >
> > > >>> > >>
> >
> > > >>>
> >
> > > >>> > >>
> >
> > > >>>
> >
> > > >>> > >>
> >
> > > >>>
> >
> > > >>> > >> I have currently created simple implementations of following 
> > > >>> > >> Python
> >
> > > >>> classes:
> >
> > > >>>
> >
> > > >>> > >>
> >
> > > 

RE: Re: Contributor permission for Jira tickets

2022-02-15 Thread Sami Niemi
My username is samnisol.

On 2022/02/15 18:52:33 Ahmet Altay wrote:
> What is your jira username?
>
> On Tue, Feb 15, 2022 at 2:12 AM Sami Niemi 
> mailto:sa...@solita.fi>> wrote:
>
> > Hello,
> >
> >
> >
> > This is Sami from Solita. I’m working on ByteKeyRange and
> > ByteKeyRestrictionTracker for Python SDK and I would need contributor
> > permissions so I could create/assign tickets in Jira.
> >
> >
> >
> > Thank you,
> >
> > Sami Niemi
> >
>


Beam Summit 2022 is here, and you can register now!

2022-02-15 Thread Mara Ruvalcaba
*The premier conference for the worldwide community of Apache Beam users 
and contributors is here!*


By participating in Beam Summit you will:

 * Learn how leading organizations use Apache Beam.
 * Find out what is upcoming for the project.
 * Improve your Beam skills through in-depth workshops.
 * Interact with key developers of Apache Beam as well as with other
   users and contributors from all around the world.

Beam Summit 2022 will be a hybrid event that you can attend in-person in 
Austin, TX or online.


 * On July 18-19 we will have talks which will also be live streamed
   for the online audience.
 * On July 20 we will have workshops, which will only be available for
   in-person participants.



*Online participation has no cost, and If you are interested in 
attending Beam Summit 2022 in person and are not able to afford a 
ticket, please fill this form to apply for a scholarship 
.*


*

Get your tickets now!: https://2022.beamsummit.org/

We would love to have you as a speaker

If you would like to share with the community and be speaker at Beam 
Summit, the CFP is open, please submit your proposal here: 
https://sessionize.com/beam-summit-2022 




Invite your organization to support the event

We are looking for organizations who would like to support the community 
by sponsoring the event. If you think your organization might be 
interested, please share with them the Prospectus 
 




*

Registration open


--
Mara Ruvalcaba
COO, SG Software Guru & Nearshore Link
USA: 512 296 2884
MX: 55 5239 5502


P1 issues report (73)

2022-02-15 Thread Beam Jira Bot
This is your daily summary of Beam's current P1 issues, not including flaky 
tests 
(https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20priority%20%3D%20P1%20AND%20(labels%20is%20EMPTY%20OR%20labels%20!%3D%20flake).

See https://beam.apache.org/contribute/jira-priorities/#p1-critical for the 
meaning and expectations around P1 issues.

https://issues.apache.org/jira/browse/BEAM-13952: Dataflow streaming tests 
failing new AfterSynchronizedProcessingTime test (created 2022-02-15)
https://issues.apache.org/jira/browse/BEAM-13950: PVR_Spark2_Streaming 
perma-red (created 2022-02-15)
https://issues.apache.org/jira/browse/BEAM-13931: BigQueryIO is sending 
rows that are too large to Deadletter Queue even on RETRY_ALWAYS (created 
2022-02-11)
https://issues.apache.org/jira/browse/BEAM-13920: Beam x-lang Dataflow 
tests failing due to _InactiveRpcError (created 2022-02-10)
https://issues.apache.org/jira/browse/BEAM-13850: 
beam_PostCommit_Python_Examples_Dataflow failing (created 2022-02-08)
https://issues.apache.org/jira/browse/BEAM-13830: XVR Direct/Spark/Flink 
tests are timing out (created 2022-02-04)
https://issues.apache.org/jira/browse/BEAM-13822: GBK and CoGBK streaming 
Java load tests failing (created 2022-02-03)
https://issues.apache.org/jira/browse/BEAM-13809: beam_PostCommit_XVR_Flink 
flaky: Connection refused (created 2022-02-03)
https://issues.apache.org/jira/browse/BEAM-13805: Simplify version override 
for Dev versions of the Go SDK. (created 2022-02-02)
https://issues.apache.org/jira/browse/BEAM-13798: Upgrade Kubernetes 
Clusters (created 2022-02-01)
https://issues.apache.org/jira/browse/BEAM-13769: 
beam_PreCommit_Python_Cron failing on test_create_uses_coder_for_pickling 
(created 2022-01-28)
https://issues.apache.org/jira/browse/BEAM-13763: Rotate credentials for 
'io-datastores' Kubernetes cluster (created 2022-01-28)
https://issues.apache.org/jira/browse/BEAM-13741: 
:sdks:java:extensions:sql:hcatalog:compileJava failing in 
beam_Release_NightlySnapshot  (created 2022-01-25)
https://issues.apache.org/jira/browse/BEAM-13715: Kafka commit offset drop 
data on failure for runners that have non-checkpointing shuffle (created 
2022-01-21)
https://issues.apache.org/jira/browse/BEAM-13694: 
beam_PostCommit_Java_Hadoop_Versions failing with ClassDefNotFoundError 
(created 2022-01-19)
https://issues.apache.org/jira/browse/BEAM-13693: 
beam_PostCommit_Java_ValidatesRunner_Dataflow_Streaming timing out at 9 hours 
(created 2022-01-19)
https://issues.apache.org/jira/browse/BEAM-13668: Java Spanner IO Request 
Count metrics broke backwards compatibility (created 2022-01-15)
https://issues.apache.org/jira/browse/BEAM-13582: Beam website precommit 
mentions broken links, but passes. (created 2021-12-30)
https://issues.apache.org/jira/browse/BEAM-13579: Cannot run 
python_xlang_kafka_taxi_dataflow validation script on 2.35.0 (created 
2021-12-29)
https://issues.apache.org/jira/browse/BEAM-13487: WriteToBigQuery Dynamic 
table destinations returns wrong tableId (created 2021-12-17)
https://issues.apache.org/jira/browse/BEAM-13393: GroupIntoBatchesTest is 
failing (created 2021-12-07)
https://issues.apache.org/jira/browse/BEAM-13376: Missing error for 
nonexistent column family BigTable (created 2021-12-03)
https://issues.apache.org/jira/browse/BEAM-13237: 
org.apache.beam.sdk.transforms.CombineTest$WindowingTests.testWindowedCombineGloballyAsSingletonView
 flaky on Dataflow Runner V2 (created 2021-11-12)
https://issues.apache.org/jira/browse/BEAM-13164: Race between member 
variable being accessed due to leaking uninitialized state via 
OutboundObserverFactory (created 2021-11-01)
https://issues.apache.org/jira/browse/BEAM-13132: WriteToBigQuery submits a 
duplicate BQ load job if a 503 error code is returned from googleapi (created 
2021-10-27)
https://issues.apache.org/jira/browse/BEAM-13087: 
apache_beam.runners.portability.fn_api_runner.translations_test.TranslationsTest.test_run_packable_combine_globally
 'apache_beam.coders.coder_impl._AbstractIterable' object is not reversible 
(created 2021-10-20)
https://issues.apache.org/jira/browse/BEAM-13078: Python DirectRunner does 
not emit data at GC time (created 2021-10-18)
https://issues.apache.org/jira/browse/BEAM-13076: Python AfterAny, AfterAll 
do not follow spec (created 2021-10-18)
https://issues.apache.org/jira/browse/BEAM-13010: Delete orphaned files 
(created 2021-10-06)
https://issues.apache.org/jira/browse/BEAM-12995: Consumer group with 
random prefix (created 2021-10-04)
https://issues.apache.org/jira/browse/BEAM-12959: Dataflow error in 
CombinePerKey operation (created 2021-09-26)
https://issues.apache.org/jira/browse/BEAM-12867: Either Create or 
DirectRunner fails to produce all elements to the following transform (created 
2021-09-09)

Flaky test issue report (50)

2022-02-15 Thread Beam Jira Bot
This is your daily summary of Beam's current flaky tests 
(https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20statusCategory%20!%3D%20Done%20AND%20labels%20%3D%20flake)

These are P1 issues because they have a major negative impact on the community 
and make it hard to determine the quality of the software.

https://issues.apache.org/jira/browse/BEAM-13952: Dataflow streaming tests 
failing new AfterSynchronizedProcessingTime test (created 2022-02-15)
https://issues.apache.org/jira/browse/BEAM-13859: Test flake: 
test_split_half_sdf (created 2022-02-09)
https://issues.apache.org/jira/browse/BEAM-13850: 
beam_PostCommit_Python_Examples_Dataflow failing (created 2022-02-08)
https://issues.apache.org/jira/browse/BEAM-13822: GBK and CoGBK streaming 
Java load tests failing (created 2022-02-03)
https://issues.apache.org/jira/browse/BEAM-13810: Flaky tests: Gradle build 
daemon disappeared unexpectedly (created 2022-02-03)
https://issues.apache.org/jira/browse/BEAM-13797: Flakes: Failed to load 
cache entry (created 2022-02-01)
https://issues.apache.org/jira/browse/BEAM-13783: 
apache_beam.transforms.combinefn_lifecycle_test.LocalCombineFnLifecycleTest.test_combine
 is flaky (created 2022-02-01)
https://issues.apache.org/jira/browse/BEAM-13741: 
:sdks:java:extensions:sql:hcatalog:compileJava failing in 
beam_Release_NightlySnapshot  (created 2022-01-25)
https://issues.apache.org/jira/browse/BEAM-13708: flake: 
FlinkRunnerTest.testEnsureStdoutStdErrIsRestored (created 2022-01-20)
https://issues.apache.org/jira/browse/BEAM-13693: 
beam_PostCommit_Java_ValidatesRunner_Dataflow_Streaming timing out at 9 hours 
(created 2022-01-19)
https://issues.apache.org/jira/browse/BEAM-13575: Flink 
testParDoRequiresStableInput flaky (created 2021-12-28)
https://issues.apache.org/jira/browse/BEAM-13519: Java precommit flaky 
(timing out) (created 2021-12-22)
https://issues.apache.org/jira/browse/BEAM-13500: NPE in Flink Portable 
ValidatesRunner streaming suite (created 2021-12-21)
https://issues.apache.org/jira/browse/BEAM-13453: Flake in 
org.apache.beam.sdk.io.mqtt.MqttIOTest.testReadObject: Address already in use 
(created 2021-12-13)
https://issues.apache.org/jira/browse/BEAM-13393: GroupIntoBatchesTest is 
failing (created 2021-12-07)
https://issues.apache.org/jira/browse/BEAM-13367: 
[beam_PostCommit_Python36] [ 
apache_beam.io.gcp.experimental.spannerio_read_it_test] Failure summary 
(created 2021-12-01)
https://issues.apache.org/jira/browse/BEAM-13312: 
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInStartBundle
 is flaky in Java Spark ValidatesRunner suite  (created 2021-11-23)
https://issues.apache.org/jira/browse/BEAM-13311: 
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInProcessElementStateful
 is flaky in Java ValidatesRunner Flink suite. (created 2021-11-23)
https://issues.apache.org/jira/browse/BEAM-13234: Flake in 
StreamingWordCountIT.test_streaming_wordcount_it (created 2021-11-12)
https://issues.apache.org/jira/browse/BEAM-13025: pubsublite.ReadWriteIT 
flaky in beam_PostCommit_Java_DataflowV2   (created 2021-10-08)
https://issues.apache.org/jira/browse/BEAM-12928: beam_PostCommit_Python36 
- CrossLanguageSpannerIOTest - flakey failing (created 2021-09-21)
https://issues.apache.org/jira/browse/BEAM-12859: 
org.apache.beam.runners.dataflow.worker.fn.logging.BeamFnLoggingServiceTest.testMultipleClientsFailingIsHandledGracefullyByServer
 is flaky (created 2021-09-08)
https://issues.apache.org/jira/browse/BEAM-12858: 
org.apache.beam.sdk.io.gcp.datastore.RampupThrottlingFnTest.testRampupThrottler 
is flaky (created 2021-09-08)
https://issues.apache.org/jira/browse/BEAM-12809: 
testTwoTimersSettingEachOtherWithCreateAsInputBounded flaky (created 2021-08-26)
https://issues.apache.org/jira/browse/BEAM-12794: 
PortableRunnerTestWithExternalEnv.test_pardo_timers flaky (created 2021-08-24)
https://issues.apache.org/jira/browse/BEAM-12793: 
beam_PostRelease_NightlySnapshot failed (created 2021-08-24)
https://issues.apache.org/jira/browse/BEAM-12766: Already Exists: Dataset 
apache-beam-testing:python_bq_file_loads_NNN (created 2021-08-16)
https://issues.apache.org/jira/browse/BEAM-12673: 
apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT.test_streaming_wordcount_it
 flakey (created 2021-07-28)
https://issues.apache.org/jira/browse/BEAM-12515: Python PreCommit flaking 
in PipelineOptionsTest.test_display_data (created 2021-06-18)
https://issues.apache.org/jira/browse/BEAM-12322: Python precommit flaky: 
Failed to read inputs in the data plane (created 2021-05-10)
https://issues.apache.org/jira/browse/BEAM-12320: 
PubsubTableProviderIT.testSQLSelectsArrayAttributes[0] failing in SQL 
PostCommit (created 2021-05-10)
https://issues.apache.org/jira/browse/BEAM-12291: 

Contributor permission for Jira tickets

2022-02-15 Thread Sami Niemi
Hello,

This is Sami from Solita. I’m working on ByteKeyRange and 
ByteKeyRestrictionTracker for Python SDK and I would need contributor 
permissions so I could create/assign tickets in Jira.

Thank you,
Sami Niemi


RE: Re: Re: [Question][Contribution] Python SDK ByteKeyRange

2022-02-15 Thread Sami Niemi
Hi Ismaël,

What I’ve currently been working on locally is almost 100% based on that Java 
implementation. I suppose I need to create Jira issue and make the contribution.

On 2022/02/15 09:19:33 Ismaël Mejía wrote:
> Oh, forgot to add also the link to the tests that cover most of those
> unexpected cases:
> [2]
> https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTrackerTest.java
>
>
> On Tue, Feb 15, 2022 at 10:17 AM Ismaël Mejía 
> mailto:ie...@gmail.com>> wrote:
>
> > Great idea, please take a look at the Java ByteKeyRestrictionTracker
> > implementation for consistency [1]
> > I remember we had to deal with lots of corner cases so probably worth a
> > look.
> >
> > [1]
> > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTracker.java
> >
> >
> > On Mon, Feb 14, 2022 at 6:39 PM Robert Bradshaw 
> > mailto:ro...@google.com>>
> > wrote:
> >
> >> +1 to being forward looking and making restriction trackers.
> >> Hopefully the restriction tracker and existing range tracker could share
> >> 90% of their code.
> >>
> >> On Mon, Feb 14, 2022 at 9:36 AM Sami Niemi 
> >> mailto:sa...@solita.fi>> wrote:
> >>
> >>> Hello Robert,
> >>>
> >>>
> >>>
> >>> Beam has documented only OffsetRangeTracker [1] for new SDF API. Since
> >>> Beam is moving away from Source API, I thought it would be nice to develop
> >>> IO connectors by using new SDFs. For this I need to create restriction
> >>> tracker that follows new SDF API.
> >>>
> >>>
> >>>
> >>> So I propose adding ByteKeyRange as new restriction class and
> >>> ByteKeyRestrictionTracker as new restriction tracker class. In my
> >>> implementation I’ve also used ByteKey class which are given to 
> >>> restriction.
> >>>
> >>>
> >>>
> >>>1.
> >>>
> >>> https://github.com/apache/beam/blob/7eb7fd017a43353204eb8037603409dda7e0414a/sdks/python/apache_beam/io/restriction_trackers.py#L76
> >>>
> >>>
> >>>
> >>> On 2022/02/11 18:27:23 Robert Bradshaw wrote:
> >>>
> >>> > Hi Sam! Glad to hear you're willing to contribute.
> >>>
> >>> >
> >>>
> >>> > Though the name is a bit different, I'm wondering if this is already
> >>>
> >>> > present as LexicographicKeyRangeTracker.
> >>>
> >>> >
> >>> https://github.com/apache/beam/blob/release-2.35.0/sdks/python/apache_beam/io/range_trackers.py#L349
> >>>
> >>> >
> >>>
> >>> > On Fri, Feb 11, 2022 at 9:54 AM Ahmet Altay 
> >>> > mailto:al...@google.com>> wrote:
> >>>
> >>> > >
> >>>
> >>> > > Hi Sami. Thank you for your interest.
> >>>
> >>> > >
> >>>
> >>> > > Adding people who might be able to comment: @Chamikara Jayalath
> >>> @Lukasz Cwik
> >>>
> >>> > >
> >>>
> >>> > > On Thu, Feb 10, 2022 at 8:38 AM Sami Niemi 
> >>> > > mailto:sa...@solita.fi>> wrote:
> >>>
> >>> > >>
> >>>
> >>> > >> Hello,
> >>>
> >>> > >>
> >>>
> >>> > >>
> >>>
> >>> > >>
> >>>
> >>> > >> I noticed that Python SDK only has implementation for
> >>> OffsetRangeTracker and OffsetRange while Java also has ByteKeyRange and
> >>> -Tracker.
> >>>
> >>> > >>
> >>>
> >>> > >>
> >>>
> >>> > >>
> >>>
> >>> > >> I have currently created simple implementations of following Python
> >>> classes:
> >>>
> >>> > >>
> >>>
> >>> > >> ByteKey
> >>>
> >>> > >> ByteKeyRange
> >>>
> >>> > >> ByteKeyRestrictionTracker
> >>>
> >>> > >>
> >>>
> >>> > >>
> >>>
> >>> > >>
> >>>
> >>> > >> I would like to make contribution and make these available in
> >>> Python SDK in addition to OffsetRange and -Tracker. I would like to hear
> >>> any thoughts about this and should I make a contribution.
> >>>
> >>> > >>
> >>>
> >>> > >>
> >>>
> >>> > >>
> >>>
> >>> > >> Thank you,
> >>>
> >>> > >>
> >>>
> >>> > >> Sami Niemi
> >>>
> >>> >
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> *SAMI NIEMI*
> >>> Data Engineer
> >>> +358 50 412 2115 <+358%2050%204122115>
> >>> sami.ni...@solita.fi
> >>>
> >>>
> >>>
> >>> *SOLITA*
> >>> Eteläesplanadi 8
> >>> 00130 Helsinki
> >>> solita.fi 
> >>>
> >>>
> >>>
> >>
>


Re: Re: [Question][Contribution] Python SDK ByteKeyRange

2022-02-15 Thread Ismaël Mejía
Oh, forgot to add also the link to the tests that cover most of those
unexpected cases:
[2]
https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTrackerTest.java


On Tue, Feb 15, 2022 at 10:17 AM Ismaël Mejía  wrote:

> Great idea, please take a look at the Java ByteKeyRestrictionTracker
> implementation for consistency [1]
> I remember we had to deal with lots of corner cases so probably worth a
> look.
>
> [1]
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTracker.java
>
>
> On Mon, Feb 14, 2022 at 6:39 PM Robert Bradshaw 
> wrote:
>
>> +1 to being forward looking and making restriction trackers.
>> Hopefully the restriction tracker and existing range tracker could share
>> 90% of their code.
>>
>> On Mon, Feb 14, 2022 at 9:36 AM Sami Niemi  wrote:
>>
>>> Hello Robert,
>>>
>>>
>>>
>>> Beam has documented only OffsetRangeTracker [1] for new SDF API. Since
>>> Beam is moving away from Source API, I thought it would be nice to develop
>>> IO connectors by using new SDFs. For this I need to create restriction
>>> tracker that follows new SDF API.
>>>
>>>
>>>
>>> So I propose adding ByteKeyRange as new restriction class and
>>> ByteKeyRestrictionTracker as new restriction tracker class. In my
>>> implementation I’ve also used ByteKey class which are given to restriction.
>>>
>>>
>>>
>>>1.
>>>
>>> https://github.com/apache/beam/blob/7eb7fd017a43353204eb8037603409dda7e0414a/sdks/python/apache_beam/io/restriction_trackers.py#L76
>>>
>>>
>>>
>>> On 2022/02/11 18:27:23 Robert Bradshaw wrote:
>>>
>>> > Hi Sam! Glad to hear you're willing to contribute.
>>>
>>> >
>>>
>>> > Though the name is a bit different, I'm wondering if this is already
>>>
>>> > present as LexicographicKeyRangeTracker.
>>>
>>> >
>>> https://github.com/apache/beam/blob/release-2.35.0/sdks/python/apache_beam/io/range_trackers.py#L349
>>>
>>> >
>>>
>>> > On Fri, Feb 11, 2022 at 9:54 AM Ahmet Altay  wrote:
>>>
>>> > >
>>>
>>> > > Hi Sami. Thank you for your interest.
>>>
>>> > >
>>>
>>> > > Adding people who might be able to comment: @Chamikara Jayalath
>>> @Lukasz Cwik
>>>
>>> > >
>>>
>>> > > On Thu, Feb 10, 2022 at 8:38 AM Sami Niemi  wrote:
>>>
>>> > >>
>>>
>>> > >> Hello,
>>>
>>> > >>
>>>
>>> > >>
>>>
>>> > >>
>>>
>>> > >> I noticed that Python SDK only has implementation for
>>> OffsetRangeTracker and OffsetRange while Java also has ByteKeyRange and
>>> -Tracker.
>>>
>>> > >>
>>>
>>> > >>
>>>
>>> > >>
>>>
>>> > >> I have currently created simple implementations of following Python
>>> classes:
>>>
>>> > >>
>>>
>>> > >> ByteKey
>>>
>>> > >> ByteKeyRange
>>>
>>> > >> ByteKeyRestrictionTracker
>>>
>>> > >>
>>>
>>> > >>
>>>
>>> > >>
>>>
>>> > >> I would like to make contribution and make these available in
>>> Python SDK in addition to OffsetRange and -Tracker. I would like to hear
>>> any thoughts about this and should I make a contribution.
>>>
>>> > >>
>>>
>>> > >>
>>>
>>> > >>
>>>
>>> > >> Thank you,
>>>
>>> > >>
>>>
>>> > >> Sami Niemi
>>>
>>> >
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *SAMI NIEMI*
>>> Data Engineer
>>> +358 50 412 2115 <+358%2050%204122115>
>>> sami.ni...@solita.fi
>>>
>>>
>>>
>>> *SOLITA*
>>> Eteläesplanadi 8
>>> 00130 Helsinki
>>> solita.fi 
>>>
>>>
>>>
>>


Re: Re: [Question][Contribution] Python SDK ByteKeyRange

2022-02-15 Thread Ismaël Mejía
Great idea, please take a look at the Java ByteKeyRestrictionTracker
implementation for consistency [1]
I remember we had to deal with lots of corner cases so probably worth a
look.

[1]
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/splittabledofn/ByteKeyRangeTracker.java


On Mon, Feb 14, 2022 at 6:39 PM Robert Bradshaw  wrote:

> +1 to being forward looking and making restriction trackers. Hopefully the
> restriction tracker and existing range tracker could share 90% of their
> code.
>
> On Mon, Feb 14, 2022 at 9:36 AM Sami Niemi  wrote:
>
>> Hello Robert,
>>
>>
>>
>> Beam has documented only OffsetRangeTracker [1] for new SDF API. Since
>> Beam is moving away from Source API, I thought it would be nice to develop
>> IO connectors by using new SDFs. For this I need to create restriction
>> tracker that follows new SDF API.
>>
>>
>>
>> So I propose adding ByteKeyRange as new restriction class and
>> ByteKeyRestrictionTracker as new restriction tracker class. In my
>> implementation I’ve also used ByteKey class which are given to restriction.
>>
>>
>>
>>1.
>>
>> https://github.com/apache/beam/blob/7eb7fd017a43353204eb8037603409dda7e0414a/sdks/python/apache_beam/io/restriction_trackers.py#L76
>>
>>
>>
>> On 2022/02/11 18:27:23 Robert Bradshaw wrote:
>>
>> > Hi Sam! Glad to hear you're willing to contribute.
>>
>> >
>>
>> > Though the name is a bit different, I'm wondering if this is already
>>
>> > present as LexicographicKeyRangeTracker.
>>
>> >
>> https://github.com/apache/beam/blob/release-2.35.0/sdks/python/apache_beam/io/range_trackers.py#L349
>>
>> >
>>
>> > On Fri, Feb 11, 2022 at 9:54 AM Ahmet Altay  wrote:
>>
>> > >
>>
>> > > Hi Sami. Thank you for your interest.
>>
>> > >
>>
>> > > Adding people who might be able to comment: @Chamikara Jayalath
>> @Lukasz Cwik
>>
>> > >
>>
>> > > On Thu, Feb 10, 2022 at 8:38 AM Sami Niemi  wrote:
>>
>> > >>
>>
>> > >> Hello,
>>
>> > >>
>>
>> > >>
>>
>> > >>
>>
>> > >> I noticed that Python SDK only has implementation for
>> OffsetRangeTracker and OffsetRange while Java also has ByteKeyRange and
>> -Tracker.
>>
>> > >>
>>
>> > >>
>>
>> > >>
>>
>> > >> I have currently created simple implementations of following Python
>> classes:
>>
>> > >>
>>
>> > >> ByteKey
>>
>> > >> ByteKeyRange
>>
>> > >> ByteKeyRestrictionTracker
>>
>> > >>
>>
>> > >>
>>
>> > >>
>>
>> > >> I would like to make contribution and make these available in Python
>> SDK in addition to OffsetRange and -Tracker. I would like to hear any
>> thoughts about this and should I make a contribution.
>>
>> > >>
>>
>> > >>
>>
>> > >>
>>
>> > >> Thank you,
>>
>> > >>
>>
>> > >> Sami Niemi
>>
>> >
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *SAMI NIEMI*
>> Data Engineer
>> +358 50 412 2115 <+358%2050%204122115>
>> sami.ni...@solita.fi
>>
>>
>>
>> *SOLITA*
>> Eteläesplanadi 8
>> 00130 Helsinki
>> solita.fi 
>>
>>
>>
>