Re: DRAFT - Beam board report June 2020

2020-06-09 Thread Jean-Baptiste Onofre
Hi,

It looks good with the latest proposed changes.

Regards
JB

> Le 9 juin 2020 à 20:36, Kenneth Knowles  a écrit :
> 
> Ping! It is now June, and time to submit this report. Please add interesting 
> tidbits from the last quarter. Perhaps find highlights in 
> https://github.com/apache/beam/blob/master/CHANGES.md 
> 
> 
> Kenn
> 
> On Wed, Mar 25, 2020 at 10:40 AM Kenneth Knowles  > wrote:
> Hi all,
> 
> I just finally got a chance to finish and submit the March board report 
> (late, sorry).
> 
> I want to have the board report draft available earlier so we can make notes 
> whenever things happen. Just like CHANGES.md is for the code, this is for the 
> project/community.
> 
> https://s.apache.org/beam-report-2020-06 
> 
> 
> You can read past reports at 
> https://whimsy.apache.org/board/minutes/Beam.html 
>  to get a feel for it. 
> Here are some specific examples of things that are good to add:
> 
>  - interesting technical discussions that steer the project
>  - major integrations with other projects
>  - community events
>  - major user facing addition/deprecation (like the Flink and Python version 
> and LTS discussions)
> 
> It is OK to add very rough data and not be too careful with language. I will 
> play editor and make it all fit together.
> 
> Kenn



Re: Remove EOL'd Runners

2020-06-09 Thread Tyson Hamilton
OK first PR up for Gearpump removal: https://github.com/apache/beam/pull/11960

If someone could run precommits that would be appreciated. One thing I'm unsure 
of is if there are any additional Jenkins configs outside the repo that need 
updating. I don't think there is but if anyone else knows please speak up.


On 2020/06/09 17:43:08, Ahmet Altay  wrote: 
> Thank you Tyson!
> 
> On Tue, Jun 9, 2020 at 10:20 AM Thomas Weise  wrote:
> 
> > +1
> >
> >
> > On Tue, Jun 9, 2020 at 9:41 AM Robert Bradshaw 
> > wrote:
> >
> >> Makes sense to me.
> >>
> >> On Tue, Jun 9, 2020 at 8:45 AM Maximilian Michels  wrote:
> >>
> >>> Thanks of the heads-up, Tyson! It's a sensible decision to remove
> >>> unsupported runners.
> >>>
> >>> -Max
> >>>
> >>> On 09.06.20 16:51, Tyson Hamilton wrote:
> >>> > Hi All,
> >>> >
> >>> > As part of the Fixit [1] I'd like to remove EOL'd runners, Apex and
> >>> Gearpump, as described in BEAM- [2]. This will be a big PR I think and
> >>> didn't want anyone to be surprised. There is already some agreement in the
> >>> linked Jira issue. If there are no objections I'll get started later today
> >>> or tomorrow.
> >>> >
> >>> > -Tyson
> >>> >
> >>> >
> >>> > [1]:
> >>> https://lists.apache.org/thread.html/r9ddc77a8fee58ad02f68e2d9a7f054aab3e55717cc88ad1d5bc49311%40%3Cdev.beam.apache.org%3E
> >>> > [2]: https://issues.apache.org/jira/browse/BEAM-
> >>> >
> >>>
> >>
> 


Re: Ensuring messages are processed and emitted in-order

2020-06-09 Thread Catlyn Kong
Thanks a lot for the response!

We have several business use cases that rely strongly on ordering by Kafka 
offset:
1) streaming unwindowed inner join: say we want to join users with reviews on 
user_id. Here are the schemas for two streams:
user:
user_id
name
timestamp
reviews:
review_id
user_id
timestamp
Here are the messages in each stream ordered by kafka offset:
user:
(1, name_a, 60), (2, name_b, 120), (1, name_c, 240)
reviews:
(ABC, 1, 90), (DEF, 2, 360)
I would expect to receive following output messages:
(1, name_a, ABC) at timestamp 90
(1, name_c, ABC) at timestamp 240
(2, name_b, DEF) at timestamp 360
This can be done in native Flink since Flink kafka consumer reads from each 
partition sequentially. But without an ordering guarantee, we can end up with 
arbitrary results. So how would we implement this in Beam?
2) unwindowed aggregation: aggregate all the employees for every organization. 
Say we have a new employee stream with the following schema:
new_employee:
organization_id
employee_name
And here are messaged ordered by kafka offset:
(1, name_a), (2, name_b), (2, name_c), (1, name_d)
I would expect the output to be:
(1, [name_a]), (2, [name_b]), (2, [name_b, name_c]), (1, [name_a, name_d])
Again without an ordering guarantee, the result is non deterministic. 

Change data capture (CDC) streams are a very common use case for our data 
pipeline. As in the examples above we rely on Kafka offsets to make sure we 
process data mutations in the proper order. While in some cases we have Flink 
native solutions to these problems (Flink provides ordering guarantees within 
the chosen key), we are now building some new Beam applications that would 
require ordering guarantees. What is the recommended approach in Beam for such 
use cases? If this isn’t currently supported, do we have any near plan to add 
native ordering support in Beam?


On 2020/06/09 20:37:22, Luke Cwik  wrote: 
> This will likely break due to:> 
> * workers can have more then one thread and hence process the source in> 
> parallel> 
> * splitting a source allows for the source to be broken up into multiple> 
> restrictions and hence the runner can process those restrictions in any> 
> order they want. (lets say your kafka partition has unconsumed commit> 
> offset range [20, 100), this could be split into [20, 60), [60, 100) and> 
> the [60, 100) offset range could be processed first)> 
> 
> You're right that you need to sort the output however you want within your> 
> DoFn before you make external calls to Kafka (this prevents you from using> 
> the KafkaIO sink implementation as a transform). There is an annotation> 
> @RequiresTimeSortedInput which is a special case for this sorting if you> 
> want it to be sorted by the elements timestamp but still you'll need to> 
> write to Kafka directly yourself from your DoFn.> 
> 
> On Mon, Jun 8, 2020 at 4:24 PM Hadi Zhang  wrote:> 
> 
> > We are using the Beam 2.20 Python SDK on a Flink 1.9 runner. Our> 
> > messages originate from a custom source that consumes messages from a> 
> > Kafka topic and emits them in the order of their Kafka offsets to a> 
> > DoFn. After this DoFn processes the messages, they are emitted to a> 
> > custom sink that sends messages to a Kafka topic.> 
> >> 
> > We want to process those messages in the order in which we receive> 
> > them from Kafka and then emit them to the Kafka sink in the same> 
> > order, but based on our understanding Beam does not provide an> 
> > in-order transport. However, in practice we noticed that with a Python> 
> > SDK worker on Flink and a parallelism setting of 1 and one sdk_worker> 
> > instance, messages seem to be both processed and emitted in order. Is> 
> > that implementation-specific in-order behavior something that we can> 
> > rely on, or is it very likely that this will break at some future> 
> > point?> 
> >> 
> > In case it's not recommended to depend on that behavior what is the> 
> > best approach for in-order processing?> 
> >> 
> > https://stackoverflow.com/questions/45888719/processing-total-ordering-of-events-by-key-using-apache-beam>
> >  
> > recommends to order events in a heap, but according to our> 
> > understanding this approach will only work when directly writing to an> 
> > external system.> 
> >> 
> 

Re: DRAFT - Beam board report June 2020

2020-06-09 Thread Ahmet Altay
Thank you Kenn. I added a few suggestions and tagged people to verify those.

On Tue, Jun 9, 2020 at 11:36 AM Kenneth Knowles  wrote:

> Ping! It is now June, and time to submit this report. Please add
> interesting tidbits from the last quarter. Perhaps find highlights in
> https://github.com/apache/beam/blob/master/CHANGES.md
>
> Kenn
>
> On Wed, Mar 25, 2020 at 10:40 AM Kenneth Knowles  wrote:
>
>> Hi all,
>>
>> I just finally got a chance to finish and submit the March board report
>> (late, sorry).
>>
>> I want to have the board report draft available earlier so we can make
>> notes whenever things happen. Just like CHANGES.md is for the code, this is
>> for the project/community.
>>
>> https://s.apache.org/beam-report-2020-06
>>
>> You can read past reports at
>> https://whimsy.apache.org/board/minutes/Beam.html to get a feel for it.
>> Here are some specific examples of things that are good to add:
>>
>>  - interesting technical discussions that steer the project
>>  - major integrations with other projects
>>  - community events
>>  - major user facing addition/deprecation (like the Flink and Python
>> version and LTS discussions)
>>
>> It is OK to add very rough data and not be too careful with language. I
>> will play editor and make it all fit together.
>>
>> Kenn
>>
>


Re: Fwd: [DISCUSSION] Use github actions for python wheels ?

2020-06-09 Thread Ahmet Altay
Hi Tobiasz,

Thank you for your work on this. I think this is making great progress and
I agree that it will simplify the release process. I asked a few clarifying
questions on the PR. (I do not want to repeat it here to avoid diverging
the conversation.)

Thank you!
Ahmet

On Tue, Jun 9, 2020 at 2:39 AM Tobiasz Kędzierski <
tobiasz.kedzier...@polidea.com> wrote:

> Hi,
>
> I've added some important updates to
> https://github.com/apache/beam/pull/11877 and I wanted to share some
> thoughts with you about possible improvements:
>
> During releasing a new version of Beam the script
> *build_release_candidate.sh* is executed. It builds sources and puts them
> into the GCS staging bucket where they are consumed by separate repository
> CI jobs (beam-wheels). Then they are downloaded and processed by
> *sign_hash_python_wheels.sh* script.
>
> By using github actions this process could be simplified as follows:
> 1. Within *build_release_candidate.sh* *release* and *release candidate*
> branches are pushed to the remote repository (this is done by it now).
> 2. gh-actions will build sources and wheels based on these branches.
> 3. *build_release_candidate.sh* could verify status of the build by using
> github api and corresponding data (name of the branch, commit hash) and
> after successful build download sources and wheel files from gh-action
> artifacts and use them in further actions.
>
> Happy to know your opinion on this
>
> BR
> Tobiasz Kędzierski
>
> On Mon, Jun 1, 2020 at 11:02 PM Ahmet Altay  wrote:
>
>> > @Ahmet Altay  happy to understand the extent of what
>> you had in mind, maybe the extensions are not as important to plan out, as
>> they're straightforwardly bolted on (ex: daily builds).  More tactically
>> would be valuable to ensure I understand what all needs to occur.  Any
>> other source of info to consume other than
>> https://github.com/apache/beam-wheels and
>> https://beam.apache.org/contribute/release-guide/.
>>
>> I added a bit more details to
>> https://issues.apache.org/jira/browse/BEAM-9388 as a comment, so that it
>> is preserved in the JIRA. Thank you all for working on this.
>>
>> On Mon, Jun 1, 2020 at 9:20 AM Kamil Wasilewski <
>> kamil.wasilew...@polidea.com> wrote:
>>
>>> "unistd.h" C header is present on POSIX systems (MacOS and Linux), but
>>> not on Windows, therefore you can't build a wheel for Windows.
>>>
>>> I took a look and "statesampler_fast.pyx" uses "unistd.h" only because
>>> of the `usleep` function. Unless we use C++ which offers [1], the solution
>>> would be to search for the equivalent of `usleep` that works on Windows.
>>>
>>
>> +Robert Bradshaw  +Valentyn Tymofieiev
>>  - Do you have any suggestions on how building
>> wheels could work on Windows?
>>
>>
>>>
>>> [1] https://en.cppreference.com/w/cpp/thread/sleep_for
>>>
>>
>


Re: Python2.7 Beam End-of-Life Date

2020-06-09 Thread Ahmet Altay
Thank you for re-opening this Valentyn. I am in favor of EOLing py2 support
sooner than later. The reality is that we will not be effectively
supporting beam python 2 for a long time while the ecosystem already EOLed
python 2. That said, a significant chunk (but no longer a majority) of our
users are still using python 2. Upgrades are painful, it might be
especially painful nowadays. It would be good to hear counter view points,
user voices related to this.

On Thu, Jun 4, 2020 at 4:53 PM Valentyn Tymofieiev 
wrote:

> Back at the end of February we decided to revisit this conversation in 3
> months. Do folks on this thread have any new input or perspective regarding
> us balancing "user pain/contributor pain/our ability to continuously test
> with python 2 in a shifting environment"?
>
> Some new information on my end is that we have been seeing steady adoption
> of Python 3 among Beam Python users in Dataflow, particularly strong
> adoption among streaming users, and Dataflow is sunsetting Python 2 support
> for all released Beam SDKs later this year [1]. We will have to remove
> Python 2 Beam test suites that use Dataflow  when Dataflow runner disables
> Py2 support if this happens before Beam Py2 EOL (when we have to remove all
> Py2 suites), including performance tests that still use Dataflow on Python
> 3.
>
> I am curious how much motivation there is in the community at this moment
> to continue Py2 support in Beam,  whether any previous Py3 migration
> blockers were resolved or any new blockers discovered among Beam users.
>
> [1] https://cloud.google.com/python/docs/python2-sunset/#dataflow
>
> On Fri, May 8, 2020 at 3:52 PM Valentyn Tymofieiev 
> wrote:
>
>> That's good news! Thanks for sharing.
>>
>> Another datapoint, here are a few of Beam's dependencies that no longer
>> release new py2 artifacts (I looked at REQUIRED_PACKAGES +  aws, gcp, and
>> interactive extras):
>>
>> hdfs
>> numpy
>> pyarrow
>> ipython
>>
>> There are more if we include transitive dependencies and test-only
>> packages. I also remember encountering one issue last month that was broken
>> only on Py2, which we had to go back and fix.
>>
>> If others have noticed frictions related to ongoing Py2 support or have
>> updates on previously mentioned Py3 migration blockers, feel free to post
>> them.
>>
>> On Fri, May 8, 2020 at 9:19 AM Robert Bradshaw 
>> wrote:
>>
>>> It hasn't been 3 months yet, but I wanted to call out a milestone that
>>> Python 3 downloads crossed the 50% threshold on pypi, if just briefly.
>>>
>>> On Thu, Feb 13, 2020 at 12:40 AM Ismaël Mejía  wrote:
>>> >
>>> > > I would suggest re-evaluating this within the next 3 months again.
>>> We need to balance between user pain/contributor pain/our ability to
>>> continuously test with python 2 in a shifting environment.
>>> >
>>> > Good idea for the in 3 months evaluation, at that point also
>>> distributions will probably be phasing out python2 by default which
>>> definitely help in this direction.
>>> > Thanks for updating the roadmap Ahmet
>>> >
>>> >
>>> > On Thu, Feb 13, 2020 at 2:49 AM Ahmet Altay  wrote:
>>> >>
>>> >>
>>> >>
>>> >> On Wed, Feb 12, 2020 at 1:29 AM Ismaël Mejía 
>>> wrote:
>>> >>>
>>> >>> I am with Chad on this, we should probably extend it a bit more,
>>> even if it
>>> >>> makes us struggle a bit at least we have some workarounds as Robert
>>> suggests,
>>> >>> and as Chad said there are still many people playing the python 3
>>> catchup game,
>>> >>> so worth to support those users.
>>> >>>
>>> >>>
>>> >>> But maybe it is worth to evaluate the current state later in the
>>> year.
>>> >>
>>> >>
>>> >> I would suggest re-evaluating this within the next 3 months again. We
>>> need to balance between user pain/contributor pain/our ability to
>>> continuously test with python 2 in a shifting environment.
>>> >>
>>> >>>
>>> >>> In the
>>> >>> meantime can someone please update our Roadmap in the website with
>>> this info and
>>> >>> where we are with Python 3 support (it looks not up to date).
>>> >>> https://beam.apache.org/roadmap/
>>> >>
>>> >>
>>> >> I made a minor change to update that page (
>>> https://github.com/apache/beam/pull/10848). A more comprehensive update
>>> to that page and linked (
>>> https://beam.apache.org/roadmap/python-sdk/#python-3-support) would
>>> still be welcome.
>>> >>
>>> >>>
>>> >>>
>>> >>> - Ismaël
>>> >>>
>>> >>>
>>> >>> On Tue, Feb 4, 2020 at 10:49 PM Robert Bradshaw 
>>> wrote:
>>> 
>>>   On Tue, Feb 4, 2020 at 12:12 PM Chad Dombrova 
>>> wrote:
>>>  >>
>>>  >>  Not to mention that all the nice work for the type hints will
>>> have to be redone in the for 3.x.
>>>  >
>>>  > Note that there's a tool for automatically converting type
>>> comments to annotations: https://github.com/ilevkivskyi/com2ann
>>>  >
>>>  > So don't let that part bother you.
>>> 
>>>  +1, I wouldn't worry about what can be easily automated.
>>> 
>>>  > I'm curious what other features

Re: Ensuring messages are processed and emitted in-order

2020-06-09 Thread Luke Cwik
This will likely break due to:
* workers can have more then one thread and hence process the source in
parallel
* splitting a source allows for the source to be broken up into multiple
restrictions and hence the runner can process those restrictions in any
order they want. (lets say your kafka partition has unconsumed commit
offset range [20, 100), this could be split into [20, 60), [60, 100) and
the [60, 100) offset range could be processed first)

You're right that you need to sort the output however you want within your
DoFn before you make external calls to Kafka (this prevents you from using
the KafkaIO sink implementation as a transform). There is an annotation
@RequiresTimeSortedInput which is a special case for this sorting if you
want it to be sorted by the elements timestamp but still you'll need to
write to Kafka directly yourself from your DoFn.

On Mon, Jun 8, 2020 at 4:24 PM Hadi Zhang  wrote:

> We are using the Beam 2.20 Python SDK on a Flink 1.9 runner. Our
> messages originate from a custom source that consumes messages from a
> Kafka topic and emits them in the order of their Kafka offsets to a
> DoFn. After this DoFn processes the messages, they are emitted to a
> custom sink that sends messages to a Kafka topic.
>
> We want to process those messages in the order in which we receive
> them from Kafka and then emit them to the Kafka sink in the same
> order, but based on our understanding Beam does not provide an
> in-order transport. However, in practice we noticed that with a Python
> SDK worker on Flink and a parallelism setting of 1 and one sdk_worker
> instance, messages seem to be both processed and emitted in order. Is
> that implementation-specific in-order behavior something that we can
> rely on, or is it very likely that this will break at some future
> point?
>
> In case it's not recommended to depend on that behavior what is the
> best approach for in-order processing?
>
> https://stackoverflow.com/questions/45888719/processing-total-ordering-of-events-by-key-using-apache-beam
> recommends to order events in a heap, but according to our
> understanding this approach will only work when directly writing to an
> external system.
>


Remove EOL'd Runners

2020-06-09 Thread Tyson Hamilton
Hi All,

As part of the Fixit [1] I'd like to remove EOL'd runners, Apex and Gearpump, 
as described in BEAM- [2]. This will be a big PR I think and didn't want 
anyone to be surprised. There is already some agreement in the linked Jira 
issue. If there are no objections I'll get started later today or tomorrow.

-Tyson


[1]: 
https://lists.apache.org/thread.html/r9ddc77a8fee58ad02f68e2d9a7f054aab3e55717cc88ad1d5bc49311%40%3Cdev.beam.apache.org%3E
[2]: https://issues.apache.org/jira/browse/BEAM-


Re: DRAFT - Beam board report June 2020

2020-06-09 Thread Kenneth Knowles
Ping! It is now June, and time to submit this report. Please add
interesting tidbits from the last quarter. Perhaps find highlights in
https://github.com/apache/beam/blob/master/CHANGES.md

Kenn

On Wed, Mar 25, 2020 at 10:40 AM Kenneth Knowles  wrote:

> Hi all,
>
> I just finally got a chance to finish and submit the March board report
> (late, sorry).
>
> I want to have the board report draft available earlier so we can make
> notes whenever things happen. Just like CHANGES.md is for the code, this is
> for the project/community.
>
> https://s.apache.org/beam-report-2020-06
>
> You can read past reports at
> https://whimsy.apache.org/board/minutes/Beam.html to get a feel for it.
> Here are some specific examples of things that are good to add:
>
>  - interesting technical discussions that steer the project
>  - major integrations with other projects
>  - community events
>  - major user facing addition/deprecation (like the Flink and Python
> version and LTS discussions)
>
> It is OK to add very rough data and not be too careful with language. I
> will play editor and make it all fit together.
>
> Kenn
>


Re: Remove EOL'd Runners

2020-06-09 Thread Ahmet Altay
Thank you Tyson!

On Tue, Jun 9, 2020 at 10:20 AM Thomas Weise  wrote:

> +1
>
>
> On Tue, Jun 9, 2020 at 9:41 AM Robert Bradshaw 
> wrote:
>
>> Makes sense to me.
>>
>> On Tue, Jun 9, 2020 at 8:45 AM Maximilian Michels  wrote:
>>
>>> Thanks of the heads-up, Tyson! It's a sensible decision to remove
>>> unsupported runners.
>>>
>>> -Max
>>>
>>> On 09.06.20 16:51, Tyson Hamilton wrote:
>>> > Hi All,
>>> >
>>> > As part of the Fixit [1] I'd like to remove EOL'd runners, Apex and
>>> Gearpump, as described in BEAM- [2]. This will be a big PR I think and
>>> didn't want anyone to be surprised. There is already some agreement in the
>>> linked Jira issue. If there are no objections I'll get started later today
>>> or tomorrow.
>>> >
>>> > -Tyson
>>> >
>>> >
>>> > [1]:
>>> https://lists.apache.org/thread.html/r9ddc77a8fee58ad02f68e2d9a7f054aab3e55717cc88ad1d5bc49311%40%3Cdev.beam.apache.org%3E
>>> > [2]: https://issues.apache.org/jira/browse/BEAM-
>>> >
>>>
>>


Re: Remove EOL'd Runners

2020-06-09 Thread Thomas Weise
+1


On Tue, Jun 9, 2020 at 9:41 AM Robert Bradshaw  wrote:

> Makes sense to me.
>
> On Tue, Jun 9, 2020 at 8:45 AM Maximilian Michels  wrote:
>
>> Thanks of the heads-up, Tyson! It's a sensible decision to remove
>> unsupported runners.
>>
>> -Max
>>
>> On 09.06.20 16:51, Tyson Hamilton wrote:
>> > Hi All,
>> >
>> > As part of the Fixit [1] I'd like to remove EOL'd runners, Apex and
>> Gearpump, as described in BEAM- [2]. This will be a big PR I think and
>> didn't want anyone to be surprised. There is already some agreement in the
>> linked Jira issue. If there are no objections I'll get started later today
>> or tomorrow.
>> >
>> > -Tyson
>> >
>> >
>> > [1]:
>> https://lists.apache.org/thread.html/r9ddc77a8fee58ad02f68e2d9a7f054aab3e55717cc88ad1d5bc49311%40%3Cdev.beam.apache.org%3E
>> > [2]: https://issues.apache.org/jira/browse/BEAM-
>> >
>>
>


Re: Remove EOL'd Runners

2020-06-09 Thread Robert Bradshaw
Makes sense to me.

On Tue, Jun 9, 2020 at 8:45 AM Maximilian Michels  wrote:

> Thanks of the heads-up, Tyson! It's a sensible decision to remove
> unsupported runners.
>
> -Max
>
> On 09.06.20 16:51, Tyson Hamilton wrote:
> > Hi All,
> >
> > As part of the Fixit [1] I'd like to remove EOL'd runners, Apex and
> Gearpump, as described in BEAM- [2]. This will be a big PR I think and
> didn't want anyone to be surprised. There is already some agreement in the
> linked Jira issue. If there are no objections I'll get started later today
> or tomorrow.
> >
> > -Tyson
> >
> >
> > [1]:
> https://lists.apache.org/thread.html/r9ddc77a8fee58ad02f68e2d9a7f054aab3e55717cc88ad1d5bc49311%40%3Cdev.beam.apache.org%3E
> > [2]: https://issues.apache.org/jira/browse/BEAM-
> >
>


[GitHub] [beam-site] TheNeuralBit merged pull request #604: Update beam-site for release 2.22.0

2020-06-09 Thread GitBox


TheNeuralBit merged pull request #604:
URL: https://github.com/apache/beam-site/pull/604


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: Remove EOL'd Runners

2020-06-09 Thread Maximilian Michels
Thanks of the heads-up, Tyson! It's a sensible decision to remove
unsupported runners.

-Max

On 09.06.20 16:51, Tyson Hamilton wrote:
> Hi All,
> 
> As part of the Fixit [1] I'd like to remove EOL'd runners, Apex and Gearpump, 
> as described in BEAM- [2]. This will be a big PR I think and didn't want 
> anyone to be surprised. There is already some agreement in the linked Jira 
> issue. If there are no objections I'll get started later today or tomorrow.
> 
> -Tyson
> 
> 
> [1]: 
> https://lists.apache.org/thread.html/r9ddc77a8fee58ad02f68e2d9a7f054aab3e55717cc88ad1d5bc49311%40%3Cdev.beam.apache.org%3E
> [2]: https://issues.apache.org/jira/browse/BEAM-
> 


Beam Dependency Check Report (2020-06-09)

2020-06-09 Thread Apache Jenkins Server
ERROR: File 'src/build/dependencyUpdates/beam-dependency-check-report.html' does not exist

Re: Fwd: [DISCUSSION] Use github actions for python wheels ?

2020-06-09 Thread Tobiasz Kędzierski
Hi,

I've added some important updates to
https://github.com/apache/beam/pull/11877 and I wanted to share some
thoughts with you about possible improvements:

During releasing a new version of Beam the script
*build_release_candidate.sh* is executed. It builds sources and puts them
into the GCS staging bucket where they are consumed by separate repository
CI jobs (beam-wheels). Then they are downloaded and processed by
*sign_hash_python_wheels.sh* script.

By using github actions this process could be simplified as follows:
1. Within *build_release_candidate.sh* *release* and *release candidate*
branches are pushed to the remote repository (this is done by it now).
2. gh-actions will build sources and wheels based on these branches.
3. *build_release_candidate.sh* could verify status of the build by using
github api and corresponding data (name of the branch, commit hash) and
after successful build download sources and wheel files from gh-action
artifacts and use them in further actions.

Happy to know your opinion on this

BR
Tobiasz Kędzierski

On Mon, Jun 1, 2020 at 11:02 PM Ahmet Altay  wrote:

> > @Ahmet Altay  happy to understand the extent of what
> you had in mind, maybe the extensions are not as important to plan out, as
> they're straightforwardly bolted on (ex: daily builds).  More tactically
> would be valuable to ensure I understand what all needs to occur.  Any
> other source of info to consume other than
> https://github.com/apache/beam-wheels and
> https://beam.apache.org/contribute/release-guide/.
>
> I added a bit more details to
> https://issues.apache.org/jira/browse/BEAM-9388 as a comment, so that it
> is preserved in the JIRA. Thank you all for working on this.
>
> On Mon, Jun 1, 2020 at 9:20 AM Kamil Wasilewski <
> kamil.wasilew...@polidea.com> wrote:
>
>> "unistd.h" C header is present on POSIX systems (MacOS and Linux), but
>> not on Windows, therefore you can't build a wheel for Windows.
>>
>> I took a look and "statesampler_fast.pyx" uses "unistd.h" only because of
>> the `usleep` function. Unless we use C++ which offers [1], the solution
>> would be to search for the equivalent of `usleep` that works on Windows.
>>
>
> +Robert Bradshaw  +Valentyn Tymofieiev
>  - Do you have any suggestions on how building
> wheels could work on Windows?
>
>
>>
>> [1] https://en.cppreference.com/w/cpp/thread/sleep_for
>>
>