Re: Proposal: Dynamic timer support (BEAM-6857)

2019-10-28 Thread Reuven Lax
Just to circle back around, after the discussion on this thread I propose modifying the proposed API as follows: class MyDoFn extends DoFn { @TimerFamily("timers") TimerSpec timers = TimerSpecs.timerFamily(TimeDomain(EVENT_TIME)); @ProcessElement public void process(@Element String e,

Re: Proposal: Dynamic timer support (BEAM-6857)

2019-10-28 Thread Reuven Lax
On Wed, Oct 23, 2019 at 1:21 AM Jan Lukavský wrote: > Hi Reuven, > > yes, if this change is intended to be used by end users, then > DoFnSignatures cannot be used, agree on that. Regarding the relationship > with dynamic state - I agree that this is separate problem, but because it > is close

Re: Rethinking the Flink Runner modes

2019-10-28 Thread Thomas Weise
The current semantics of flink_master are tied to the Flink Java API. The Flink client / Java API isn't a "REST API". It now uses the REST API somewhere deep in RemoteEnvironment when the flink_master value is host:port, but it does a lot of other things as well, such are parsing config files and

Re: (mini-doc) Beam (Flink) portable job templates

2019-10-28 Thread Chad Dombrova
Thanks for the follow up, Thomas. On Mon, Oct 28, 2019 at 7:55 PM Thomas Weise wrote: > Follow-up for users looking to run portable pipelines on Flink: > > After prototyping the generate-jar-file approach for internal deployment > and some related discussion, the conclusion was that it is too

Re: (mini-doc) Beam (Flink) portable job templates

2019-10-28 Thread Thomas Weise
Follow-up for users looking to run portable pipelines on Flink: After prototyping the generate-jar-file approach for internal deployment and some related discussion, the conclusion was that it is too limiting. The sticky point is that the jar file would need to be generated at container build

Re: RFC: python static typing PR

2019-10-28 Thread Robert Burke
As someone who cribs from the Python SDK to make changes in the Go SDK, this will make things much easier to follow! Thank you. On Mon, Oct 28, 2019, 6:52 PM Chad Dombrova wrote: > > Wow, that is an incredible amount of work! >> > > Some people meditate. I annotate ;) > > I'm definitely of the

Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-28 Thread jincheng sun
Sure, Thank you for your confirmation Luke! :) Best, Jincheng Luke Cwik 于2019年10月29日周二 上午1:20写道: > I would go with creating JIRAs and PRs directly since this doesn't seem to > be contentious since you have received feedback from a few folks and they > are all suggesting the same thing. > > On

Re: RFC: python static typing PR

2019-10-28 Thread Chad Dombrova
> Wow, that is an incredible amount of work! > Some people meditate. I annotate ;) I'm definitely of the opinion that there's no viable counterargument to the > value of types, especially for large or complex codebases. > Agreed. That's part of why I waited until I got the whole thing passing

Re: RFC: python static typing PR

2019-10-28 Thread Ahmet Altay
Thank you Chad, everyone else who helped with reviews so far. I think this is a positive change. I do share the worry of one more lint. I hope that we can hear from majority of beam python contributors. Good tooling and educational information would be helpful here. Ideally this will serve

Re: [Discuss] Ideas for Apache Beam presence in social media

2019-10-28 Thread Matthias Baetens
Awesome Aizhamal :) Lmk if I can be of any help! On Mon, Oct 28, 2019, 11:14 Aizhamal Nurmamat kyzy wrote: > Thank you Matthias, > > I was supposed to write up the documentation.. sorry this got slipped > through the cracks. I will prepare the PR until the end of the week. > > On Tue, Oct 22,

Re: Quota issues again

2019-10-28 Thread Kenneth Knowles
It may also be advantageous to separate most submodules to not run a giant generic Java precommit. Each IO really only needs its own, and to register itself in the global Java precommit run only for the core. The bookkeeping may become quite a lot, but this is the natural structure. Kenn On Mon,

Re: RFC: python static typing PR

2019-10-28 Thread Kenneth Knowles
Wow, that is an incredible amount of work! I'm definitely of the opinion that there's no viable counterargument to the value of types, especially for large or complex codebases. This kind of check must be in precommit or it will become perma-red very quickly. Kenn On Mon, Oct 28, 2019 at 4:21

Re: Quota issues again

2019-10-28 Thread Chad Dombrova
Can we get more aggressive about separating tests into groups by those that are dependent on other languages and those that are not? I think we could dramatically reduce our backlog if we didn’t run all of the Java tests every time a commit is made that only affects python code, and vice versa.

Re: Python Precommit duration pushing 2 hours

2019-10-28 Thread Pablo Estrada
*not deciles, but 9-percentiles : ) On Mon, Oct 28, 2019 at 5:31 PM Pablo Estrada wrote: > I've ran the tests in Python 2 (without cython), and used a utility to > track runtime for each test method. I found some of the following things: > - Total test methods run: 2665 > - Total test runtime:

Re: Python Precommit duration pushing 2 hours

2019-10-28 Thread Pablo Estrada
I've ran the tests in Python 2 (without cython), and used a utility to track runtime for each test method. I found some of the following things: - Total test methods run: 2665 - Total test runtime: 990 seconds - Deciles of time spent: - 1949 tests run in the first 9% of time - 173 in the 9-18%

Re: Pipeline AttributeError on Python3

2019-10-28 Thread Valentyn Tymofieiev
+user@, bcc: dev@ https://issues.apache.org/jira/browse/BEAM-6158 may be contributing to this issue, although we saw instances of this bug in exactly opposite scenarios - when pipeline was defined *in one file*, but not in multiple files. Could you try replacing instances of super() in

Re: RFC: python static typing PR

2019-10-28 Thread Valentyn Tymofieiev
Thanks a lot, Chad. Looking at the PR, I am incredibly happy to see explicit type annotations throughout Beam codebase. I believe this is a step in the right direction even if the tooling were not able to do any inference at all. The effort required from developers to add annotations in their code

Re: Rethinking the Flink Runner modes

2019-10-28 Thread Kyle Weaver
Filed https://issues.apache.org/jira/browse/BEAM-8507 for the issue I mentioned. On Mon, Oct 28, 2019 at 4:12 PM Kyle Weaver wrote: > > I'd like to see this issue resolved before 2.17 as changing the public > API once it's released will be harder. > > +1. In particular, I misunderstood that

Re: Rethinking the Flink Runner modes

2019-10-28 Thread Kyle Weaver
> I'd like to see this issue resolved before 2.17 as changing the public API once it's released will be harder. +1. In particular, I misunderstood that [auto] is not supported by `FlinkUberJarJobServer`. Since [auto] is now the default, it's broken for Python 3.6+.

Re: Python Precommit duration pushing 2 hours

2019-10-28 Thread Pablo Estrada
I have written https://github.com/apache/beam/pull/9910 to reduce FnApiRunnerTest variations. I'm not in a rush to merge, but rather happy to start a discussion. I'll also try to figure out if there are other tests slowing down the suite significantly. Best -P. On Fri, Oct 25, 2019 at 7:41 PM

Re: Quota issues again

2019-10-28 Thread Mikhail Gryzykhin
Quota jira issue: https://issues.apache.org/jira/browse/BEAM-8195 On Mon, Oct 28, 2019 at 2:05 PM Mikhail Gryzykhin wrote: > Hi everyone, > > > While validating release branch, I got failure due Quota again. Also, current > queue time for jobs is more than 1.5 hours. > > > I'm not sure if it

Re: Rethinking the Flink Runner modes

2019-10-28 Thread Robert Bradshaw
Thanks for bringing this to the list. Some comments below, though it would be good to get additional feedback beyond those that have been participating on the PR, if any. I'd like to see this issue resolved before 2.17 as changing the public API once it's released will be harder. On Mon, Oct 28,

Re: [Question] Cannot resolve symbol 'AutoValue_KafkaIO_WriteRecords'

2019-10-28 Thread Kirill Kozlov
I found this document [1] helpful when setting up an Intellij to work with Beam. Make sure that (File | Settings | Build, Execution, Deployment | Build Tools | Gradle | Runner) has a "Delegate IDE build/run actions to Gradle" checked and "Run test using" is set to "Gradle Test Runner". Under

Pipeline AttributeError on Python3

2019-10-28 Thread Rakesh Kumar
Hi All, We have noticed a weird intermittent issue on Python3 but we don't run into this issue on python2. Sometimes when we are trying to submit the pipeline, we get AttributeError (Check the stack trace below). we have double-checked and we do find the attribute/methods are present in the

Re: RFC: python static typing PR

2019-10-28 Thread Robert Bradshaw
Thanks, Chad, this has been a herculean task. I'm excited for the additional tooling and documentation explicit types can bring to our code, even if tooling such as mypy isn't able to do as much inference for obvious cases as I would like. This will, of course, put another burden on developers in

Quota issues again

2019-10-28 Thread Mikhail Gryzykhin
Hi everyone, While validating release branch, I got failure due Quota again. Also, current queue time for jobs is more than 1.5 hours. I'm not sure if it is worth starting another thread on tests efficiency, but still want to keep this mail to highlight the issues. See PS for links.

Re: [UPDATE] Preparing for Beam 2.17.0 release

2019-10-28 Thread Ahmet Altay
On Mon, Oct 28, 2019 at 12:44 PM Gleb Kanterov wrote: > It looks like BigQueryIO DIRECT_READ is broken since 2.16.0, I've added a > ticket describing the problem and possible fix, see BEAM-8504 > [1]. > Should this be added to 2.16 blog post as

Re: [UPDATE] Preparing for Beam 2.17.0 release

2019-10-28 Thread Gleb Kanterov
It looks like BigQueryIO DIRECT_READ is broken since 2.16.0, I've added a ticket describing the problem and possible fix, see BEAM-8504 [1]. [1]: https://issues.apache.org/jira/browse/BEAM-8504 On Wed, Oct 23, 2019 at 9:19 PM Kenneth Knowles

Re: Is there good way to make Python SDK docs draft accessible?

2019-10-28 Thread Udi Meiri
I believe that generating pydoc for the website is still a manual process (unlike the rest of the website?). The reviewer will need to manually generate the docs (checkout the PR, run tox -e docs). On Mon, Oct 28, 2019 at 10:55 AM Yoshiki Obata wrote: > Hi all. > > I'm working on enabling to

Re: [Discuss] Ideas for Apache Beam presence in social media

2019-10-28 Thread Aizhamal Nurmamat kyzy
Thank you Matthias, I was supposed to write up the documentation.. sorry this got slipped through the cracks. I will prepare the PR until the end of the week. On Tue, Oct 22, 2019, 12:51 AM Matthias Baetens wrote: > Thanks Thomas. > > Happy to help on the doc side when I find some time :) I'll

Is there good way to make Python SDK docs draft accessible?

2019-10-28 Thread Yoshiki Obata
Hi all. I'm working on enabling to generate Python SDK docs with Python3 [1] I have modified scripts and now reviewing generated docs in someone’s eyes is needed. But there seems to be no existing way to upload generated docs to where anyone can access unlike website html which can be uploaded

Re: Go IOs ?

2019-10-28 Thread Chamikara Jayalath
On Mon, Oct 28, 2019 at 9:15 AM Robert Burke wrote: > There are IOs that work [1], but they haven't been vetted for production > use (performance, overhead etc) just yet. As such, I don't recommend > putting them on the site at this time. Of course, folks a free to discuss > and disagree with me

RFC: python static typing PR

2019-10-28 Thread Chad Dombrova
Hi all, I've been working on a PR to add static typing to the beam python sdk for the past 4 months or so. This has been an epic journey which has required chasing down numerous fixes across several other projects (mypy, pylint, python-future), but the mypy tests are now passing! I'm not sure

Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-28 Thread Luke Cwik
I would go with creating JIRAs and PRs directly since this doesn't seem to be contentious since you have received feedback from a few folks and they are all suggesting the same thing. On Sun, Oct 27, 2019 at 9:27 PM jincheng sun wrote: > Hi all, > > Thanks a lot for your feedback. It seems that

Re: [spark structured streaming runner] merge to master?

2019-10-28 Thread Alexey Romanenko
Let me share some of my thoughts on this. >> - shall we filter out the package name from the release? >> Until new runner is not ready to be used in production (or, at least, be used for beta testing but users should be clearly warned about that in this case), I believe we need to filter

Re: Go IOs ?

2019-10-28 Thread Robert Burke
There are IOs that work [1], but they haven't been vetted for production use (performance, overhead etc) just yet. As such, I don't recommend putting them on the site at this time. Of course, folks a free to discuss and disagree with me on that :). The Go SDK doesn't yet have a way of scaling

Re: Go IOs ?

2019-10-28 Thread Chamikara Jayalath
Seems like we have several file-based IO connectors and datastore here: https://github.com/apache/beam/tree/master/sdks/go/pkg/beam/io We should definitely add these to the list. I guess bulk of IOs will be supported through cross-language transforms framework when we have that for Go SDK.

Re: Apache Pulsar connector for Beam

2019-10-28 Thread Yijie Shen
I've written pulsar-flink and pulsar-spark connectors before. Please feel free to ping me if you need some help. Best, Yijie On Mon, Oct 28, 2019 at 1:21 AM Sijie Guo wrote: > Add Yijie in the loop. He can help from Pulsar side. > > - Sijie > > On Sat, Oct 26, 2019 at 7:23 PM Taher Koitawala

Re: Java PortableRunner package name

2019-10-28 Thread Michał Walenia
Hi, thanks for the opinion. I think that's a very good argument against the change. I'll stick with changing package names. Have a good day, Michal On Mon, Oct 28, 2019 at 2:41 PM Maximilian Michels wrote: > Hi Michal, > > the package name looks good to me. > > -1 on the name change. For

Re: Java PortableRunner package name

2019-10-28 Thread Maximilian Michels
Hi Michal, the package name looks good to me. -1 on the name change. For users, the current name "PortableRunner" reflects best what it does, running portable pipelines. The details of the translation and the submission process do not have to be reflected in the name. Cheers, Max On

Rethinking the Flink Runner modes

2019-10-28 Thread Maximilian Michels
Hi, Robert and Kyle have been doing great work to simplify submitting portable pipelines with the Flink Runner. Part of this is having a Python "FlinkRunner" which handles bringing up a Beam job server and submitting the pipeline directly via the Flink REST API. One building block is the

Re: Java PortableRunner package name

2019-10-28 Thread Michał Walenia
Hi all, thank you for your replies and ideas. My proposition is to move PortableRunner to package sdks.java.portability. I really like the idea of renaming it - PortableRunnerClient looks like a good idea. WDYT? Regards, Michal On Wed, Oct 23, 2019 at 12:09 PM Ismaël Mejía wrote: > +Ankur

Beam Dependency Check Report (2019-10-28)

2019-10-28 Thread Apache Jenkins Server
High Priority Dependency Updates Of Beam Python SDK: Dependency Name Current Version Latest Version Release Date Of the Current Used Version Release Date Of The Latest Release JIRA Issue mock 2.0.0 3.0.5 2019-05-20