Re: Exploding windows and FnApiDoFnRunner

2020-05-01 Thread Reuven Lax
FnApiDoFnRunner does run Java DoFns. On Fri, May 1, 2020 at 9:10 PM Robert Burke wrote: > In the Go SDK this optimization is handled on the SDK side, inthe pardo > execution node not one the runner side of the FnAPI > > But i think I'm about to learn that FnApiDoFnRunner is something that runs

Re: Exploding windows and FnApiDoFnRunner

2020-05-01 Thread Robert Burke
In the Go SDK this optimization is handled on the SDK side, inthe pardo execution node not one the runner side of the FnAPI But i think I'm about to learn that FnApiDoFnRunner is something that runs on the Java SDK side rather than on the runner side, despite the name. On Fri, May 1, 2020, 9:02

Re: Exploding windows and FnApiDoFnRunner

2020-05-01 Thread Reuven Lax
Ah - so we don't implement the optimization of not expanding the windows if not necessary? On Fri, May 1, 2020 at 8:56 PM Luke Cwik wrote: > In all the processElementYYY methods the currentWindow is assigned as can > be seen here as we loop over the set of windows: > >

Re: Exploding windows and FnApiDoFnRunner

2020-05-01 Thread Luke Cwik
In all the processElementYYY methods the currentWindow is assigned as can be seen here as we loop over the set of windows: https://github.com/apache/beam/blob/9bb2990c0f6c08dd33d9c6fa1fd91842c644a8e3/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnApiDoFnRunner.java#L738 On Fri, May

Exploding windows and FnApiDoFnRunner

2020-05-01 Thread Reuven Lax
In Beam a WindowedValue can can contain multiple windows, because an element can be in multiple windows at once (for example, sliding windows). Usually we keep these elements unexpanded, but if the user's doFn observes the window then we have to "explode" the element out, and we run the process

Re: NPE in Calcite dialect when input PCollection has logical type in schema, from JdbcIO Transform

2020-05-01 Thread rahul patwari
Thanks for your suggestion, Brian. I will move the logical types explicitly defined for JdbcIO in org.apache.beam.sdk.io.jdbc.LogicalTypes to org.apache.beam.sdk.schemas.logicaltypes with a URN identifier. If any other IO defines logical types which correspond to SQL data types, all those

Re: JIRA priorities explaination

2020-05-01 Thread Ahmet Altay
+1 sounds good to me. Oftentimes I confused the relative priorities of critical/blocker/major. On Fri, May 1, 2020 at 3:05 PM Tyson Hamilton wrote: > Proposal sounds good to me! The tool tips will be fantastic. > > On Fri, May 1, 2020 at 3:03 PM Robert Bradshaw > wrote: > >> On Fri, May 1,

Re: JIRA priorities explaination

2020-05-01 Thread Tyson Hamilton
Proposal sounds good to me! The tool tips will be fantastic. On Fri, May 1, 2020 at 3:03 PM Robert Bradshaw wrote: > On Fri, May 1, 2020 at 2:34 PM Kenneth Knowles wrote: > > > > Coming back to this thread (again!) > > > > I wrote up https://beam.apache.org/contribute/jira-priorities/ and >

Re: JIRA priorities explaination

2020-05-01 Thread Luke Cwik
Sounds good to me. On Fri, May 1, 2020 at 2:34 PM Kenneth Knowles wrote: > Coming back to this thread (again!) > > I wrote up https://beam.apache.org/contribute/jira-priorities/ and > https://beam.apache.org/contribute/release-blockers/ and I have had > success communicating using these docs. >

Re: JIRA priorities explaination

2020-05-01 Thread Robert Bradshaw
On Fri, May 1, 2020 at 2:34 PM Kenneth Knowles wrote: > > Coming back to this thread (again!) > > I wrote up https://beam.apache.org/contribute/jira-priorities/ and > https://beam.apache.org/contribute/release-blockers/ and I have had success > communicating using these docs. > > However, some

Re: JIRA priorities explaination

2020-05-01 Thread Kenneth Knowles
Coming back to this thread (again!) I wrote up https://beam.apache.org/contribute/jira-priorities/ and https://beam.apache.org/contribute/release-blockers/ and I have had success communicating using these docs. However, some people get confused because the existing Jira priorities have tooltips

Re: NPE in Calcite dialect when input PCollection has logical type in schema, from JdbcIO Transform

2020-05-01 Thread Brian Hulette
On Thu, Apr 30, 2020 at 11:26 PM rahul patwari wrote: > Hi, > > A JIRA ticket is raised to track this bug: BEAM-8307 > > > I have raised a PR: https://github.com/apache/beam/pull/11581 to fix the > issue. > > This PR takes care of using BeamSql

Re: Automation for Jira

2020-05-01 Thread Kenneth Knowles
Based on the mild consensus and my availability, I just did #1. I have not done any others. It seems #2 may be infeasible [1] and I am convinced that we should not auto-close. I'll update again in a bit... Kenn [1] https://jira.atlassian.com/browse/JRACLOUD-28064 On Wed, Apr 29, 2020 at 2:54 PM

Re: Non-trivial joins examples

2020-05-01 Thread Jan Lukavský
Interestingly, I'm currently also working on a proposal for generic join semantics. I plan to send a proposal for review, but unfortunately, there are still other things keeping me busy. I take this opportunity to review high-level thoughts, maybe someone can give some points. The general

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

2020-05-01 Thread Valentyn Tymofieiev
Hi Yoshiki, Thanks a lot for your help with Python 3 support so far and most recently, with your work on Python 3.8. Overall the proposal sounds good to me. I see several aspects here that we need to address: 1) We can seed the smoke test suite with typehints tests, and add more tests later if

Re: Companies using Beam?

2020-05-01 Thread Aizhamal Nurmamat kyzy
Another example of similar pages: https://www.instaclustr.com/resource-type/case-studies/. The whole resources section is very useful. It would be good for Beam to have links to webinars, meetups, books, whitepapers, etc. from the website where new users land. Simpler example:

Re: possible bug in AvroUtils

2020-05-01 Thread Brian Hulette
Let's discuss details on the jira. I could maybe take it, but could use advice on the right course of action. On Fri, May 1, 2020 at 6:05 AM Ismaël Mejía wrote: > I dug deeper and found that this global static change was introduced > since the beginning of the Avro / Beam Schema support (Beam

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-01 Thread Brian Hulette
Regarding move detection: I worked with Nam on this some on the-asf slack. We couldn't make squashing into a single large commit work - when I did it, `git log` still showed many dropped and added files. Breaking out a single commit with the file moves was the best we could manage. I tested a PR

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-01 Thread Robert Bradshaw
I just took a look, and added a couple of comments, but it mostly looks good. Thanks for creating a commit that preserves changes; that's a big improvement. +1 to Ahmet's suggestion about braking the huge commit up a bit more. I would suggest one that adds the mechanics (etc.), one that applies a

Re: [DISCUSS] How many Python 3.x minor versions should Beam Python SDK aim to support concurrently?

2020-05-01 Thread Yoshiki Obata
Hello everyone. I'm working on Python 3.8 support[1] and now is the time for preparing test infrastructure. According to the discussion, I've considered how to prioritize tests. My plan is as below. I'd like to get your thoughts on this. - With all low-pri Python, apache_beam.typehints.*_test

Re: Non-trivial joins examples

2020-05-01 Thread Kenneth Knowles
+dev @beam and some people who I talk about joins with Interesting! It is a lot to take in and fully grok the code, so calling in reinforcements... Generally, I think there's agreement that for a lot of real use cases, you have to roll your own join using the lower level Beam primitives. So I

Re: [REVIEW][please pause website changes] Migrated the Beam website to Hugo

2020-05-01 Thread Kenneth Knowles
I believe taking Brian and Robert's advice to help git detect moves (even more than you already have) will make this much more manageable. I just tried it out and squashing commits brings it to "631 files changed, 10363 insertions(+), 9945 deletions(-)" according to git, so that is more manageable

Re: Greetings from Tyson

2020-05-01 Thread Kenneth Knowles
Welcome! On Thu, Apr 30, 2020 at 9:48 AM Ruoyun Huang wrote: > Welcome Tyson! > > On Thu, Apr 30, 2020 at 6:44 AM Connell O'Callaghan > wrote: > >> Welcome Tyson!!! >> >> >> >> On Thu, Apr 30, 2020 at 6:12 AM Ismaël Mejía wrote: >> >>> Welcome! >>> >>> On Thu, Apr 30, 2020 at 12:27 AM Alan

Re: Companies using Beam?

2020-05-01 Thread Kenneth Knowles
+1 to a "Powered By" style page. These are pretty common. Like https://calcite.apache.org/docs/powered_by.html I think applying a designer's skills might result in something a bit cooler looking... Kenn On Thu, Apr 30, 2020 at 9:24 AM Austin Bennett wrote: > A first pass, something like: > >

Re: possible bug in AvroUtils

2020-05-01 Thread Ismaël Mejía
I dug deeper and found that this global static change was introduced since the beginning of the Avro / Beam Schema support (Beam 2.15.0): https://github.com/apache/beam/commit/2a40c576cfb On Thu, Apr 30, 2020 at 8:52 PM Ismaël Mejía wrote: > > Created

Re: Jenkins jobs not running for my PR 10438

2020-05-01 Thread Ismaël Mejía
done On Fri, May 1, 2020 at 5:31 AM Tomo Suzuki wrote: > > Hi Beam committers, > > Would you trigger the precommit checks for this PR? > https://github.com/apache/beam/pull/11586 > > Regards, > Tomo

NPE in Calcite dialect when input PCollection has logical type in schema, from JdbcIO Transform

2020-05-01 Thread rahul patwari
Hi, A JIRA ticket is raised to track this bug: BEAM-8307 I have raised a PR: https://github.com/apache/beam/pull/11581 to fix the issue. This PR takes care of using BeamSql with JdbcIO. I would be interested to contribute if any other IOs