Re: [Discuss] Ideas for Apache Beam presence in social media

2019-06-04 Thread Kenneth Knowles
Bringing the PMC's conclusion back to this list, we are happy to start with
the following arrangement:

 - Doc/spreadsheet/etc readable by dev@ (aka the public), writable by some
group of contributors to set up a queue of news
 - Any member of PMC approves and executes the posts, with enough time
elapsing to consider it lazy consensus

Any mistake transcribing this conclusion is my own. And of course nothing
is permanent, but we try and iterate.

Kenn

On Mon, Jun 3, 2019 at 2:18 PM Aizhamal Nurmamat kyzy 
wrote:

> Hello folks,
>
> I have created a spreadsheet where people can suggest tweets [1]. It
> contains a couple of tweets that have been tweeted as examples. Also, there
> are a couple others that I will ask PMC members to review in the next few
> days.
>
> I have also created a blog post[2] to invite community members to
> participate by proposing tweets / retweets.
>
> Does this look OK to everyone? I’d love to try it out and see if it drives
> engagement in the community. If not we can always change the processes.
>
> Thanks,
> aizhamal
>
> [1] s.apache.org/beam-tweets
> [2] https://github.com/apache/beam/pull/8747
>
> On Fri, May 24, 2019 at 4:26 PM Kenneth Knowles  wrote:
>
>> Thanks for taking on this work!
>>
>> Kenn
>>
>> On Fri, May 24, 2019 at 2:52 PM Aizhamal Nurmamat kyzy <
>> aizha...@google.com> wrote:
>>
>>> Hi everyone,
>>>
>>> I'd like to pilot this if that's okay by everyone. I'll set up a
>>> spreadsheet, write a blog post publicizing it, and perhaps send out a
>>> tweet. We can improve the process later with tools if necessary.
>>>
>>> Thanks all and have a great weekend!
>>> Aizhamal
>>>
>>> On Tue, May 21, 2019 at 8:37 PM Kenneth Knowles  wrote:
>>>
 Great idea.

 Austin - point well taken about whether the PMC really has to
 micro-manage here. The stakes are potentially very high, but so are the
 stakes for code and website changes.

 I know that comdev votes authoring privileges to people who are not
 committers, but they are not speaking on behalf of comdev but under their
 own name.

 Let's definitely find a way to be effective on social media.

 Kenn

 On Tue, May 21, 2019 at 4:14 AM Maximilian Michels 
 wrote:

> Hi Aizhamal,
>
> This is a great idea. I think it would help Beam to be more prominent
> on
> social media.
>
> We need to discuss this also on the private@ mailing list but I don't
> see anything standing in the way if the PMC always gets to approve the
> proposed social media postings.
>
> I could even imagine that the PMC gives rights to a Beam community
> member to post in their name.
>
> Thanks,
> Max
>
> On 21.05.19 03:09, Austin Bennett wrote:
> > Is PMC definitely in charge of this (approving, communication
> channel,
> > etc)?
> >
> > There could even be a more concrete pull-request-like function even
> for
> > things like tweets (to minimize cut/paste operations)?
> >
> > I remember a bit of a mechanism having been proposed some time ago
> (in
> > another circumstance), though doesn't look like it made it terribly
> far:
> >
> http://www.redhenlab.org/home/the-cognitive-core-research-topics-in-red-hen/the-barnyard/-slick-tweeting
> > (I haven't otherwise seen such functionality).
> >
> >
> >
> > On Mon, May 20, 2019 at 4:54 PM Robert Burke  > > wrote:
> >
> > +1
> > As a twitter user, I like this idea.
> >
> > On Mon, 20 May 2019 at 15:18, Aizhamal Nurmamat kyzy
> > mailto:aizha...@google.com>> wrote:
> >
> > Hello everyone,
> >
> >
> > What does the community think of making Apache Beam’s social
> > media presence more active and more community driven?
> >
> >
> > The Slack and StackOverflow for Apache Beam offer pretty nice
> > support, but we still could utilize Twitter & LinkedIn
> better to
> > share more interesting Beam news. For example, we could
> tweet to
> > welcome new committers, announce new features consistently,
> > share and recognize contributions, promote events and
> meetups,
> > share other news that are relevant to Beam, big data, etc.
> >
> >
> > I understand that PMC members may not have time to do
> curation,
> > moderation and creation of content; so I was wondering if we
> > could create a spreadsheet where community members could
> propose
> > posts with publishing dates, and let somebody to filter,
> > moderate, and manage it; then send to a PMC member for
> publication.
> >
> >
> > I would love to help where I can in this regard. I’ve had
> some
> > experience doing social media elsewhere in the 

Re: Removing shading by default within BeamModulePlugin.groovy

2019-06-04 Thread Kenneth Knowles
Nice! This is a huge step. One thing that showed up in the last big gradle
change was needing to check the generated poms.

Kenn

On Tue, Jun 4, 2019 at 5:07 PM Lukasz Cwik  wrote:

> Since we have been migrating to using vendoring instead of shading[1] and
> due to previous efforts in vendoring[2, 3] I have opened up PR 8762[4]
> which migrates all projects that weren't doing anything shading wise to not
> perform any shading. This required me to fix up all intra project
> dependencies and release publishing.
>
> The following is a list of all project paths which are still using shading
> for some reason:
> model/*
> sdks/java/core
> sdks/java/extensions/kryo
> sdks/java/extensions/sql
> sdks/java/extensions/sql/jdbc
> sdks/java/harness
> runners/spark/job-server
> runners/direct-java
> runners/samza/job-server
> runners/google-cloud-dataflow-java/worker
> runners/google-cloud-dataflow-java/worker/legacy-worker
> runners/google-cloud-dataflow-java/worker/windmill
> vendor/*
>
> Out of the list above, migrating sdks/java/core and runners/direct-java
> (in that order) would provide the most benefit to moving away from shading
> within our project. Many of the others are either shaded proto classes or
> applications (e.g. job-servers, harness, sql jdbc) and either require
> shading to be compatible with vendoring or aren't meant to be used as
> dependencies.
>
> Since this is a larger change that cuts across so many projects there is
> risk for breakage. I'm looking for people to help test the change and
> validate any scenarios that they are specifically interested in. I'm
> planning to run several of the postcommits on my PR and check that we can
> build a release in addition to any efforts others provide before looking to
> have the change merged.
>
> The following guidance should help those who edit Gradle build files
> (after this change is merged):
> * For projects that don't perform any shading, those projects have been
> migrated to use the default configurations that the Gradle Java plugin
> uses[5]. Note that the default configurations we use have been deprecated.
> * For projects that depend on another project that isn't shaded, the intra
> project configuration has been swapped to use compile / testRuntime instead
> of shadow and shadowTest
> * Existing projects that are still shaded should use the shadow and
> shadowTest configurations as before.
>
> 1:
> https://lists.apache.org/thread.html/4c12db35b40a6d56e170cd6fc8bb0ac4c43a99aa3cb7dbae54176815@%3Cdev.beam.apache.org%3E
> 2:
> https://lists.apache.org/thread.html/4c12db35b40a6d56e170cd6fc8bb0ac4c43a99aa3cb7dbae54176815@%3Cdev.beam.apache.org%3E
> 3:
> https://lists.apache.org/thread.html/972b5175641f4eaf7ec92870cc0ff72fa52e6f0bbaccc384a3814e45@%3Cdev.beam.apache.org%3E
> 4: https://github.com/apache/beam/pull/8762
> 5:
> https://docs.gradle.org/current/userguide/java_plugin.html#sec:java_plugin_and_dependency_management
>


Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-04 Thread Ankur Goenka
Final few things remaining for the release
* Please review https://github.com/apache/beam/pull/8667

After which we can
* Release version finalized in JIRA (PMC help needed)
* Release version is listed at reporter.apache.org (PMC help needed)
* Promote the release.

On Tue, Jun 4, 2019 at 5:02 PM Ahmet Altay  wrote:

> I would suggest have a single way of tracking cherry pick request to an
> RC. Currently we use emails on the RC thread, open PRs, and Jiras tagged
> for the release. This is confusing the person doing the release while they
> are juggling multiple things. How about we ask all cherry pick requests to
> have a JIRA filed against that release and marked as blockers?
>
> On Tue, Jun 4, 2019 at 1:05 PM Ankur Goenka  wrote:
>
>> That makes sense.
>> I would also like to add that the corresponding PR should be added to an
>> open blocking Jira
>>  for the
>> release to keep a single source to check.
>>
>> On Tue, Jun 4, 2019 at 12:15 PM Kenneth Knowles  wrote:
>>
>>> I would actually suggest that the following search needs to be triaged
>>> to zero before cutting an RC:
>>> https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+is%3Aopen+base%3Arelease-2.13.0
>>> .
>>>
>>> On Tue, Jun 4, 2019 at 11:17 AM Ankur Goenka  wrote:
>>>
 Sorry, I missed the comment for not including weekend's to 72 hours
 voting period.

 I meant to update the blog post
 https://github.com/apache/beam/pull/8667/files once we have finalized
 the RC so that it can be consistent. Please add any comments to PR and I
 can incorporate them.

 As we did not go for 3rd RC and
 https://github.com/apache/beam/pull/8714 was not blocking the 2.13
 release, I went with the release.

 I have released the maven artifacts for beam. So I suppose, we can not
 do another RC for 2.13.0.
 If we need anything urgently in 2.13 then we can do a bug fix release
 2.13.1.


 On Tue, Jun 4, 2019 at 8:59 AM Thomas Weise  wrote:

> This seems a rushed and things fall through the cracks.
>
> Max had requested to not include the weekend into the voting period.
>
> Valentyn: I had the same question on the first RC. The PR should be
> included into the vote for review. You can find it here:
> https://github.com/apache/beam/pull/8667/files
>
> I had requested to include following backport PR before the RC:
> https://github.com/apache/beam/pull/8714  - It's not blocking but
> would be nice if someone can merge it for any future release from this
> branch.
>
> Thanks,
> Thomas
>
>
> On Tue, Jun 4, 2019 at 1:59 AM Maximilian Michels 
> wrote:
>
>> The summary is not correct. Binding votes (in order):
>>
>> Ahmet Altay
>> Robert Bradshaw
>> Maximilian Michels
>> Jean-Baptiste Onofré
>> Lukasz Cwik
>>
>> A total of 5 binding votes.
>>
>> On 04.06.19 02:37, Ankur Goenka wrote:
>> > +1
>> > Thanks for validating the release and voting.
>> > With 0(-1), 6(+1) and 3(+1 binding) votes, I am concluding the
>> voting
>> > process.
>> > I am going ahead with the release and will keep the community
>> posted
>> > with the updates.
>> >
>> > On Mon, Jun 3, 2019 at 1:57 PM Andrew Pilloud > > > wrote:
>> >
>> > +1 Reviewed the Nexmark java and SQL perfkit graphs, no obvious
>> > regressions over the previous release.
>> >
>> > On Mon, Jun 3, 2019 at 1:15 PM Lukasz Cwik > > > wrote:
>> >
>> > Thanks for the clarification.
>> >
>> > On Mon, Jun 3, 2019 at 11:40 AM Ankur Goenka <
>> goe...@google.com
>> > > wrote:
>> >
>> > Yes, i meant i will close the voting at 5pm and start
>> the
>> > release process.
>> >
>> > On Mon, Jun 3, 2019, 10:59 AM Lukasz Cwik <
>> lc...@google.com
>> > > wrote:
>> >
>> > Ankur, did you mean to say your going to close the
>> vote
>> > today at 5pm? (and then complete the release
>> afterwards)
>> >
>> > On Mon, Jun 3, 2019 at 10:54 AM Ankur Goenka
>> > mailto:goe...@google.com>>
>> wrote:
>> >
>> > Thanks for validating and voting.
>> >
>> > We have 4 binding votes.
>> > I will complete the release today 5PM. Please
>> raise
>> > any concerns before that.
>> >
>> > Thanks,
>> > Ankur
>> >
>> > On Mon, Jun 3, 2019 at 8:36 AM Lukasz Cwik
>> > 

Removing shading by default within BeamModulePlugin.groovy

2019-06-04 Thread Lukasz Cwik
Since we have been migrating to using vendoring instead of shading[1] and
due to previous efforts in vendoring[2, 3] I have opened up PR 8762[4]
which migrates all projects that weren't doing anything shading wise to not
perform any shading. This required me to fix up all intra project
dependencies and release publishing.

The following is a list of all project paths which are still using shading
for some reason:
model/*
sdks/java/core
sdks/java/extensions/kryo
sdks/java/extensions/sql
sdks/java/extensions/sql/jdbc
sdks/java/harness
runners/spark/job-server
runners/direct-java
runners/samza/job-server
runners/google-cloud-dataflow-java/worker
runners/google-cloud-dataflow-java/worker/legacy-worker
runners/google-cloud-dataflow-java/worker/windmill
vendor/*

Out of the list above, migrating sdks/java/core and runners/direct-java (in
that order) would provide the most benefit to moving away from shading
within our project. Many of the others are either shaded proto classes or
applications (e.g. job-servers, harness, sql jdbc) and either require
shading to be compatible with vendoring or aren't meant to be used as
dependencies.

Since this is a larger change that cuts across so many projects there is
risk for breakage. I'm looking for people to help test the change and
validate any scenarios that they are specifically interested in. I'm
planning to run several of the postcommits on my PR and check that we can
build a release in addition to any efforts others provide before looking to
have the change merged.

The following guidance should help those who edit Gradle build files (after
this change is merged):
* For projects that don't perform any shading, those projects have been
migrated to use the default configurations that the Gradle Java plugin
uses[5]. Note that the default configurations we use have been deprecated.
* For projects that depend on another project that isn't shaded, the intra
project configuration has been swapped to use compile / testRuntime instead
of shadow and shadowTest
* Existing projects that are still shaded should use the shadow and
shadowTest configurations as before.

1:
https://lists.apache.org/thread.html/4c12db35b40a6d56e170cd6fc8bb0ac4c43a99aa3cb7dbae54176815@%3Cdev.beam.apache.org%3E
2:
https://lists.apache.org/thread.html/4c12db35b40a6d56e170cd6fc8bb0ac4c43a99aa3cb7dbae54176815@%3Cdev.beam.apache.org%3E
3:
https://lists.apache.org/thread.html/972b5175641f4eaf7ec92870cc0ff72fa52e6f0bbaccc384a3814e45@%3Cdev.beam.apache.org%3E
4: https://github.com/apache/beam/pull/8762
5:
https://docs.gradle.org/current/userguide/java_plugin.html#sec:java_plugin_and_dependency_management


Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-04 Thread Ahmet Altay
I would suggest have a single way of tracking cherry pick request to an RC.
Currently we use emails on the RC thread, open PRs, and Jiras tagged for
the release. This is confusing the person doing the release while they are
juggling multiple things. How about we ask all cherry pick requests to have
a JIRA filed against that release and marked as blockers?

On Tue, Jun 4, 2019 at 1:05 PM Ankur Goenka  wrote:

> That makes sense.
> I would also like to add that the corresponding PR should be added to an
> open blocking Jira
>  for the
> release to keep a single source to check.
>
> On Tue, Jun 4, 2019 at 12:15 PM Kenneth Knowles  wrote:
>
>> I would actually suggest that the following search needs to be triaged to
>> zero before cutting an RC:
>> https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+is%3Aopen+base%3Arelease-2.13.0
>> .
>>
>> On Tue, Jun 4, 2019 at 11:17 AM Ankur Goenka  wrote:
>>
>>> Sorry, I missed the comment for not including weekend's to 72 hours
>>> voting period.
>>>
>>> I meant to update the blog post
>>> https://github.com/apache/beam/pull/8667/files once we have finalized
>>> the RC so that it can be consistent. Please add any comments to PR and I
>>> can incorporate them.
>>>
>>> As we did not go for 3rd RC and https://github.com/apache/beam/pull/8714 was
>>> not blocking the 2.13 release, I went with the release.
>>>
>>> I have released the maven artifacts for beam. So I suppose, we can not
>>> do another RC for 2.13.0.
>>> If we need anything urgently in 2.13 then we can do a bug fix release
>>> 2.13.1.
>>>
>>>
>>> On Tue, Jun 4, 2019 at 8:59 AM Thomas Weise  wrote:
>>>
 This seems a rushed and things fall through the cracks.

 Max had requested to not include the weekend into the voting period.

 Valentyn: I had the same question on the first RC. The PR should be
 included into the vote for review. You can find it here:
 https://github.com/apache/beam/pull/8667/files

 I had requested to include following backport PR before the RC:
 https://github.com/apache/beam/pull/8714  - It's not blocking but
 would be nice if someone can merge it for any future release from this
 branch.

 Thanks,
 Thomas


 On Tue, Jun 4, 2019 at 1:59 AM Maximilian Michels 
 wrote:

> The summary is not correct. Binding votes (in order):
>
> Ahmet Altay
> Robert Bradshaw
> Maximilian Michels
> Jean-Baptiste Onofré
> Lukasz Cwik
>
> A total of 5 binding votes.
>
> On 04.06.19 02:37, Ankur Goenka wrote:
> > +1
> > Thanks for validating the release and voting.
> > With 0(-1), 6(+1) and 3(+1 binding) votes, I am concluding the
> voting
> > process.
> > I am going ahead with the release and will keep the community posted
> > with the updates.
> >
> > On Mon, Jun 3, 2019 at 1:57 PM Andrew Pilloud  > > wrote:
> >
> > +1 Reviewed the Nexmark java and SQL perfkit graphs, no obvious
> > regressions over the previous release.
> >
> > On Mon, Jun 3, 2019 at 1:15 PM Lukasz Cwik  > > wrote:
> >
> > Thanks for the clarification.
> >
> > On Mon, Jun 3, 2019 at 11:40 AM Ankur Goenka <
> goe...@google.com
> > > wrote:
> >
> > Yes, i meant i will close the voting at 5pm and start the
> > release process.
> >
> > On Mon, Jun 3, 2019, 10:59 AM Lukasz Cwik <
> lc...@google.com
> > > wrote:
> >
> > Ankur, did you mean to say your going to close the
> vote
> > today at 5pm? (and then complete the release
> afterwards)
> >
> > On Mon, Jun 3, 2019 at 10:54 AM Ankur Goenka
> > mailto:goe...@google.com>>
> wrote:
> >
> > Thanks for validating and voting.
> >
> > We have 4 binding votes.
> > I will complete the release today 5PM. Please
> raise
> > any concerns before that.
> >
> > Thanks,
> > Ankur
> >
> > On Mon, Jun 3, 2019 at 8:36 AM Lukasz Cwik
> > mailto:lc...@google.com>>
> wrote:
> >
> > Since the gearpump issue has been ongoing
> since
> > 2.10, I can't consider it a blocker for this
> > release and am voting +1.
> >
> > On Mon, Jun 3, 2019 at 7:13 AM Jean-Baptiste
> > Onofré  > > wrote:
> >
> >  

Re: [DISCUSS] Portability representation of schemas

2019-06-04 Thread Brian Hulette
Yeah that's what I meant. It does seem logical reasonable to scope any
registry by pipeline and not by PCollection. Then it seems we would want
the entire LogicalType (including the `FieldType representation` field) as
the value type, and not just LogicalTypeConversion. Otherwise we're
separating the representations from the conversions, and duplicating the
representations. You did say a "registry of logical types", so maybe that
is what you meant.

Brian

On Tue, Jun 4, 2019 at 1:21 PM Reuven Lax  wrote:

>
>
> On Tue, Jun 4, 2019 at 9:20 AM Brian Hulette  wrote:
>
>>
>>
>> On Mon, Jun 3, 2019 at 10:04 PM Reuven Lax  wrote:
>>
>>>
>>>
>>> On Mon, Jun 3, 2019 at 12:27 PM Brian Hulette 
>>> wrote:
>>>
 > It has to go into the proto somewhere (since that's the only way the
 SDK can get it), but I'm not sure they should be considered integral parts
 of the type.
 Are you just advocating for an approach where any SDK-specific
 information is stored outside of the Schema message itself so that Schema
 really does just represent the type? That seems reasonable to me, and
 alleviates my concerns about how this applies to columnar encodings a bit
 as well.

>>>
>>> Yes, that's exactly what I'm advocating.
>>>
>>>

 We could lift all of the LogicalTypeConversion messages out of the
 Schema and the LogicalType like this:

 message SchemaCoder {
   Schema schema = 1;
   LogicalTypeConversion root_conversion = 2;
   map attribute_conversions = 3; // only
 necessary for user type aliases, portable logical types by definition have
 nothing SDK-specific
 }

>>>
>>> I'm not sure what the map is for? I think we have status quo wihtout it.
>>>
>>
>> My intention was that the SDK-specific information (to/from functions)
>> for any nested fields that are themselves user type aliases would be stored
>> in this map. That was the motivation for my next question, if we don't
>> allow user types to be nested within other user types we may not need it.
>>
>
> Oh, is this meant to contain the ids of all the logical types in this
> schema? If so I don't think SchemaCoder is the right place for this. Any
> "registry" of logical types should be global to the pipeline, not scoped to
> a single PCollection IMO.
>
>
>> I may be missing your meaning - but I think we currently only have status
>> quo without this map in the Java SDK because Schema.LogicalType is just an
>> interface that must be implemented. It's appropriate for just portable
>> logical types, not user-type aliases. Note I've adopted Kenn's terminology
>> where portable logical type is a type that can be identified by just a URN
>> and maybe some parameters, while a user type alias needs some SDK specific
>> information, like a class and to/from UDFs.
>>
>>
>>>
 I think a critical question (that has implications for the above
 proposal) is how/if the two different concepts Kenn mentioned are allowed
 to nest. For example, you could argue it's redundant to have a user type
 alias that has a Row representation with a field that is itself a user type
 alias, because instead you could just have a single top-level type alias
 with to/from functions that pack and unpack the entire hierarchy. On the
 other hand, I think it does make sense for a user type alias or a truly
 portable logical type to have a field that is itself a truly portable
 logical type (e.g. a user type alias or portable type with a DateTime).

 I've been assuming that user-type aliases could be nested, but should
 we disallow that? Or should we go the other way and require that logical
 types define at most one "level"?

>>>
>>> No I think it's useful to allow things to be nested (though of course
>>> the nesting must terminate).
>>>
>>
>>>

 Brian

 On Mon, Jun 3, 2019 at 11:08 AM Kenneth Knowles 
 wrote:

>
> On Mon, Jun 3, 2019 at 10:53 AM Reuven Lax  wrote:
>
>> So I feel a bit leery about making the to/from functions a
>> fundamental part of the portability representation. In my mind, that is
>> very tied to a specific SDK/language. A SDK (say the Java SDK) wants to
>> allow users to use a wide variety of native types with schemas, and under
>> the covers uses the to/from functions to implement that. However from the
>> portable Beam perspective, the schema itself should be the real "type" of
>> the PCollection; the to/from methods are simply a way that a particular 
>> SDK
>> makes schemas easier to use. It has to go into the proto somewhere (since
>> that's the only way the SDK can get it), but I'm not sure they should be
>> considered integral parts of the type.
>>
>
> On the doc in a couple places this distinction was made:
>
> * For truly portable logical types, no instructions for the SDK are
> needed. Instead, they require:
>- URN: a standardized 

Re: [DISCUSS] Portability representation of schemas

2019-06-04 Thread Reuven Lax
On Tue, Jun 4, 2019 at 9:20 AM Brian Hulette  wrote:

>
>
> On Mon, Jun 3, 2019 at 10:04 PM Reuven Lax  wrote:
>
>>
>>
>> On Mon, Jun 3, 2019 at 12:27 PM Brian Hulette 
>> wrote:
>>
>>> > It has to go into the proto somewhere (since that's the only way the
>>> SDK can get it), but I'm not sure they should be considered integral parts
>>> of the type.
>>> Are you just advocating for an approach where any SDK-specific
>>> information is stored outside of the Schema message itself so that Schema
>>> really does just represent the type? That seems reasonable to me, and
>>> alleviates my concerns about how this applies to columnar encodings a bit
>>> as well.
>>>
>>
>> Yes, that's exactly what I'm advocating.
>>
>>
>>>
>>> We could lift all of the LogicalTypeConversion messages out of the
>>> Schema and the LogicalType like this:
>>>
>>> message SchemaCoder {
>>>   Schema schema = 1;
>>>   LogicalTypeConversion root_conversion = 2;
>>>   map attribute_conversions = 3; // only
>>> necessary for user type aliases, portable logical types by definition have
>>> nothing SDK-specific
>>> }
>>>
>>
>> I'm not sure what the map is for? I think we have status quo wihtout it.
>>
>
> My intention was that the SDK-specific information (to/from functions) for
> any nested fields that are themselves user type aliases would be stored in
> this map. That was the motivation for my next question, if we don't allow
> user types to be nested within other user types we may not need it.
>

Oh, is this meant to contain the ids of all the logical types in this
schema? If so I don't think SchemaCoder is the right place for this. Any
"registry" of logical types should be global to the pipeline, not scoped to
a single PCollection IMO.


> I may be missing your meaning - but I think we currently only have status
> quo without this map in the Java SDK because Schema.LogicalType is just an
> interface that must be implemented. It's appropriate for just portable
> logical types, not user-type aliases. Note I've adopted Kenn's terminology
> where portable logical type is a type that can be identified by just a URN
> and maybe some parameters, while a user type alias needs some SDK specific
> information, like a class and to/from UDFs.
>
>
>>
>>> I think a critical question (that has implications for the above
>>> proposal) is how/if the two different concepts Kenn mentioned are allowed
>>> to nest. For example, you could argue it's redundant to have a user type
>>> alias that has a Row representation with a field that is itself a user type
>>> alias, because instead you could just have a single top-level type alias
>>> with to/from functions that pack and unpack the entire hierarchy. On the
>>> other hand, I think it does make sense for a user type alias or a truly
>>> portable logical type to have a field that is itself a truly portable
>>> logical type (e.g. a user type alias or portable type with a DateTime).
>>>
>>> I've been assuming that user-type aliases could be nested, but should we
>>> disallow that? Or should we go the other way and require that logical types
>>> define at most one "level"?
>>>
>>
>> No I think it's useful to allow things to be nested (though of course the
>> nesting must terminate).
>>
>
>>
>>>
>>> Brian
>>>
>>> On Mon, Jun 3, 2019 at 11:08 AM Kenneth Knowles  wrote:
>>>

 On Mon, Jun 3, 2019 at 10:53 AM Reuven Lax  wrote:

> So I feel a bit leery about making the to/from functions a fundamental
> part of the portability representation. In my mind, that is very tied to a
> specific SDK/language. A SDK (say the Java SDK) wants to allow users to 
> use
> a wide variety of native types with schemas, and under the covers uses the
> to/from functions to implement that. However from the portable Beam
> perspective, the schema itself should be the real "type" of the
> PCollection; the to/from methods are simply a way that a particular SDK
> makes schemas easier to use. It has to go into the proto somewhere (since
> that's the only way the SDK can get it), but I'm not sure they should be
> considered integral parts of the type.
>

 On the doc in a couple places this distinction was made:

 * For truly portable logical types, no instructions for the SDK are
 needed. Instead, they require:
- URN: a standardized identifier any SDK can recognize
- A spec: what is the universe of values in this type?
- A representation: how is it represented in built-in types? This is
 how SDKs who do not know/care about the URN will process it
- (optional): SDKs choose preferred SDK-specific types to embed the
 values in. SDKs have to know about the URN and choose for themselves.

 *For user-level type aliases, written as convenience by the user in
 their pipeline, what Java schemas have today:
- to/from UDFs: the code is SDK-specific
- some representation of the intended type (like java class): 

Re: Join the Beam Community Request Email

2019-06-04 Thread Brian Hulette
Some of the pain of switching to Hugo may be alleviated by the `hugo import
jekyll` command: https://gohugo.io/commands/hugo_import_jekyll/

I guess that will probably just take care of the markdown flavor, but at
least that's one less thing to worry about.

Brian

On Mon, Jun 3, 2019 at 11:08 PM Melissa Pashniak 
wrote:

>
> We currently use Jekyll to generate our HTML files, and there are some
> plugins and hacks to add multilingual support, but it's not built-in. as I
> understand it, It's one of the reasons Kubernetes moved away from Jekyll to
> Hugo about a year ago (see their blog post for more details [1] ). They
> wanted multilingual support and a language switcher, and after
> investigating Jekyll options, settled on Hugo as it comes with built-in
> multilingual support [2].
>
> We could try to go the Jekyll plugin route to add additional languages in
> Jekyll, but if we want maintainable docs that are in many languages and a
> language switcher as a long-term goal, we might want to consider moving.
> Either way, we'd need to do some work.
>
> For the Jekyll option, we'd need to (likely):
>
> -  evaluate the existing multilingual plugins, see if they do what we
> want, and make any needed code changes
> -  I suspect we'll have to change the site directory structure
> -  update all of the existing Jenkins jobs to stage/test, and any other
> scripts that deal with building the site. this would require someone with
> knowledge about how it's all put together with the building and syncing of
> the website, and access to the build machines?
> - updating the existing local testing/staging scripts for contributors
>
> If we moved to Hugo for built-in support:
>
> - rip out Jekyll and get Hugo set up and configured
> - (same as above, but more substantial changes needed) update all of the
> existing Jenkins jobs to stage/test, and any other scripts that deal with
> building the site. this would require someone with knowledge about how it's
> all put together with the building and syncing of the website, and access
> to the build machines?
> - updating the existing local testing/staging scripts for contributors
> - figuring out replacements (IF they don't exist for Hugo) for some of our
> Jekyll plugins (for example, github snippet grabber, link checker).
> - Hugo uses a different flavor of markdown, so we'll likely need to make
> some changes to our markdown files
> - Probably other things I am not aware of ;-)
>
> Thoughts?
>
> [1] https://kubernetes.io/blog/2018/05/05/hugo-migration/
> [2] https://gohugo.io/content-management/multilingual/
>
>
>
> On Fri, May 24, 2019 at 10:50 AM Lukasz Cwik  wrote:
>
>> Welcome Zhang, I have added you as a contributor the Apache Beam JIRA.
>>
>> I would suggest you take a look at the contribution guide[1] to learn on
>> how to get started.
>>
>> If I understand correctly, your interested in translating several
>> documents found on the Beam website, if so Melissa would be a good contact
>> since she has helped with our documentation a lot.
>> Melissa would you know if we have i18n support built into the website
>> already?
>>
>> 1: https://beam.apache.org/contribute/
>>
>> On Thu, May 23, 2019 at 6:24 PM 图霸群英  wrote:
>>
>>> Hello everyone! My name is Zhang Haitao and I am from Beijing, China.
>>> I created a Beam Chinese hobby group in China.
>>> Also published on infoq about Beam related articles.
>>> Https://www.infoq.cn/profile/1280576
>>> I want to join the Beam community now.
>>>
>>> On May 27th of this month, I will also participate in Qcon's promotion
>>> work on Beam technology.
>>> https://2019.qconguangzhou.com/presentation/1822
>>>
>>> Now I am organizing people in China to translate English help documents
>>> into Chinese.
>>> Please give me a lot of advice and care.
>>>
>>> My github account: xsm110
>>> Example:
>>> https://github.com/xsm110/Apache-Beam-Example
>>> Apache User ID : zhanghaitao8
>>> JIRA: zhanghaitao8
>>>
>>> The attachment is signed by me for the agreement submitted to ASF.ICLA
>>>
>>


Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-04 Thread Ankur Goenka
That makes sense.
I would also like to add that the corresponding PR should be added to an
open blocking Jira
 for the
release to keep a single source to check.

On Tue, Jun 4, 2019 at 12:15 PM Kenneth Knowles  wrote:

> I would actually suggest that the following search needs to be triaged to
> zero before cutting an RC:
> https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+is%3Aopen+base%3Arelease-2.13.0
> .
>
> On Tue, Jun 4, 2019 at 11:17 AM Ankur Goenka  wrote:
>
>> Sorry, I missed the comment for not including weekend's to 72 hours
>> voting period.
>>
>> I meant to update the blog post
>> https://github.com/apache/beam/pull/8667/files once we have finalized
>> the RC so that it can be consistent. Please add any comments to PR and I
>> can incorporate them.
>>
>> As we did not go for 3rd RC and https://github.com/apache/beam/pull/8714 was
>> not blocking the 2.13 release, I went with the release.
>>
>> I have released the maven artifacts for beam. So I suppose, we can not do
>> another RC for 2.13.0.
>> If we need anything urgently in 2.13 then we can do a bug fix release
>> 2.13.1.
>>
>>
>> On Tue, Jun 4, 2019 at 8:59 AM Thomas Weise  wrote:
>>
>>> This seems a rushed and things fall through the cracks.
>>>
>>> Max had requested to not include the weekend into the voting period.
>>>
>>> Valentyn: I had the same question on the first RC. The PR should be
>>> included into the vote for review. You can find it here:
>>> https://github.com/apache/beam/pull/8667/files
>>>
>>> I had requested to include following backport PR before the RC:
>>> https://github.com/apache/beam/pull/8714  - It's not blocking but would
>>> be nice if someone can merge it for any future release from this branch.
>>>
>>> Thanks,
>>> Thomas
>>>
>>>
>>> On Tue, Jun 4, 2019 at 1:59 AM Maximilian Michels 
>>> wrote:
>>>
 The summary is not correct. Binding votes (in order):

 Ahmet Altay
 Robert Bradshaw
 Maximilian Michels
 Jean-Baptiste Onofré
 Lukasz Cwik

 A total of 5 binding votes.

 On 04.06.19 02:37, Ankur Goenka wrote:
 > +1
 > Thanks for validating the release and voting.
 > With 0(-1), 6(+1) and 3(+1 binding) votes, I am concluding the voting
 > process.
 > I am going ahead with the release and will keep the community posted
 > with the updates.
 >
 > On Mon, Jun 3, 2019 at 1:57 PM Andrew Pilloud >>> > > wrote:
 >
 > +1 Reviewed the Nexmark java and SQL perfkit graphs, no obvious
 > regressions over the previous release.
 >
 > On Mon, Jun 3, 2019 at 1:15 PM Lukasz Cwik >>> > > wrote:
 >
 > Thanks for the clarification.
 >
 > On Mon, Jun 3, 2019 at 11:40 AM Ankur Goenka <
 goe...@google.com
 > > wrote:
 >
 > Yes, i meant i will close the voting at 5pm and start the
 > release process.
 >
 > On Mon, Jun 3, 2019, 10:59 AM Lukasz Cwik <
 lc...@google.com
 > > wrote:
 >
 > Ankur, did you mean to say your going to close the
 vote
 > today at 5pm? (and then complete the release
 afterwards)
 >
 > On Mon, Jun 3, 2019 at 10:54 AM Ankur Goenka
 > mailto:goe...@google.com>> wrote:
 >
 > Thanks for validating and voting.
 >
 > We have 4 binding votes.
 > I will complete the release today 5PM. Please
 raise
 > any concerns before that.
 >
 > Thanks,
 > Ankur
 >
 > On Mon, Jun 3, 2019 at 8:36 AM Lukasz Cwik
 > mailto:lc...@google.com>>
 wrote:
 >
 > Since the gearpump issue has been ongoing
 since
 > 2.10, I can't consider it a blocker for this
 > release and am voting +1.
 >
 > On Mon, Jun 3, 2019 at 7:13 AM Jean-Baptiste
 > Onofré >>> > > wrote:
 >
 > +1 (binding)
 >
 > Quickly tested on beam-samples.
 >
 > Regards
 > JB
 >
 > On 31/05/2019 04:52, Ankur Goenka wrote:
 >  > Hi everyone,
 >  >
 >  > Please review and vote on the release
 > candidate #2 for the version
 >  > 2.13.0, as follows:

Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-04 Thread Kenneth Knowles
I would actually suggest that the following search needs to be triaged to
zero before cutting an RC:
https://github.com/apache/beam/pulls?utf8=%E2%9C%93=is%3Apr+is%3Aopen+base%3Arelease-2.13.0
.

On Tue, Jun 4, 2019 at 11:17 AM Ankur Goenka  wrote:

> Sorry, I missed the comment for not including weekend's to 72 hours voting
> period.
>
> I meant to update the blog post
> https://github.com/apache/beam/pull/8667/files once we have finalized the
> RC so that it can be consistent. Please add any comments to PR and I can
> incorporate them.
>
> As we did not go for 3rd RC and https://github.com/apache/beam/pull/8714 was
> not blocking the 2.13 release, I went with the release.
>
> I have released the maven artifacts for beam. So I suppose, we can not do
> another RC for 2.13.0.
> If we need anything urgently in 2.13 then we can do a bug fix release
> 2.13.1.
>
>
> On Tue, Jun 4, 2019 at 8:59 AM Thomas Weise  wrote:
>
>> This seems a rushed and things fall through the cracks.
>>
>> Max had requested to not include the weekend into the voting period.
>>
>> Valentyn: I had the same question on the first RC. The PR should be
>> included into the vote for review. You can find it here:
>> https://github.com/apache/beam/pull/8667/files
>>
>> I had requested to include following backport PR before the RC:
>> https://github.com/apache/beam/pull/8714  - It's not blocking but would
>> be nice if someone can merge it for any future release from this branch.
>>
>> Thanks,
>> Thomas
>>
>>
>> On Tue, Jun 4, 2019 at 1:59 AM Maximilian Michels  wrote:
>>
>>> The summary is not correct. Binding votes (in order):
>>>
>>> Ahmet Altay
>>> Robert Bradshaw
>>> Maximilian Michels
>>> Jean-Baptiste Onofré
>>> Lukasz Cwik
>>>
>>> A total of 5 binding votes.
>>>
>>> On 04.06.19 02:37, Ankur Goenka wrote:
>>> > +1
>>> > Thanks for validating the release and voting.
>>> > With 0(-1), 6(+1) and 3(+1 binding) votes, I am concluding the voting
>>> > process.
>>> > I am going ahead with the release and will keep the community posted
>>> > with the updates.
>>> >
>>> > On Mon, Jun 3, 2019 at 1:57 PM Andrew Pilloud >> > > wrote:
>>> >
>>> > +1 Reviewed the Nexmark java and SQL perfkit graphs, no obvious
>>> > regressions over the previous release.
>>> >
>>> > On Mon, Jun 3, 2019 at 1:15 PM Lukasz Cwik >> > > wrote:
>>> >
>>> > Thanks for the clarification.
>>> >
>>> > On Mon, Jun 3, 2019 at 11:40 AM Ankur Goenka <
>>> goe...@google.com
>>> > > wrote:
>>> >
>>> > Yes, i meant i will close the voting at 5pm and start the
>>> > release process.
>>> >
>>> > On Mon, Jun 3, 2019, 10:59 AM Lukasz Cwik <
>>> lc...@google.com
>>> > > wrote:
>>> >
>>> > Ankur, did you mean to say your going to close the vote
>>> > today at 5pm? (and then complete the release
>>> afterwards)
>>> >
>>> > On Mon, Jun 3, 2019 at 10:54 AM Ankur Goenka
>>> > mailto:goe...@google.com>> wrote:
>>> >
>>> > Thanks for validating and voting.
>>> >
>>> > We have 4 binding votes.
>>> > I will complete the release today 5PM. Please raise
>>> > any concerns before that.
>>> >
>>> > Thanks,
>>> > Ankur
>>> >
>>> > On Mon, Jun 3, 2019 at 8:36 AM Lukasz Cwik
>>> > mailto:lc...@google.com>>
>>> wrote:
>>> >
>>> > Since the gearpump issue has been ongoing since
>>> > 2.10, I can't consider it a blocker for this
>>> > release and am voting +1.
>>> >
>>> > On Mon, Jun 3, 2019 at 7:13 AM Jean-Baptiste
>>> > Onofré >> > > wrote:
>>> >
>>> > +1 (binding)
>>> >
>>> > Quickly tested on beam-samples.
>>> >
>>> > Regards
>>> > JB
>>> >
>>> > On 31/05/2019 04:52, Ankur Goenka wrote:
>>> >  > Hi everyone,
>>> >  >
>>> >  > Please review and vote on the release
>>> > candidate #2 for the version
>>> >  > 2.13.0, as follows:
>>> >  >
>>> >  > [ ] +1, Approve the release
>>> >  > [ ] -1, Do not approve the release
>>> > (please provide specific comments)
>>> >  >
>>> >  > The complete staging area is available
>>> > for your review, which includes:
>>> >

Re: Jira tracker permission

2019-06-04 Thread Kenneth Knowles
Welcome!

On Tue, Jun 4, 2019 at 9:03 AM Ahmet Altay  wrote:

> Welcome!
>
> On Mon, Jun 3, 2019 at 10:31 PM Pablo Estrada  wrote:
>
>> I've added you as contributor - welcome
>> -P.
>>
>> On Mon, Jun 3, 2019, 9:16 PM Yichi Zhang  wrote:
>>
>>> Hi, beam-dev,
>>>
>>> This is Yichi Zhang from Google, I just started looking into beam
>>> projects and will be actively working on beam sdk, could someone grant me
>>> permission to beam jira issue tracker? My jira username is yichi
>>> .
>>>
>>> Looking forward to work with everyone else.
>>>
>>> Thanks,
>>> Yichi
>>>
>>


Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-04 Thread Ankur Goenka
Sorry, I missed the comment for not including weekend's to 72 hours voting
period.

I meant to update the blog post
https://github.com/apache/beam/pull/8667/files once we have finalized the
RC so that it can be consistent. Please add any comments to PR and I can
incorporate them.

As we did not go for 3rd RC and https://github.com/apache/beam/pull/8714 was
not blocking the 2.13 release, I went with the release.

I have released the maven artifacts for beam. So I suppose, we can not do
another RC for 2.13.0.
If we need anything urgently in 2.13 then we can do a bug fix release
2.13.1.


On Tue, Jun 4, 2019 at 8:59 AM Thomas Weise  wrote:

> This seems a rushed and things fall through the cracks.
>
> Max had requested to not include the weekend into the voting period.
>
> Valentyn: I had the same question on the first RC. The PR should be
> included into the vote for review. You can find it here:
> https://github.com/apache/beam/pull/8667/files
>
> I had requested to include following backport PR before the RC:
> https://github.com/apache/beam/pull/8714  - It's not blocking but would
> be nice if someone can merge it for any future release from this branch.
>
> Thanks,
> Thomas
>
>
> On Tue, Jun 4, 2019 at 1:59 AM Maximilian Michels  wrote:
>
>> The summary is not correct. Binding votes (in order):
>>
>> Ahmet Altay
>> Robert Bradshaw
>> Maximilian Michels
>> Jean-Baptiste Onofré
>> Lukasz Cwik
>>
>> A total of 5 binding votes.
>>
>> On 04.06.19 02:37, Ankur Goenka wrote:
>> > +1
>> > Thanks for validating the release and voting.
>> > With 0(-1), 6(+1) and 3(+1 binding) votes, I am concluding the voting
>> > process.
>> > I am going ahead with the release and will keep the community posted
>> > with the updates.
>> >
>> > On Mon, Jun 3, 2019 at 1:57 PM Andrew Pilloud > > > wrote:
>> >
>> > +1 Reviewed the Nexmark java and SQL perfkit graphs, no obvious
>> > regressions over the previous release.
>> >
>> > On Mon, Jun 3, 2019 at 1:15 PM Lukasz Cwik > > > wrote:
>> >
>> > Thanks for the clarification.
>> >
>> > On Mon, Jun 3, 2019 at 11:40 AM Ankur Goenka > > > wrote:
>> >
>> > Yes, i meant i will close the voting at 5pm and start the
>> > release process.
>> >
>> > On Mon, Jun 3, 2019, 10:59 AM Lukasz Cwik > > > wrote:
>> >
>> > Ankur, did you mean to say your going to close the vote
>> > today at 5pm? (and then complete the release afterwards)
>> >
>> > On Mon, Jun 3, 2019 at 10:54 AM Ankur Goenka
>> > mailto:goe...@google.com>> wrote:
>> >
>> > Thanks for validating and voting.
>> >
>> > We have 4 binding votes.
>> > I will complete the release today 5PM. Please raise
>> > any concerns before that.
>> >
>> > Thanks,
>> > Ankur
>> >
>> > On Mon, Jun 3, 2019 at 8:36 AM Lukasz Cwik
>> > mailto:lc...@google.com>> wrote:
>> >
>> > Since the gearpump issue has been ongoing since
>> > 2.10, I can't consider it a blocker for this
>> > release and am voting +1.
>> >
>> > On Mon, Jun 3, 2019 at 7:13 AM Jean-Baptiste
>> > Onofré > > > wrote:
>> >
>> > +1 (binding)
>> >
>> > Quickly tested on beam-samples.
>> >
>> > Regards
>> > JB
>> >
>> > On 31/05/2019 04:52, Ankur Goenka wrote:
>> >  > Hi everyone,
>> >  >
>> >  > Please review and vote on the release
>> > candidate #2 for the version
>> >  > 2.13.0, as follows:
>> >  >
>> >  > [ ] +1, Approve the release
>> >  > [ ] -1, Do not approve the release
>> > (please provide specific comments)
>> >  >
>> >  > The complete staging area is available
>> > for your review, which includes:
>> >  > * JIRA release notes [1],
>> >  > * the official Apache source release to
>> > be deployed to dist.apache.org
>> > 
>> >  >  [2], which is
>> > signed with the key with
>> >  > fingerprint
>> > 

Re: [DISCUSS] Portability representation of schemas

2019-06-04 Thread Brian Hulette
On Mon, Jun 3, 2019 at 10:04 PM Reuven Lax  wrote:

>
>
> On Mon, Jun 3, 2019 at 12:27 PM Brian Hulette  wrote:
>
>> > It has to go into the proto somewhere (since that's the only way the
>> SDK can get it), but I'm not sure they should be considered integral parts
>> of the type.
>> Are you just advocating for an approach where any SDK-specific
>> information is stored outside of the Schema message itself so that Schema
>> really does just represent the type? That seems reasonable to me, and
>> alleviates my concerns about how this applies to columnar encodings a bit
>> as well.
>>
>
> Yes, that's exactly what I'm advocating.
>
>
>>
>> We could lift all of the LogicalTypeConversion messages out of the Schema
>> and the LogicalType like this:
>>
>> message SchemaCoder {
>>   Schema schema = 1;
>>   LogicalTypeConversion root_conversion = 2;
>>   map attribute_conversions = 3; // only
>> necessary for user type aliases, portable logical types by definition have
>> nothing SDK-specific
>> }
>>
>
> I'm not sure what the map is for? I think we have status quo wihtout it.
>

My intention was that the SDK-specific information (to/from functions) for
any nested fields that are themselves user type aliases would be stored in
this map. That was the motivation for my next question, if we don't allow
user types to be nested within other user types we may not need it.
I may be missing your meaning - but I think we currently only have status
quo without this map in the Java SDK because Schema.LogicalType is just an
interface that must be implemented. It's appropriate for just portable
logical types, not user-type aliases. Note I've adopted Kenn's terminology
where portable logical type is a type that can be identified by just a URN
and maybe some parameters, while a user type alias needs some SDK specific
information, like a class and to/from UDFs.


>
>> I think a critical question (that has implications for the above
>> proposal) is how/if the two different concepts Kenn mentioned are allowed
>> to nest. For example, you could argue it's redundant to have a user type
>> alias that has a Row representation with a field that is itself a user type
>> alias, because instead you could just have a single top-level type alias
>> with to/from functions that pack and unpack the entire hierarchy. On the
>> other hand, I think it does make sense for a user type alias or a truly
>> portable logical type to have a field that is itself a truly portable
>> logical type (e.g. a user type alias or portable type with a DateTime).
>>
>> I've been assuming that user-type aliases could be nested, but should we
>> disallow that? Or should we go the other way and require that logical types
>> define at most one "level"?
>>
>
> No I think it's useful to allow things to be nested (though of course the
> nesting must terminate).
>

>
>>
>> Brian
>>
>> On Mon, Jun 3, 2019 at 11:08 AM Kenneth Knowles  wrote:
>>
>>>
>>> On Mon, Jun 3, 2019 at 10:53 AM Reuven Lax  wrote:
>>>
 So I feel a bit leery about making the to/from functions a fundamental
 part of the portability representation. In my mind, that is very tied to a
 specific SDK/language. A SDK (say the Java SDK) wants to allow users to use
 a wide variety of native types with schemas, and under the covers uses the
 to/from functions to implement that. However from the portable Beam
 perspective, the schema itself should be the real "type" of the
 PCollection; the to/from methods are simply a way that a particular SDK
 makes schemas easier to use. It has to go into the proto somewhere (since
 that's the only way the SDK can get it), but I'm not sure they should be
 considered integral parts of the type.

>>>
>>> On the doc in a couple places this distinction was made:
>>>
>>> * For truly portable logical types, no instructions for the SDK are
>>> needed. Instead, they require:
>>>- URN: a standardized identifier any SDK can recognize
>>>- A spec: what is the universe of values in this type?
>>>- A representation: how is it represented in built-in types? This is
>>> how SDKs who do not know/care about the URN will process it
>>>- (optional): SDKs choose preferred SDK-specific types to embed the
>>> values in. SDKs have to know about the URN and choose for themselves.
>>>
>>> *For user-level type aliases, written as convenience by the user in
>>> their pipeline, what Java schemas have today:
>>>- to/from UDFs: the code is SDK-specific
>>>- some representation of the intended type (like java class): also
>>> SDK specific
>>>- a representation
>>>- any "id" is just like other ids in the pipeline, just avoiding
>>> duplicating the proto
>>>- Luke points out that nesting these can give multiple SDKs a hint
>>>
>>> In my mind the remaining complexity is whether or not we need to be able
>>> to move between the two. Composite PTransforms, for example, do have
>>> fluidity between being strictly user-defined versus 

Re: Jira tracker permission

2019-06-04 Thread Ahmet Altay
Welcome!

On Mon, Jun 3, 2019 at 10:31 PM Pablo Estrada  wrote:

> I've added you as contributor - welcome
> -P.
>
> On Mon, Jun 3, 2019, 9:16 PM Yichi Zhang  wrote:
>
>> Hi, beam-dev,
>>
>> This is Yichi Zhang from Google, I just started looking into beam
>> projects and will be actively working on beam sdk, could someone grant me
>> permission to beam jira issue tracker? My jira username is yichi
>> .
>>
>> Looking forward to work with everyone else.
>>
>> Thanks,
>> Yichi
>>
>


Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-04 Thread Thomas Weise
This seems a rushed and things fall through the cracks.

Max had requested to not include the weekend into the voting period.

Valentyn: I had the same question on the first RC. The PR should be
included into the vote for review. You can find it here:
https://github.com/apache/beam/pull/8667/files

I had requested to include following backport PR before the RC:
https://github.com/apache/beam/pull/8714  - It's not blocking but would be
nice if someone can merge it for any future release from this branch.

Thanks,
Thomas


On Tue, Jun 4, 2019 at 1:59 AM Maximilian Michels  wrote:

> The summary is not correct. Binding votes (in order):
>
> Ahmet Altay
> Robert Bradshaw
> Maximilian Michels
> Jean-Baptiste Onofré
> Lukasz Cwik
>
> A total of 5 binding votes.
>
> On 04.06.19 02:37, Ankur Goenka wrote:
> > +1
> > Thanks for validating the release and voting.
> > With 0(-1), 6(+1) and 3(+1 binding) votes, I am concluding the voting
> > process.
> > I am going ahead with the release and will keep the community posted
> > with the updates.
> >
> > On Mon, Jun 3, 2019 at 1:57 PM Andrew Pilloud  > > wrote:
> >
> > +1 Reviewed the Nexmark java and SQL perfkit graphs, no obvious
> > regressions over the previous release.
> >
> > On Mon, Jun 3, 2019 at 1:15 PM Lukasz Cwik  > > wrote:
> >
> > Thanks for the clarification.
> >
> > On Mon, Jun 3, 2019 at 11:40 AM Ankur Goenka  > > wrote:
> >
> > Yes, i meant i will close the voting at 5pm and start the
> > release process.
> >
> > On Mon, Jun 3, 2019, 10:59 AM Lukasz Cwik  > > wrote:
> >
> > Ankur, did you mean to say your going to close the vote
> > today at 5pm? (and then complete the release afterwards)
> >
> > On Mon, Jun 3, 2019 at 10:54 AM Ankur Goenka
> > mailto:goe...@google.com>> wrote:
> >
> > Thanks for validating and voting.
> >
> > We have 4 binding votes.
> > I will complete the release today 5PM. Please raise
> > any concerns before that.
> >
> > Thanks,
> > Ankur
> >
> > On Mon, Jun 3, 2019 at 8:36 AM Lukasz Cwik
> > mailto:lc...@google.com>> wrote:
> >
> > Since the gearpump issue has been ongoing since
> > 2.10, I can't consider it a blocker for this
> > release and am voting +1.
> >
> > On Mon, Jun 3, 2019 at 7:13 AM Jean-Baptiste
> > Onofré  > > wrote:
> >
> > +1 (binding)
> >
> > Quickly tested on beam-samples.
> >
> > Regards
> > JB
> >
> > On 31/05/2019 04:52, Ankur Goenka wrote:
> >  > Hi everyone,
> >  >
> >  > Please review and vote on the release
> > candidate #2 for the version
> >  > 2.13.0, as follows:
> >  >
> >  > [ ] +1, Approve the release
> >  > [ ] -1, Do not approve the release
> > (please provide specific comments)
> >  >
> >  > The complete staging area is available
> > for your review, which includes:
> >  > * JIRA release notes [1],
> >  > * the official Apache source release to
> > be deployed to dist.apache.org
> > 
> >  >  [2], which is
> > signed with the key with
> >  > fingerprint
> > 6356C1A9F089B0FA3DE8753688934A6699985948 [3],
> >  > * all artifacts to be deployed to the
> > Maven Central Repository [4],
> >  > * source code tag "v2.13.0-RC2" [5],
> >  > * website pull request listing the
> > release [6] and publishing the API
> >  > reference manual [7].
> >  > * Python artifacts are deployed along
> > with the source release to the
> >  > dist.apache.org 
> >  [2].
> >   

Re: [PROPOSAL] Standardize Gradle structure in Python SDK

2019-06-04 Thread Lukasz Cwik
On Mon, Jun 3, 2019 at 5:13 PM Valentyn Tymofieiev 
wrote:

> Hey Mark & others,
>
> We've been following the structure proposed in this thread to extend test
> coverage for Beam Python SDK on Python 3.5, 3.6, 3.7 interpreters, see [1].
>
> This structure allowed us to add 3.x suites without slowing down the
> pre/postcommit execution time. We can actually see a drop in precommit
> latency [2] around March 23 we first made some Python 3.x suites run in
> parallel, and we have added more suites since then without slowing down
> pre/postcommits. Therefore I am in favor of this proposal, especially since
> AFAIK we don't have better one. Thanks a lot!
>
> I do have some feedback on this proposal:
>
> 1. There is a duplication of gradle code between test suites for different
> python minor versions, for example see the identical definition of
> DirectRunner PostCommitIT suite for Python 3.6 and Python 3.7 [4,5].
>
> Possible solution to reduce the duplication is to move common code that
> defines a task into a separate groovy file shared across multiple gradle
> files. We have an example of this, where enablePythonPerformanceTest() is
> defined in BeamModulePlugin.groovy, and used in several build.gradle files
> to create a gradle task required for performance tests, see: [6]. I
> followed the same example in a Python 3 test suite for Portable Flink
> Runner I am working on [3], however I am not sure if BeamModulePlugin is
> the best place to define common gradle tasks to needed for Python CI.
> Perhaps we can make a separate groovy file for this purpose in
> sdk/python/test-suites?
>

I would suggest placing the shared code in a new file in
https://github.com/apache/beam/tree/master/buildSrc/src/main/groovy/org/apache/beam/gradle,
we have several other groovy files related to building defined there
already.


> 2. Python 3 test suites currently live in sdks/python/test-suites, while
> most Python 2 suites are still defined in sdks/python/build.gradle.
>
> This may cause confusion for folks working on adding new Python suites. If
> there is an overall agreement on proposed structure I suggest to  start
> moving Python 2 CI tasks out of  sdks/python/build.gradle into
> sdks/python/test-suites/[runner]/py27/build.gradle, or a common groovy
> file. If there are better alternatives we can continue discussing them here.
>

For the runner specific Java ITs, we have been trying to get the tests to
be placed inside the runners own directory instead of underneath the SDK
directories. This was to make it easy to associate test task -> runner. So
an alternative would be to have runners/[runner]/test-suites/py27/... be
that location.


> Thanks,
> Valenyn
>
>
> [1] https://github.com/apache/beam/tree/master/sdks/python/test-suites
> [2]
> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1=1546507894013=1554189164736
> [3] https://github.com/apache/beam/pull/8745
> [4]
> https://github.com/apache/beam/blob/291f1e9fb5ce5ee4bb7e2519ffe40334fb5c08c5/sdks/python/test-suites/direct/py36/build.gradle#L27
> [5]
> https://github.com/apache/beam/blob/291f1e9fb5ce5ee4bb7e2519ffe40334fb5c08c5/sdks/python/test-suites/direct/py37/build.gradle#L27
> [6]
> https://github.com/apache/beam/search?q=enablePythonPerformanceTest_q=enablePythonPerformanceTest
>
>
> On Fri, Mar 29, 2019 at 9:45 AM Udi Meiri  wrote:
>
>> I don't use gradle commands for Python development either, because they
>> are slow (no incremental testing).
>>
>>
>>
>> On Fri, Mar 29, 2019 at 9:16 AM Michael Luckey 
>> wrote:
>>
>>>
>>>
>>> On Fri, Mar 29, 2019 at 2:31 PM Robert Bradshaw 
>>> wrote:
>>>
 On Fri, Mar 29, 2019 at 12:54 PM Michael Luckey 
 wrote:
 >
 > Really like the idea of improving here.
 >
 > Unfortunately, I haven't worked with python on that scale yet, so
 bear with my naive understandings in this regard. If I understand
 correctly, the suggestion will result in a couple of projects consisting
 only of a build,gradle file to kind of workaround on gradles decision not
 to parallelize within projects, right? In consequence, this also kind of
 decouples projects from their content - they stuff which constitutes the
 project - and forces the build file to 'somehow reach out to content of
 other (only python root?) projects. E.g couples projects. This somehow
 'feels non natural' to me. But, of course, might be the path to go. As I
 said before, never worked on python on that scale.

 It feels a bit odd to me as well. Is it possible to have multiple
 projects per directory (e.g. a suite of testing ones) rather than
 having to break things up like this, especially if the goal is
 primarily to get parallel running of tests? Especially if we could
 automatically create the cross-product rather than manually? There
 also seems to be some redundancy with what tox is doing here.

>>>
>>> Not sure, whether I understand correctly. But I do not think that's
>>> 

Re: BQ IT tests fail on TestDataflowRunner - Python SDK

2019-06-04 Thread Tanay Tummalapalli
I didn't have any other changes.
I ran the tests with a clean virtualenv as you suggested and it works now.
:)

Thanks Ahmet and Chamikara!

On Tue, Jun 4, 2019 at 6:36 AM Chamikara Jayalath 
wrote:

> Sounds like your input job was somehow incompatible with the Dataflow
> worker. Running using a clean virtual env should help verify as Ahmet
> mentioned.
>
> On Mon, Jun 3, 2019 at 5:44 PM Ahmet Altay  wrote:
>
>> Do you have any other changes? Are you trying from head with a clean
>> virtual environment?
>>
>> If you can share a link to dataflow job (in the apache-beam-testing GCP
>> project), we can try to look at additional logs as well.
>>
>> On Mon, Jun 3, 2019 at 1:42 PM Tanay Tummalapalli 
>> wrote:
>>
>>> Hi everyone,
>>>
>>> I ran the Integration Tests -
>>> BigQueryStreamingInsertTransformIntegrationTests[1] and
>>> BigQueryFileLoadsIT[2] on the master branch locally, with the following
>>> command:
>>> ./scripts/run_integration_test.sh --test_opts
>>> --tests=apache_beam.io.gcp.bigquery_test:BigQueryStreamingInsertTransformIntegrationTests
>>> The Dataflow jobs for the tests failed with the following error:
>>> root: INFO: 2019-06-03T18:36:53.021Z: JOB_MESSAGE_ERROR: Traceback
>>> (most recent call last):
>>> File
>>> "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py",
>>> line 649, in do_work
>>> work_executor.execute()
>>> File
>>> "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py",
>>> line 150, in execute
>>> test_shuffle_sink=self._test_shuffle_sink)
>>> File
>>> "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py",
>>> line 116, in create_operation
>>> is_streaming=False)
>>> File "apache_beam/runners/worker/operations.py", line 962, in
>>> apache_beam.runners.worker.operations.create_operation
>>> op = BatchGroupAlsoByWindowsOperation(
>>> File "dataflow_worker/shuffle_operations.py", line 219, in
>>> dataflow_worker.shuffle_operations.BatchGroupAlsoByWindowsOperation.
>>> __init__
>>> self.windowing = deserialize_windowing_strategy(self.spec.window_fn)
>>> File "dataflow_worker/shuffle_operations.py", line 207, in
>>> dataflow_worker.shuffle_operations.deserialize_windowing_strategy
>>> return pickler.loads(serialized_data)
>>> File
>>> "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py",
>>> line 248, in loads
>>> c = base64.b64decode(encoded)
>>> File "/usr/lib/python2.7/base64.py", line 78, in b64decode
>>> raise TypeError(msg)
>>> TypeError: Incorrect padding
>>>
>>>
>>> I tested the same tests on the 2.13.0-RC#2 branch as well and they
>>> passed. These tests also don't fail in the most recent Python post-commit
>>> tests[3-5].
>>>
>>> Keeping in mind the recent b64 changes in BQ, none of the tests in the
>>> test classes mentioned above makes use of a "BYTES" type field.
>>> Would love to get pointers to possible reasons.
>>>
>>> Thank You
>>> - TT
>>>
>>> [1]
>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery_test.py#L479-L630
>>> [2]
>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery_file_loads_test.py#L358-L528
>>> [3]
>>> https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/
>>> [4]
>>> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/
>>> [5]
>>> https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/
>>>
>>


Re: [VOTE] Release 2.13.0, release candidate #2

2019-06-04 Thread Maximilian Michels

The summary is not correct. Binding votes (in order):

Ahmet Altay
Robert Bradshaw
Maximilian Michels
Jean-Baptiste Onofré
Lukasz Cwik

A total of 5 binding votes.

On 04.06.19 02:37, Ankur Goenka wrote:

+1
Thanks for validating the release and voting.
With 0(-1), 6(+1) and 3(+1 binding) votes, I am concluding the voting 
process.
I am going ahead with the release and will keep the community posted 
with the updates.


On Mon, Jun 3, 2019 at 1:57 PM Andrew Pilloud > wrote:


+1 Reviewed the Nexmark java and SQL perfkit graphs, no obvious
regressions over the previous release.

On Mon, Jun 3, 2019 at 1:15 PM Lukasz Cwik mailto:lc...@google.com>> wrote:

Thanks for the clarification.

On Mon, Jun 3, 2019 at 11:40 AM Ankur Goenka mailto:goe...@google.com>> wrote:

Yes, i meant i will close the voting at 5pm and start the
release process.

On Mon, Jun 3, 2019, 10:59 AM Lukasz Cwik mailto:lc...@google.com>> wrote:

Ankur, did you mean to say your going to close the vote
today at 5pm? (and then complete the release afterwards)

On Mon, Jun 3, 2019 at 10:54 AM Ankur Goenka
mailto:goe...@google.com>> wrote:

Thanks for validating and voting.

We have 4 binding votes.
I will complete the release today 5PM. Please raise
any concerns before that.

Thanks,
Ankur

On Mon, Jun 3, 2019 at 8:36 AM Lukasz Cwik
mailto:lc...@google.com>> wrote:

Since the gearpump issue has been ongoing since
2.10, I can't consider it a blocker for this
release and am voting +1.

On Mon, Jun 3, 2019 at 7:13 AM Jean-Baptiste
Onofré mailto:j...@nanthrax.net>> wrote:

+1 (binding)

Quickly tested on beam-samples.

Regards
JB

On 31/05/2019 04:52, Ankur Goenka wrote:
 > Hi everyone,
 >
 > Please review and vote on the release
candidate #2 for the version
 > 2.13.0, as follows:
 >
 > [ ] +1, Approve the release
 > [ ] -1, Do not approve the release
(please provide specific comments)
 >
 > The complete staging area is available
for your review, which includes:
 > * JIRA release notes [1],
 > * the official Apache source release to
be deployed to dist.apache.org

 >  [2], which is
signed with the key with
 > fingerprint
6356C1A9F089B0FA3DE8753688934A6699985948 [3],
 > * all artifacts to be deployed to the
Maven Central Repository [4],
 > * source code tag "v2.13.0-RC2" [5],
 > * website pull request listing the
release [6] and publishing the API
 > reference manual [7].
 > * Python artifacts are deployed along
with the source release to the
 > dist.apache.org 
 [2].
 > * Validation sheet with a tab for 2.13.0
release to help with validation
 > [8].
 >
 > The vote will be open for at least 72
hours. It is adopted by majority
 > approval, with at least 3 PMC affirmative
votes.
 >
 > Thanks,
 > Ankur
 >
 > [1]
 >

https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527=12345166
 > [2]
https://dist.apache.org/repos/dist/dev/beam/2.13.0/
 > [3]
https://dist.apache.org/repos/dist/release/beam/KEYS
 > [4]

Re: Timer support in Flink

2019-06-04 Thread Robert Bradshaw
One issue with the fully expanded version is that it's so large it's
hard to read.

I think it would be useful to make the ~ entries (at least) clickable
or with hover tool tips. It would be nice to be able to expand columns
individually as well.

On Tue, Jun 4, 2019 at 7:20 AM Melissa Pashniak  wrote:
>
>
> Yeah, people's eyes likely jump to the big "What is being computed?" header 
> first and skip the small font "expand details" (that's what my eyes did 
> anyway!) Even just moving the expand/collapse to be AFTER the header of the 
> table (or down to the next line)  and making the font bigger might help a 
> lot. And maybe making the text more explicit: "Click to expand for more 
> details".
>
> I'm traveling right now so can't take an in-depth look, but this might be 
> doable by changing the order of things in [1] and the font size in [2]. I'll 
> add this info to the JIRA also.
>
> [1] 
> https://github.com/apache/beam/blame/master/website/src/_includes/capability-matrix.md#L18
> [2] 
> https://github.com/apache/beam/blob/master/website/src/_sass/capability-matrix.scss#L130
>
>
> On Mon, Jun 3, 2019 at 2:15 AM Maximilian Michels  wrote:
>>
>> Good point. I think I discovered the detailed view when I made changes
>> to the source code. Classic tunnel-vision problem :)
>>
>> On 30.05.19 12:57, Reza Rokni wrote:
>> > :-)
>> >
>> > https://issues.apache.org/jira/browse/BEAM-7456
>> >
>> > On Thu, 30 May 2019 at 18:41, Alex Van Boxel > > > wrote:
>> >
>> > Oh... you can expand the matrix. Never saw that, this could indeed
>> > be better. So it isn't you.
>> >
>> >   _/
>> > _/ Alex Van Boxel
>> >
>> >
>> > On Thu, May 30, 2019 at 12:24 PM Reza Rokni > > > wrote:
>> >
>> > PS, until it was just pointed out to me by Max, I had missed the
>> > (expand details) clickable link in the capability matrix.
>> >
>> > Probably just me, but do others think it's also easy to miss? If
>> > yes I will raise a Jira for it
>> >
>> > On Wed, 29 May 2019 at 19:52, Reza Rokni > > > wrote:
>> >
>> > Thanx Max!
>> >
>> > Reza
>> >
>> > On Wed, 29 May 2019, 16:38 Maximilian Michels,
>> > mailto:m...@apache.org>> wrote:
>> >
>> > Hi Reza,
>> >
>> > The detailed view of the capability matrix states: "The
>> > Flink Runner
>> > supports timers in non-merging windows."
>> >
>> > That is still the case. Other than that, timers should
>> > be working fine.
>> >
>> >  > It makes very heavy use of Event.Time timers and has
>> > to do some manual DoFn cache work to get around some
>> > O(heavy) issues.
>> >
>> > If you are running on Flink 1.5, timer deletion suffers
>> > from O(n)
>> > complexity which has been fixed in newer versions.
>> >
>> > Cheers,
>> > Max
>> >
>> > On 29.05.19 03:27, Reza Rokni wrote:
>> >  > Hi Flink experts,
>> >  >
>> >  > I am getting ready to push a PR around a utility
>> > class for timeseries join
>> >  >
>> >  > left.timestamp match to closest right.timestamp where
>> > right.timestamp <=
>> >  > left.timestamp.
>> >  >
>> >  > It makes very heavy use of Event.Time timers and has
>> > to do some manual
>> >  > DoFn cache work to get around some O(heavy) issues.
>> > Wanted to test
>> >  > things against Flink: In the capability matrix we
>> > have "~" for Timer
>> >  > support in Flink:
>> >  >
>> >  >
>> > 
>> > https://beam.apache.org/documentation/runners/capability-matrix/
>> >  >
>> >  > Is that page outdated, if not what are the areas that
>> > still need to be
>> >  > addressed please?
>> >  >
>> >  > Cheers
>> >  >
>> >  > Reza
>> >  >
>> >  >
>> >  > --
>> >  >
>> >  > This email may be confidential and privileged. If you
>> > received this
>> >  > communication by mistake, please don't forward it to
>> > anyone else, please
>> >  > erase all copies and attachments, and please let me
>> > know that it has
>> >  > gone to the wrong person.
>> >  >
>> >  > The above terms reflect a potential business
>> > arrangement, are 

Re: Join the Beam Community Request Email

2019-06-04 Thread Melissa Pashniak
We currently use Jekyll to generate our HTML files, and there are some
plugins and hacks to add multilingual support, but it's not built-in. as I
understand it, It's one of the reasons Kubernetes moved away from Jekyll to
Hugo about a year ago (see their blog post for more details [1] ). They
wanted multilingual support and a language switcher, and after
investigating Jekyll options, settled on Hugo as it comes with built-in
multilingual support [2].

We could try to go the Jekyll plugin route to add additional languages in
Jekyll, but if we want maintainable docs that are in many languages and a
language switcher as a long-term goal, we might want to consider moving.
Either way, we'd need to do some work.

For the Jekyll option, we'd need to (likely):

-  evaluate the existing multilingual plugins, see if they do what we want,
and make any needed code changes
-  I suspect we'll have to change the site directory structure
-  update all of the existing Jenkins jobs to stage/test, and any other
scripts that deal with building the site. this would require someone with
knowledge about how it's all put together with the building and syncing of
the website, and access to the build machines?
- updating the existing local testing/staging scripts for contributors

If we moved to Hugo for built-in support:

- rip out Jekyll and get Hugo set up and configured
- (same as above, but more substantial changes needed) update all of the
existing Jenkins jobs to stage/test, and any other scripts that deal with
building the site. this would require someone with knowledge about how it's
all put together with the building and syncing of the website, and access
to the build machines?
- updating the existing local testing/staging scripts for contributors
- figuring out replacements (IF they don't exist for Hugo) for some of our
Jekyll plugins (for example, github snippet grabber, link checker).
- Hugo uses a different flavor of markdown, so we'll likely need to make
some changes to our markdown files
- Probably other things I am not aware of ;-)

Thoughts?

[1] https://kubernetes.io/blog/2018/05/05/hugo-migration/
[2] https://gohugo.io/content-management/multilingual/



On Fri, May 24, 2019 at 10:50 AM Lukasz Cwik  wrote:

> Welcome Zhang, I have added you as a contributor the Apache Beam JIRA.
>
> I would suggest you take a look at the contribution guide[1] to learn on
> how to get started.
>
> If I understand correctly, your interested in translating several
> documents found on the Beam website, if so Melissa would be a good contact
> since she has helped with our documentation a lot.
> Melissa would you know if we have i18n support built into the website
> already?
>
> 1: https://beam.apache.org/contribute/
>
> On Thu, May 23, 2019 at 6:24 PM 图霸群英  wrote:
>
>> Hello everyone! My name is Zhang Haitao and I am from Beijing, China.
>> I created a Beam Chinese hobby group in China.
>> Also published on infoq about Beam related articles.
>> Https://www.infoq.cn/profile/1280576
>> I want to join the Beam community now.
>>
>> On May 27th of this month, I will also participate in Qcon's promotion
>> work on Beam technology.
>> https://2019.qconguangzhou.com/presentation/1822
>>
>> Now I am organizing people in China to translate English help documents
>> into Chinese.
>> Please give me a lot of advice and care.
>>
>> My github account: xsm110
>> Example:
>> https://github.com/xsm110/Apache-Beam-Example
>> Apache User ID : zhanghaitao8
>> JIRA: zhanghaitao8
>>
>> The attachment is signed by me for the agreement submitted to ASF.ICLA
>>
>