I opened https://github.com/apache/beam/pull/8319 to eliminate the
duplicate yaml file (and cover timestamp coder for the Python SDK). Would
appreciate if someone could take a look. (PR doesn't affect the
StrUtf8Coder subject, but it is required to fix a timer bug.)
Thanks,
Thomas
On Fri, Apr 12
This is a minor point Robert Burke but having access to the "stream" when
decoding/encoding could mean that your reading/writing from the underlying
transport channel directly and not needing to copy the bytes into/from
memory.
On Wed, Apr 10, 2019 at 3:45 PM Kenneth Knowles wrote:
> On Mon, A
On Mon, Apr 8, 2019 at 4:03 PM Robert Bradshaw wrote:
> This email is already very long, but in summary I think the right
> answer is to just get rid of Outer altogether (except possibly for
> IOs, which we'd only preserve for legacy reasons until 3.0).
>
> - Robert
>
I had forgotten that compat
On Mon, 8 Apr 2019 at 16:03, Robert Bradshaw wrote:
> On Mon, Apr 8, 2019 at 8:04 PM Kenneth Knowles wrote:
> >
> > On Mon, Apr 8, 2019 at 1:57 AM Robert Bradshaw
> wrote:
> >>
> >> On Sat, Apr 6, 2019 at 12:08 AM Kenneth Knowles
> wrote:
> >> >
> >> > On Fri, Apr 5, 2019 at 2:24 PM Robert Bra
On Mon, Apr 8, 2019 at 8:04 PM Kenneth Knowles wrote:
>
> On Mon, Apr 8, 2019 at 1:57 AM Robert Bradshaw wrote:
>>
>> On Sat, Apr 6, 2019 at 12:08 AM Kenneth Knowles wrote:
>> >
>> > On Fri, Apr 5, 2019 at 2:24 PM Robert Bradshaw wrote:
>> >>
>> >> On Fri, Apr 5, 2019 at 6:24 PM Kenneth Knowles
On Mon, Apr 8, 2019 at 1:57 AM Robert Bradshaw wrote:
> On Sat, Apr 6, 2019 at 12:08 AM Kenneth Knowles wrote:
> >
> >
> >
> > On Fri, Apr 5, 2019 at 2:24 PM Robert Bradshaw
> wrote:
> >>
> >> On Fri, Apr 5, 2019 at 6:24 PM Kenneth Knowles wrote:
> >> >
> >> > Nested and unnested contexts are
On Sat, Apr 6, 2019 at 12:08 AM Kenneth Knowles wrote:
>
>
>
> On Fri, Apr 5, 2019 at 2:24 PM Robert Bradshaw wrote:
>>
>> On Fri, Apr 5, 2019 at 6:24 PM Kenneth Knowles wrote:
>> >
>> > Nested and unnested contexts are two different encodings. Can we just give
>> > them different URNs? We can
On Fri, Apr 5, 2019 at 2:24 PM Robert Bradshaw wrote:
> On Fri, Apr 5, 2019 at 6:24 PM Kenneth Knowles wrote:
> >
> > Nested and unnested contexts are two different encodings. Can we just
> give them different URNs? We can even just express the length-prefixed
> UTF-8 as a composition of the len
On Fri, Apr 5, 2019 at 6:24 PM Kenneth Knowles wrote:
>
> Nested and unnested contexts are two different encodings. Can we just give
> them different URNs? We can even just express the length-prefixed UTF-8 as a
> composition of the length-prefix URN and the UTF-8 URN.
It's not that simple, esp
Also, as for the backwards compatibility discussion, I don't believe
non-portable jobs will be able to be upgraded to portable jobs and hence
may be a good time to make upgrade incompatible coder changes at that point
in time.
On Fri, Apr 5, 2019 at 1:44 PM Lukasz Cwik wrote:
> Robert, I filed h
Robert, I filed https://issues.apache.org/jira/browse/BEAM-7015 for
removing the Python SDK copy of standard_coders.yaml and assigned it to you.
On Fri, Apr 5, 2019 at 9:24 AM Kenneth Knowles wrote:
> Nested and unnested contexts are two different encodings. Can we just give
> them different URN
Nested and unnested contexts are two different encodings. Can we just give
them different URNs? We can even just express the length-prefixed UTF-8 as
a composition of the length-prefix URN and the UTF-8 URN.
On Fri, Apr 5, 2019 at 12:38 AM Robert Bradshaw wrote:
> On Fri, Apr 5, 2019 at 12:50 AM
On Fri, Apr 5, 2019 at 12:50 AM Heejong Lee wrote:
>
> Robert, does nested/unnested context work properly for Java?
I believe so. It is similar to the bytes coder, that prefixes vs. not
based on the context.
> I can see that the Context is fixed to NESTED[1] and the encode method with
> the Con
Robert, does nested/unnested context work properly for Java? I can see that
the Context is fixed to NESTED[1] and the encode method with the Context
parameter is marked as deprecated[2].
[1]:
https://github.com/apache/beam/blob/0868e7544fd1e96db67ff5b9e70a67802c0f0c8e/sdks/java/core/src/main/java/
I don't know why there are two separate copies of
standard_coders.yaml--originally there was just one (though it did
live in the Python directory). I'm guessing a copy was made rather
than just pointing both to the new location, but that completely
defeats the point. I can't seem to access JIRA rig
My 2cents is that the "Textual description" should be part of the
documentation of the URNs on the Proto messages, since that's the common
place. I've added a short description for the varints for example, and we
already have lenghthier format & protocol descriptions there for iterables
and similar
On Thu, Apr 4, 2019 at 1:49 PM Robert Burke wrote:
> We should probably move the "java" version of the yaml file [1] to a
> common location rather than deep in the java hierarchy, or copying it for
> Go and Python, but that can be a separate task. It's probably non-trivial
> since it looks like i
On Thu, Apr 4, 2019 at 1:48 PM Kenneth Knowles wrote:
> I have to actually say that a collection of test cases is not a definition
> of a format. It is one of the pieces, and the other one is a textual
> description in a prominent, discoverable place.
>
A reference implementation can also serve
We should probably move the "java" version of the yaml file [1] to a common
location rather than deep in the java hierarchy, or copying it for Go and
Python, but that can be a separate task. It's probably non-trivial since it
looks like it's part of a java resources structure.
Luke, the Go SDK doe
I have to actually say that a collection of test cases is not a definition
of a format. It is one of the pieces, and the other one is a textual
description in a prominent, discoverable place.
Kenn
On Thu, Apr 4, 2019 at 1:28 PM Lukasz Cwik wrote:
>
>
> On Thu, Apr 4, 2019 at 1:15 PM Chamikara J
On Thu, Apr 4, 2019 at 1:15 PM Chamikara Jayalath
wrote:
>
>
> On Thu, Apr 4, 2019 at 12:15 PM Lukasz Cwik wrote:
>
>> standard_coders.yaml[1] is where we are currently defining these formats.
>> Unfortunately the Python SDK has its own copy[2].
>>
>
> Ah great. Thanks for the pointer. Any idea
On Thu, Apr 4, 2019 at 12:15 PM Lukasz Cwik wrote:
> standard_coders.yaml[1] is where we are currently defining these formats.
> Unfortunately the Python SDK has its own copy[2].
>
Ah great. Thanks for the pointer. Any idea why there's a separate copy for
Python ? I didn't see a significant dif
standard_coders.yaml[1] is where we are currently defining these formats.
Unfortunately the Python SDK has its own copy[2].
Here is an example PR[3] that adds the "beam:coder:double:v1" as tests to
the Java and Python SDKs to ensure interoperability.
Robert Burke, does the Go SDK have a test wher
On Thu, Apr 4, 2019 at 11:50 AM Chamikara Jayalath
wrote:
>
>
> On Thu, Apr 4, 2019 at 11:29 AM Robert Bradshaw
> wrote:
>
>> A URN defines the encoding.
>>
>> There are (unfortunately) *two* encodings defined for a Coder (defined
>> by a URN), the nested and the unnested one. IIRC, in both Java
On Thu, Apr 4, 2019 at 11:29 AM Robert Bradshaw wrote:
> A URN defines the encoding.
>
> There are (unfortunately) *two* encodings defined for a Coder (defined
> by a URN), the nested and the unnested one. IIRC, in both Java and
> Python, the nested one prefixes with a var-int length, and the
> u
A URN defines the encoding.
There are (unfortunately) *two* encodings defined for a Coder (defined
by a URN), the nested and the unnested one. IIRC, in both Java and
Python, the nested one prefixes with a var-int length, and the
unnested one does not.
We should define the spec clearly and have cr
Could this be a backwards-incompatible change that would break pipelines
from upgrading? If they have data in-flight in between operators, and we
change the coder, they would break?
I know very little about coders, but since nobody has mentioned it, I
wanted to make sure we have it in mind.
-P.
On
Agree that a coder URN defines the encoding. I see that string UTF-8 was
added to the proto enum, but it needs a written spec of the encoding.
Ideally some test data that different languages can use to drive compliance
testing.
Kenn
On Wed, Apr 3, 2019 at 6:21 PM Robert Burke wrote:
> String UT
String UTF8 was recently added as a "standard coder " URN in the protos,
but I don't think that developed beyond Java, so adding it to Python would
be reasonable in my opinion.
The Go SDK handles Strings as "custom coders" presently which for Go are
always length prefixed (and reported to the Runn
29 matches
Mail list logo