Hi Charles, It's only for Beam schema DATETIME field.
-Rui On Tue, Nov 6, 2018 at 12:55 AM Charles Chen <c...@google.com> wrote: > Is the proposal to do this for both Beam Schema DATETIME fields as well as > for Beam timestamps in general? The latter likely has a bunch of > downstream consequences for all runners. > > On Tue, Nov 6, 2018 at 12:38 AM Ismaël Mejía <ieme...@gmail.com> wrote: > >> +1 to more precision even to the nano level, probably via Reuven's >> proposal of a different internal representation. >> On Tue, Nov 6, 2018 at 9:19 AM Robert Bradshaw <rober...@google.com> >> wrote: >> > >> > +1 to offering more granular timestamps in general. I think it will be >> > odd if setting the element timestamp from a row DATETIME field is >> > lossy, so we should seriously consider upgrading that as well. >> > On Tue, Nov 6, 2018 at 6:42 AM Charles Chen <c...@google.com> wrote: >> > > >> > > One related issue that came up before is that we (perhaps >> unnecessarily) restrict the precision of timestamps in the Python SDK to >> milliseconds because of legacy reasons related to the Java runner's use of >> Joda time. Perhaps Beam portability should natively use a more granular >> timestamp unit. >> > > >> > > On Mon, Nov 5, 2018 at 9:34 PM Rui Wang <ruw...@google.com> wrote: >> > >> >> > >> Thanks Reuven! >> > >> >> > >> I think Reuven gives the third option: >> > >> >> > >> Change internal representation of DATETIME field in Row. Still keep >> public ReadableDateTime getDateTime(String fieldName) API to be compatible >> with existing code. And I think we could add one more API to >> getDataTimeNanosecond. This option is different from the option one because >> option one actually maintains two implementation of time. >> > >> >> > >> -Rui >> > >> >> > >> On Mon, Nov 5, 2018 at 9:26 PM Reuven Lax <re...@google.com> wrote: >> > >>> >> > >>> I would vote that we change the internal representation of Row to >> something other than Joda. Java 8 times would give us at least >> microseconds, and if we want nanoseconds we could simply store it as a >> number. >> > >>> >> > >>> We should still keep accessor methods that return and take Joda >> objects, as the rest of Beam still depends on Joda. >> > >>> >> > >>> Reuven >> > >>> >> > >>> On Mon, Nov 5, 2018 at 9:21 PM Rui Wang <ruw...@google.com> wrote: >> > >>>> >> > >>>> Hi Community, >> > >>>> >> > >>>> The DATETIME field in Beam Schema/Row is implemented by Joda's >> Datetime (see Row.java#L611 and Row.java#L169). Joda's Datetime is limited >> to the precision of millisecond. It has good enough precision to represent >> timestamp of event time, but it is not enough for the real "time" data. For >> the "time" type data, we probably need to support even up to the precision >> of nanosecond. >> > >>>> >> > >>>> Unfortunately, Joda decided to keep the precision of millisecond: >> https://github.com/JodaOrg/joda-time/issues/139. >> > >>>> >> > >>>> If we want to support the precision of nanosecond, we could have >> two options: >> > >>>> >> > >>>> Option one: utilize current FieldType's metadata field, such that >> we could set something into meta data and Row could check the metadata to >> decide what's saved in DATETIME field: Joda's Datetime or an implementation >> that supports nanosecond. >> > >>>> >> > >>>> Option two: have another field (maybe called TIMESTAMP field?), to >> have an implementation to support higher precision of time. >> > >>>> >> > >>>> What do you think about the need of higher precision for time type >> and which option is preferred? >> > >>>> >> > >>>> -Rui >> >