Thanks Reuven!

I think Reuven gives the third option:

Change internal representation of DATETIME field in Row. Still keep
public ReadableDateTime
getDateTime(String fieldName) API to be compatible with existing code. And
I think we could add one more API to getDataTimeNanosecond. This option is
different from the option one because option one actually maintains two
implementation of time.

-Rui

On Mon, Nov 5, 2018 at 9:26 PM Reuven Lax <re...@google.com> wrote:

> I would vote that we change the internal representation of Row to
> something other than Joda. Java 8 times would give us at least
> microseconds, and if we want nanoseconds we could simply store it as a
> number.
>
> We should still keep accessor methods that return and take Joda objects,
> as the rest of Beam still depends on Joda.
>
> Reuven
>
> On Mon, Nov 5, 2018 at 9:21 PM Rui Wang <ruw...@google.com> wrote:
>
>> Hi Community,
>>
>> The DATETIME field in Beam Schema/Row is implemented by Joda's Datetime
>> (see Row.java#L611
>> <https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java#L611>
>>  and Row.java#L169
>> <https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java#L169>).
>> Joda's Datetime is limited to the precision of millisecond. It has good
>> enough precision to represent timestamp of event time, but it is not enough
>> for the real "time" data. For the "time" type data, we probably need to
>> support even up to the precision of nanosecond.
>>
>> Unfortunately, Joda decided to keep the precision of millisecond:
>> https://github.com/JodaOrg/joda-time/issues/139.
>>
>> If we want to support the precision of nanosecond, we could have two
>> options:
>>
>> Option one: utilize current FieldType's metadata field
>> <https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java#L421>,
>> such that we could set something into meta data and Row could check the
>> metadata to decide what's saved in DATETIME field: Joda's Datetime or an
>> implementation that supports nanosecond.
>>
>> Option two: have another field (maybe called TIMESTAMP field?), to have
>> an implementation to support higher precision of time.
>>
>> What do you think about the need of higher precision for time type and
>> which option is preferred?
>>
>> -Rui
>>
>

Reply via email to