Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-11 Thread Ryan Blue
estamp of the last >>>>>> change, or we want to use it only as a monotonously increasing more human >>>>>> readable identifier? >>>>>> Do we want to compare this timestamp against some external source, or >>>>>> we just want to

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-10 Thread Ashish Mehta
;>> Do we want to compare this timestamp against some external source, or >>>>> we just want to compare this timestamp with other timestamps in the >>>>> different snapshots of the same table? >>>>> >>>>> >>>>> So, there are tw

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-10 Thread Ryan Blue
timestamps in the >>>> different snapshots of the same table? >>>> >>>> >>>> So, there are two asks: 1). Whether to have a timestamp based API for >>>> delta reading; 2). How to enforce and implement a service/protocol for >>>&g

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-10 Thread Ryan Blue
timestamp based API for >>> delta reading; 2). How to enforce and implement a service/protocol for >>> timestamp sync among all clients. >>> >>> 1). +1 to have it as Jingsong and Gautam suggested. Snapshot ID could be >>> source of truth in any cases. >>

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-10 Thread Gautam
ce and implement a service/protocol for >> timestamp sync among all clients. >> >> 1). +1 to have it as Jingsong and Gautam suggested. Snapshot ID could be >> source of truth in any cases. >> >> 2). IMO, it should be an external package to Iceberg. >> >&

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-09 Thread Ryan Blue
O, it should be an external package to Iceberg. > > Miao > > *From: *OpenInx > *Reply-To: *"dev@iceberg.apache.org" > *Date: *Tuesday, September 8, 2020 at 7:55 PM > *To: *Iceberg Dev List > *Subject: *Re: Timestamp Based Incremental Reading in Iceberg ... > >

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-09 Thread Peter Vary
8, 2020 at 7:55 PM > To: Iceberg Dev List > Subject: Re: Timestamp Based Incremental Reading in Iceberg ... > > I agree that it's helpful to allow users to read the incremental delta based > timestamp, as Jingsong said timestamp is more friendly. > > My question

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-08 Thread Miao Wang
uot; Date: Tuesday, September 8, 2020 at 7:55 PM To: Iceberg Dev List Subject: Re: Timestamp Based Incremental Reading in Iceberg ... I agree that it's helpful to allow users to read the incremental delta based timestamp, as Jingsong said timestamp is more friendly. My question is how to

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-08 Thread Peter Vary
Newby here, but if I understand correctly, the client knows the previous snapshot and the corresponding timestamp. It could be the responsibility of the client to generate a new timestamp which is higher or equal than the previous one. There might be checks implemented on commit to prevent smaller

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-08 Thread OpenInx
I agree that it's helpful to allow users to read the incremental delta based timestamp, as Jingsong said timestamp is more friendly. My question is how to implement this ? If just attach the client's timestamp to the iceberg table when committing, then different clients may have different tim

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-08 Thread Jingsong Li
+1 for timestamps are linear, in implementation, maybe the writer only needs to look at the previous snapshot timestamp. We're trying to think of iceberg as a message queue, Let's take the popular queue Kafka as an example, Iceberg has snapshotId and timestamp, corresponding, Kafka has offset and

Re: Timestamp Based Incremental Reading in Iceberg ...

2020-09-08 Thread Sud
We are using incremental read for iceberg tables which gets quite few appends ( ~500- 1000 per hour) . but instead of using timestamp we use snapshot ids and track state of last read snapshot Id. We are using timestamp as fallback when the state is incorrect, but as you mentioned if timestamps are

Timestamp Based Incremental Reading in Iceberg ...

2020-09-08 Thread Gautam
Hello Devs, We are looking into adding workflows that read data incrementally based on commit time. The ability to read deltas between start / end commit timestamps on a table and ability to resume reading from last read end timestamp. In that regard, we need the timestamps to be