Hi Ming. Yes, I think there must be a primary key, we can compute the bucket from the primary key, and find which executor to visit. This is the primary key Query Service.
And then, maybe we can introduce more Query Service types, maybe another service can be Secondary indexed Query Service, it can be queried by another field to get primary key, (maybe use RocksDB to maintain the index) and query primary key Query Service to get the whole value. The Secondary indexed Query Service and Primary Key Query Service are independent and unrelated, but then, we can use Snapshot Id to do some consistent alignment work. But this should be more complicated. These things can be imaged, but need lots of work. I just created a POC for first version, it is very rough: https://github.com/apache/incubator-paimon/pull/2110 Best, Jingsong On Tue, Oct 10, 2023 at 3:36 PM Ming Li <[email protected]> wrote: > > Thanks for the proposal! > It is a common scenario for multiple applications to share the same > dimension table. As described in the design document, the TableQuery client > will obtain the addresses of all Executors from the AddressServer and then > request them through RPC. I have a question about this: How does the > TableQuery client decide which Executor to request? Request all Executors > in turn? Or is it restricted that the key of lookup must contain bucket-key? > > Best, > Ming Li > > > Jingsong Li <[email protected]> 于2023年10月8日周日 18:35写道: > > > Hi all, > > > > I want to bring up a discussion about Paimon QueryService [1]. > > > > Paimon primary key table already provides LSM file structure, it is a > > pity that the paimon can not provide a queryable service for lookup. > > > > A distributed service can download Paimon files locally and provide a > > Lookup service. It does not affect the write process and read process, > > it is a separate server. It can be used as: > > > > 1. Flink Lookup Join, reuse by multiple Flink Jobs. > > 2. Online Service Lookup, this requires high stability. (it may not be > > so stable in the first version) > > > > See more in PIP [1]. > > > > This PIP is a high-level design for Paimon QueryService, not including > > all details. > > > > [1] > > https://cwiki.apache.org/confluence/display/PAIMON/PIP-10%3A+Introduce+Paimon+QueryService > > > > Best, > > Jingsong > >
