Hi Tanu, Some points to consider: 1. UUID is fixed size compared to domain_object_keys (dont know the size). Smaller keys will reduce the storage requirements. 2. UUIDs don't compress. Your domain object keys may compress better. 3. From the bloom filter perspective, I dont think there is any difference unless the size difference of keys is very large. 4. If the domain object keys are already unique, what is the use of suffixing the create_date? 5. If you query by "primary key minus timestamp", the entire record key column will have to be read to match it. So bloom filters won't be useful here. 6. What do the domain object keys look like? Are they going to be included in any other field in the record? Would you ever want to query on domain object keys?
Thanks Prashant On Thu, Oct 15, 2020 at 8:21 PM tanu dua <[email protected]> wrote: > read query pattern will be (partition key + primary key minus timestamp) > where my primary key is domain keys + timestamp. > > Read Write queries are as per dataset but mostly all the tables are read > and write frequently and equally > > Read will be mostly done by providing the partitions and not by blanket > query. > > If we have to choose between read and write I will choose write but I want > to stick only with COW table. > > Please let me know if you need more information. > > > On Thu, 15 Oct 2020 at 5:48 PM, Sivabalan <[email protected]> wrote: > > > Can you give us a sense of how your read workload looks like? Depending > on > > that read perf could vary. > > > > On Thu, Oct 15, 2020 at 4:06 AM Tanuj <[email protected]> wrote: > > > > > Hi all, > > > We don't have an "UPDATE" use case and all ingested rows will be > "INSERT" > > > so what is the best way to define PRIMARY key. As of now we have > designed > > > primary key as per domain object with create_date which is - > > > <domain_object_key_1>,<domain_object_key_2>,<create_date> > > > > > > Since its always an INSERT for us , I can potentially use UUID as well > . > > > > > > We use keys for Bloom Index in HUDI so just wanted to know if I get a > > > better performance in writing if I will have the UUID vs composite > domain > > > keys. > > > > > > I believe read is not impacted as per the Primary Key as its not being > > > considered ? > > > > > > Please suggest > > > > > > > > > > -- > > Regards, > > -Sivabalan > > >
