Hi Rahul, I am sorry, I didnt understand the use case properly. Can you please explain with an example? Let me put my version of understanding based on your email. > In json file, every time i will pass a fixed value for a key field. Are you saying that you will always have only one value for every key? Example: Rahul -> "Some Value"
> Currently if i load data like this only 1 entry per file only load. What do you mean by this line? Do you mean currently you are loading data like this and only 1 entry per file is loading. Isnt that what you are trying to achieve in the line above? > I don't want same key's values to be skipped while inserting. All you are saying that you want to have same values also repeated in your keys eg: if Rahul primary_key has "Some Value" insert 5 times, then you would want that to appear 5 times in your store? In summary: it appears that what you want is if someone enters 5 values even if they are same. So you need something as below: > | primary_key | Values | > | Rahul | "Some Value", "Some Value", ..... | Let me know if my understanding is correct. Thanks Kabeer. > Dear Omar/Kabeer > In one of my usecasetthink like i don't want update at all. In json file, > every time i will pass a fixed value for a key field. Currently if i load > data like this only 1 entry per file only load. I don't want same key's > values to be skipped while inserting. > Thanks & Regards > Rahul On Apr 5 2019, at 9:11 am, Unknown wrote: > > > On 2019/04/04 19:48:39, Kabeer Ahmed <[email protected]> wrote: > > Omkar - there might be various reasons to have duplicates eg: handle trades > > in a given day from a single client, track visitor click data to the > > website etc. > > > > Rahul - If you can give more details about your requirements, then we can > > come up with a solution. > > I have never used INSERT & BULK_INSERT at all and I am not sure if these > > options (insert and bulk_insert) do allow user to specify the logic that > > you are seeking. Without knowing your exact requirement, I can still give a > > suggestion to look into the option of implementing your own > > combineAndGetUpdateValue() logic. > > Lets say all your values for a particular key are strings. You could append > > the string values to existing values and store them as: > > > > key | Value > > Rahul | Nice > > // when there is another entry append the existing one with value with a > > comma separator per say. > > > > key | Value > > Rahul | Nice, Person > > When you retrieve the key values you could then decide to ship back to user > > as you want - which is something you would know based on your requirement - > > since your json is anyways having multiple ways to insert values for a key. > > > > Feel free to reach out if you need help and I will help you as much as I > > can. > > On Apr 4 2019, at 6:35 pm, Omkar Joshi <[email protected]> wrote: > > > Hi Rahul, > > > > > > Thanks for trying out Hudi!! > > > Any reason why you need to have duplicates in HUDI dataset? Will you ever > > > be updating it later? > > > > > > Thanks, > > > Omkar > > > > > > On Thu, Apr 4, 2019 at 1:33 AM [email protected] < > > > [email protected]> wrote: > > > > > > > Dear All > > > > I am using cow table with INSERT/BULK_INSERT. > > > > I am loading the data from json files. > > > > > > > > If existing key in hudi dataset is loading again, then only new data > > > > with > > > > that key only showing. Can i able to show both data? (In INSERT) > > > > > > > > If same key is there in multiple times in a source json file, then only > > > > one key is getting loaded. Can i able to load duplicates keys from same > > > > file. (both insert/bulk_insert) > > > > > > > > > > > > Thanks & Regards > > > > Rahul > > > > > > > > > > > > > > Dear Omar/Kabeer > In one of my usecasetthink like i don't want update at all. In json file, > every time i will pass a fixed value for a key field. Currently if i load > data like this only 1 entry per file only load. I don't want same key's > values to be skipped while inserting. > Thanks & Regards > Rahul >
