Re: Optimisation on join in case of all the data to be joined present in the same machine (region server)

2018-04-16 Thread Josh Elser
That's a great suggestion too, Pedro! Sounds like both are ultimately achieving the same thing. I just didn't know what all was possible inside of Kafka Streams ;). Thanks for sharing. On 4/16/18 2:33 PM, Pedro Boado wrote: I guess this thread is not about kafka streams but what Josh

Re: Optimisation on join in case of all the data to be joined present in the same machine (region server)

2018-04-16 Thread Pedro Boado
I guess this thread is not about kafka streams but what Josh suggested is basically my last resource plan for building kafka streams as you'll be constrained by HBase/Phoenix upsert ratio -you'll be doing 5x the number of upserts- In my experience Kafka Streams is not bad at all doing this kind

Re: Optimisation on join in case of all the data to be joined present in the same machine (region server)

2018-04-16 Thread Rabin Banerjee
Thanks Josh ! On Mon, Apr 16, 2018 at 11:16 PM, Josh Elser wrote: > Please keep communication on the mailing list. > > Remember that you can execute partial-row upserts with Phoenix. As long as > you can generate the primary key from each stream, you don't need to do >

Re: Optimisation on join in case of all the data to be joined present in the same machine (region server)

2018-04-16 Thread Josh Elser
Please keep communication on the mailing list. Remember that you can execute partial-row upserts with Phoenix. As long as you can generate the primary key from each stream, you don't need to do anything special in Kafka streams. You can just submit 5 UPSERTS (one for each stream), and the

Re: Optimisation on join in case of all the data to be joined present in the same machine (region server)

2018-04-16 Thread Josh Elser
Short-answer: no. You're going to be much better off de-normalizing your five tables into one table and eliminate the need for this JOIN. What made you decide to want to use Phoenix in the first place? On 4/16/18 6:04 AM, Rabin Banerjee wrote: HI all, I am new to phoenix, I wanted to know

Optimisation on join in case of all the data to be joined present in the same machine (region server)

2018-04-16 Thread Rabin Banerjee
HI all, I am new to phoenix, I wanted to know if I have to join 5 huge tables where all are keyed based on the same id (i.e. one id columns is common between all of them), is there any optimization to add to make this join faster , as all the data for a particular key for all 5 tables will