Thanks a lot for your help! I see that everyone is really against the "many small tables" setup. Is that not efficient? I did it that way because having many tables really felt more natural, but I can give it a shot and see if that helps.
Isart On Wed, Jun 3, 2015 at 9:10 PM, Alex Kamil <[email protected]> wrote: > Isart, you can try eliminate joins by embedding smaller tables into larger > ones wherever possible, we do this with ARRAY > <https://phoenix.apache.org/array_type.html> (and writing some UDFs > <https://phoenix.apache.org/udf.html> for refined filtering in these > nested tables), or as James suggested Views might be helpful. > > Alex > > On Wed, Jun 3, 2015 at 10:02 AM, James Taylor <[email protected]> > wrote: > >> Rather than use a SALT_BUCKET of 2, just don't salt the table at all. It >> never makes sense to have a SALT_BUCKET of 1, though. >> >> How many total tables do you have? Are you using views at all ( >> http://phoenix.apache.org/views.html)? >> >> Thanks, >> James >> >> On Wednesday, June 3, 2015, Puneet Kumar Ojha <[email protected]> >> wrote: >> >>> Do not use SALT_BUCKET=32 for smaller join table. Use salt number as 1 >>> or 2. >>> >>> Increase the handler count to 60. Recommended RAM is atleast 16GB / RS. >>> >>> >>> >>> Your join query performance should increase and cluster will be stable. >>> >>> >>> >>> *From:* Isart Montane [mailto:[email protected]] >>> *Sent:* Wednesday, June 03, 2015 4:44 PM >>> *To:* [email protected] >>> *Subject:* Recommendations on phoenix setup >>> >>> >>> >>> Hi, >>> >>> I would like to use Phoenix to replace a few of our databases, and I've >>> been doing some tests on that direction. So far it's been working all right >>> but I wanted to share it with you to see if I can get some recommendations >>> from other experiences. >>> >>> Our dataset has 1 big table (around 200G) and around 100k smaller tables >>> (the biggest is 5-6G, but 90% are less than 1G), the application runs >>> mainly joins on one or two of this small tables and the big one to return >>> just a few rows back to the app. So far it's been working OK in a 4 nodes >>> test cluster (64G of RAM in total) >>> >>> All the tables are created with SALT_BUCKETS=32,COMPRESSION='snappy' >>> >>> >>> >>> Is someone running a similiar setup? any tips on how much RAM shall I >>> use? >>> >>> >>> >> >
