I'm a little hesitant of this for a few things I've noticed from lots of various installations:

* Salted tables are *not* always more efficient. In fact, I've found myself giving advice to not use salted tables a bit more than expected. Certain kinds of queries will require much more work if you have salting over not having salting

* Considering salt buckets as a measure of parallelism for a table, it's impossible for the system to correctly judge what the parallelism of the cluster should be. For example, with 10 RS and 1 Phoenix table, you would want to start with 10 salt buckets. However, with 10 RS and 100 Phoenix tables, you'd *maybe* want to do 3 salt buckets. It's hard to make system wide decisions correctly without a global view of the entire system.

I think James was trying to capture some of this in his use of "relative conservative default", but I'd take that even a bit farther to say I consider it harmful for Phoenix to do that out of the box.

However, I would flip the question upside down instead: what kind of suggestions can Phoenix make as a database to the user to _recommend_ to them that they enable salting on a table given its schema and important queries?

On 12/8/17 12:34 PM, James Taylor wrote:
Hi Flavio,
I like the idea of “adaptable configuration” where you specify a config value as a % of some cluster resource (with relatively conservative defaults). Salting is somewhat of a gray area though as it’s not config based, but driven by your DDL. One solution you could implement on top of Phoenix is scripting for DDL that fills in the salt bucket parameter based on cluster size.
Thanks,
James

On Tue, Dec 5, 2017 at 12:50 AM Flavio Pompermaier <[email protected] <mailto:[email protected]>> wrote:

    Hi to all,
    as stated by at the documentation[1] "for optimal performance,
    number of salt buckets should match number of region servers".
    So, why not to add an option AUTO/DEFAULT for salting that defaults
    this parameter to the number of region servers?
    Otherwise I have to manually connect to HBase, retrieve that number
    and pass to Phoenix...
    What do you think?

    [1] https://phoenix.apache.org/performance.html#Salting

    Best,
    Flavio

Reply via email to