Hello,
We need to have some queries executed with store.format set to parquet
and some with this option set to CSV. To date we have experimented with
setting the store format for sessions controlled by using two separate
user logins as a sort of context switch, but I'm wondering if the group
here might have suggestions for a better way to handle this,
particularly one that will scale a little better for us?
The main problem we have with this approach is in introducing multiple
drillbits/HA and assuring that the session and the settings we need are
respected across all drillbits (whether with an HAProxy + sticky session
approach or any other approach). There is a more general thread (which
I've chosen not to hijack) about HA Drill from a more general
standpoint, you might think of my question here as being similar, but
with the need for a context switch to support multiple Drill
configurations/session options.
Here are the various attempts and approaches we have come up with so
far. I'm wondering if you'd have any general advice as to which approach
would be best for us to take, considering future plans for Drill itself.
For example, if need be we can write our own plugin(s) if this is the
smartest approach:
- embedded the store.format option into the query itself by chaining
multiple queries together separated by a comma (it appears that this
doesn't work at all)
- look into writing some sort of plugin to allow us to scale our current
approach somehow (I realize that this is vague)
- a "foreman" approach where we stick with our current approach and
direct all requests to our "foreman"/master with the hope and
expectation that it will farm out work to the workers/slaves
- multiple clusters set with different settings
Each of these approaches seems to have its pros and cons. To reiterate:
what approach do you think would be the smartest and most future-proof
approach for us to take?
Thanks in advance!