thanks for replying, Adar. Did some math and in our case we are hitting another Kudu limit - 60 tablets per node. We use high density nodes with 2 24-core CPUs so we have 88 hyperthreaded cores total per node or 88*24=2112 cores total. But I cannot create more than 60*24=1440 tablets per table. Looks like my tablets for the largest table will be around 8-10Gb in size. Should I be worried since recommendation is to keep tablets about 1Gb in size?
On Wed, Oct 17, 2018 at 8:06 PM Adar Lieber-Dembo <a...@cloudera.com> wrote: > Hi Boris, > > > Also, when they say tablets - I assume this is before replication? so in > reality, it is number of nodes x cpu cores / replication factor? If this is > the case, it is not looking good... > > No, I think this is post-replication. The underlying assumption is > that you want to maximize parallelism for large tables, and since > Impala only uses one read thread per tablet, that means ensuring the > number of tablets is close or equal to the overall number of cores. > However, during a scan Impala will choose one of the tablet's replicas > to read from, so you don't need to "reserve" a core for the other > replicas. > > >> can someone clarify if this recommendation below - does it mean > physical or hyper-threaded CPU cores? quite a big difference... > > I think this refers to hyper-threaded CPU cores (i.e. a CPU unit > capable of executing an OS thread). But I'd be curious to hear if your > workload is substantially more or less performant either way. >