Hello All, I'm looking for a methodology on what basis we should decide the cluster capacity for Hive.
Can anyone recommend best practices to choose a cluster capacity for querying data efficiently in Hive. Please note that, we have external tables in Hive pointing to S3, so we just use Hive for querying the data. *Thanks,* *Sai.*
