Hello!

We have Hadoop/HDFS running with Yarn/Spark on worker nodes for processing jobs that are ran on a schedule. We would like to introduce a queue for Spark "streaming" jobs that are indefinite/do not exit, without interfering with the scheduled jobs or Hadoop/HBase/HDFS. We currently limit Yarn to 11 CPUs, and want to bump it up to 14 CPUs to handle this additional queue. Is this a sensible thing to do on the workers themselves? From profiling a bit it seems like the non-Yarn/Spark related processes don't require a huge amount of CPU, but is there a recommended resource amount for Hadoop/HBase/HDFS that I can reference? One worker has 24 CPU, 125GB RAM, 8 Disks.

Thanks!


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to