We have our quantitative team using Spark as part of their daily work.  One
of the more common problems we run into is that people unintentionally
leave their shells open throughout the day.  This eats up memory in the
cluster and causes others to have limited resources to run their jobs.

With something like Hive or many client applications for SQL databases,
this is not really an issue but with Spark it's a significant inconvenience
to non-technical users.  Someone ends up having to post throughout the day
in chats to ensure people are using their shells or to 'get off the
cluster'.

Just wondering if anyone else has experienced this type of issue and how
they are managing it.  One idea we've had is to implement an 'idle timeout'
monitor for the shell, though on the surface this appears quite
challenging.

Reply via email to