Hello. We recently had a user that ran an ad-hoc hive query with a JOIN clause without an ON-predicate, resulting in a huge resultset that then resulted in our hdfs storage becoming full.
I am wondering what support and strategies there are to help limit the damage that ad-hoc queries like this can do. I have looked at HDFS quotas which might be of some help. Is there anything else. Any tips and good links would be appreciated. Best Regards Thomas
