I would recommend looking at Hive Strict Mode first. It helps users think
about their queries by throwing errors for certain operations that may be
unexpectedly bad, like a full table scan over a partitioned fact table,
when only a certain subset may be needed.

http://my.safaribooksonline.com/book/databases/hadoop/9781449326944/10dot-tuning/strict_mode_tuning_html
 On Apr 23, 2014 9:51 AM, "Thomas Larsson" <[email protected]>
wrote:

> Hello.
>
> We recently had a user that ran an ad-hoc hive query with a JOIN clause
> without an ON-predicate, resulting in a huge resultset that then resulted
> in our hdfs storage becoming full.
>
> I am wondering what support and strategies there are to help limit the
> damage that ad-hoc queries like this can do. I have looked at HDFS quotas
> which might be of some help. Is there anything else.
>
> Any tips and good links would be appreciated.
>
> Best Regards
> Thomas
>
>

Reply via email to