Re: Hash Aggregate Memory usage

2016-06-01 Thread rahul challapalli
Thanks for the information Jacques. Based on the above formula from Jacques, the hash agg operator should not be using ~33MB memory when we have only 1 group ( and 7 varchar columns in the group). As per Aman's suggestion, I tried using fixed width columns in the group by and the memory usage

Re: Hash Aggregate Memory usage

2016-05-27 Thread Jacques Nadeau
There was a presentation a year or so ago I presented at the MapR sales kickoff that covers the memory characteristics of operators. Unfortunately, I don't have access to the content but hopefully someone internal to MapR should have it. (Maybe Ellen or Neeraja) Approximately (from memory):

Re: Hash Aggregate Memory usage

2016-05-27 Thread Aman Sinha
Rahul, can you send me the query profile separately ? Also, can you try group-by on fixed-width columns instead of Varchar ? With single group, the hash table itself should be consuming relatively small amount of memory. On Fri, May 27, 2016 at 11:14 AM, Zelaine Fong wrote:

Re: Hash Aggregate Memory usage

2016-05-27 Thread Zelaine Fong
My guess would be that for hashing, a hash table is pre-allocated based on the number of keys in the hash. That would explain why with more keys, the memory usage grows. But that's just my guess. Someone who really understands how this works should chime in :). -- Zelaine On Fri, May 27, 2016

Re: Hash Aggregate Memory usage

2016-05-27 Thread rahul challapalli
Any inputs on this one? On Wed, May 25, 2016 at 7:51 PM, rahul challapalli < challapallira...@gmail.com> wrote: > Its using hash aggregation. > On May 25, 2016 7:48 PM, "Zelaine Fong" wrote: > >> What does the explain plan show? I.e., is the group by being done via a >>

Re: Hash Aggregate Memory usage

2016-05-25 Thread rahul challapalli
Its using hash aggregation. On May 25, 2016 7:48 PM, "Zelaine Fong" wrote: > What does the explain plan show? I.e., is the group by being done via a > hash agg or a streaming agg? If it's a streaming agg, then you still have > to sort the entire data set before you reduce

Re: Hash Aggregate Memory usage

2016-05-25 Thread Zelaine Fong
Oops, my bad. I just noticed you did indicate that the query plan shows usage of a hash agg. -- Zelaine On Wed, May 25, 2016 at 7:48 PM, Zelaine Fong wrote: > What does the explain plan show? I.e., is the group by being done via a > hash agg or a streaming agg? If it's a

Re: Hash Aggregate Memory usage

2016-05-25 Thread Zelaine Fong
What does the explain plan show? I.e., is the group by being done via a hash agg or a streaming agg? If it's a streaming agg, then you still have to sort the entire data set before you reduce it down to a single group. That would explain the increase in memory as you add group by keys. --