On Sat, Mar 7, 2009 at 2:35 AM, Doug Judd <[email protected]> wrote:
> Well, what exactly is the workload?  Are you inserting data?  Are you doing
> queries?  How many queries per second?  Are they long scans or individual
> cell lookups?  That might tell us why the blocks in the cache are pinned.

In the initial phase I'm doing a full table scan and this gives 100%
cpu utilization
on one of the 4 of them. I'm caching data in memory to prevent using
the table in later phases.

In the second phase I'm doing queries only (on a different table than
previous phase). These are single row (and in most cases single cell)
lookups. The workload is about 70% on one of the 4 cpus.

This phase works basically by passing over a table containing text
data, breaking up text in tokens, and computing TFIDF for every word.
I'm again using some caching to lower the number of times the database
is consulted. But the crash occurs before this cache has a chance to
work at all.

The TFIDF I'm doing works by consulting two tables. It creates two
scanners and preforms exactly 2 single cell lookups on these tables.

The query rate per second is not high and I guess it is about 4-10 per second.

Mateusz


>
> On Fri, Mar 6, 2009 at 5:29 PM, Mateusz Berezecki <[email protected]>
> wrote:
>>
>> On Sat, Mar 7, 2009 at 2:08 AM, Doug Judd <[email protected]> wrote:
>> > In the HT_FATALF statement, print out the size of the block that is
>> > attempting to be inserted into the block cache.  If it is larger than
>> > 200MB,
>> > then that would be the reason.  If the size of the block looks
>> > reasonable,
>> > maybe you can add a method to the BlockCache called something like
>> > tell_me_why_insert_and_checkout_is_failing() which is an exact copy of
>> > insert_and_checkout, but prints out a bunch of diagnostic information as
>> > to
>> > why it cannot insert the object into the cache.  Then, right before the
>> > call
>> > to HT_FATALF, call this method and see what it prints out.
>>
>> Ok, so it's running out of memory for some strange reason, but
>> surprisingly the RangeServer process is at 15.1% of total memory
>> utilization
>>
>> 1236389033 ERROR Hypertable.RangeServer : insert_and_checkout
>>
>> (/home/mateusz/hypertable/src/cc/Hypertable/RangeServer/FileBlockCache.cc:114):
>> available memory : 303465
>> 1236389033 ERROR Hypertable.RangeServer : insert_and_checkout
>>
>> (/home/mateusz/hypertable/src/cc/Hypertable/RangeServer/FileBlockCache.cc:115):
>> length : 996243
>>
>> It looks like the part of the code below the // make room comment
>> is unable to find enough room and the only case this could happen is
>> hitting m_cache.end() == iter condition before enough space gets reserved.
>>
>>  95   // make room
>>  96   if (m_avail_memory < length) {
>>  97     BlockCache::iterator iter = m_cache.begin();
>>  98     while (iter != m_cache.end()) {
>>  99       if ((*iter).ref_count == 0) {
>> 100         m_avail_memory += (*iter).length;
>> 101         delete [] (*iter).block;
>> 102         iter = m_cache.erase(iter);
>> 103         if (m_avail_memory >= length)
>> 104           break;
>> 105       }
>> 106       else
>> 107         ++iter;
>> 108     }
>> 109   }
>> 110
>> 111   if (m_avail_memory < length)
>> 112   {
>> 113     HT_ERROR_OUT << "Out of MEMOR?Y!" << HT_END;
>> 114     HT_ERROR_OUT << "available memory : " << m_avail_memory << HT_END;
>> 115     HT_ERROR_OUT << "length : " << length << HT_END;
>> 116     return false;
>> 117   }
>>
>> Is there anything that I can do to workaround this?
>>
>> Mateusz
>>
>>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to