Hey Doug,

Yes, that's exactly what was happening.  I've since rebuilt everything
with tcmalloc/google-perftools according to the docs and the memory
usage has become more manageable but I still see high consumption and
eventual memory exhaustion during heavy updates.

A new problem I've encountered with the tcmalloc-built binaries is
that the ThriftBroker hangs soon after it completes some random number
of reads or updates, usually within a minute or two of activity.  I
tried using the non-tcmalloc ThriftBroker binary with the currently
running tcmalloc master/rangeservers/kosmosbrokers and it still hung.
I'm going to try going back and start a fresh Hypertable instance with
the non-tcmalloc binaries for everything to see if the problem goes
away.  Could be some changes to our app code causing the ThriftBroker
hangs, we'll see.

Thanks for the update btw! :-)

Josh

On Wed, Apr 15, 2009 at 9:31 PM, Doug Judd <[email protected]> wrote:
> Hi Josh,
>
> Is it possible that the system underwent heavy update activity during that
> time period?  We don't have request throttling in place yet (should be out
> next week), so it is possible for the RangeServer to exhaust memory under
> heavy update workloads.  It looks like the commit log got
> truncated/corrupted when the machine died.  You can tell the RangeServer to
> skip commit log errors with the following property:
>
> Hypertable.CommitLog.SkipErrors=true
>
> This data in the commit log that is being skipped will most likely be lost.
>
> - Doug
>
> On Mon, Apr 13, 2009 at 1:10 PM, Josh Adams <[email protected]> wrote:
>>
>> On Mon, Apr 13, 2009 at 9:58 AM, Doug Judd <[email protected]> wrote:
>> > No, it shouldn't.  One thing that might help is to install tcmalloc
>> > (google-perftools) and then re-build.  You'll need to have tcmalloc
>> > installed in all your runtime environments.
>>
>> Ok thanks, I'll try that out hopefully this week and let you know.
>>
>> > 157 on it a while back.  It would be interesting to know if the disk
>> > subsystems on any of your machines are getting saturated during this low
>> > throughput condition.  If so, then there probably is not much we can do
>>
>> Good point, I'll keep an eye on that.
>>
>> I was out of town on a short trip over the weekend and I wasn't
>> watching our Hypertable instance very closely.  During the early
>> morning hours on Saturday it looks like each of the four machines
>> running RangeServer/kosmosBroker/ThriftBroker had their memory spike
>> heavily for about an hour.  The root RangeServer started swapping and
>> the machine went down later that day.  I can't start the instance back
>> up at the moment because the root RangeServer is complaining about
>> this error and dies when I try starting it:
>>
>> 1239651998 ERROR Hypertable.RangeServer : load_next_valid_header
>>
>> (/data/tmp/dev/src/hypertable/6d5fdd1/src/cc/Hypertable/Lib/CommitLogBlockStream.cc:148):
>> Hypertable::Exception: Error reading 34 bytes from DFS fd 1057 -
>> HYPERTABLE failed expectation
>>        at virtual size_t Hypertable::DfsBroker::Client::read(int32_t,
>> void*,
>> size_t)
>> (/data/tmp/dev/src/hypertable/6d5fdd1/src/cc/DfsBroker/Lib/Client.cc:258)
>>        at size_t Hypertable::ClientBufferedReaderHandler::read(void*,
>> size_t)
>> (/data/tmp/dev/src/hypertable/6d5fdd1/src/cc/DfsBroker/Lib/ClientBufferedReaderHandler.cc:161):
>> empty queue
>>
>> I've attached a file containing the relevant errors at the end of its
>> log and also the whole kosmosBroker log file for that startup attempt.
>>
>> Cheers,
>> Josh
>>
>>
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to