Re: Supporting large values

Bill Havanki Tue, 27 May 2014 15:21:07 -0700

Stack traces are here:

https://gist.github.com/kbzod/e6e21ea15cf5670ba534


This time something showed up in the monitor, often there is no stack trace
there. The thread dump is from setting ACCUMULO_KILL_CMD to "kill -3 %p".

Thanks again
Bill


On Tue, May 27, 2014 at 5:09 PM, Bill Havanki <[email protected]>wrote:

> I left the default key size constraint in place. I had set the tserver
> mesage size up from 1 GB to 1.5 GB, but it didn't help. (I forgot that
> config item.)
>
> Stack trace(s) coming up! I got tired of failures all day so I'm running a
> different test that will hopefully work. I'll re-break it shortly :D
>
>
> On Tue, May 27, 2014 at 5:04 PM, Josh Elser <[email protected]> wrote:
>
>> Stack traces would definitely be helpful, IMO.
>>
>> (or interesting if nothing else :D)
>>
>>
>> On 5/27/14, 4:55 PM, Bill Havanki wrote:
>>
>>> No sir. I am seeing general out of heap space messages, nothing about
>>> direct buffers. One specific example would be while Thrift is writing to
>>> a
>>> ByteArrayOutputStream to send off scan results. (I can get an exact stack
>>> trace - easily :} - if it would be helpful.) It seems as if there just
>>> isn't enough heap left, after controlling for what I have so far.
>>>
>>> As a clarification of my original email: each row has 100 cells, and each
>>> cell has a 100 MB value. So, one row would occupy just over 10 GB.
>>>
>>>
>>> On Tue, May 27, 2014 at 4:49 PM, <[email protected]> wrote:
>>>
>>>  Are you seeing something similar to the error in
>>>> https://issues.apache.org/jira/browse/ACCUMULO-2495?
>>>>
>>>> ----- Original Message -----
>>>>
>>>> From: "Bill Havanki" <[email protected]>
>>>> To: "Accumulo Dev List" <[email protected]>
>>>> Sent: Tuesday, May 27, 2014 4:30:59 PM
>>>> Subject: Supporting large values
>>>>
>>>> I'm trying to run a stress test where each row in a table has 100 cells,
>>>> each with a value of 100 MB of random data. (This is using Bill Slacum's
>>>> memory stress test tool). Despite fiddling with the cluster
>>>> configuration,
>>>> I always run out of tablet server heap space before too long.
>>>>
>>>> Here are the configurations I've tried so far, with valuable guidance
>>>> from
>>>> Busbey and madrob:
>>>>
>>>> - native maps are enabled, tserver.memory.maps.max = 8G
>>>> - table.compaction.minor.logs.threshold = 8
>>>> - tserver.walog.max.size = 1G
>>>> - Tablet server has 4G heap (-Xmx4g)
>>>> - table is pre-split into 8 tablets (split points 0x20, 0x40, 0x60,
>>>> ...), 5
>>>> tablet servers are available
>>>> - tserver.cache.data.size = 256M
>>>> - tserver.cache.index.size = 40M (keys are small - 4 bytes - in this
>>>> test)
>>>> - table.scan.max.memory = 256M
>>>> - tserver.readahead.concurrent.max = 4 (default is 16)
>>>>
>>>> It's often hard to tell where the OOM error comes from, but I have seen
>>>> it
>>>> frequently coming from Thrift as it is writing out scan results.
>>>>
>>>> Does anyone have any good conventions for supporting large values?
>>>> (Warning: I'll want to work on large keys (and tiny values) next! :) )
>>>>
>>>> Thanks very much
>>>> Bill
>>>>
>>>> --
>>>> // Bill Havanki
>>>> // Solutions Architect, Cloudera Govt Solutions
>>>> // 443.686.9283
>>>>
>>>>
>>>>
>>>
>>>
>
>
> --
> // Bill Havanki
> // Solutions Architect, Cloudera Govt Solutions
> // 443.686.9283
>



-- 
// Bill Havanki
// Solutions Architect, Cloudera Govt Solutions
// 443.686.9283

Re: Supporting large values

Reply via email to