[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

Mck SembWever (Commented) (JIRA) Tue, 27 Sep 2011 10:26:08 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115726#comment-13115726
 ]


Mck SembWever commented on CASSANDRA-3150:
------------------------------------------

bq. you haven't done any messing with index_interval by chance?
Once, around the time of this issue was created. I tried two smaller values 
(the first attempt gave OOM) but it didn't change much. The value is now at the 
original 128 again.

bq. So how many memtables do you have at once and how many rows can fit in a 
memtable?

Skinny rows. three columns: type, timestamp, data. type is a short string and 
indexed, timestamp is a long and indexed, data is a thrift serialised bean 
100bytes to 1k.

memtables is set to 1024. (this was decided since we're using Xmx8g and didn't 
want huge number of sstables files).
So i guess that means we could fit up to ~5 million rows in a memtable...

But above i wrote that neither a flush nor a compact helped the situation... 
(in fact each hadoop job starts and finishes with a flush) at least it didn't 
with 0.8.4 and 0.8.5. Now it seems much better w/ 0.8.6. I see through the day 
tasks go over 100% but it doesn't exaggerate through the day as badly as it did 
before.
                
> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3150
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.4, 0.8.5
>            Reporter: Mck SembWever
>            Assignee: Mck SembWever
>            Priority: Critical
>             Fix For: 0.8.6
>
>         Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_000029 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
> fullscan-example1.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

Reply via email to