[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-10-30 Thread Mck SembWever (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13139728#comment-13139728
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 10/30/11 7:46 PM:


Back after an upgrade to cassandra-1.0.0

Example job start logs{noformat}[ INFO] 20:39:21  Restricting input range: 
3589a548d20f80a7b41368b59973bcbc -- 36f0bedaf02a49a3b41368b59973bcbc []  at 
no.finntech.countstats.reduce.rolled.internal.GenericCountAggregation.configureCountToAggregateMapper(GenericCountAggregation.java:222)
[ INFO] 20:39:21  Corresponding time range is Sun Oct 30 00:00:00 CEST 2011 
(131992560) -- Sun Oct 30 23:00:00 CET 2011 (132001200) []  at 
no.finntech.countstats.reduce.rolled.internal.GenericCountAggregation.configureCountToAggregateMapper(GenericCountAggregation.java:225)
[ INFO] 20:39:21  Starting AdIdCountAggregation-phase1-DAY_2011303 ( 
to=1320002999000)) []  at 
no.finntech.countstats.reduce.rolled.internal.GenericCountAggregation.run(GenericCountAggregation.java:142)
[DEBUG] 20:39:21  adding 
ColumnFamilySplit{startToken='3589a548d20f80a7b41368b59973bcbc', 
endToken='36f0bedaf02a49a3b41368b59973bcbc', dataNodes=[0.0.0.0]} []  at 
org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:210)
{noformat}

In this split there is in fact ~40 million rows, and with a split size of 
393216, it is expected ~100 splits to be returned.

I'm also very confused by the dataNodes=[0.0.0.0]. this looks to be wrong, i 
know that data lies on the third node: cassandra03, where-else the job is 
running on the first node: cassandra01.

  was (Author: michaelsembwever):
Back after an upgrade to cassandra-1.0.0

Example job start logs{noformat}[ INFO] 20:39:21  Restricting input range: 
3589a548d20f80a7b41368b59973bcbc -- 36f0bedaf02a49a3b41368b59973bcbc []  at 
no.finntech.countstats.reduce.rolled.internal.GenericCountAggregation.configureCountToAggregateMapper(GenericCountAggregation.java:222)
[ INFO] 20:39:21  Corresponding time range is Sun Oct 30 00:00:00 CEST 2011 
(131992560) -- Sun Oct 30 23:00:00 CET 2011 (132001200) []  at 
no.finntech.countstats.reduce.rolled.internal.GenericCountAggregation.configureCountToAggregateMapper(GenericCountAggregation.java:225)
[ INFO] 20:39:21  Starting AdIdCountAggregation-phase1-DAY_2011303 ( 
to=1320002999000)) []  at 
no.finntech.countstats.reduce.rolled.internal.GenericCountAggregation.run(GenericCountAggregation.java:142)
[DEBUG] 20:39:21  adding 
ColumnFamilySplit{startToken='3589a548d20f80a7b41368b59973bcbc', 
endToken='36f0bedaf02a49a3b41368b59973bcbc', dataNodes=[0.0.0.0]} []  at 
org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:210)
{noformat}

In this split there is in fact ~40 million rows, and with a split size of 
393216, it is expected ~100 splits to be returned.

I'm also very confused by the {{dataNodes=[0.0.0.0]}}. this looks to be wrong, 
i know that data lies on the third node: cassandra03, where-else the job is 
running on the first node: cassandra01.
  
> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4, 0.8.5
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Fix For: 0.8.6
>
> Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
> fullscan-example1.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-27 Thread Mck SembWever (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115278#comment-13115278
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/27/11 7:14 AM:
---

No. (the title is incorrect).

That split contains millions of rows, despite StorageService.getSplits(..) 
being called with splitSize 393216.
In the worse situations the split has 20 millions rows in it. These were all 
valid and had to be all processed for the hadoop job to produce accurate 
results.

  was (Author: michaelsembwever):
No. (the title is incorrect).

That split contains millions of rows, despite StorageService.getSplits(..) 
being called with splitSize 393216.
In the worse situations the split has 20 millions rows in it. This were all 
valid are had to be process for the hadoop job to produce accurate results.
  
> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4, 0.8.5
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Fix For: 0.8.6
>
> Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
> fullscan-example1.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-26 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114220#comment-13114220
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/26/11 8:29 AM:
---

{{fullscan-example1.log}} is debug from a "full scan" job. It scans data over a 
full year (and since the cluster's ring range only holds 3 months of data such 
a job guarantees a full scan).

In the debug you see the splits.
{{`nodetool ring`}} gives

{noformat}Address DC  RackStatus State   Load   
 OwnsToken   
   
Token(bytes[5554])
152.90.241.22   DC1 RAC1Up Normal  16.65 GB33.33%  
Token(bytes[00])
152.90.241.23   DC2 RAC1Up Normal  63.22 GB33.33%  
Token(bytes[2aaa])
152.90.241.24   DC1 RAC1Up Normal  72.4 KB 33.33%  
Token(bytes[5554])
{noformat}

The problematic split ends up being 
{noformat}ColumnFamilySplit{startToken='0528cbe0b2b5ff6b816c68b59973bcbc', 
endToken='2aaa', 
dataNodes=[cassandra02.finn.no]}{noformat} This is the split that's receiving 
new data (5-10k rows/second). This new data is being written directly using 
{{StorageProxy.mutate(..)}} with code somewhat similar to the second example in 
[wiki: ScribeToCassandra|http://wiki.apache.org/cassandra/ScribeToCassandra].

This cluster has {{binary_memtable_throughput_in_mb: 1024}} and {{Xmx8g}}. 
There are 3 nodes in the cluster each 48g ram and 24cpus.

  was (Author: michaelsembwever):
{{fullscan-example1.log}} is debug from a "full scan" job. It scans data 
over a full year (and since the cluster's ring range only holds 3 months of 
data such a job guarantees a full scan).

In the debug you see the splits.
{{`nodetool ring`}} gives

{noformat}Address DC  RackStatus State   Load   
 OwnsToken   
   
Token(bytes[5554])
152.90.241.22   DC1 RAC1Up Normal  16.65 GB33.33%  
Token(bytes[00])
152.90.241.23   DC2 RAC1Up Normal  63.22 GB33.33%  
Token(bytes[2aaa])
152.90.241.24   DC1 RAC1Up Normal  72.4 KB 33.33%  
Token(bytes[5554])
{noformat}

The problematic split ends up being 
{noformat}ColumnFamilySplit{startToken='0528cbe0b2b5ff6b816c68b59973bcbc', 
endToken='2aaa', 
dataNodes=[cassandra02.finn.no]}{noformat} This is the split that's receiving 
new data (5-10k rows/second). This new data is being written directly using 
{{StorageProxy.mutate(..)}} with code somewhat similar to the second example in 
[wiki: ScribeToCassandra|http://wiki.apache.org/cassandra/ScribeToCassandra].
  
> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4, 0.8.5
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Fix For: 0.8.6
>
> Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
> fullscan-example1.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-25 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114220#comment-13114220
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/25/11 11:32 AM:


{{fullscan-example1.log}} is debug from a "full scan" job. It scans data over a 
full year (and since the cluster's ring range only holds 3 months of data such 
a job guarantees a full scan).

In the debug you see the splits.
{{`nodetool ring`}} gives

{noformat}Address DC  RackStatus State   Load   
 OwnsToken   
   
Token(bytes[5554])
152.90.241.22   DC1 RAC1Up Normal  16.65 GB33.33%  
Token(bytes[00])
152.90.241.23   DC2 RAC1Up Normal  63.22 GB33.33%  
Token(bytes[2aaa])
152.90.241.24   DC1 RAC1Up Normal  72.4 KB 33.33%  
Token(bytes[5554])
{noformat}

The problematic split ends up being 
{noformat}ColumnFamilySplit{startToken='0528cbe0b2b5ff6b816c68b59973bcbc', 
endToken='2aaa', 
dataNodes=[cassandra02.finn.no]}{noformat} This is the split that's receiving 
new data (5-10k rows/second). This new data is being written directly using 
{{StorageProxy.mutate(..)}} with code somewhat similar to the second example in 
[wiki: ScribeToCassandra|http://wiki.apache.org/cassandra/ScribeToCassandra].

  was (Author: michaelsembwever):
{{fullscan-example1.log}} is debug from a "full scan" job. It scans data 
over a full year (and since the cluster's ring range only holds 3 months of 
data such a job guarantees a full scan).

In the debug you see the splits.
{{`nodetool ring`}} gives

{noformat}Address DC  RackStatus State   Load   
 OwnsToken   
   
Token(bytes[5554])
152.90.241.22   DC1 RAC1Up Normal  16.65 GB33.33%  
Token(bytes[00])
152.90.241.23   DC2 RAC1Up Normal  63.22 GB33.33%  
Token(bytes[2aaa])
152.90.241.24   DC1 RAC1Up Normal  72.4 KB 33.33%  
Token(bytes[5554])
{noformat}

The problematic split ends up being 
{noformat}ColumnFamilySplit{startToken='0528cbe0b2b5ff6b816c68b59973bcbc', 
endToken='2aaa', 
dataNodes=[cassandra02.finn.no]}{noformat} This is the split that's receiving 
new data (5-10k rows/second).
  
> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4, 0.8.5
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Fix For: 0.8.6
>
> Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
> fullscan-example1.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-25 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114220#comment-13114220
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/25/11 11:30 AM:


{{fullscan-example1.log}} is debug from a "full scan" job. It scans data over a 
full year (and since the cluster's ring range only holds 3 months of data such 
a job guarantees a full scan).

In the debug you see the splits.
{{`nodetool ring`}} gives

{noformat}Address DC  RackStatus State   Load   
 OwnsToken   
   
Token(bytes[5554])
152.90.241.22   DC1 RAC1Up Normal  16.65 GB33.33%  
Token(bytes[00])
152.90.241.23   DC2 RAC1Up Normal  63.22 GB33.33%  
Token(bytes[2aaa])
152.90.241.24   DC1 RAC1Up Normal  72.4 KB 33.33%  
Token(bytes[5554])
{noformat}

The problematic split ends up being 
{noformat}ColumnFamilySplit{startToken='0528cbe0b2b5ff6b816c68b59973bcbc', 
endToken='2aaa', 
dataNodes=[cassandra02.finn.no]}{noformat} This is the split that's receiving 
new data (5-10k rows/second).

  was (Author: michaelsembwever):
Here's debug from a "full scan" job. It scans data over a full year (and 
since the cluster's ring range only holds 3 months of data such a job 
guarantees a full scan).

In the debug you see the splits.
{{`nodetool ring`}} gives

{noformat}Address DC  RackStatus State   Load   
 OwnsToken   
   
Token(bytes[5554])
152.90.241.22   DC1 RAC1Up Normal  16.65 GB33.33%  
Token(bytes[00])
152.90.241.23   DC2 RAC1Up Normal  63.22 GB33.33%  
Token(bytes[2aaa])
152.90.241.24   DC1 RAC1Up Normal  72.4 KB 33.33%  
Token(bytes[5554])
{noformat}

The problematic split ends up being 
{noformat}ColumnFamilySplit{startToken='0528cbe0b2b5ff6b816c68b59973bcbc', 
endToken='2aaa', 
dataNodes=[cassandra02.finn.no]}{noformat} This is the split that's receiving 
new data (5-10k rows/second).
  
> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4, 0.8.5
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Fix For: 0.8.6
>
> Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
> fullscan-example1.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-25 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114220#comment-13114220
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/25/11 11:27 AM:


Here's debug from a "full scan" job. It scans data over a full year (and since 
the cluster's ring range only holds 3 months of data such a job guarantees a 
full scan).

In the debug you see the splits.
{{`nodetool ring`}} gives

{noformat}Address DC  RackStatus State   Load   
 OwnsToken   
   
Token(bytes[5554])
152.90.241.22   DC1 RAC1Up Normal  16.65 GB33.33%  
Token(bytes[00])
152.90.241.23   DC2 RAC1Up Normal  63.22 GB33.33%  
Token(bytes[2aaa])
152.90.241.24   DC1 RAC1Up Normal  72.4 KB 33.33%  
Token(bytes[5554])
{noformat}

The problematic split ends up being 
{noformat}ColumnFamilySplit{startToken='0528cbe0b2b5ff6b816c68b59973bcbc', 
endToken='2aaa', 
dataNodes=[cassandra02.finn.no]}{noformat} This is the split that's receiving 
new data (5-10k rows/second).

  was (Author: michaelsembwever):
Here's debug from a "full scan" job. It scans data over a full year (and 
since the cluster's ring range only holds 3 months of data such a job 
guarantees a full scan).

In the debug you see the splits.
{{`nodetool ring`}} gives

{noformat}Address DC  RackStatus State   Load   
 OwnsToken   
   
Token(bytes[5554])
152.90.241.22   DC1 RAC1Up Normal  16.65 GB33.33%  
Token(bytes[00])
152.90.241.23   DC2 RAC1Up Normal  63.22 GB33.33%  
Token(bytes[2aaa])
152.90.241.24   DC1 RAC1Up Normal  72.4 KB 33.33%  
Token(bytes[5554])
{noformat}

The problematic split ends up being 
{noformat}ColumnFamilySplit{startToken='0528cbe0b2b5ff6b816c68b59973bcbc', 
endToken='2aaa', 
dataNodes=[cassandra02.finn.no]}{noformat}
  
> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4, 0.8.5
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Fix For: 0.8.6
>
> Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
> fullscan-example1.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-25 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114220#comment-13114220
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/25/11 11:23 AM:


Here's debug from a "full scan" job. It scans data over a full year (and since 
the cluster's ring range only holds 3 months of data such a job guarantees a 
full scan).

In the debug you see the splits.
{{`nodetool ring`}} gives

{noformat}Address DC  RackStatus State   Load   
 OwnsToken   
   
Token(bytes[5554])
152.90.241.22   DC1 RAC1Up Normal  16.65 GB33.33%  
Token(bytes[00])
152.90.241.23   DC2 RAC1Up Normal  63.22 GB33.33%  
Token(bytes[2aaa])
152.90.241.24   DC1 RAC1Up Normal  72.4 KB 33.33%  
Token(bytes[5554])
{noformat}

The problematic split ends up being 
{noformat}ColumnFamilySplit{startToken='0528cbe0b2b5ff6b816c68b59973bcbc', 
endToken='2aaa', 
dataNodes=[cassandra02.finn.no]}{noformat}

  was (Author: michaelsembwever):
Here's debug from a "full scan" job. It scans data over a full year (and 
since the cluster's ring range only hold 3 months of data this job guarantees a 
full scan).

In the debug you see the splits.
{{`nodetool ring`}} gives

{noformat}Address DC  RackStatus State   Load   
 OwnsToken   
   
Token(bytes[5554])
152.90.241.22   DC1 RAC1Up Normal  16.65 GB33.33%  
Token(bytes[00])
152.90.241.23   DC2 RAC1Up Normal  63.22 GB33.33%  
Token(bytes[2aaa])
152.90.241.24   DC1 RAC1Up Normal  72.4 KB 33.33%  
Token(bytes[5554])
{noformat}

The problematic split ends up being 
{noformat}ColumnFamilySplit{startToken='0528cbe0b2b5ff6b816c68b59973bcbc', 
endToken='2aaa', 
dataNodes=[cassandra02.finn.no]}{noformat}
  
> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4, 0.8.5
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Fix For: 0.8.6
>
> Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
> fullscan-example1.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-23 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113326#comment-13113326
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/23/11 7:30 PM:
---

This is persisting to be a problem. And continuously gets worse through the day.
I can see two thirds of my cf being read from one split.
It doesn't matter if it's ByteOrderingPartition or RandomPartitioner. Although 
i'm having a greater problem with the former. A compact and restart didn't 
help. 
I've attached two screenshots from hadoop webpages (you see 36million map input 
records when cassandra.input.split.size is set to 393216).

bq. ... double-check how many rows there really are in the given split range.
I can confirm that if i let these jobs run through to their completion (despite 
how terribly long they may take) that the results are correct. It would seem 
that the split ranges are incorrect (not the rows within in them).

  was (Author: michaelsembwever):
This is persisting to be a problem. And continuously gets worse through the 
day.
I can see like half my cf being read from one split.
It doesn't matter if it's ByteOrderingPartition or RandomPartitioner.
I've attached two screenshots from hadoop webpages (you see 36million map input 
records when cassandra.input.split.size is set to 393216).

bq. ... double-check how many rows there really are in the given split range.
I can confirm that if i let these jobs run through to their completion (despite 
how terribly long they may take) that the results are correct. It would seem 
that the split ranges are incorrect (not the rows within in them).
  
> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-23 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113326#comment-13113326
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/23/11 12:05 PM:


This is persisting to be a problem. And continuously gets worse through the day.
I can see like half my cf being read from one split.
It doesn't matter if it's ByteOrderingPartition or RandomPartitioner.
I've attached two screenshots from hadoop webpages (you see 36million map input 
records when cassandra.input.split.size is set to 393216).

bq. ... double-check how many rows there really are in the given split range.
I can confirm that if i let these jobs run through to their completion (despite 
how terribly long they may take) that the results are correct. It would seem 
that the split ranges are incorrect (not the rows within in them).

  was (Author: michaelsembwever):
This is persisting to be a problem. And continuously gets worse through the 
day.
I can see like half my cf being read from one split.
It doesn't matter if it's ByteOrderingPartition or RandomPartitioner.
I've attached two screenshots from hadoop webpages (you see 36million map input 
records when cassandra.input.split.size is set to 393216).
  
> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-23 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113326#comment-13113326
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/23/11 11:04 AM:


This is persisting to be a problem. And continuously gets worse through the day.
I can see like half my cf being read from one split.
It doesn't matter if it's ByteOrderingPartition or RandomPartitioner.
I've attached two screenshots from hadoop webpages (you see 36million map input 
records when cassandra.input.split.size is set to 393216).

  was (Author: michaelsembwever):
This is persisting to be a problem. And continuously gets worse through the 
day.
I can see like half my cf being read from one split.
It doesn't matter if it's ByteOrderingPartition or RandomPartitioner.
I've attached two screenshots from hadoop webpages.
  
> ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
> whack)
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
> task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
> task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-10 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102062#comment-13102062
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/10/11 6:04 PM:
---

Debug from a task that was still running at 1200%

The initial split for this CFRR is 
30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8 : 
303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8

This job was run with 
 cassandra.input.split.size=196608
 cassandra.range.batch.size=16000

therefore there shouldn't be more than 13 calls to get_range_slices(..) in this 
task. There was already 166 calls in this log.


What i can see here is that the original split for this task is just way too 
big and this comes from {{describe_splits(..)}}
which in turn depends on "index_interval". Reading 
{{StorageService.getSplits(..)}} i would guess that the split can in fact 
contain many more keys with the default sampling of 128. Question is how low 
can/should i bring index_interval (this cf can have up to 8 billion rows)?

  was (Author: michaelsembwever):
Debug from a task that was still running at 1200%

The initial split for this CFRR is 
30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8 : 
303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8

This job was run with 
 cassandra.input.split.size=196608
 cassandra.range.batch.size=16000

therefore there shouldn't be more than 13 calls to get_range_slices(..) in this 
task. There was already 166 calls in this log.


What i can see here is that the original split for this task is just way too 
big and this comes from {{describe_splits(..)}}
which in turn depends on "index_interval". Reading 
{{StorageService.getSplits(..)}} i would guess that the split can in fact 
contain many more keys with the default sampling of 128. Question is how low 
can/should i bring index_interval ?
  
> ColumnFormatRecordReader loops forever
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Attachments: CASSANDRA-3150.patch, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-10 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102062#comment-13102062
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/10/11 6:02 PM:
---

Debug from a task that was still running at 1200%

The initial split for this CFRR is 
30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8 : 
303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8

This job was run with 
 cassandra.input.split.size=196608
 cassandra.range.batch.size=16000

therefore there shouldn't be more than 13 calls to get_range_slices(..) in this 
task. There was already 166 calls in this log.


What i can see here is that the original split for this task is just way too 
big and this comes from {{describe_splits(..)}}
which in turn depends on "index_interval". Reading 
{{StorageService.getSplits(..)}} i would guess that the split can in fact 
contain many more keys with the default sampling of 128. Question is how low 
can/should i bring index_interval ?

  was (Author: michaelsembwever):
Debug from a task that was still running at 1200%

The initial split for this CFRR is 
30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8 : 
303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8

This job was run with 
 cassandra.input.split.size=196608
 cassandra.range.batch.size=16000

therefore there shouldn't be more than 13 calls to get_range_slices(..) in this 
task. There was already 166 calls in this log.


What i can see here is that the original split for this task is just way too 
big and this comes from {{describe_splits(..)}}
  
> ColumnFormatRecordReader loops forever
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Attachments: CASSANDRA-3150.patch, 
> attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099263#comment-13099263
 ] 

Jonathan Ellis edited comment on CASSANDRA-3150 at 9/7/11 7:57 PM:
---

If Token's Comparable implementation is broken, anything is possible.  But I 
don't think it is.  In your case, for instance, BOP is sorting the shorter 
token correctly since 3 < 7.

Load won't decrease until you run cleanup.

  was (Author: jbellis):
If Token's Comparable implementation is broken, anything is possible.  But 
I don't think it is.  In your case, for instance, OPP is sorting the shorter 
token correctly since 3 < 7.

Load won't decrease until you run cleanup.
  
> ColumnFormatRecordReader loops forever
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Attachments: CASSANDRA-3150.patch
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099254#comment-13099254
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/7/11 7:43 PM:
--

What about the case where tokens of different length exist. Could 
get_range_slices be busted there?
I don't know if this is actually possible but from 
{noformat}
Address Status State   LoadOwnsToken
   
   
Token(bytes[76118303760208547436305468318170713656])
152.90.241.22   Up Normal  270.46 GB   33.33%  
Token(bytes[30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8])
152.90.241.24   Up Normal  247.89 GB   33.33%  
Token(bytes[303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8])
152.90.241.23   Up Normal  1.1 TB  33.33%  
Token(bytes[76118303760208547436305468318170713656])
{noformat}
you see the real tokens are very long compared to the initial_tokens the 
cluster was configured with. (The two long tokens have been moved off their 
initial_tokens, and to note the load on .23 never decreased to ~300GB as it 
should have...).

  was (Author: michaelsembwever):
What about the case where tokens of different length exist.
I don't know if this is actually possible but from 
{noformat}
Address Status State   LoadOwnsToken
   
   
Token(bytes[76118303760208547436305468318170713656])
152.90.241.22   Up Normal  270.46 GB   33.33%  
Token(bytes[30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8])
152.90.241.24   Up Normal  247.89 GB   33.33%  
Token(bytes[303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8])
152.90.241.23   Up Normal  1.1 TB  33.33%  
Token(bytes[76118303760208547436305468318170713656])
{noformat}
you see the real tokens are very long compared to the initial_tokens the 
cluster was configured with. (The two long tokens have been moved off their 
initial_tokens, and to note the load on .23 never decreased to ~300GB as it 
should have...).
  
> ColumnFormatRecordReader loops forever
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Attachments: CASSANDRA-3150.patch
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099254#comment-13099254
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/7/11 7:38 PM:
--

What about the case where tokens of different length exist.
I don't know if this is actually possible but from 
{noformat}
Address Status State   LoadOwnsToken
   
   
Token(bytes[76118303760208547436305468318170713656])
152.90.241.22   Up Normal  270.46 GB   33.33%  
Token(bytes[30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8])
152.90.241.24   Up Normal  247.89 GB   33.33%  
Token(bytes[303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8])
152.90.241.23   Up Normal  1.1 TB  33.33%  
Token(bytes[76118303760208547436305468318170713656])
{noformat}
you see the real tokens are very long compared to the initial_tokens the 
cluster was configured with. (The two long tokens have been moved off their 
initial_tokens, and to note the load on .23 never decreased to ~300GB as it 
should have...).

  was (Author: michaelsembwever):
What about the case where tokens of different length exist.
I don't know if this is actually possible but from 
{noformat}
Address Status State   LoadOwnsToken
   
   
Token(bytes[76118303760208547436305468318170713656])
152.90.241.22   Up Normal  270.46 GB   33.33%  
Token(bytes[30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8])
152.90.241.24   Up Normal  247.89 GB   33.33%  
Token(bytes[303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8])
152.90.241.23   Up Normal  1.1 TB  33.33%  
Token(bytes[76118303760208547436305468318170713656])
{noformat}
you see the real tokens are very long compared to the initial_tokens the 
cluster was configured with. (The two long tokens has since been moved, and to 
note the load on .23 never decreased to ~300GB as it should have...).
  
> ColumnFormatRecordReader loops forever
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Attachments: CASSANDRA-3150.patch
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099234#comment-13099234
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/7/11 7:24 PM:
--

Here keyRange is startToken to split.getEndToken()
startToken is updated each iterate to the last row read (each iterate is 
batchRowCount rows).

What happens if split.getEndToken() doesn't correspond to any of the rowKeys?
To me it reads that startToken will hop over split.getEndToken() and 
get_range_slices(..) will start querying against wrapping ranges. This will 
still return rows and so the iteration will continue, now forever.

The only way out for this code today is a) startToken equals 
split.getEndToken(), or b) get_range_slices(..) is called with startToken 
equals split.getEndToken() OR a gap so small there exists no rows in between.

  was (Author: michaelsembwever):
Here keyRange is startToken to split.getEndToken()
startToken is updated each iterate to the last row read (each iterate is 
batchRowCount rows).

What happens if split.getEndToken() doesn't correspond to any of the rowKeys?
To me it reads that startToken will hop over split.getEndToken() and 
get_range_slices(..) will start returning wrapping ranges. This will still 
return rows and so the iteration will continue, now forever.

The only way out for this code today is a) startToken equals 
split.getEndToken(), or b) get_range_slices(..) is called with startToken 
equals split.getEndToken() OR a gap so small there exists no rows in between.
  
> ColumnFormatRecordReader loops forever
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Attachments: CASSANDRA-3150.patch
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099234#comment-13099234
 ] 

Mck SembWever edited comment on CASSANDRA-3150 at 9/7/11 7:17 PM:
--

Here keyRange is startToken to split.getEndToken()
startToken is updated each iterate to the last row read (each iterate is 
batchRowCount rows).

What happens if split.getEndToken() doesn't correspond to any of the rowKeys?
To me it reads that startToken will hop over split.getEndToken() and 
get_range_slices(..) will start returning wrapping ranges. This will still 
return rows and so the iteration will continue, now forever.

The only way out for this code today is a) startToken equals 
split.getEndToken(), or b) get_range_slices(..) is called with startToken 
equals split.getEndToken() OR a gap so small there exists no rows in between.

  was (Author: michaelsembwever):
Here keyRange is startToken to split.getEndToken()
startToken is updated each iterate to the last row read (each iterate is 
batchRowCount rows).

What happens is split.getEndToken() doesn't correspond to any of the rowKeys?
To me it reads that startToken will hop over split.getEndToken() and 
get_rage_slices(..) will start returning wrapping ranges. This will still 
return rows and so the iteration will continue, now forever.

The only way out for this code today is a) startToken equals 
split.getEndToken(), or b) get_range_slices(..) is called with startToken 
equals split.getEndToken() OR a gap so small there exists no rows in between.
  
> ColumnFormatRecordReader loops forever
> --
>
> Key: CASSANDRA-3150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.8.4
>Reporter: Mck SembWever
>Assignee: Mck SembWever
>Priority: Critical
> Attachments: CASSANDRA-3150.patch
>
>
> From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira