[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-10-30 Thread Mck SembWever (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139728#comment-13139728
 ] 

Mck SembWever commented on CASSANDRA-3150:
--

Back after an upgrade to cassandra-1.0.0

Example job start logs{noformat}[ INFO] 20:39:21  Restricting input range: 
3589a548d20f80a7b41368b59973bcbc -- 36f0bedaf02a49a3b41368b59973bcbc []  at 
no.finntech.countstats.reduce.rolled.internal.GenericCountAggregation.configureCountToAggregateMapper(GenericCountAggregation.java:222)
[ INFO] 20:39:21  Corresponding time range is Sun Oct 30 00:00:00 CEST 2011 
(131992560) -- Sun Oct 30 23:00:00 CET 2011 (132001200) []  at 
no.finntech.countstats.reduce.rolled.internal.GenericCountAggregation.configureCountToAggregateMapper(GenericCountAggregation.java:225)
[ INFO] 20:39:21  Starting AdIdCountAggregation-phase1-DAY_2011303 ( 
to=1320002999000)) []  at 
no.finntech.countstats.reduce.rolled.internal.GenericCountAggregation.run(GenericCountAggregation.java:142)
[DEBUG] 20:39:21  adding 
ColumnFamilySplit{startToken='3589a548d20f80a7b41368b59973bcbc', 
endToken='36f0bedaf02a49a3b41368b59973bcbc', dataNodes=[0.0.0.0]} []  at 
org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:210)
{noformat}

In this split there is in fact ~40 million rows, and with a split size of 
393216, it is expected ~100 splits to be returned.

I'm also very confused by the {{dataNodes=[0.0.0.0]}}. this looks to be wrong, 
i know that data lies on the third node: cassandra03, where-else the job is 
running on the first node: cassandra01.

 ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
 whack)
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4, 0.8.5
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Fix For: 0.8.6

 Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
 task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
 task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
 attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
 fullscan-example1.log


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-27 Thread Mck SembWever (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13115278#comment-13115278
 ] 

Mck SembWever commented on CASSANDRA-3150:
--

No. (the title is incorrect).

That split contains millions of rows, despite StorageService.getSplits(..) 
being called with splitSize 393216.
In the worse situations the split has 20 millions rows in it. This were all 
valid are had to be process for the hadoop job to produce accurate results.

 ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
 whack)
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4, 0.8.5
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Fix For: 0.8.6

 Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
 task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
 task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
 attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
 fullscan-example1.log


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-27 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13115451#comment-13115451
 ] 

Jonathan Ellis commented on CASSANDRA-3150:
---

you haven't done any messing with index_interval by chance?

how much of the difference between 400K and 20M can be explained by new rows 
being added?

 ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
 whack)
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4, 0.8.5
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Fix For: 0.8.6

 Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
 task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
 task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
 attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
 fullscan-example1.log


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-27 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13115468#comment-13115468
 ] 

T Jake Luciani commented on CASSANDRA-3150:
---

bq. This is the split that's receiving new data (5-10k rows/second).

So how many memtables do you have at once and how many rows can fit in a 
memtable?  If you have large memtables and tiny rows that would throw 
getSplits() off since the splits are generated from SSTables only.

 ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
 whack)
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4, 0.8.5
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Fix For: 0.8.6

 Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
 task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
 task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
 attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
 fullscan-example1.log


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-27 Thread Mck SembWever (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13115726#comment-13115726
 ] 

Mck SembWever commented on CASSANDRA-3150:
--

bq. you haven't done any messing with index_interval by chance?
Once, around the time of this issue was created. I tried two smaller values 
(the first attempt gave OOM) but it didn't change much. The value is now at the 
original 128 again.

bq. So how many memtables do you have at once and how many rows can fit in a 
memtable?

Skinny rows. three columns: type, timestamp, data. type is a short string and 
indexed, timestamp is a long and indexed, data is a thrift serialised bean 
100bytes to 1k.

memtables is set to 1024. (this was decided since we're using Xmx8g and didn't 
want huge number of sstables files).
So i guess that means we could fit up to ~5 million rows in a memtable...

But above i wrote that neither a flush nor a compact helped the situation... 
(in fact each hadoop job starts and finishes with a flush) at least it didn't 
with 0.8.4 and 0.8.5. Now it seems much better w/ 0.8.6. I see through the day 
tasks go over 100% but it doesn't exaggerate through the day as badly as it did 
before.

 ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
 whack)
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4, 0.8.5
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Fix For: 0.8.6

 Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
 task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
 task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
 attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
 fullscan-example1.log


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-26 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13114702#comment-13114702
 ] 

T Jake Luciani commented on CASSANDRA-3150:
---

bq. The problematic split ends up being ...

So you are saying that split keeps getting called over and over? I dont see 
that in the log or does it hang?

 ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
 whack)
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4, 0.8.5
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Fix For: 0.8.6

 Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
 task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
 task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
 attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log, 
 fullscan-example1.log


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-25 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13114196#comment-13114196
 ] 

Mck SembWever commented on CASSANDRA-3150:
--

Yes i see now an CFRR that went to 600%. But it's still a long way from 
problematic.

My understanding of how the sampled keys is generated is it all happens when a 
sstable is read.
If a restart or a (manual) compact doesn't help, why did an upgrade help?

What can i investigate to provide more debug here?

 ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
 whack)
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4, 0.8.5
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Fix For: 0.8.6

 Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
 task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
 task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
 attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13114115#comment-13114115
 ] 

Jonathan Ellis commented on CASSANDRA-3150:
---

I'd love to think we fixed it but nothing in 0.8.6 changes looks relevant.  I 
suspect it's still lurking to bite you.

 ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
 whack)
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4, 0.8.5
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Fix For: 0.8.6

 Attachments: CASSANDRA-3150.patch, Screenshot-Counters for 
 task_201109212019_1060_m_29 - Mozilla Firefox.png, Screenshot-Hadoop map 
 task list for job_201109212019_1060 on cassandra01 - Mozilla Firefox.png, 
 attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of whack)

2011-09-23 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13113326#comment-13113326
 ] 

Mck SembWever commented on CASSANDRA-3150:
--

This is persisting to be a problem. And continuously gets worse through the day.
I can see like half my cf being read from one split.
It doesn't matter if it's ByteOrderingPartition or RandomPartitioner.
I've attached two screenshots from hadoop webpages.

 ColumnFormatRecordReader loops forever (StorageService.getSplits(..) out of 
 whack)
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch, 
 attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-10 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102115#comment-13102115
 ] 

Mck SembWever commented on CASSANDRA-3150:
--

There's a lot to learn about cassandra so forgive my ignorance in so many areas.
So how can {{StorageService.getSplits(..)}} be so out of whack? Is there 
anything i can tune to better this situation?
(Or is there any other debug i can provide?)

 ColumnFormatRecordReader loops forever
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch, 
 attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-10 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13102126#comment-13102126
 ] 

Jonathan Ellis commented on CASSANDRA-3150:
---

I don't know.  I think what I would do would be to use sstablekeys to 
double-check how many rows there really are in the given split range.

 ColumnFormatRecordReader loops forever
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch, 
 attempt_201109071357_0044_m_003040_0.grep-get_range_slices.log


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099217#comment-13099217
 ] 

Jonathan Ellis commented on CASSANDRA-3150:
---

The start==end check on 234 is a special case, because start==end is a wrapping 
Range.

The main stop when we're done logic is this:
{code}
rows = client.get_range_slices(new ColumnParent(cfName),
   predicate,
   keyRange,
   consistencyLevel);
  
// nothing new? reached the end
if (rows.isEmpty())
{
rows = null;
return;
}
{code}


 ColumnFormatRecordReader loops forever
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099234#comment-13099234
 ] 

Mck SembWever commented on CASSANDRA-3150:
--

Here keyRange is startToken to split.getEndToken()
startToken is updated each iterate to the last row read (each iterate is 
batchRowCount rows).

What happens is split.getEndToken() doesn't correspond to any of the rowKeys?
To me it reads that startToken will hop over split.getEndToken() and 
get_rage_slices(..) will start returning wrapping ranges. This will still 
return rows and so the iteration will continue, now forever.

The only way out for this code today is a) startToken equals 
split.getEndToken(), or b) get_range_slices(..) is called with startToken 
equals split.getEndToken() OR a gap so small there exists no rows in between.

 ColumnFormatRecordReader loops forever
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099243#comment-13099243
 ] 

Jonathan Ellis commented on CASSANDRA-3150:
---

the next startToken always comes from the most recently returned range (s.t. 
start  range = end), so unless there's a bad bug in get_range_slices it can't 
ever sort after endToken.

 ColumnFormatRecordReader loops forever
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099247#comment-13099247
 ] 

Jonathan Ellis commented on CASSANDRA-3150:
---

bq. a gap so small there exists no rows in between

Right.  So you page through with startToken increasing, until either you hit 
endToken or you get no rows back.

(Recall that the rows from get_range_slices come back in token order.)

 ColumnFormatRecordReader loops forever
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099254#comment-13099254
 ] 

Mck SembWever commented on CASSANDRA-3150:
--

What about the case where tokens of different length exist.
I don't know if this is actually possible but from 
{noformat}
Address Status State   LoadOwnsToken
   
   
Token(bytes[76118303760208547436305468318170713656])
152.90.241.22   Up Normal  270.46 GB   33.33%  
Token(bytes[30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8])
152.90.241.24   Up Normal  247.89 GB   33.33%  
Token(bytes[303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8])
152.90.241.23   Up Normal  1.1 TB  33.33%  
Token(bytes[76118303760208547436305468318170713656])
{noformat}
you see the real tokens are very long compared to the initial_tokens the 
cluster was configured with. (The two long tokens has since been moved, and to 
note the load on .23 never decreased to ~300GB as it should have...).

 ColumnFormatRecordReader loops forever
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099263#comment-13099263
 ] 

Jonathan Ellis commented on CASSANDRA-3150:
---

If Token's Comparable implementation is broken, anything is possible.  But I 
don't think it is.  In your case, for instance, OPP is sorting the shorter 
token correctly since 3  7.

Load won't decrease until you run cleanup.

 ColumnFormatRecordReader loops forever
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099269#comment-13099269
 ] 

Mck SembWever commented on CASSANDRA-3150:
--

bq. BOP is sorting the shorter token correctly since 3  7.
Sorry, so that doesn't explain this bug?

bq. Load won't decrease until you run cleanup.
Never worked.
repair and cleanup is run every night, the move was done one week ago and more 
than a couple of weeks ago.

 ColumnFormatRecordReader loops forever
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099281#comment-13099281
 ] 

Jonathan Ellis commented on CASSANDRA-3150:
---

bq. so that doesn't explain this bug?

I'm afraid not.

bq. Never worked.

Then I guess your tokens still aren't balanced (and nodetool is smoking crack). 
 It's virtually impossible to pick balanced tokens until we do CASSANDRA-2917.

 ColumnFormatRecordReader loops forever
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever

2011-09-07 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13099943#comment-13099943
 ] 

Mck SembWever commented on CASSANDRA-3150:
--

I'll try and put debug in so i can get a log of get_slice_range calls from 
CFRR... (this may take some days)

 ColumnFormatRecordReader loops forever
 --

 Key: CASSANDRA-3150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3150
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.8.4
Reporter: Mck SembWever
Assignee: Mck SembWever
Priority: Critical
 Attachments: CASSANDRA-3150.patch


 From http://thread.gmane.org/gmane.comp.db.cassandra.user/20039
 {quote}
 bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
 bq. CFIF's inputSplitSize=196608
 bq. 3 map tasks (from 4013) is still running after read 25 million rows.
 bq. Can this be a bug in StorageService.getSplits(..) ?
 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 I guess the main alternative would be a bug in the recordreader paging.
 {quote}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira