[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687111#comment-13687111 ] Daniel Meyer commented on CASSANDRA-5225: - Added a dtest to cover this scenario: https://github.com/riptano/cassandra-dtest/blob/75bffeba0af410a41eb97b269ae1c94f4227c312/wide_rows_test.py > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Assignee: Sylvain Lebresne >Priority: Critical > Fix For: 1.2.2 > > Attachments: 5225.txt, pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13597683#comment-13597683 ] Jonathan Ellis commented on CASSANDRA-5225: --- Nevertheless, the bug fixed here was a regression introduced by CASSANDRA-3885 for 1.2.0. > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Assignee: Sylvain Lebresne >Priority: Critical > Fix For: 1.2.2 > > Attachments: 5225.txt, pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13597675#comment-13597675 ] Ahmed Bashir commented on CASSANDRA-5225: - The reporter for CASSANDRA-5210 (marked as a dupe of this) tested as far back as 0.8 > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Assignee: Sylvain Lebresne >Priority: Critical > Fix For: 1.2.2 > > Attachments: 5225.txt, pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596390#comment-13596390 ] Jonathan Ellis commented on CASSANDRA-5225: --- If you have a test case that fails against 1.1, please post it. > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Assignee: Sylvain Lebresne >Priority: Critical > Fix For: 1.2.2 > > Attachments: 5225.txt, pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596204#comment-13596204 ] Ahmed Bashir commented on CASSANDRA-5225: - This affects 1.1.x as well; will the fix be a part of 1.1.11? > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Assignee: Sylvain Lebresne >Priority: Critical > Fix For: 1.2.2 > > Attachments: 5225.txt, pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577309#comment-13577309 ] Brandon Williams commented on CASSANDRA-5225: - +1 > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Assignee: Sylvain Lebresne >Priority: Critical > Fix For: 1.2.2 > > Attachments: 5225.txt, pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576905#comment-13576905 ] Elden Bishop commented on CASSANDRA-5225: - This patch also fixes CASSANDRA-5210. I'll mark that one as a dupe. > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Priority: Critical > Fix For: 1.2.2 > > Attachments: 5225.txt, pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574728#comment-13574728 ] Brandon Williams commented on CASSANDRA-5225: - I applied the patch correctly, but the bug is in the pycassa script itself... I was hitting an edge case where it asked for the same column twice. > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Priority: Critical > Fix For: 1.2.2 > > Attachments: 5225.txt, pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574358#comment-13574358 ] Mukund commented on CASSANDRA-5225: --- Yes..The patch works with my test code too.. > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Priority: Critical > Fix For: 1.2.2 > > Attachments: 5225.txt, pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574340#comment-13574340 ] Sylvain Lebresne commented on CASSANDRA-5225: - Are you sure you applied the patch correctly? I just tested the pycassa-repro.py test above and it fails every time without the patch but haven't failed once with the patch. > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Priority: Critical > Fix For: 1.2.2 > > Attachments: 5225.txt, pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13573789#comment-13573789 ] Brandon Williams commented on CASSANDRA-5225: - It still doesn't pass :( > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Priority: Critical > Fix For: 1.2.2 > > Attachments: 5225.txt, pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13573712#comment-13573712 ] Sylvain Lebresne commented on CASSANDRA-5225: - With just eyeballing the code, I would say that the line at https://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/io/sstable/IndexHelper.java#L179 should be: {noformat} if (!reversed) {noformat} i.e. both branch should be inverted. The goal of the lastIndex parameter is to ignore index block we know are "behind" us. So when we go forward (not reversed) you'd want to look at [lastIndex, index.size()], not the contrary. > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Priority: Critical > Fix For: 1.2.2 > > Attachments: pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5225) Missing columns, errors when requesting specific columns from wide rows
[ https://issues.apache.org/jira/browse/CASSANDRA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13573682#comment-13573682 ] Yuki Morishita commented on CASSANDRA-5225: --- It looks like cassandra is reading from wrong column index here(https://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/db/columniterator/SSTableNamesIterator.java#L236). Suppose we have col indexes of [[1..5][6..10][11..15][16..20]](numbers are column names), and we want to 'SELECT 2, 18 FROM CF'; First, we check '2' against indexes and get indexes[0]. Next, we check '18' against indexes with lastIndexIdx of 0. Now, because we are limiting the second index check to the sublist of indexes[0, lastIndexIdx + 1] here(https://github.com/apache/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/io/sstable/IndexHelper.java#L186), it only checks against only first two indexes and gets wrong index position of indexes[2]. So it thinks '20' is not in the sstable. In fact, if I removed sublisting part from IndexHelper.indexFor, SSTableNamesIterator started returning collect values. But I don't know that's the right way to do. [~slebresne]? > Missing columns, errors when requesting specific columns from wide rows > --- > > Key: CASSANDRA-5225 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5225 > Project: Cassandra > Issue Type: Bug > Components: Core >Affects Versions: 1.2.1 >Reporter: Tyler Hobbs >Priority: Critical > Fix For: 1.2.2 > > Attachments: pycassa-repro.py > > > With Cassandra 1.2.1 (and probably 1.2.0), I'm seeing some problems with > Thrift queries that request a set of specific column names when the row is > very wide. > To reproduce, I'm inserting 10 million columns into a single row and then > randomly requesting three columns by name in a loop. It's common for only > one or two of the three columns to be returned. I'm also seeing stack traces > like the following in the Cassandra log: > {noformat} > ERROR 13:12:01,017 Exception in thread Thread[ReadStage:76,5,main] > java.lang.RuntimeException: > org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1576) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: > org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid > column name length 0 > (/var/lib/cassandra/data/Keyspace1/CF1/Keyspace1-CF1-ib-5-Data.db, 14035168 > bytes remaining) > at > org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:69) > at > org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81) > at > org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68) > at > org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133) > at > org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) > at > org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1358) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1215) > at > org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1127) > at org.apache.cassandra.db.Table.getRow(Table.java:355) > at > org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64) > at > org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1572) > ... 3 more > {noformat} > This doesn't seem to happen when the row is smaller, so it might have > something to do with incremental large row compaction. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira