[ 
https://issues.apache.org/jira/browse/CASSANDRA-5855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733928#comment-13733928
 ] 

Jonathan Ellis commented on CASSANDRA-5855:
-------------------------------------------

It looks to me like we actually need to handle 3 cases:

# classic one-cell-per-column, no composites; the existing code
# CQL-style one-cell-per-column; the code you added
# COMPACT style, multiple columns in one cell; need to split them up and check 
each value

Nit: prefer assigning variables only once, e.g. we could write the 
validationName code [without addressing case 3 above] as

{code}
        ByteBuffer validationName;
        if (cfdef.isComposite && !cfdef.isCompact)
        {
            AbstractCompositeType comparator = (AbstractCompositeType) 
metadata.comparator;
            List<AbstractCompositeType.CompositeComponent> components = 
comparator.deconstruct(name);
            validationName = components.get(components.size() - 1).value;
        }
        else
        {
            validationName = name;
        }
{code}

                
> Scrub does not understand compound primary key created in CQL 3 beta
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-5855
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5855
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>            Reporter: J.B. Langston
>            Assignee: Tyler Hobbs
>         Attachments: 0001-Correctly-validate-sparse-composite-columns.patch
>
>
> We have a customer who was using the beta version of CQL 3 in DSE 3.0 which 
> includes Cassandra 1.1.9 plus patches backported from later versions.  
> They've now upgraded to DSE 3.1, which includes Cassandra 1.2.6 plus patches.
> When restarting for the first time after running upgradesstables, they 
> noticed the following error in the log:
> {noformat}
> Thread[SSTableBatchOpen:2,5,main]
> java.lang.AssertionError
>         at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:401)
>         at 
> org.apache.cassandra.io.sstable.IndexSummary$IndexSummarySerializer.deserialize(IndexSummary.java:124)
>         at 
> org.apache.cassandra.io.sstable.SSTableReader.loadSummary(SSTableReader.java:426)
>         at 
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:360)
>         at 
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:201)
>         at 
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:154)
>         at 
> org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:241)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {noformat}
> This error was also reported on CASSANDRA-5703.  The comments suggested it 
> was caused by an empty row key, so I had them run scrub on it.  When they 
> did, scrub reported the following warning almost 4 million times:
> {noformat}
>  WARN [CompactionExecutor:27] 2013-08-02 10:13:13,041 OutputHandler.java 
> (line 52) Row at 530332255 is unreadable; skipping to next
>  WARN [CompactionExecutor:27] 2013-08-02 10:13:13,041 OutputHandler.java 
> (line 57) Non-fatal error reading row (stacktrace follows)
> java.lang.RuntimeException: Error validating row 
> DecoratedKey(139154446688383793922009760478335751546, 
> 735fc9da503b11e2844b123140ff209f)
>  at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:243)
>  at 
> org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:114)
>  at 
> org.apache.cassandra.db.compaction.PrecompactedRow.<init>(PrecompactedRow.java:98)
>  at 
> org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:160)
>  at 
> org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:166)
>  at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:173)
>  at 
> org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:529)
>  at 
> org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:518)
>  at 
> org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:73)
>  at 
> org.apache.cassandra.db.compaction.CompactionManager$3.perform(CompactionManager.java:283)
>  at 
> org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:253)
>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>  at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.cassandra.db.marshal.MarshalException: String didn't 
> validate.
>  at org.apache.cassandra.db.marshal.UTF8Type.validate(UTF8Type.java:66)
>  at org.apache.cassandra.db.Column.validateFields(Column.java:292)
>  at 
> org.apache.cassandra.db.ColumnFamily.validateColumnFields(ColumnFamily.java:382)
>  at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:239)
>  ... 15 more
> {noformat}
> The customer did some testing and they've determined that the issue only 
> exists when taking a Cassandra 1.1 sstable with compound primary keys and 
> running scrub on it in either Cassandra 1.1 or 1.2.
> It appears that the scrub does not understand the 1.1 compound primary key so 
> is invalidating the row.
> The customer provided a cassandra data directory that's from DSE 3.0. Running 
> "nodetool scrub" in either DSE 3.0 or 3.1 generates all sorts of exceptions.
> If you fire up cassandra before running the scrub, running this query:
> {noformat}
> select count(*) from b_projectsubscription_project;
> {noformat}
> will return 88.
> After the scrub, it returns 0.
> When I discussed this with Alexey Yeschenko, he said that if he recalls 
> correctly, the beta didn't have row markers, and did not use a composite 
> comparator for simple primary keys. Whereas CQL3 final used 
> CompositeType(UTF8Type), the beta would just use UTF8Type. I asked him if 
> this could cause these errors, and he said he didn't think so because after 
> upgrading your schema created under the beta would still have UTF8Type, so 
> Cassandra would know how to handle it correctly. 
> Based on the customer's investigation, it sounds like this may be true of the 
> normal read/write path but not for scrub. However, given the error that 
> occurred at startup, this may be causing some issues aside from just scrub.  
> My theory is that scrub is looking at what's just a UTF8 string and trying to 
> interpret the first few bytes as the sentinels for a composite type.  When it 
> then tries to interpret the remaining bytes, if only part of a multi-byte 
> UTF8 character was in the remaining byte array, it might cause the UTF8 
> validation errors above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to