[ https://issues.apache.org/jira/browse/CASSANDRA-3820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197496#comment-13197496 ]
Peter Schuller edited comment on CASSANDRA-3820 at 2/1/12 1:53 AM: ------------------------------------------------------------------- Check whether the .bf files contain all zeroes above roughly 235 mb or so. If you have lots of rows, your BF will be that large. We encountered a bug internally whereby all bloom filters larger than 2^31 bits were large on disk, but everything after the first 2^31 bits were all zeroes. Unfortunately I don't know whether this is specific to patches made to our branch, and I have been so busy I haven't been able to follow up to figure out whether it affects the upstream version. But - just "tail -c 1000 | hexdump". If you only have zeroes, this is the bug. Make sure to tail on a large .bf file (take the largest, easiest). was (Author: scode): Check whether the .bf files contain all zeroes above roughly 235 mb or so. If you have lots of rows, your BF will be that large. We encountered a bug internally whereby all bloom filters larger than 2^31 bits were large on disk, but everything afger the first 2^31 bits were all zeroes. Unfortunately I don't know whether this is specific to patches made to our branch, and I have been so busy I haven't been able to follow up to figure out whether it affects the upstream version. But - just "tail -c 1000 | hexdump". If you only have zeroes, this is the bug. Make sure to tail on a large .bf file (take the largest, easiest). > Columns missing after upgrade from 0.8.5 to 1.0.7. > -------------------------------------------------- > > Key: CASSANDRA-3820 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3820 > Project: Cassandra > Issue Type: Bug > Affects Versions: 1.0.7 > Reporter: Jason Harvey > > After an upgrade, one of our CFs had a lot of rows with missing columns. I've > been able to reproduce in test conditions. Working on getting the tables to > DataStax(data is private). > 0.8 results: > {code} > [default@reddit] get CommentVote[36353467625f63333837336f32]; > => (column=date, value=313332333932323930392e3531, timestamp=1323922909506508) > => (column=ip, value=REDACTED, timestamp=1327048432717348, ttl=2592000) > => (column=name, value=31, timestamp=1327048433000740) > => (column=REDACTED, value=30, timestamp=1323922909506432) > => (column=thing1_id, value=REDACTED, timestamp=1323922909506475) > => (column=thing2_id, value=REDACTED, timestamp=1323922909506486) > => (column=REDACTED, value=31, timestamp=1323922909506518) > => (column=REDACTED, value=30, timestamp=1323922909506497) > {code} > 1.0 results: > {code} > [default@reddit] get CommentVote[36353467625f63333837336f32]; > => (column=ip, value=REDACTED, timestamp=1327048432717348, ttl=2592000) > => (column=name, value=31, timestamp=1327048433000740) > {code} > A few notes: > * The rows with missing data were fully restored after scrubbing the sstables. > * The row which I reproduced on happened to be split across multiple sstables. > * When I copied the first sstable I found the row on, I was able to 'list' > rows from the sstable, but any and all 'get' calls failed. > * These SStables were natively created on 0.8.5; they did not come from any > previous upgrade. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira