[jira] [Commented] (CASSANDRA-13687) Abnormal heap growth and CPU usage during repair.
[ https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099346#comment-16099346 ] Stanislav Vishnevskiy commented on CASSANDRA-13687: --- This cluster does not have any materialized views. Our last few nights the repair has finished successfully, but the heap and CPU usage is still higher than other nodes and it seems like the norm now. > Abnormal heap growth and CPU usage during repair. > - > > Key: CASSANDRA-13687 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13687 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png, > 3.0.9heap.png, 3.0.9.png > > > We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 > Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying > on us. We currently don't have any data to help reproduce this, but maybe > since there aren't many commits between the 2 versions it might be obvious. > Basically we trigger a parallel incremental repair from a single node every > night at 1AM. That node will sometimes start allocating a lot and keeping the > heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This > effectively destroys the whole cluster due to timeouts to this node. > The only solution we currently have is to drain the node and restart the > repair, it has worked fine the second time every time. > I attached heap charts from 3.0.9 and 3.0.14 during repair. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13687) Abnormal heap growth and CPU usage during repair.
[ https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085351#comment-16085351 ] Stanislav Vishnevskiy commented on CASSANDRA-13687: --- Just adding to this ticket. Today it finished safely again with the 12GB heap, but still used a lot of RAM. The CPU usage is still higher and repairs take about twice as long as they did on 3.0.9. > Abnormal heap growth and CPU usage during repair. > - > > Key: CASSANDRA-13687 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13687 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png, > 3.0.9heap.png, 3.0.9.png > > > We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 > Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying > on us. We currently don't have any data to help reproduce this, but maybe > since there aren't many commits between the 2 versions it might be obvious. > Basically we trigger a parallel incremental repair from a single node every > night at 1AM. That node will sometimes start allocating a lot and keeping the > heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This > effectively destroys the whole cluster due to timeouts to this node. > The only solution we currently have is to drain the node and restart the > repair, it has worked fine the second time every time. > I attached heap charts from 3.0.9 and 3.0.14 during repair. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13687) Abnormal heap growth and CPU usage during repair.
[ https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Vishnevskiy updated CASSANDRA-13687: -- Summary: Abnormal heap growth and CPU usage during repair. (was: Abnormal heap growth and long GC during repair.) > Abnormal heap growth and CPU usage during repair. > - > > Key: CASSANDRA-13687 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13687 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png, > 3.0.9heap.png, 3.0.9.png > > > We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 > Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying > on us. We currently don't have any data to help reproduce this, but maybe > since there aren't many commits between the 2 versions it might be obvious. > Basically we trigger a parallel incremental repair from a single node every > night at 1AM. That node will sometimes start allocating a lot and keeping the > heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This > effectively destroys the whole cluster due to timeouts to this node. > The only solution we currently have is to drain the node and restart the > repair, it has worked fine the second time every time. > I attached heap charts from 3.0.9 and 3.0.14 during repair. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.
[ https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16083666#comment-16083666 ] Stanislav Vishnevskiy edited comment on CASSANDRA-13687 at 7/12/17 8:58 AM: We just had this happen again. I am attaching screenshots of similar time range again from before and after. As you can see in this [^3.0.14heap.png] image at 1PM the heap spikes to 6GB, then we have to take down the node cause it makes the cluster start failing. We then proceed to change MAX_HEAP_SIZE to 12GB and bring it up again and repair. This time it spikes to 8GB and sticks there though the whole repair. It then drops down to 600MB without a huge CMS almost like it was 1 big object. The node calling repair (1-1) is the only one with the heap growth. If you look at [^3.0.9heap.png] this used to not occur during repair and all nodes looked similar. Another interesting thing is CPU usage as seen in [^3.0.14cpu.png]. The node performing the node tool repair (in blue) is using way more CPU than the other nodes in the cluster. We compared this a week ago with 3.0.9 and this was also not true. This feels like a bug in repair? was (Author: stanislav): We just had this happen again. I am attaching screenshots of similar time range again from before and after. As you can see in this [^3.0.14heap.png] image at 1PM the heap spikes to 6GB, then we have to take down the node cause it makes the cluster start failing. We then proceed to change MAX_HEAP_SIZE to 12GB and bring it up again and repair. This time it spikes to 8GB and sticks there though the whole repair. It then drops down to 600MB without a huge CMS almost like it was 1 big object. The node calling repair (1-1) is the only one with the heap growth. If you look at [^3.0.9heap.png] this used to not occur during repair and all nodes looked similar. Another interesting thing is CPU usage as seen in [^3.0.14cpu.png]. The node performing the node tool repair (in blue) is using way more CPU than the other node in the cluster. We compared this a week ago with 3.0.9 and this was also not true. This feels like a bug in repair? > Abnormal heap growth and long GC during repair. > --- > > Key: CASSANDRA-13687 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13687 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png, > 3.0.9heap.png, 3.0.9.png > > > We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 > Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying > on us. We currently don't have any data to help reproduce this, but maybe > since there aren't many commits between the 2 versions it might be obvious. > Basically we trigger a parallel incremental repair from a single node every > night at 1AM. That node will sometimes start allocating a lot and keeping the > heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This > effectively destroys the whole cluster due to timeouts to this node. > The only solution we currently have is to drain the node and restart the > repair, it has worked fine the second time every time. > I attached heap charts from 3.0.9 and 3.0.14 during repair. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.
[ https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Vishnevskiy updated CASSANDRA-13687: -- Attachment: 3.0.9heap.png 3.0.14heap.png 3.0.14cpu.png We just had this happen again. I am attaching screenshots of similar time range again from before and after. As you can see in this [^3.0.14heap.png] image at 1PM the heap spikes to 6GB, then we have to take down the node cause it makes the cluster start failing. We then proceed to change MAX_HEAP_SIZE to 12GB and bring it up again and repair. This time it spikes to 8GB and sticks there though the whole repair. It then drops down to 600MB without a huge CMS almost like it was 1 big object. The node calling repair (1-1) is the only one with the heap growth. If you look at [^3.0.9heap.png] this used to not occur during repair and all nodes looked similar. Another interesting thing is CPU usage as seen in [^3.0.14cpu.png]. The node performing the node tool repair (in blue) is using way more CPU than the other node in the cluster. We compared this a week ago with 3.0.9 and this was also not true. This feels like a bug in repair? > Abnormal heap growth and long GC during repair. > --- > > Key: CASSANDRA-13687 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13687 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png, > 3.0.9heap.png, 3.0.9.png > > > We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 > Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying > on us. We currently don't have any data to help reproduce this, but maybe > since there aren't many commits between the 2 versions it might be obvious. > Basically we trigger a parallel incremental repair from a single node every > night at 1AM. That node will sometimes start allocating a lot and keeping the > heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This > effectively destroys the whole cluster due to timeouts to this node. > The only solution we currently have is to drain the node and restart the > repair, it has worked fine the second time every time. > I attached heap charts from 3.0.9 and 3.0.14 during repair. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.
[ https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16083554#comment-16083554 ] Stanislav Vishnevskiy commented on CASSANDRA-13687: --- I am assuming you were referring to "Compacted partition maximum bytes", the largest one is 20MB. The good news is that one is probably going to be deleted later this week because we figured out a better way to deal with outlier users. That said 20MB is well below the recommended 100MB limit. I can't get anything off netstats currently, probably have to wait till it happens again. The question though is why does this only happen on the node that is running the node repair command? If this was a streaming issue wouldn't other nodes also have this issue. Is there a specific bugfix that caused this behavior change? It seems really weird for a hotfix version bump change behavior this way and it is not documented anywhere. We run incremental repairs every 24 hours, so it definitely was not behind. > Abnormal heap growth and long GC during repair. > --- > > Key: CASSANDRA-13687 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13687 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > Attachments: 3.0.14.png, 3.0.9.png > > > We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 > Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying > on us. We currently don't have any data to help reproduce this, but maybe > since there aren't many commits between the 2 versions it might be obvious. > Basically we trigger a parallel incremental repair from a single node every > night at 1AM. That node will sometimes start allocating a lot and keeping the > heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This > effectively destroys the whole cluster due to timeouts to this node. > The only solution we currently have is to drain the node and restart the > repair, it has worked fine the second time every time. > I attached heap charts from 3.0.9 and 3.0.14 during repair. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.
[ https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Vishnevskiy updated CASSANDRA-13687: -- Description: We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying on us. We currently don't have any data to help reproduce this, but maybe since there aren't many commits between the 2 versions it might be obvious. Basically we trigger a parallel incremental repair from a single node every night at 1AM. That node will sometimes start allocating a lot and keeping the heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This effectively destroys the whole cluster due to timeouts to this node. The only solution we currently have is to drain the node and restart the repair, it has worked fine the second time every time. I attached heap charts from 3.0.9 and 3.0.14 during repair. was: We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying on us. We currently don't have any data to help reproduce this, but maybe since there aren't many commits between the 2 version it might be obvious. Basically we trigger a parallel incremental repair from a single node every night at 1AM. That node will sometimes start allocating a lot and keeping the heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This effectively destroys the whole cluster due to timeouts to this node. The only solution we currently have is to drain the node and restart the repair, it has worked fine the second time every time. I attached heap charts from 3.0.9 and 3.0.14 during repair. > Abnormal heap growth and long GC during repair. > --- > > Key: CASSANDRA-13687 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13687 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > Attachments: 3.0.14.png, 3.0.9.png > > > We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 > Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying > on us. We currently don't have any data to help reproduce this, but maybe > since there aren't many commits between the 2 versions it might be obvious. > Basically we trigger a parallel incremental repair from a single node every > night at 1AM. That node will sometimes start allocating a lot and keeping the > heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This > effectively destroys the whole cluster due to timeouts to this node. > The only solution we currently have is to drain the node and restart the > repair, it has worked fine the second time every time. > I attached heap charts from 3.0.9 and 3.0.14 during repair. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.
[ https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Vishnevskiy updated CASSANDRA-13687: -- Description: We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying on us. We currently don't have any data to help reproduce this, but maybe since there aren't many commits between the 2 version it might be obvious. Basically we trigger a parallel incremental repair from a single node every night at 1AM. That node will sometimes start allocating a lot and keeping the heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This effectively destroys the whole cluster due to timeouts to this node. The only solution we currently have is to drain the node and restart the repair, it has worked fine the second time every time. I attached heap charts from 3.0.9 and 3.0.14 during repair. was: We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying on us. We currently don't have any data to help reproduce this, but maybe since there aren't many commits between the 2 version it might be obvious. Basically we trigger a parallel incremental repair from a single node every night at 1AM. That node will sometimes start allocating a lot and keeping the heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This effectively destroys the whole cluster due to timeouts to this node. The only solution we currently have is to drain the node and restart the repair, it has worked fine the second time every time. I attached heap charts from 3.0.9 and 3.0.14. > Abnormal heap growth and long GC during repair. > --- > > Key: CASSANDRA-13687 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13687 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > Attachments: 3.0.14.png, 3.0.9.png > > > We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 > Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying > on us. We currently don't have any data to help reproduce this, but maybe > since there aren't many commits between the 2 version it might be obvious. > Basically we trigger a parallel incremental repair from a single node every > night at 1AM. That node will sometimes start allocating a lot and keeping the > heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This > effectively destroys the whole cluster due to timeouts to this node. > The only solution we currently have is to drain the node and restart the > repair, it has worked fine the second time every time. > I attached heap charts from 3.0.9 and 3.0.14 during repair. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.
Stanislav Vishnevskiy created CASSANDRA-13687: - Summary: Abnormal heap growth and long GC during repair. Key: CASSANDRA-13687 URL: https://issues.apache.org/jira/browse/CASSANDRA-13687 Project: Cassandra Issue Type: Bug Reporter: Stanislav Vishnevskiy Attachments: 3.0.14.png, 3.0.9.png We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004 Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying on us. We currently don't have any data to help reproduce this, but maybe since there aren't many commits between the 2 version it might be obvious. Basically we trigger a parallel incremental repair from a single node every night at 1AM. That node will sometimes start allocating a lot and keeping the heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This effectively destroys the whole cluster due to timeouts to this node. The only solution we currently have is to drain the node and restart the repair, it has worked fine the second time every time. I attached heap charts from 3.0.9 and 3.0.14. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13004) Corruption while adding a column to a table
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15838695#comment-15838695 ] Stanislav Vishnevskiy commented on CASSANDRA-13004: --- [~jjirsa] We read and write at LOCAL_QUORUM. We don't remember if it corrupted the same way on each node, but we couldn't read it correctly with LOCAL_ONE when we tried. We have since fixed this data by rewriting it. > Corruption while adding a column to a table > --- > > Key: CASSANDRA-13004 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13004 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > > We had the following schema in production. > {code:none} > CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( > nick text > ); > CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( > id bigint, > type int, > allow_ int, > deny int > ); > CREATE TABLE IF NOT EXISTS discord_channels.channels ( > id bigint, > guild_id bigint, > type tinyint, > name text, > topic text, > position int, > owner_id bigint, > icon_hash text, > recipients map>, > permission_overwrites map>, > bitrate int, > user_limit int, > last_pin_timestamp timestamp, > last_message_id bigint, > PRIMARY KEY (id) > ); > {code} > And then we executed the following alter. > {code:none} > ALTER TABLE discord_channels.channels ADD application_id bigint; > {code} > And one row (that we can tell) got corrupted at the same time and could no > longer be read from the Python driver. > {code:none} > [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. > ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: > '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00 > > \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00' > {code} > And then in cqlsh when trying to read the row we got this. > {code:none} > /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than > Python datetime can represent. Timestamps are displayed in milliseconds from > epoch. > Traceback (most recent call last): > File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement > result = future.result() > File > "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-
[jira] [Comment Edited] (CASSANDRA-13004) Corruption while adding a column to a table
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731690#comment-15731690 ] Stanislav Vishnevskiy edited comment on CASSANDRA-13004 at 12/8/16 7:57 PM: We just ran into this issue on an older 3.0.7 cluster with fairly low write velocity (70/sec). was (Author: stanislav): We just ran into this issue on an older 3.0.7 cluster with fairly low write velocity (500/sec). > Corruption while adding a column to a table > --- > > Key: CASSANDRA-13004 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13004 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > > We had the following schema in production. > {code:none} > CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( > nick text > ); > CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( > id bigint, > type int, > allow_ int, > deny int > ); > CREATE TABLE IF NOT EXISTS discord_channels.channels ( > id bigint, > guild_id bigint, > type tinyint, > name text, > topic text, > position int, > owner_id bigint, > icon_hash text, > recipients map>, > permission_overwrites map>, > bitrate int, > user_limit int, > last_pin_timestamp timestamp, > last_message_id bigint, > PRIMARY KEY (id) > ); > {code} > And then we executed the following alter. > {code:none} > ALTER TABLE discord_channels.channels ADD application_id bigint; > {code} > And one row (that we can tell) got corrupted at the same time and could no > longer be read from the Python driver. > {code:none} > [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. > ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: > '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00 > > \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00' > {code} > And then in cqlsh when trying to read the row we got this. > {code:none} > /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than > Python datetime can represent. Timestamps are displayed in milliseconds from > epoch. > Traceback (most recent call last): > File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement > result = future.result() > File > "/usr/share/cassandra/lib/cassandra-driver-inter
[jira] [Commented] (CASSANDRA-13004) Corruption while adding a column to a table
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731690#comment-15731690 ] Stanislav Vishnevskiy commented on CASSANDRA-13004: --- We just ran into this issue on an older 3.0.7 cluster with fairly low write velocity (500/sec). > Corruption while adding a column to a table > --- > > Key: CASSANDRA-13004 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13004 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > > We had the following schema in production. > {code:none} > CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( > nick text > ); > CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( > id bigint, > type int, > allow_ int, > deny int > ); > CREATE TABLE IF NOT EXISTS discord_channels.channels ( > id bigint, > guild_id bigint, > type tinyint, > name text, > topic text, > position int, > owner_id bigint, > icon_hash text, > recipients map>, > permission_overwrites map>, > bitrate int, > user_limit int, > last_pin_timestamp timestamp, > last_message_id bigint, > PRIMARY KEY (id) > ); > {code} > And then we executed the following alter. > {code:none} > ALTER TABLE discord_channels.channels ADD application_id bigint; > {code} > And one row (that we can tell) got corrupted at the same time and could no > longer be read from the Python driver. > {code:none} > [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. > ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: > '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00 > > \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00' > {code} > And then in cqlsh when trying to read the row we got this. > {code:none} > /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than > Python datetime can represent. Timestamps are displayed in milliseconds from > epoch. > Traceback (most recent call last): > File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement > result = future.result() > File > "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py", > line 3650, in result > raise self._final_exception > UnicodeDecodeError:
[jira] [Commented] (CASSANDRA-13004) Corruption while adding a column to a table
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725002#comment-15725002 ] Stanislav Vishnevskiy commented on CASSANDRA-13004: --- On another node we saw this. {code:none} ERROR [MessagingService-Incoming-/10.10.0.129] 2016-12-06 01:44:17,430 CassandraDaemon.java:205 - Exception in thread Thread[MessagingService-Incoming-/10.10.0.129,5,main] java.lang.RuntimeException: Unknown column application_id during deserialization at org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:432) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.SerializationHeader$Serializer.deserializeForMessaging(SerializationHeader.java:427) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.deserializeHeader(UnfilteredRowIteratorSerializer.java:190) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:661) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:334) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:353) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:290) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) ~[apache-cassandra-3.0.9.jar:3.0.9] {/code} > Corruption while adding a column to a table > --- > > Key: CASSANDRA-13004 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13004 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > > We had the following schema in production. > {code:none} > CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( > nick text > ); > CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( > id bigint, > type int, > allow_ int, > deny int > ); > CREATE TABLE IF NOT EXISTS discord_channels.channels ( > id bigint, > guild_id bigint, > type tinyint, > name text, > topic text, > position int, > owner_id bigint, > icon_hash text, > recipients map>, > permission_overwrites map>, > bitrate int, > user_limit int, > last_pin_timestamp timestamp, > last_message_id bigint, > PRIMARY KEY (id) > ); > {code} > And then we executed the following alter. > {code:none} > ALTER TABLE discord_channels.channels ADD application_id bigint; > {code} > And one row (that we can tell) got corrupted at the same time and could no > longer be read from the Python driver. > {code:none} > [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. > ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: > '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x0
[jira] [Comment Edited] (CASSANDRA-13004) Corruption while adding a column to a table
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725002#comment-15725002 ] Stanislav Vishnevskiy edited comment on CASSANDRA-13004 at 12/6/16 10:07 AM: - On another node we saw this. {code:none} ERROR [MessagingService-Incoming-/10.10.0.129] 2016-12-06 01:44:17,430 CassandraDaemon.java:205 - Exception in thread Thread[MessagingService-Incoming-/10.10.0.129,5,main] java.lang.RuntimeException: Unknown column application_id during deserialization at org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:432) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.SerializationHeader$Serializer.deserializeForMessaging(SerializationHeader.java:427) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.deserializeHeader(UnfilteredRowIteratorSerializer.java:190) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:661) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:334) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:353) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:290) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) ~[apache-cassandra-3.0.9.jar:3.0.9] {code} was (Author: stanislav): On another node we saw this. {code:none} ERROR [MessagingService-Incoming-/10.10.0.129] 2016-12-06 01:44:17,430 CassandraDaemon.java:205 - Exception in thread Thread[MessagingService-Incoming-/10.10.0.129,5,main] java.lang.RuntimeException: Unknown column application_id during deserialization at org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:432) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.SerializationHeader$Serializer.deserializeForMessaging(SerializationHeader.java:427) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.deserializeHeader(UnfilteredRowIteratorSerializer.java:190) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:661) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:334) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:353) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:290) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) ~[apache-cassandra-3.0.9.jar:3.0.9] {/code} > Corruption while adding a column to a table > --- > > Key: CASSANDRA-13004 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13004 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > > We had the following schema in production. > {code:none} > CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( > nick text > ); > CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( > id bigint, > type int, > allow_ int, > deny int > ); > CREATE TABLE IF NOT EXISTS discord_channels.channels ( > id bigint, > guild_id bigint, > type tinyint, > name text, > topic text, > position int, > owner_id bigint, > icon_hash text, > recipients map>, > permission_overwrites map>, > bitrate int, > user_limit int, > last_pin_timestamp timestamp, > last_message_id bigint, > PRIMARY KEY (id) > ); > {code} > And then we executed the following alter. > {code:none} > ALTER TABLE discord_channels.channels ADD application_id bigint; > {code} > And one row (that we can tell) got corrupted at the same time and could no > longer be read from the Python driver. > {code:none} > [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. > ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: > '\x84\x00
[jira] [Comment Edited] (CASSANDRA-13004) Corruption while adding a column to a table
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15724156#comment-15724156 ] Stanislav Vishnevskiy edited comment on CASSANDRA-13004 at 12/6/16 3:01 AM: We found this in the logs at the exact time this happened. {code} ERROR [SharedPool-Worker-11] 2016-12-06 01:44:16,971 Message.java:617 - Unexpected exception during request; channel = [id: 0xbd9a77e9, /10.10.0.48:38317 => /10.10.0.129:9042] java.io.IOError: java.io.IOException: Corrupt value length 1485619006 encountered, as it exceeds the maximum of 268435456, which is set via max_value_size_in_mb in cassandra.yaml at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:210) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:369) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:189) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:158) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:509) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:369) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.FilteredRows.isEmpty(FilteredRows.java:50) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.Filter.closeIfEmpty(Filter.java:73) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:43) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:26) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:707) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:400) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:353) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:227) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:487) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:464) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) [apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) [apache-cassandra-3.0.9.jar:3.0.9] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) [netty-all-4.0.23.Final.ja
[jira] [Commented] (CASSANDRA-13004) Corruption while adding a column to a table
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15724156#comment-15724156 ] Stanislav Vishnevskiy commented on CASSANDRA-13004: --- We found this in the logs at the exact time this happened. {code} ERROR [SharedPool-Worker-11] 2016-12-06 01:44:16,971 Message.java:617 - Unexpected exception during request; channel = [id: 0xbd9a77e9, /10.10.0.48:38317 => /10.10.0.129:9042] java.io.IOError: java.io.IOException: Corrupt value length 1485619006 encountered, as it exceeds the maximum of 268435456, which is set via max_value_size_in_mb in cassandra.yaml at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:210) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:369) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:189) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:158) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:509) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:369) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.FilteredRows.isEmpty(FilteredRows.java:50) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.Filter.closeIfEmpty(Filter.java:73) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:43) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:26) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:707) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:400) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:353) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:227) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:487) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:464) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130) ~[apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) [apache-cassandra-3.0.9.jar:3.0.9] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) [apache-cassandra-3.0.9.jar:3.0.9] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) [netty-all-4.0.23.Final.jar:4.0.23.Final] at io.netty.channel.Abst
[jira] [Updated] (CASSANDRA-13004) Corruption while adding a column to a table
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Vishnevskiy updated CASSANDRA-13004: -- Summary: Corruption while adding a column to a table (was: Corruption while adding a column to a table in production) > Corruption while adding a column to a table > --- > > Key: CASSANDRA-13004 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13004 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > > We had the following schema in production. > {code:text} > CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( > nick text > ); > CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( > id bigint, > type int, > allow_ int, > deny int > ); > CREATE TABLE IF NOT EXISTS discord_channels.channels ( > id bigint, > guild_id bigint, > type tinyint, > name text, > topic text, > position int, > owner_id bigint, > icon_hash text, > recipients map>, > permission_overwrites map>, > bitrate int, > user_limit int, > last_pin_timestamp timestamp, > last_message_id bigint, > PRIMARY KEY (id) > ); > {/code} > And then we executed the following alter. > {code:text} > ALTER TABLE discord_channels.channels ADD application_id bigint; > {/code} > And one row (that we can tell) got corrupted at the same time and could no > longer be read from the Python driver. > {code:text} > [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. > ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: > '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00 > > \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00' > {code} > And then in cqlsh when trying to read the row we got this. > {code:text} > /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than > Python datetime can represent. Timestamps are displayed in milliseconds from > epoch. > Traceback (most recent call last): > File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement > result = future.result() > File > "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py", > line 3650, in result > raise self._final_exception > UnicodeDecodeError: 'utf8' codec can't decode by
[jira] [Updated] (CASSANDRA-13004) Corruption while add a column to a table in production
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Vishnevskiy updated CASSANDRA-13004: -- Description: We had the following schema in production. {code:text} CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( nick text ); CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( id bigint, type int, allow_ int, deny int ); CREATE TABLE IF NOT EXISTS discord_channels.channels ( id bigint, guild_id bigint, type tinyint, name text, topic text, position int, owner_id bigint, icon_hash text, recipients map>, permission_overwrites map>, bitrate int, user_limit int, last_pin_timestamp timestamp, last_message_id bigint, PRIMARY KEY (id) ); {/code} And then we executed the following alter. {code:text} ALTER TABLE discord_channels.channels ADD application_id bigint; {/code} And one row (that we can tell) got corrupted at the same time and could no longer be read from the Python driver. {code:text} [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00 \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00' {code} And then in cqlsh when trying to read the row we got this. {code:text} /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than Python datetime can represent. Timestamps are displayed in milliseconds from epoch. Traceback (most recent call last): File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement result = future.result() File "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py", line 3650, in result raise self._final_exception UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid start byte {code} We tried to read the data and it would refuse to read the name column (the UTF8 error) and the last_pin_timestamp column had an absurdly large value. We ended up rewriting the whole row as we had the data in another place and it fixed the problem. However there is clearly a race condition in the schema change sub-system. Any ideas? was: We had the following schema in production. {code:cql} CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( nick
[jira] [Updated] (CASSANDRA-13004) Corruption while adding a column to a table
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Vishnevskiy updated CASSANDRA-13004: -- Description: We had the following schema in production. {code:none} CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( nick text ); CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( id bigint, type int, allow_ int, deny int ); CREATE TABLE IF NOT EXISTS discord_channels.channels ( id bigint, guild_id bigint, type tinyint, name text, topic text, position int, owner_id bigint, icon_hash text, recipients map>, permission_overwrites map>, bitrate int, user_limit int, last_pin_timestamp timestamp, last_message_id bigint, PRIMARY KEY (id) ); {code} And then we executed the following alter. {code:none} ALTER TABLE discord_channels.channels ADD application_id bigint; {code} And one row (that we can tell) got corrupted at the same time and could no longer be read from the Python driver. {code:none} [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00 \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00' {code} And then in cqlsh when trying to read the row we got this. {code:none} /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than Python datetime can represent. Timestamps are displayed in milliseconds from epoch. Traceback (most recent call last): File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement result = future.result() File "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py", line 3650, in result raise self._final_exception UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid start byte {code} We tried to read the data and it would refuse to read the name column (the UTF8 error) and the last_pin_timestamp column had an absurdly large value. We ended up rewriting the whole row as we had the data in another place and it fixed the problem. However there is clearly a race condition in the schema change sub-system. Any ideas? was: We had the following schema in production. {code:text} CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( nick
[jira] [Updated] (CASSANDRA-13004) Corruption while adding a column to a table in production
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Vishnevskiy updated CASSANDRA-13004: -- Summary: Corruption while adding a column to a table in production (was: Corruption while add a column to a table in production) > Corruption while adding a column to a table in production > - > > Key: CASSANDRA-13004 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13004 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > > We had the following schema in production. > {code:text} > CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( > nick text > ); > CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( > id bigint, > type int, > allow_ int, > deny int > ); > CREATE TABLE IF NOT EXISTS discord_channels.channels ( > id bigint, > guild_id bigint, > type tinyint, > name text, > topic text, > position int, > owner_id bigint, > icon_hash text, > recipients map>, > permission_overwrites map>, > bitrate int, > user_limit int, > last_pin_timestamp timestamp, > last_message_id bigint, > PRIMARY KEY (id) > ); > {/code} > And then we executed the following alter. > {code:text} > ALTER TABLE discord_channels.channels ADD application_id bigint; > {/code} > And one row (that we can tell) got corrupted at the same time and could no > longer be read from the Python driver. > {code:text} > [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. > ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: > '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00 > > \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00' > {code} > And then in cqlsh when trying to read the row we got this. > {code:text} > /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than > Python datetime can represent. Timestamps are displayed in milliseconds from > epoch. > Traceback (most recent call last): > File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement > result = future.result() > File > "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py", > line 3650, in result > raise self._final_exception > UnicodeDe
[jira] [Updated] (CASSANDRA-13004) Corruption while add a column to a table in production
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Vishnevskiy updated CASSANDRA-13004: -- Description: We had the following schema in production. {code:cql} CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( nick text ); CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( id bigint, type int, allow_ int, deny int ); CREATE TABLE IF NOT EXISTS discord_channels.channels ( id bigint, guild_id bigint, type tinyint, name text, topic text, position int, owner_id bigint, icon_hash text, recipients map>, permission_overwrites map>, bitrate int, user_limit int, last_pin_timestamp timestamp, last_message_id bigint, PRIMARY KEY (id) ); {/code} And then we executed the following alter. {code:cql} ALTER TABLE discord_channels.channels ADD application_id bigint; {/code} And one row (that we can tell) got corrupted at the same time and could no longer be read from the Python driver. {code:text} [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00 \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00' {code} And then in cqlsh when trying to read the row we got this. {code:text} /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than Python datetime can represent. Timestamps are displayed in milliseconds from epoch. Traceback (most recent call last): File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement result = future.result() File "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py", line 3650, in result raise self._final_exception UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid start byte {/code} We tried to read the data and it would refuse to read the name column (the UTF8 error) and the last_pin_timestamp column had an absurdly large value. We ended up rewriting the whole row as we had the data in another place and it fixed the problem. However there is clearly a race condition in the schema change sub-system. Any ideas? was: We had the following schema in production. {code:cql} CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( nick
[jira] [Updated] (CASSANDRA-13004) Corruption while add a column to a table in production
[ https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Vishnevskiy updated CASSANDRA-13004: -- Description: We had the following schema in production. {code:cql} CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( nick text ); CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( id bigint, type int, allow_ int, deny int ); CREATE TABLE IF NOT EXISTS discord_channels.channels ( id bigint, guild_id bigint, type tinyint, name text, topic text, position int, owner_id bigint, icon_hash text, recipients map>, permission_overwrites map>, bitrate int, user_limit int, last_pin_timestamp timestamp, last_message_id bigint, PRIMARY KEY (id) ); {/code} And then we executed the following alter. {code:cql} ALTER TABLE discord_channels.channels ADD application_id bigint; {/code} And one row (that we can tell) got corrupted at the same time and could no longer be read from the Python driver. {code:text} [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00 \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00' {code} And then in cqlsh when trying to read the row we got this. {code:text} /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than Python datetime can represent. Timestamps are displayed in milliseconds from epoch. Traceback (most recent call last): File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement result = future.result() File "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py", line 3650, in result raise self._final_exception UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid start byte {code} We tried to read the data and it would refuse to read the name column (the UTF8 error) and the last_pin_timestamp column had an absurdly large value. We ended up rewriting the whole row as we had the data in another place and it fixed the problem. However there is clearly a race condition in the schema change sub-system. Any ideas? was: We had the following schema in production. {code:cql} CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( nick t
[jira] [Created] (CASSANDRA-13004) Corruption while add a column to a table in production
Stanislav Vishnevskiy created CASSANDRA-13004: - Summary: Corruption while add a column to a table in production Key: CASSANDRA-13004 URL: https://issues.apache.org/jira/browse/CASSANDRA-13004 Project: Cassandra Issue Type: Bug Reporter: Stanislav Vishnevskiy We had the following schema in production. {code:cql} CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient ( nick text ); CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite ( id bigint, type int, allow_ int, deny int ); CREATE TABLE IF NOT EXISTS discord_channels.channels ( id bigint, guild_id bigint, type tinyint, name text, topic text, position int, owner_id bigint, icon_hash text, recipients map>, permission_overwrites map>, bitrate int, user_limit int, last_pin_timestamp timestamp, last_message_id bigint, PRIMARY KEY (id) ); {/code} And then we executed the following alter. {code:cql} ALTER TABLE discord_channels.channels ADD application_id bigint; {/code} And one row (that we can tell) got corrupted at the same time and could no longer be read from the Python driver. {code} [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00 \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00' {code} And then in cqlsh when trying to read the row we got this. {code} /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than Python datetime can represent. Timestamps are displayed in milliseconds from epoch. Traceback (most recent call last): File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement result = future.result() File "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py", line 3650, in result raise self._final_exception UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid start byte {/code} We tried to read the data and it would refuse to read the name column (the UTF8 error) and the last_pin_timestamp column had an absurdly large value. We ended up rewriting the whole row as we had the data in another place and it fixed the problem. However there is clearly a race condition in the schema change sub-system. Any ideas
[jira] [Commented] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7
[ https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365165#comment-15365165 ] Stanislav Vishnevskiy commented on CASSANDRA-12144: --- I sent you an email. > Undeletable rows after upgrading from 2.2.4 to 3.0.7 > > > Key: CASSANDRA-12144 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12144 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy >Assignee: Alex Petrov > > We upgraded our cluster today and now have a some rows that refuse to delete. > Here are some example traces. > https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d > Even weirder. > Updating the row and querying it back results in 2 rows even though the id is > the clustering key. > {noformat} > user_id| id | since| type > ---++--+-- > 116138050710536192 | 153047019424972800 | null |0 > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > And then deleting it again only removes the new one. > {noformat} > cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > user_id| id | since| type > ++--+-- > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > We tried repairing, compacting, scrubbing. No Luck. > Not sure what to do. Is anyone aware of this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7
[ https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365044#comment-15365044 ] Stanislav Vishnevskiy commented on CASSANDRA-12144: --- Email sent. > Undeletable rows after upgrading from 2.2.4 to 3.0.7 > > > Key: CASSANDRA-12144 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12144 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy >Assignee: Alex Petrov > > We upgraded our cluster today and now have a some rows that refuse to delete. > Here are some example traces. > https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d > Even weirder. > Updating the row and querying it back results in 2 rows even though the id is > the clustering key. > {noformat} > user_id| id | since| type > ---++--+-- > 116138050710536192 | 153047019424972800 | null |0 > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > And then deleting it again only removes the new one. > {noformat} > cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > user_id| id | since| type > ++--+-- > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > We tried repairing, compacting, scrubbing. No Luck. > Not sure what to do. Is anyone aware of this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7
[ https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364833#comment-15364833 ] Stanislav Vishnevskiy commented on CASSANDRA-12144: --- Yup for some rows somehow it manages to have multiple rows for the same primary key. Only old rows it seems that existed prior to the upgrade. We followed the steps outlines in this document. https://docs.datastax.com/en/latest-upgrade/upgrade/cassandra/upgrdBestPractCassandra.html Which included running upgradesstables before and after the upgrade. > Undeletable rows after upgrading from 2.2.4 to 3.0.7 > > > Key: CASSANDRA-12144 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12144 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > > We upgraded our cluster today and now have a some rows that refuse to delete. > Here are some example traces. > https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d > Even weirder. > Updating the row and querying it back results in 2 rows even though the id is > the clustering key. > {noformat} > user_id| id | since| type > ---++--+-- > 116138050710536192 | 153047019424972800 | null |0 > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > And then deleting it again only removes the new one. > {noformat} > cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > user_id| id | since| type > ++--+-- > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > We tried repairing, compacting, scrubbing. No Luck. > Not sure what to do. Is anyone aware of this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7
[ https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364659#comment-15364659 ] Stanislav Vishnevskiy edited comment on CASSANDRA-12144 at 7/6/16 5:16 PM: --- The writetime on the rows is in the GitHub gist. {noformat} cqlsh> SELECT WRITETIME(since), WRITETIME(type) FROM discord_relationships.relationships WHERE user_id = 116138050710536192 AND id = 153047019424972800; writetime(since) | writetime(type) --+-- 1464619988173052 | 1464619988173052 {noformat} So that was written on May 30th. This cluster is half a year old and exhibited zero issues until yesterday pretty much right after the upgrade finished. We also are noticing a weird key cache hit rate right from when the upgrade finished. http://i.imgur.com/JDihdGO.png The schema is as follows. {noformat} CREATE KEYSPACE discord_relationships WITH replication = {'class': 'NetworkTopologyStrategy', 'us-east1': '3'} AND durable_writes = true; CREATE TABLE discord_relationships.relationships ( user_id bigint, id bigint, since timestamp, type tinyint, PRIMARY KEY (user_id, id) ) WITH CLUSTERING ORDER BY (id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; {noformat} Thanks. was (Author: stanislav): The writetime on the rows is in the GitHub {noformat} cqlsh> SELECT WRITETIME(since), WRITETIME(type) FROM discord_relationships.relationships WHERE user_id = 116138050710536192 AND id = 153047019424972800; writetime(since) | writetime(type) --+-- 1464619988173052 | 1464619988173052 {noformat} So that was written on May 30th. This cluster is half a year old and exhibited zero issues until yesterday pretty much right after the upgrade finished. We also are noticing a weird key cache hit rate right from when the upgrade finished. http://i.imgur.com/JDihdGO.png The schema is as follows. {noformat} CREATE KEYSPACE discord_relationships WITH replication = {'class': 'NetworkTopologyStrategy', 'us-east1': '3'} AND durable_writes = true; CREATE TABLE discord_relationships.relationships ( user_id bigint, id bigint, since timestamp, type tinyint, PRIMARY KEY (user_id, id) ) WITH CLUSTERING ORDER BY (id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; {noformat} Thanks. > Undeletable rows after upgrading from 2.2.4 to 3.0.7 > > > Key: CASSANDRA-12144 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12144 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > > We upgraded our cluster today and now have a some rows that refuse to delete. > Here are some example traces. > https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d > Even weirder. > Updating the row and querying it back results in 2 rows even though the id is > the clustering key. > {noformat} > user_id| id | since| type > ---++--+-- > 116138050710536192 | 153047019424972800 | null |0 > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > And then deleting it again only removes the new one. > {noformat} > cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > cqlsh:discord_relationships> SELECT * FROM relationships WHERE
[jira] [Commented] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7
[ https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364659#comment-15364659 ] Stanislav Vishnevskiy commented on CASSANDRA-12144: --- The writetime on the rows is in the GitHub {noformat} cqlsh> SELECT WRITETIME(since), WRITETIME(type) FROM discord_relationships.relationships WHERE user_id = 116138050710536192 AND id = 153047019424972800; writetime(since) | writetime(type) --+-- 1464619988173052 | 1464619988173052 {noformat} So that was written on May 30th. This cluster is half a year old and exhibited zero issues until yesterday pretty much right after the upgrade finished. We also are noticing a weird key cache hit rate right from when the upgrade finished. http://i.imgur.com/JDihdGO.png The schema is as follows. {noformat} CREATE KEYSPACE discord_relationships WITH replication = {'class': 'NetworkTopologyStrategy', 'us-east1': '3'} AND durable_writes = true; CREATE TABLE discord_relationships.relationships ( user_id bigint, id bigint, since timestamp, type tinyint, PRIMARY KEY (user_id, id) ) WITH CLUSTERING ORDER BY (id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; {noformat} Thanks. > Undeletable rows after upgrading from 2.2.4 to 3.0.7 > > > Key: CASSANDRA-12144 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12144 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > > We upgraded our cluster today and now have a some rows that refuse to delete. > Here are some example traces. > https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d > Even weirder. > Updating the row and querying it back results in 2 rows even though the id is > the clustering key. > {noformat} > user_id| id | since| type > ---++--+-- > 116138050710536192 | 153047019424972800 | null |0 > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > And then deleting it again only removes the new one. > {noformat} > cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > user_id| id | since| type > ++--+-- > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > {noformat} > We tried repairing, compacting, scrubbing. No Luck. > Not sure what to do. Is anyone aware of this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7
[ https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stanislav Vishnevskiy updated CASSANDRA-12144: -- Description: We upgraded our cluster today and now have a some rows that refuse to delete. Here are some example traces. https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d Even weirder. Updating the row and querying it back results in 2 rows even though the id is the clustering key. user_id| id | since| type ++--+-- 116138050710536192 | 153047019424972800 | null |0 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 And then deleting it again only removes the new one. cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = 116138050710536192 AND id = 153047019424972800; cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = 116138050710536192 AND id = 153047019424972800; user_id| id | since| type ++--+-- 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 We tried repairing, compacting, scrubbing. No Luck. Not sure what to do. Is anyone aware of this? was: We upgraded our cluster today and now have a some rows that refuse to delete. Here are some example traces. https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d We tried repairing, compacting, scrubbing. No Luck. Not sure what to do. Is anyone aware of this? > Undeletable rows after upgrading from 2.2.4 to 3.0.7 > > > Key: CASSANDRA-12144 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12144 > Project: Cassandra > Issue Type: Bug >Reporter: Stanislav Vishnevskiy > > We upgraded our cluster today and now have a some rows that refuse to delete. > Here are some example traces. > https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d > Even weirder. > Updating the row and querying it back results in 2 rows even though the id is > the clustering key. > user_id| id | since| type > ++--+-- > 116138050710536192 | 153047019424972800 | null |0 > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > And then deleting it again only removes the new one. > cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = > 116138050710536192 AND id = 153047019424972800; > user_id| id | since| type > ++--+-- > 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2 > We tried repairing, compacting, scrubbing. No Luck. > Not sure what to do. Is anyone aware of this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7
Stanislav Vishnevskiy created CASSANDRA-12144: - Summary: Undeletable rows after upgrading from 2.2.4 to 3.0.7 Key: CASSANDRA-12144 URL: https://issues.apache.org/jira/browse/CASSANDRA-12144 Project: Cassandra Issue Type: Bug Reporter: Stanislav Vishnevskiy We upgraded our cluster today and now have a some rows that refuse to delete. Here are some example traces. https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d We tried repairing, compacting, scrubbing. No Luck. Not sure what to do. Is anyone aware of this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)