[jira] [Commented] (CASSANDRA-13687) Abnormal heap growth and CPU usage during repair.

2017-07-24 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099346#comment-16099346
 ] 

Stanislav Vishnevskiy commented on CASSANDRA-13687:
---

This cluster does not have any materialized views.

Our last few nights the repair has finished successfully, but the heap and CPU 
usage is still higher than other nodes and it seems like the norm now.

> Abnormal heap growth and CPU usage during repair.
> -
>
> Key: CASSANDRA-13687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13687
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
> Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png, 
> 3.0.9heap.png, 3.0.9.png
>
>
> We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004
> Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying 
> on us. We currently don't have any data to help reproduce this, but maybe 
> since there aren't many commits between the 2 versions it might be obvious.
> Basically we trigger a parallel incremental repair from a single node every 
> night at 1AM. That node will sometimes start allocating a lot and keeping the 
> heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
> effectively destroys the whole cluster due to timeouts to this node.
> The only solution we currently have is to drain the node and restart the 
> repair, it has worked fine the second time every time.
> I attached heap charts from 3.0.9 and 3.0.14 during repair.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13687) Abnormal heap growth and CPU usage during repair.

2017-07-13 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085351#comment-16085351
 ] 

Stanislav Vishnevskiy commented on CASSANDRA-13687:
---

Just adding to this ticket. Today it finished safely again with the 12GB heap, 
but still used a lot of RAM. The CPU usage is still higher and repairs take 
about twice as long as they did on 3.0.9.

> Abnormal heap growth and CPU usage during repair.
> -
>
> Key: CASSANDRA-13687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13687
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
> Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png, 
> 3.0.9heap.png, 3.0.9.png
>
>
> We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004
> Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying 
> on us. We currently don't have any data to help reproduce this, but maybe 
> since there aren't many commits between the 2 versions it might be obvious.
> Basically we trigger a parallel incremental repair from a single node every 
> night at 1AM. That node will sometimes start allocating a lot and keeping the 
> heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
> effectively destroys the whole cluster due to timeouts to this node.
> The only solution we currently have is to drain the node and restart the 
> repair, it has worked fine the second time every time.
> I attached heap charts from 3.0.9 and 3.0.14 during repair.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13687) Abnormal heap growth and CPU usage during repair.

2017-07-12 Thread Stanislav Vishnevskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Vishnevskiy updated CASSANDRA-13687:
--
Summary: Abnormal heap growth and CPU usage during repair.  (was: Abnormal 
heap growth and long GC during repair.)

> Abnormal heap growth and CPU usage during repair.
> -
>
> Key: CASSANDRA-13687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13687
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
> Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png, 
> 3.0.9heap.png, 3.0.9.png
>
>
> We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004
> Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying 
> on us. We currently don't have any data to help reproduce this, but maybe 
> since there aren't many commits between the 2 versions it might be obvious.
> Basically we trigger a parallel incremental repair from a single node every 
> night at 1AM. That node will sometimes start allocating a lot and keeping the 
> heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
> effectively destroys the whole cluster due to timeouts to this node.
> The only solution we currently have is to drain the node and restart the 
> repair, it has worked fine the second time every time.
> I attached heap charts from 3.0.9 and 3.0.14 during repair.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.

2017-07-12 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16083666#comment-16083666
 ] 

Stanislav Vishnevskiy edited comment on CASSANDRA-13687 at 7/12/17 8:58 AM:


We just had this happen again. I am attaching screenshots of similar time range 
again from before and after.

As you can see in this [^3.0.14heap.png] image at 1PM the heap spikes to 6GB, 
then we have to take down the node cause it makes the cluster start failing. We 
then proceed to change MAX_HEAP_SIZE to 12GB and bring it up again and repair. 
This time it spikes to 8GB and sticks there though the whole repair. It then 
drops down to 600MB without a huge CMS almost like it was 1 big object. The 
node calling repair (1-1) is the only one with the heap growth. If you look at 
[^3.0.9heap.png] this used to not occur during repair and all nodes looked 
similar.

Another interesting thing is CPU usage as seen in [^3.0.14cpu.png]. The node 
performing the node tool repair (in blue) is using way more CPU than the other 
nodes in the cluster. We compared this a week ago with 3.0.9 and this was also 
not true.

This feels like a bug in repair?



was (Author: stanislav):
We just had this happen again. I am attaching screenshots of similar time range 
again from before and after.

As you can see in this [^3.0.14heap.png] image at 1PM the heap spikes to 6GB, 
then we have to take down the node cause it makes the cluster start failing. We 
then proceed to change MAX_HEAP_SIZE to 12GB and bring it up again and repair. 
This time it spikes to 8GB and sticks there though the whole repair. It then 
drops down to 600MB without a huge CMS almost like it was 1 big object. The 
node calling repair (1-1) is the only one with the heap growth. If you look at 
[^3.0.9heap.png] this used to not occur during repair and all nodes looked 
similar.

Another interesting thing is CPU usage as seen in [^3.0.14cpu.png]. The node 
performing the node tool repair (in blue) is using way more CPU than the other 
node in the cluster. We compared this a week ago with 3.0.9 and this was also 
not true.

This feels like a bug in repair?


> Abnormal heap growth and long GC during repair.
> ---
>
> Key: CASSANDRA-13687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13687
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
> Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png, 
> 3.0.9heap.png, 3.0.9.png
>
>
> We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004
> Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying 
> on us. We currently don't have any data to help reproduce this, but maybe 
> since there aren't many commits between the 2 versions it might be obvious.
> Basically we trigger a parallel incremental repair from a single node every 
> night at 1AM. That node will sometimes start allocating a lot and keeping the 
> heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
> effectively destroys the whole cluster due to timeouts to this node.
> The only solution we currently have is to drain the node and restart the 
> repair, it has worked fine the second time every time.
> I attached heap charts from 3.0.9 and 3.0.14 during repair.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.

2017-07-12 Thread Stanislav Vishnevskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Vishnevskiy updated CASSANDRA-13687:
--
Attachment: 3.0.9heap.png
3.0.14heap.png
3.0.14cpu.png

We just had this happen again. I am attaching screenshots of similar time range 
again from before and after.

As you can see in this [^3.0.14heap.png] image at 1PM the heap spikes to 6GB, 
then we have to take down the node cause it makes the cluster start failing. We 
then proceed to change MAX_HEAP_SIZE to 12GB and bring it up again and repair. 
This time it spikes to 8GB and sticks there though the whole repair. It then 
drops down to 600MB without a huge CMS almost like it was 1 big object. The 
node calling repair (1-1) is the only one with the heap growth. If you look at 
[^3.0.9heap.png] this used to not occur during repair and all nodes looked 
similar.

Another interesting thing is CPU usage as seen in [^3.0.14cpu.png]. The node 
performing the node tool repair (in blue) is using way more CPU than the other 
node in the cluster. We compared this a week ago with 3.0.9 and this was also 
not true.

This feels like a bug in repair?


> Abnormal heap growth and long GC during repair.
> ---
>
> Key: CASSANDRA-13687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13687
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
> Attachments: 3.0.14cpu.png, 3.0.14heap.png, 3.0.14.png, 
> 3.0.9heap.png, 3.0.9.png
>
>
> We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004
> Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying 
> on us. We currently don't have any data to help reproduce this, but maybe 
> since there aren't many commits between the 2 versions it might be obvious.
> Basically we trigger a parallel incremental repair from a single node every 
> night at 1AM. That node will sometimes start allocating a lot and keeping the 
> heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
> effectively destroys the whole cluster due to timeouts to this node.
> The only solution we currently have is to drain the node and restart the 
> repair, it has worked fine the second time every time.
> I attached heap charts from 3.0.9 and 3.0.14 during repair.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.

2017-07-12 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16083554#comment-16083554
 ] 

Stanislav Vishnevskiy commented on CASSANDRA-13687:
---

I am assuming you were referring to "Compacted partition maximum bytes", the 
largest one is 20MB. The good news is that one is probably going to be deleted 
later this week because we figured out a better way to deal with outlier users. 
That said 20MB is well below the recommended 100MB limit. I can't get anything 
off netstats currently, probably have to wait till it happens again.

The question though is why does this only happen on the node that is running 
the node repair command? If this was a streaming issue wouldn't other nodes 
also have this issue. Is there a specific bugfix that caused this behavior 
change? It seems really weird for a hotfix version bump change behavior this 
way and it is not documented anywhere.

We run incremental repairs every 24 hours, so it definitely was not behind.

> Abnormal heap growth and long GC during repair.
> ---
>
> Key: CASSANDRA-13687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13687
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
> Attachments: 3.0.14.png, 3.0.9.png
>
>
> We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004
> Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying 
> on us. We currently don't have any data to help reproduce this, but maybe 
> since there aren't many commits between the 2 versions it might be obvious.
> Basically we trigger a parallel incremental repair from a single node every 
> night at 1AM. That node will sometimes start allocating a lot and keeping the 
> heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
> effectively destroys the whole cluster due to timeouts to this node.
> The only solution we currently have is to drain the node and restart the 
> repair, it has worked fine the second time every time.
> I attached heap charts from 3.0.9 and 3.0.14 during repair.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.

2017-07-11 Thread Stanislav Vishnevskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Vishnevskiy updated CASSANDRA-13687:
--
Description: 
We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004

Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying on 
us. We currently don't have any data to help reproduce this, but maybe since 
there aren't many commits between the 2 versions it might be obvious.

Basically we trigger a parallel incremental repair from a single node every 
night at 1AM. That node will sometimes start allocating a lot and keeping the 
heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
effectively destroys the whole cluster due to timeouts to this node.

The only solution we currently have is to drain the node and restart the 
repair, it has worked fine the second time every time.

I attached heap charts from 3.0.9 and 3.0.14 during repair.

  was:
We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004

Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying on 
us. We currently don't have any data to help reproduce this, but maybe since 
there aren't many commits between the 2 version it might be obvious.

Basically we trigger a parallel incremental repair from a single node every 
night at 1AM. That node will sometimes start allocating a lot and keeping the 
heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
effectively destroys the whole cluster due to timeouts to this node.

The only solution we currently have is to drain the node and restart the 
repair, it has worked fine the second time every time.

I attached heap charts from 3.0.9 and 3.0.14 during repair.


> Abnormal heap growth and long GC during repair.
> ---
>
> Key: CASSANDRA-13687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13687
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
> Attachments: 3.0.14.png, 3.0.9.png
>
>
> We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004
> Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying 
> on us. We currently don't have any data to help reproduce this, but maybe 
> since there aren't many commits between the 2 versions it might be obvious.
> Basically we trigger a parallel incremental repair from a single node every 
> night at 1AM. That node will sometimes start allocating a lot and keeping the 
> heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
> effectively destroys the whole cluster due to timeouts to this node.
> The only solution we currently have is to drain the node and restart the 
> repair, it has worked fine the second time every time.
> I attached heap charts from 3.0.9 and 3.0.14 during repair.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.

2017-07-11 Thread Stanislav Vishnevskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Vishnevskiy updated CASSANDRA-13687:
--
Description: 
We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004

Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying on 
us. We currently don't have any data to help reproduce this, but maybe since 
there aren't many commits between the 2 version it might be obvious.

Basically we trigger a parallel incremental repair from a single node every 
night at 1AM. That node will sometimes start allocating a lot and keeping the 
heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
effectively destroys the whole cluster due to timeouts to this node.

The only solution we currently have is to drain the node and restart the 
repair, it has worked fine the second time every time.

I attached heap charts from 3.0.9 and 3.0.14 during repair.

  was:
We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004

Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying on 
us. We currently don't have any data to help reproduce this, but maybe since 
there aren't many commits between the 2 version it might be obvious.

Basically we trigger a parallel incremental repair from a single node every 
night at 1AM. That node will sometimes start allocating a lot and keeping the 
heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
effectively destroys the whole cluster due to timeouts to this node.

The only solution we currently have is to drain the node and restart the 
repair, it has worked fine the second time every time.

I attached heap charts from 3.0.9 and 3.0.14.


> Abnormal heap growth and long GC during repair.
> ---
>
> Key: CASSANDRA-13687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13687
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
> Attachments: 3.0.14.png, 3.0.9.png
>
>
> We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004
> Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying 
> on us. We currently don't have any data to help reproduce this, but maybe 
> since there aren't many commits between the 2 version it might be obvious.
> Basically we trigger a parallel incremental repair from a single node every 
> night at 1AM. That node will sometimes start allocating a lot and keeping the 
> heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
> effectively destroys the whole cluster due to timeouts to this node.
> The only solution we currently have is to drain the node and restart the 
> repair, it has worked fine the second time every time.
> I attached heap charts from 3.0.9 and 3.0.14 during repair.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13687) Abnormal heap growth and long GC during repair.

2017-07-11 Thread Stanislav Vishnevskiy (JIRA)
Stanislav Vishnevskiy created CASSANDRA-13687:
-

 Summary: Abnormal heap growth and long GC during repair.
 Key: CASSANDRA-13687
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13687
 Project: Cassandra
  Issue Type: Bug
Reporter: Stanislav Vishnevskiy
 Attachments: 3.0.14.png, 3.0.9.png

We recently upgraded from 3.0.9 to 3.0.14 to get the fix from CASSANDRA-13004

Sadly 3 out of the last 7 nights we have had to wake up due Cassandra dying on 
us. We currently don't have any data to help reproduce this, but maybe since 
there aren't many commits between the 2 version it might be obvious.

Basically we trigger a parallel incremental repair from a single node every 
night at 1AM. That node will sometimes start allocating a lot and keeping the 
heap maxed and triggering GC. Some of these GC can last up to 2 minutes. This 
effectively destroys the whole cluster due to timeouts to this node.

The only solution we currently have is to drain the node and restart the 
repair, it has worked fine the second time every time.

I attached heap charts from 3.0.9 and 3.0.14.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13004) Corruption while adding a column to a table

2017-01-25 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15838695#comment-15838695
 ] 

Stanislav Vishnevskiy commented on CASSANDRA-13004:
---

[~jjirsa] We read and write at LOCAL_QUORUM. We don't remember if it corrupted 
the same way on each node, but we couldn't read it correctly with LOCAL_ONE 
when we tried. We have since fixed this data by rewriting it.

> Corruption while adding a column to a table
> ---
>
> Key: CASSANDRA-13004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13004
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>
> We had the following schema in production. 
> {code:none}
> CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
> nick text
> );
> CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
> id bigint,
> type int,
> allow_ int,
> deny int
> );
> CREATE TABLE IF NOT EXISTS discord_channels.channels (
> id bigint,
> guild_id bigint,
> type tinyint,
> name text,
> topic text,
> position int,
> owner_id bigint,
> icon_hash text,
> recipients map>,
> permission_overwrites map>,
> bitrate int,
> user_limit int,
> last_pin_timestamp timestamp,
> last_message_id bigint,
> PRIMARY KEY (id)
> );
> {code}
> And then we executed the following alter.
> {code:none}
> ALTER TABLE discord_channels.channels ADD application_id bigint;
> {code}
> And one row (that we can tell) got corrupted at the same time and could no 
> longer be read from the Python driver. 
> {code:none}
> [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
> ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
> '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00
>  
> \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00'
> {code}
> And then in cqlsh when trying to read the row we got this. 
> {code:none}
> /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than 
> Python datetime can represent. Timestamps are displayed in milliseconds from 
> epoch.
> Traceback (most recent call last):
>   File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement
> result = future.result()
>   File 
> "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-

[jira] [Comment Edited] (CASSANDRA-13004) Corruption while adding a column to a table

2016-12-08 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731690#comment-15731690
 ] 

Stanislav Vishnevskiy edited comment on CASSANDRA-13004 at 12/8/16 7:57 PM:


We just ran into this issue on an older 3.0.7 cluster with fairly low write 
velocity (70/sec).


was (Author: stanislav):
We just ran into this issue on an older 3.0.7 cluster with fairly low write 
velocity (500/sec).

> Corruption while adding a column to a table
> ---
>
> Key: CASSANDRA-13004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13004
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>
> We had the following schema in production. 
> {code:none}
> CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
> nick text
> );
> CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
> id bigint,
> type int,
> allow_ int,
> deny int
> );
> CREATE TABLE IF NOT EXISTS discord_channels.channels (
> id bigint,
> guild_id bigint,
> type tinyint,
> name text,
> topic text,
> position int,
> owner_id bigint,
> icon_hash text,
> recipients map>,
> permission_overwrites map>,
> bitrate int,
> user_limit int,
> last_pin_timestamp timestamp,
> last_message_id bigint,
> PRIMARY KEY (id)
> );
> {code}
> And then we executed the following alter.
> {code:none}
> ALTER TABLE discord_channels.channels ADD application_id bigint;
> {code}
> And one row (that we can tell) got corrupted at the same time and could no 
> longer be read from the Python driver. 
> {code:none}
> [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
> ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
> '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00
>  
> \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00'
> {code}
> And then in cqlsh when trying to read the row we got this. 
> {code:none}
> /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than 
> Python datetime can represent. Timestamps are displayed in milliseconds from 
> epoch.
> Traceback (most recent call last):
>   File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement
> result = future.result()
>   File 
> "/usr/share/cassandra/lib/cassandra-driver-inter

[jira] [Commented] (CASSANDRA-13004) Corruption while adding a column to a table

2016-12-08 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731690#comment-15731690
 ] 

Stanislav Vishnevskiy commented on CASSANDRA-13004:
---

We just ran into this issue on an older 3.0.7 cluster with fairly low write 
velocity (500/sec).

> Corruption while adding a column to a table
> ---
>
> Key: CASSANDRA-13004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13004
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>
> We had the following schema in production. 
> {code:none}
> CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
> nick text
> );
> CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
> id bigint,
> type int,
> allow_ int,
> deny int
> );
> CREATE TABLE IF NOT EXISTS discord_channels.channels (
> id bigint,
> guild_id bigint,
> type tinyint,
> name text,
> topic text,
> position int,
> owner_id bigint,
> icon_hash text,
> recipients map>,
> permission_overwrites map>,
> bitrate int,
> user_limit int,
> last_pin_timestamp timestamp,
> last_message_id bigint,
> PRIMARY KEY (id)
> );
> {code}
> And then we executed the following alter.
> {code:none}
> ALTER TABLE discord_channels.channels ADD application_id bigint;
> {code}
> And one row (that we can tell) got corrupted at the same time and could no 
> longer be read from the Python driver. 
> {code:none}
> [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
> ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
> '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00
>  
> \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00'
> {code}
> And then in cqlsh when trying to read the row we got this. 
> {code:none}
> /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than 
> Python datetime can represent. Timestamps are displayed in milliseconds from 
> epoch.
> Traceback (most recent call last):
>   File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement
> result = future.result()
>   File 
> "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
>  line 3650, in result
> raise self._final_exception
> UnicodeDecodeError:

[jira] [Commented] (CASSANDRA-13004) Corruption while adding a column to a table

2016-12-06 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725002#comment-15725002
 ] 

Stanislav Vishnevskiy commented on CASSANDRA-13004:
---

On another node we saw this.

{code:none}
ERROR [MessagingService-Incoming-/10.10.0.129] 2016-12-06 01:44:17,430 
CassandraDaemon.java:205 - Exception in thread 
Thread[MessagingService-Incoming-/10.10.0.129,5,main]
java.lang.RuntimeException: Unknown column application_id during deserialization
at 
org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:432) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.SerializationHeader$Serializer.deserializeForMessaging(SerializationHeader.java:427)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.deserializeHeader(UnfilteredRowIteratorSerializer.java:190)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:661)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:334)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:353)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:290)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
~[apache-cassandra-3.0.9.jar:3.0.9]
{/code}

> Corruption while adding a column to a table
> ---
>
> Key: CASSANDRA-13004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13004
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>
> We had the following schema in production. 
> {code:none}
> CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
> nick text
> );
> CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
> id bigint,
> type int,
> allow_ int,
> deny int
> );
> CREATE TABLE IF NOT EXISTS discord_channels.channels (
> id bigint,
> guild_id bigint,
> type tinyint,
> name text,
> topic text,
> position int,
> owner_id bigint,
> icon_hash text,
> recipients map>,
> permission_overwrites map>,
> bitrate int,
> user_limit int,
> last_pin_timestamp timestamp,
> last_message_id bigint,
> PRIMARY KEY (id)
> );
> {code}
> And then we executed the following alter.
> {code:none}
> ALTER TABLE discord_channels.channels ADD application_id bigint;
> {code}
> And one row (that we can tell) got corrupted at the same time and could no 
> longer be read from the Python driver. 
> {code:none}
> [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
> ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
> '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x0

[jira] [Comment Edited] (CASSANDRA-13004) Corruption while adding a column to a table

2016-12-06 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725002#comment-15725002
 ] 

Stanislav Vishnevskiy edited comment on CASSANDRA-13004 at 12/6/16 10:07 AM:
-

On another node we saw this.

{code:none}
ERROR [MessagingService-Incoming-/10.10.0.129] 2016-12-06 01:44:17,430 
CassandraDaemon.java:205 - Exception in thread 
Thread[MessagingService-Incoming-/10.10.0.129,5,main]
java.lang.RuntimeException: Unknown column application_id during deserialization
at 
org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:432) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.SerializationHeader$Serializer.deserializeForMessaging(SerializationHeader.java:427)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.deserializeHeader(UnfilteredRowIteratorSerializer.java:190)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:661)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:334)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:353)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:290)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
~[apache-cassandra-3.0.9.jar:3.0.9]
{code}


was (Author: stanislav):
On another node we saw this.

{code:none}
ERROR [MessagingService-Incoming-/10.10.0.129] 2016-12-06 01:44:17,430 
CassandraDaemon.java:205 - Exception in thread 
Thread[MessagingService-Incoming-/10.10.0.129,5,main]
java.lang.RuntimeException: Unknown column application_id during deserialization
at 
org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:432) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.SerializationHeader$Serializer.deserializeForMessaging(SerializationHeader.java:427)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.deserializeHeader(UnfilteredRowIteratorSerializer.java:190)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:661)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:334)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:353)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:290)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
~[apache-cassandra-3.0.9.jar:3.0.9]
{/code}

> Corruption while adding a column to a table
> ---
>
> Key: CASSANDRA-13004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13004
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>
> We had the following schema in production. 
> {code:none}
> CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
> nick text
> );
> CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
> id bigint,
> type int,
> allow_ int,
> deny int
> );
> CREATE TABLE IF NOT EXISTS discord_channels.channels (
> id bigint,
> guild_id bigint,
> type tinyint,
> name text,
> topic text,
> position int,
> owner_id bigint,
> icon_hash text,
> recipients map>,
> permission_overwrites map>,
> bitrate int,
> user_limit int,
> last_pin_timestamp timestamp,
> last_message_id bigint,
> PRIMARY KEY (id)
> );
> {code}
> And then we executed the following alter.
> {code:none}
> ALTER TABLE discord_channels.channels ADD application_id bigint;
> {code}
> And one row (that we can tell) got corrupted at the same time and could no 
> longer be read from the Python driver. 
> {code:none}
> [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
> ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
> '\x84\x00

[jira] [Comment Edited] (CASSANDRA-13004) Corruption while adding a column to a table

2016-12-05 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15724156#comment-15724156
 ] 

Stanislav Vishnevskiy edited comment on CASSANDRA-13004 at 12/6/16 3:01 AM:


We found this in the logs at the exact time this happened.

{code}
ERROR [SharedPool-Worker-11] 2016-12-06 01:44:16,971 Message.java:617 - 
Unexpected exception during request; channel = [id: 0xbd9a77e9, 
/10.10.0.48:38317 => /10.10.0.129:9042]
java.io.IOError: java.io.IOException: Corrupt value length 1485619006 
encountered, as it exceeds the maximum of 268435456, which is set via 
max_value_size_in_mb in cassandra.yaml
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:210)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:369)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:189)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:158)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:509)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:369)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.FilteredRows.isEmpty(FilteredRows.java:50) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.Filter.closeIfEmpty(Filter.java:73) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:43) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:26) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:707)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:400)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:353)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:227)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:487)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:464)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513)
 [apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407)
 [apache-cassandra-3.0.9.jar:3.0.9]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.ja

[jira] [Commented] (CASSANDRA-13004) Corruption while adding a column to a table

2016-12-05 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15724156#comment-15724156
 ] 

Stanislav Vishnevskiy commented on CASSANDRA-13004:
---

We found this in the logs at the exact time this happened.

{code}
ERROR [SharedPool-Worker-11] 2016-12-06 01:44:16,971 Message.java:617 - 
Unexpected exception during request; channel = [id: 0xbd9a77e9, 
/10.10.0.48:38317 => /10.10.0.129:9042]
java.io.IOError: java.io.IOException: Corrupt value length 1485619006 
encountered, as it exceeds the maximum of 268435456, which is set via 
max_value_size_in_mb in cassandra.yaml
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:210)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:369)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:189)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:158)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:509)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:369)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.FilteredRows.isEmpty(FilteredRows.java:50) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.Filter.closeIfEmpty(Filter.java:73) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:43) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.Filter.applyToPartition(Filter.java:26) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:707)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:400)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:353)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:227)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:76)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:206)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:487)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:464)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:130)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513)
 [apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407)
 [apache-cassandra-3.0.9.jar:3.0.9]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.Abst

[jira] [Updated] (CASSANDRA-13004) Corruption while adding a column to a table

2016-12-05 Thread Stanislav Vishnevskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Vishnevskiy updated CASSANDRA-13004:
--
Summary: Corruption while adding a column to a table  (was: Corruption 
while adding a column to a table in production)

> Corruption while adding a column to a table
> ---
>
> Key: CASSANDRA-13004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13004
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>
> We had the following schema in production. 
> {code:text}
> CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
> nick text
> );
> CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
> id bigint,
> type int,
> allow_ int,
> deny int
> );
> CREATE TABLE IF NOT EXISTS discord_channels.channels (
> id bigint,
> guild_id bigint,
> type tinyint,
> name text,
> topic text,
> position int,
> owner_id bigint,
> icon_hash text,
> recipients map>,
> permission_overwrites map>,
> bitrate int,
> user_limit int,
> last_pin_timestamp timestamp,
> last_message_id bigint,
> PRIMARY KEY (id)
> );
> {/code}
> And then we executed the following alter.
> {code:text}
> ALTER TABLE discord_channels.channels ADD application_id bigint;
> {/code}
> And one row (that we can tell) got corrupted at the same time and could no 
> longer be read from the Python driver. 
> {code:text}
> [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
> ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
> '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00
>  
> \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00'
> {code}
> And then in cqlsh when trying to read the row we got this. 
> {code:text}
> /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than 
> Python datetime can represent. Timestamps are displayed in milliseconds from 
> epoch.
> Traceback (most recent call last):
>   File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement
> result = future.result()
>   File 
> "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
>  line 3650, in result
> raise self._final_exception
> UnicodeDecodeError: 'utf8' codec can't decode by

[jira] [Updated] (CASSANDRA-13004) Corruption while add a column to a table in production

2016-12-05 Thread Stanislav Vishnevskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Vishnevskiy updated CASSANDRA-13004:
--
Description: 
We had the following schema in production. 

{code:text}
CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
nick text
);

CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
id bigint,
type int,
allow_ int,
deny int
);

CREATE TABLE IF NOT EXISTS discord_channels.channels (
id bigint,
guild_id bigint,
type tinyint,
name text,
topic text,
position int,
owner_id bigint,
icon_hash text,
recipients map>,
permission_overwrites map>,
bitrate int,
user_limit int,
last_pin_timestamp timestamp,
last_message_id bigint,
PRIMARY KEY (id)
);
{/code}

And then we executed the following alter.

{code:text}
ALTER TABLE discord_channels.channels ADD application_id bigint;
{/code}

And one row (that we can tell) got corrupted at the same time and could no 
longer be read from the Python driver. 

{code:text}
[E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
'\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00
 
\x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00'
{code}

And then in cqlsh when trying to read the row we got this. 

{code:text}
/usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than 
Python datetime can represent. Timestamps are displayed in milliseconds from 
epoch.
Traceback (most recent call last):
  File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement
result = future.result()
  File 
"/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
 line 3650, in result
raise self._final_exception
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid 
start byte
{code}

We tried to read the data and it would refuse to read the name column (the UTF8 
error) and the last_pin_timestamp column had an absurdly large value.

We ended up rewriting the whole row as we had the data in another place and it 
fixed the problem. However there is clearly a race condition in the schema 
change sub-system.

Any ideas?

  was:
We had the following schema in production. 

{code:cql}
CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
nick

[jira] [Updated] (CASSANDRA-13004) Corruption while adding a column to a table

2016-12-05 Thread Stanislav Vishnevskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Vishnevskiy updated CASSANDRA-13004:
--
Description: 
We had the following schema in production. 

{code:none}
CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
nick text
);

CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
id bigint,
type int,
allow_ int,
deny int
);

CREATE TABLE IF NOT EXISTS discord_channels.channels (
id bigint,
guild_id bigint,
type tinyint,
name text,
topic text,
position int,
owner_id bigint,
icon_hash text,
recipients map>,
permission_overwrites map>,
bitrate int,
user_limit int,
last_pin_timestamp timestamp,
last_message_id bigint,
PRIMARY KEY (id)
);
{code}

And then we executed the following alter.

{code:none}
ALTER TABLE discord_channels.channels ADD application_id bigint;
{code}

And one row (that we can tell) got corrupted at the same time and could no 
longer be read from the Python driver. 

{code:none}
[E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
'\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00
 
\x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00'
{code}

And then in cqlsh when trying to read the row we got this. 

{code:none}
/usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than 
Python datetime can represent. Timestamps are displayed in milliseconds from 
epoch.
Traceback (most recent call last):
  File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement
result = future.result()
  File 
"/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
 line 3650, in result
raise self._final_exception
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid 
start byte
{code}

We tried to read the data and it would refuse to read the name column (the UTF8 
error) and the last_pin_timestamp column had an absurdly large value.

We ended up rewriting the whole row as we had the data in another place and it 
fixed the problem. However there is clearly a race condition in the schema 
change sub-system.

Any ideas?

  was:
We had the following schema in production. 

{code:text}
CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
nick 

[jira] [Updated] (CASSANDRA-13004) Corruption while adding a column to a table in production

2016-12-05 Thread Stanislav Vishnevskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Vishnevskiy updated CASSANDRA-13004:
--
Summary: Corruption while adding a column to a table in production  (was: 
Corruption while add a column to a table in production)

> Corruption while adding a column to a table in production
> -
>
> Key: CASSANDRA-13004
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13004
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>
> We had the following schema in production. 
> {code:text}
> CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
> nick text
> );
> CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
> id bigint,
> type int,
> allow_ int,
> deny int
> );
> CREATE TABLE IF NOT EXISTS discord_channels.channels (
> id bigint,
> guild_id bigint,
> type tinyint,
> name text,
> topic text,
> position int,
> owner_id bigint,
> icon_hash text,
> recipients map>,
> permission_overwrites map>,
> bitrate int,
> user_limit int,
> last_pin_timestamp timestamp,
> last_message_id bigint,
> PRIMARY KEY (id)
> );
> {/code}
> And then we executed the following alter.
> {code:text}
> ALTER TABLE discord_channels.channels ADD application_id bigint;
> {/code}
> And one row (that we can tell) got corrupted at the same time and could no 
> longer be read from the Python driver. 
> {code:text}
> [E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
> ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
> '\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00
>  
> \x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00'
> {code}
> And then in cqlsh when trying to read the row we got this. 
> {code:text}
> /usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than 
> Python datetime can represent. Timestamps are displayed in milliseconds from 
> epoch.
> Traceback (most recent call last):
>   File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement
> result = future.result()
>   File 
> "/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
>  line 3650, in result
> raise self._final_exception
> UnicodeDe

[jira] [Updated] (CASSANDRA-13004) Corruption while add a column to a table in production

2016-12-05 Thread Stanislav Vishnevskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Vishnevskiy updated CASSANDRA-13004:
--
Description: 
We had the following schema in production. 

{code:cql}
CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
nick text
);

CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
id bigint,
type int,
allow_ int,
deny int
);

CREATE TABLE IF NOT EXISTS discord_channels.channels (
id bigint,
guild_id bigint,
type tinyint,
name text,
topic text,
position int,
owner_id bigint,
icon_hash text,
recipients map>,
permission_overwrites map>,
bitrate int,
user_limit int,
last_pin_timestamp timestamp,
last_message_id bigint,
PRIMARY KEY (id)
);
{/code}

And then we executed the following alter.

{code:cql}
ALTER TABLE discord_channels.channels ADD application_id bigint;
{/code}

And one row (that we can tell) got corrupted at the same time and could no 
longer be read from the Python driver. 

{code:text}
[E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
'\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00
 
\x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00'
{code}

And then in cqlsh when trying to read the row we got this. 

{code:text}
/usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than 
Python datetime can represent. Timestamps are displayed in milliseconds from 
epoch.
Traceback (most recent call last):
  File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement
result = future.result()
  File 
"/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
 line 3650, in result
raise self._final_exception
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid 
start byte
{/code}

We tried to read the data and it would refuse to read the name column (the UTF8 
error) and the last_pin_timestamp column had an absurdly large value.

We ended up rewriting the whole row as we had the data in another place and it 
fixed the problem. However there is clearly a race condition in the schema 
change sub-system.

Any ideas?

  was:
We had the following schema in production. 

{code:cql}
CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
nick 

[jira] [Updated] (CASSANDRA-13004) Corruption while add a column to a table in production

2016-12-05 Thread Stanislav Vishnevskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Vishnevskiy updated CASSANDRA-13004:
--
Description: 
We had the following schema in production. 

{code:cql}
CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
nick text
);

CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
id bigint,
type int,
allow_ int,
deny int
);

CREATE TABLE IF NOT EXISTS discord_channels.channels (
id bigint,
guild_id bigint,
type tinyint,
name text,
topic text,
position int,
owner_id bigint,
icon_hash text,
recipients map>,
permission_overwrites map>,
bitrate int,
user_limit int,
last_pin_timestamp timestamp,
last_message_id bigint,
PRIMARY KEY (id)
);
{/code}

And then we executed the following alter.

{code:cql}
ALTER TABLE discord_channels.channels ADD application_id bigint;
{/code}

And one row (that we can tell) got corrupted at the same time and could no 
longer be read from the Python driver. 

{code:text}
[E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
'\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00
 
\x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00'
{code}

And then in cqlsh when trying to read the row we got this. 

{code:text}
/usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than 
Python datetime can represent. Timestamps are displayed in milliseconds from 
epoch.
Traceback (most recent call last):
  File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement
result = future.result()
  File 
"/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
 line 3650, in result
raise self._final_exception
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid 
start byte
{code}

We tried to read the data and it would refuse to read the name column (the UTF8 
error) and the last_pin_timestamp column had an absurdly large value.

We ended up rewriting the whole row as we had the data in another place and it 
fixed the problem. However there is clearly a race condition in the schema 
change sub-system.

Any ideas?

  was:
We had the following schema in production. 

{code:cql}
CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
nick t

[jira] [Created] (CASSANDRA-13004) Corruption while add a column to a table in production

2016-12-05 Thread Stanislav Vishnevskiy (JIRA)
Stanislav Vishnevskiy created CASSANDRA-13004:
-

 Summary: Corruption while add a column to a table in production
 Key: CASSANDRA-13004
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13004
 Project: Cassandra
  Issue Type: Bug
Reporter: Stanislav Vishnevskiy


We had the following schema in production. 

{code:cql}
CREATE TYPE IF NOT EXISTS discord_channels.channel_recipient (
nick text
);

CREATE TYPE IF NOT EXISTS discord_channels.channel_permission_overwrite (
id bigint,
type int,
allow_ int,
deny int
);

CREATE TABLE IF NOT EXISTS discord_channels.channels (
id bigint,
guild_id bigint,
type tinyint,
name text,
topic text,
position int,
owner_id bigint,
icon_hash text,
recipients map>,
permission_overwrites map>,
bitrate int,
user_limit int,
last_pin_timestamp timestamp,
last_message_id bigint,
PRIMARY KEY (id)
);
{/code}

And then we executed the following alter.

{code:cql}
ALTER TABLE discord_channels.channels ADD application_id bigint;
{/code}

And one row (that we can tell) got corrupted at the same time and could no 
longer be read from the Python driver. 

{code}
[E 161206 01:56:58 geventreactor:141] Error decoding response from Cassandra. 
ver(4); flags(); stream(27); op(8); offset(9); len(887); buffer: 
'\x84\x00\x00\x1b\x08\x00\x00\x03w\x00\x00\x00\x02\x00\x00\x00\x01\x00\x00\x00\x0f\x00\x10discord_channels\x00\x08channels\x00\x02id\x00\x02\x00\x0eapplication_id\x00\x02\x00\x07bitrate\x00\t\x00\x08guild_id\x00\x02\x00\ticon_hash\x00\r\x00\x0flast_message_id\x00\x02\x00\x12last_pin_timestamp\x00\x0b\x00\x04name\x00\r\x00\x08owner_id\x00\x02\x00\x15permission_overwrites\x00!\x00\x02\x000\x00\x10discord_channels\x00\x1cchannel_permission_overwrite\x00\x04\x00\x02id\x00\x02\x00\x04type\x00\t\x00\x06allow_\x00\t\x00\x04deny\x00\t\x00\x08position\x00\t\x00\nrecipients\x00!\x00\x02\x000\x00\x10discord_channels\x00\x11channel_recipient\x00\x01\x00\x04nick\x00\r\x00\x05topic\x00\r\x00\x04type\x00\x14\x00\nuser_limit\x00\t\x00\x00\x00\x01\x00\x00\x00\x08\x03\x8a\x19\x8e\xf8\x82\x00\x01\xff\xff\xff\xff\x00\x00\x00\x04\x00\x00\xfa\x00\x00\x00\x00\x08\x00\x00\xfa\x00\x00\xf8G\xc5\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8b\xc0\xb5nB\x00\x02\x00\x00\x00\x08G\xc5\xffI\x98\xc4\xb4(\x00\x00\x00\x03\x8b\xc0\xa8\xff\xff\xff\xff\x00\x00\x01<\x00\x00\x00\x06\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x81L\xea\xfc\x82\x00\n\x00\x00\x00\x04\x00\x00\x00\x01\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1e\xe6\x8b\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x040\x07\xf8Q\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1f\x1b{\x82\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x07\xf8Q\x00\x00\x00\x04\x10\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x1fH6\x82\x00\x01\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x05\xe8A\x00\x00\x00\x04\x10\x02\x00\x00\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a+=\xca\xc0\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x08\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00$\x00\x00\x00\x08\x03\x8a\x8f\x979\x80\x00\n\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x04\x00
 
\x08\x01\x00\x00\x00\x04\xc4\xb4(\x00\xff\xff\xff\xff\x00\x00\x00O[f\x80Q\x07general\x05\xf8G\xc5\xffI\x98\xc4\xb4(\x00\xf8O[f\x80Q\x00\x00\x00\x02\x04\xf8O[f\x80Q\x00\xf8G\xc5\xffI\x98\x01\x00\x00\xf8O[f\x80Q\x00\x00\x00\x00\xf8G\xc5\xffI\x97\xc4\xb4(\x06\x00\xf8O\x7fe\x1fm\x08\x03\x00\x00\x00\x01\x00\x00\x00\x00\x04\x00\x00\x00\x00'
{code}

And then in cqlsh when trying to read the row we got this. 

{code}
/usr/bin/cqlsh.py:632: DateOverFlowWarning: Some timestamps are larger than 
Python datetime can represent. Timestamps are displayed in milliseconds from 
epoch.
Traceback (most recent call last):
  File "/usr/bin/cqlsh.py", line 1301, in perform_simple_statement
result = future.result()
  File 
"/usr/share/cassandra/lib/cassandra-driver-internal-only-3.5.0.post0-d8d0456.zip/cassandra-driver-3.5.0.post0-d8d0456/cassandra/cluster.py",
 line 3650, in result
raise self._final_exception
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid 
start byte
{/code}

We tried to read the data and it would refuse to read the name column (the UTF8 
error) and the last_pin_timestamp column had an absurdly large value.

We ended up rewriting the whole row as we had the data in another place and it 
fixed the problem. However there is clearly a race condition in the schema 
change sub-system.

Any ideas

[jira] [Commented] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7

2016-07-06 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365165#comment-15365165
 ] 

Stanislav Vishnevskiy commented on CASSANDRA-12144:
---

I sent you an email.

> Undeletable rows after upgrading from 2.2.4 to 3.0.7
> 
>
> Key: CASSANDRA-12144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12144
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>Assignee: Alex Petrov
>
> We upgraded our cluster today and now have a some rows that refuse to delete.
> Here are some example traces.
> https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d
> Even weirder.
> Updating the row and querying it back results in 2 rows even though the id is 
> the clustering key.
> {noformat}
> user_id| id | since| type
> ---++--+--
> 116138050710536192 | 153047019424972800 | null |0
> 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2
> {noformat}
> And then deleting it again only removes the new one.
> {noformat}
> cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = 
> 116138050710536192 AND id = 153047019424972800;
> cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = 
> 116138050710536192 AND id = 153047019424972800;
>  user_id| id | since| type
> ++--+--
>  116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2
> {noformat}
> We tried repairing, compacting, scrubbing. No Luck.
> Not sure what to do. Is anyone aware of this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7

2016-07-06 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15365044#comment-15365044
 ] 

Stanislav Vishnevskiy commented on CASSANDRA-12144:
---

Email sent.

> Undeletable rows after upgrading from 2.2.4 to 3.0.7
> 
>
> Key: CASSANDRA-12144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12144
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>Assignee: Alex Petrov
>
> We upgraded our cluster today and now have a some rows that refuse to delete.
> Here are some example traces.
> https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d
> Even weirder.
> Updating the row and querying it back results in 2 rows even though the id is 
> the clustering key.
> {noformat}
> user_id| id | since| type
> ---++--+--
> 116138050710536192 | 153047019424972800 | null |0
> 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2
> {noformat}
> And then deleting it again only removes the new one.
> {noformat}
> cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = 
> 116138050710536192 AND id = 153047019424972800;
> cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = 
> 116138050710536192 AND id = 153047019424972800;
>  user_id| id | since| type
> ++--+--
>  116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2
> {noformat}
> We tried repairing, compacting, scrubbing. No Luck.
> Not sure what to do. Is anyone aware of this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7

2016-07-06 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364833#comment-15364833
 ] 

Stanislav Vishnevskiy commented on CASSANDRA-12144:
---

Yup for some rows somehow it manages to have multiple rows for the same primary 
key. Only old rows it seems that existed prior to the upgrade.

We followed the steps outlines in this document.

https://docs.datastax.com/en/latest-upgrade/upgrade/cassandra/upgrdBestPractCassandra.html

Which included running upgradesstables before and after the upgrade.

> Undeletable rows after upgrading from 2.2.4 to 3.0.7
> 
>
> Key: CASSANDRA-12144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12144
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>
> We upgraded our cluster today and now have a some rows that refuse to delete.
> Here are some example traces.
> https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d
> Even weirder.
> Updating the row and querying it back results in 2 rows even though the id is 
> the clustering key.
> {noformat}
> user_id| id | since| type
> ---++--+--
> 116138050710536192 | 153047019424972800 | null |0
> 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2
> {noformat}
> And then deleting it again only removes the new one.
> {noformat}
> cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = 
> 116138050710536192 AND id = 153047019424972800;
> cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = 
> 116138050710536192 AND id = 153047019424972800;
>  user_id| id | since| type
> ++--+--
>  116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2
> {noformat}
> We tried repairing, compacting, scrubbing. No Luck.
> Not sure what to do. Is anyone aware of this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7

2016-07-06 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364659#comment-15364659
 ] 

Stanislav Vishnevskiy edited comment on CASSANDRA-12144 at 7/6/16 5:16 PM:
---

The writetime on the rows is in the GitHub gist.


{noformat}
cqlsh>  SELECT WRITETIME(since), WRITETIME(type) FROM 
discord_relationships.relationships WHERE  user_id = 116138050710536192 AND id 
= 153047019424972800;

 writetime(since) | writetime(type)
--+--
 1464619988173052 | 1464619988173052
{noformat}

So that was written on May 30th.

This cluster is half a year old and exhibited zero issues until yesterday 
pretty much right after the upgrade finished. We also are noticing a weird key 
cache hit rate right from when the upgrade finished.

http://i.imgur.com/JDihdGO.png

The schema is as follows.

{noformat}
CREATE KEYSPACE discord_relationships WITH replication = {'class': 
'NetworkTopologyStrategy', 'us-east1': '3'}  AND durable_writes = true;

CREATE TABLE discord_relationships.relationships (
user_id bigint,
id bigint,
since timestamp,
type tinyint,
PRIMARY KEY (user_id, id)
) WITH CLUSTERING ORDER BY (id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
{noformat}

Thanks.


was (Author: stanislav):
The writetime on the rows is in the GitHub


{noformat}
cqlsh>  SELECT WRITETIME(since), WRITETIME(type) FROM 
discord_relationships.relationships WHERE  user_id = 116138050710536192 AND id 
= 153047019424972800;

 writetime(since) | writetime(type)
--+--
 1464619988173052 | 1464619988173052
{noformat}

So that was written on May 30th.

This cluster is half a year old and exhibited zero issues until yesterday 
pretty much right after the upgrade finished. We also are noticing a weird key 
cache hit rate right from when the upgrade finished.

http://i.imgur.com/JDihdGO.png

The schema is as follows.

{noformat}
CREATE KEYSPACE discord_relationships WITH replication = {'class': 
'NetworkTopologyStrategy', 'us-east1': '3'}  AND durable_writes = true;

CREATE TABLE discord_relationships.relationships (
user_id bigint,
id bigint,
since timestamp,
type tinyint,
PRIMARY KEY (user_id, id)
) WITH CLUSTERING ORDER BY (id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
{noformat}

Thanks.

> Undeletable rows after upgrading from 2.2.4 to 3.0.7
> 
>
> Key: CASSANDRA-12144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12144
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>
> We upgraded our cluster today and now have a some rows that refuse to delete.
> Here are some example traces.
> https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d
> Even weirder.
> Updating the row and querying it back results in 2 rows even though the id is 
> the clustering key.
> {noformat}
> user_id| id | since| type
> ---++--+--
> 116138050710536192 | 153047019424972800 | null |0
> 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2
> {noformat}
> And then deleting it again only removes the new one.
> {noformat}
> cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = 
> 116138050710536192 AND id = 153047019424972800;
> cqlsh:discord_relationships> SELECT * FROM relationships WHERE 

[jira] [Commented] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7

2016-07-06 Thread Stanislav Vishnevskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364659#comment-15364659
 ] 

Stanislav Vishnevskiy commented on CASSANDRA-12144:
---

The writetime on the rows is in the GitHub


{noformat}
cqlsh>  SELECT WRITETIME(since), WRITETIME(type) FROM 
discord_relationships.relationships WHERE  user_id = 116138050710536192 AND id 
= 153047019424972800;

 writetime(since) | writetime(type)
--+--
 1464619988173052 | 1464619988173052
{noformat}

So that was written on May 30th.

This cluster is half a year old and exhibited zero issues until yesterday 
pretty much right after the upgrade finished. We also are noticing a weird key 
cache hit rate right from when the upgrade finished.

http://i.imgur.com/JDihdGO.png

The schema is as follows.

{noformat}
CREATE KEYSPACE discord_relationships WITH replication = {'class': 
'NetworkTopologyStrategy', 'us-east1': '3'}  AND durable_writes = true;

CREATE TABLE discord_relationships.relationships (
user_id bigint,
id bigint,
since timestamp,
type tinyint,
PRIMARY KEY (user_id, id)
) WITH CLUSTERING ORDER BY (id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
{noformat}

Thanks.

> Undeletable rows after upgrading from 2.2.4 to 3.0.7
> 
>
> Key: CASSANDRA-12144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12144
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>
> We upgraded our cluster today and now have a some rows that refuse to delete.
> Here are some example traces.
> https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d
> Even weirder.
> Updating the row and querying it back results in 2 rows even though the id is 
> the clustering key.
> {noformat}
> user_id| id | since| type
> ---++--+--
> 116138050710536192 | 153047019424972800 | null |0
> 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2
> {noformat}
> And then deleting it again only removes the new one.
> {noformat}
> cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = 
> 116138050710536192 AND id = 153047019424972800;
> cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = 
> 116138050710536192 AND id = 153047019424972800;
>  user_id| id | since| type
> ++--+--
>  116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2
> {noformat}
> We tried repairing, compacting, scrubbing. No Luck.
> Not sure what to do. Is anyone aware of this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7

2016-07-06 Thread Stanislav Vishnevskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Vishnevskiy updated CASSANDRA-12144:
--
Description: 
We upgraded our cluster today and now have a some rows that refuse to delete.

Here are some example traces.

https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d

Even weirder.

Updating the row and querying it back results in 2 rows even though the id is 
the clustering key.

user_id| id | since| type
++--+--
 116138050710536192 | 153047019424972800 | null |0
 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2

And then deleting it again only removes the new one.

cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = 
116138050710536192 AND id = 153047019424972800;
cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = 
116138050710536192 AND id = 153047019424972800;

 user_id| id | since| type
++--+--
 116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2

We tried repairing, compacting, scrubbing. No Luck.

Not sure what to do. Is anyone aware of this?

  was:
We upgraded our cluster today and now have a some rows that refuse to delete.

Here are some example traces.

https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d

We tried repairing, compacting, scrubbing. No Luck.

Not sure what to do. Is anyone aware of this?


> Undeletable rows after upgrading from 2.2.4 to 3.0.7
> 
>
> Key: CASSANDRA-12144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12144
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stanislav Vishnevskiy
>
> We upgraded our cluster today and now have a some rows that refuse to delete.
> Here are some example traces.
> https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d
> Even weirder.
> Updating the row and querying it back results in 2 rows even though the id is 
> the clustering key.
> user_id| id | since| type
> ++--+--
>  116138050710536192 | 153047019424972800 | null |0
>  116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2
> And then deleting it again only removes the new one.
> cqlsh:discord_relationships> DELETE FROM relationships WHERE user_id = 
> 116138050710536192 AND id = 153047019424972800;
> cqlsh:discord_relationships> SELECT * FROM relationships WHERE user_id = 
> 116138050710536192 AND id = 153047019424972800;
>  user_id| id | since| type
> ++--+--
>  116138050710536192 | 153047019424972800 | 2016-05-30 14:53:08+ |2
> We tried repairing, compacting, scrubbing. No Luck.
> Not sure what to do. Is anyone aware of this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12144) Undeletable rows after upgrading from 2.2.4 to 3.0.7

2016-07-06 Thread Stanislav Vishnevskiy (JIRA)
Stanislav Vishnevskiy created CASSANDRA-12144:
-

 Summary: Undeletable rows after upgrading from 2.2.4 to 3.0.7
 Key: CASSANDRA-12144
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12144
 Project: Cassandra
  Issue Type: Bug
Reporter: Stanislav Vishnevskiy


We upgraded our cluster today and now have a some rows that refuse to delete.

Here are some example traces.

https://gist.github.com/vishnevskiy/36aa18c468344ea22d14f9fb9b99171d

We tried repairing, compacting, scrubbing. No Luck.

Not sure what to do. Is anyone aware of this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)