[jira] [Comment Edited] (CASSANDRA-6596) Split out outgoing stream throughput within a DC and inter-DC

2014-02-16 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902842#comment-13902842
 ] 

Vijay edited comment on CASSANDRA-6596 at 2/16/14 9:43 PM:
---

Thanks Bendict, Fixed!


was (Author: vijay2...@yahoo.com):
Hi Bendict, Not sure where i missed it the change was to add a multiplier 
while initializing the throughput...
{code}
double currentThroughput = ((double) 
DatabaseDescriptor.getStreamThroughputOutboundMegabitsPerSec()) * 1024 * 1024 * 
8;
{code}

 Split out outgoing stream throughput within a DC and inter-DC
 -

 Key: CASSANDRA-6596
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6596
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jeremy Hanna
Assignee: Vijay
Priority: Minor
 Fix For: 2.1

 Attachments: 0001-CASSANDRA-6596.patch


 Currently the outgoing stream throughput setting doesn't differentiate 
 between when it goes to another node in the same DC and when it goes to 
 another DC across a potentially bandwidth limited link.  It would be nice to 
 have that split out so that it could be tuned for each type of link.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Issue Comment Deleted] (CASSANDRA-6596) Split out outgoing stream throughput within a DC and inter-DC

2014-02-16 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-6596:
-

Comment: was deleted

(was: Thanks Bendict, Fixed!)

 Split out outgoing stream throughput within a DC and inter-DC
 -

 Key: CASSANDRA-6596
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6596
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jeremy Hanna
Assignee: Vijay
Priority: Minor
 Fix For: 2.1

 Attachments: 0001-CASSANDRA-6596.patch


 Currently the outgoing stream throughput setting doesn't differentiate 
 between when it goes to another node in the same DC and when it goes to 
 another DC across a potentially bandwidth limited link.  It would be nice to 
 have that split out so that it could be tuned for each type of link.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6596) Split out outgoing stream throughput within a DC and inter-DC

2014-02-16 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902849#comment-13902849
 ] 

Vijay commented on CASSANDRA-6596:
--

Thanks Bendict! Fixed it Race in the comments...

 Split out outgoing stream throughput within a DC and inter-DC
 -

 Key: CASSANDRA-6596
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6596
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jeremy Hanna
Assignee: Vijay
Priority: Minor
 Fix For: 2.1

 Attachments: 0001-CASSANDRA-6596.patch


 Currently the outgoing stream throughput setting doesn't differentiate 
 between when it goes to another node in the same DC and when it goes to 
 another DC across a potentially bandwidth limited link.  It would be nice to 
 have that split out so that it could be tuned for each type of link.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6712) Equals without hashcode in SpeculativeRetry

2014-02-14 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902139#comment-13902139
 ] 

Vijay commented on CASSANDRA-6712:
--

+1

 Equals without hashcode in SpeculativeRetry
 ---

 Key: CASSANDRA-6712
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6712
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Trivial
 Fix For: 2.0.6

 Attachments: 6712.txt


 This could cause problems if we were to start using supposed-to-be-equal SR 
 objects in a Hashmap.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6590) Gossip does not heal after a temporary partition at startup

2014-02-08 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895835#comment-13895835
 ] 

Vijay commented on CASSANDRA-6590:
--

Hi Brandon, 
Was not able to reproduce the above issue... (below is the log after network 
partition)
{code}
 INFO [GossipTasks:1] 2014-02-09 05:29:10,259 Gossiper.java (line 862) 
InetAddress /17.198.227.155 is now DOWN
 INFO [HANDSHAKE-/17.198.227.155] 2014-02-09 05:29:18,023 
OutboundTcpConnection.java (line 386) Handshaking version with /17.198.227.155
 INFO [RequestResponseStage:33] 2014-02-09 05:29:18,038 Gossiper.java (line 
848) InetAddress /17.198.227.155 is now UP
{code}

{quote}
I think we'll need a separate yaml option
{quote}
Done

{quote}
I'm not sure why the block in handleMajorStateChange moved
{quote}
Since the message was wrong, Up doesn't happen until echo completes, any ways i 
reverted that.

rebased @ https://github.com/Vijay2win/cassandra/tree/6590-v4




 Gossip does not heal after a temporary partition at startup
 ---

 Key: CASSANDRA-6590
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6590
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Vijay
 Fix For: 2.0.6

 Attachments: 0001-CASSANDRA-6590.patch, 0001-logging-for-6590.patch, 
 6590_disable_echo.txt


 See CASSANDRA-6571 for background.  If a node is partitioned on startup when 
 the echo command is sent, but then the partition heals, the halves of the 
 partition will never mark each other up despite being able to communicate.  
 This stems from CASSANDRA-3533.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6590) Gossip does not heal after a temporary partition at startup

2014-02-03 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889826#comment-13889826
 ] 

Vijay commented on CASSANDRA-6590:
--

Sorry was shooting a different message during the startup, fixed and pushed to 
https://github.com/Vijay2win/cassandra/tree/6590-v3. Thanks!



 Gossip does not heal after a temporary partition at startup
 ---

 Key: CASSANDRA-6590
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6590
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Vijay
 Fix For: 2.0.6

 Attachments: 0001-CASSANDRA-6590.patch, 0001-logging-for-6590.patch, 
 6590_disable_echo.txt


 See CASSANDRA-6571 for background.  If a node is partitioned on startup when 
 the echo command is sent, but then the partition heals, the halves of the 
 partition will never mark each other up despite being able to communicate.  
 This stems from CASSANDRA-3533.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6590) Gossip does not heal after a temporary partition at startup

2014-01-26 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-6590:
-

Attachment: 0001-logging-for-6590.patch

Hi Brandon, Looks like the realMarkAlive is called multiple times and hence the 
issue i removed the localState.markDead() and it works fine for now (Attached 
patch). Let me know...

 Gossip does not heal after a temporary partition at startup
 ---

 Key: CASSANDRA-6590
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6590
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Vijay
 Fix For: 2.0.5

 Attachments: 0001-CASSANDRA-6590.patch, 0001-logging-for-6590.patch, 
 6590_disable_echo.txt


 See CASSANDRA-6571 for background.  If a node is partitioned on startup when 
 the echo command is sent, but then the partition heals, the halves of the 
 partition will never mark each other up despite being able to communicate.  
 This stems from CASSANDRA-3533.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6596) Split out outgoing stream throughput within a DC and inter-DC

2014-01-19 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-6596:
-

Attachment: 0001-CASSANDRA-6596.patch

Attached patch introduces inter_dc_stream_throughput_outbound_megabits_per_sec 
which is a subset of stream_throughput_outbound_megabits_per_sec.

Currently the node throttles all the traffic which it streams (this doesn't 
change after this patch). In addition, this patch adds additional throttle 
across the DC's.

One more thing: There might be a bug (in trunk) where the throttle is applied 
on bytes instead of bits... Since its not related to this ticket, i have not 
changed it.

{code}
int toTransfer = (int) Math.min(transferBuffer.length, length - 
bytesTransferred);
int minReadable = (int) Math.min(transferBuffer.length, reader.length() 
- reader.getFilePointer());

reader.readFully(transferBuffer, 0, minReadable);
if (validator != null)
validator.validate(transferBuffer, 0, minReadable);

limiter.acquire(toTransfer);
{code}

 Split out outgoing stream throughput within a DC and inter-DC
 -

 Key: CASSANDRA-6596
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6596
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jeremy Hanna
Assignee: Vijay
Priority: Minor
 Fix For: 2.1

 Attachments: 0001-CASSANDRA-6596.patch


 Currently the outgoing stream throughput setting doesn't differentiate 
 between when it goes to another node in the same DC and when it goes to 
 another DC across a potentially bandwidth limited link.  It would be nice to 
 have that split out so that it could be tuned for each type of link.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6590) Gossip does not heal after a temporary partition at startup

2014-01-19 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-6590:
-

Attachment: 0001-CASSANDRA-6590.patch

Nit: do_firewall_check is true by default in the yaml but is false in config.

Attached patch is on top of the original patch by brandon,
Sets the hibernate state (Dead state) as step 1 in joinTokenRing which will be 
later changed at the end of the method to normal.
The main fix (IMHO) is in the OTCP where we timeout so we can reconnect, when 
the socket hangs and makes the connection un-useable during temp network 
partition.

Please note: this patch changes the streaming_socket_timeout_in_ms 
configuration to socket_timeout_in_ms and reuses them.

 Gossip does not heal after a temporary partition at startup
 ---

 Key: CASSANDRA-6590
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6590
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Vijay
 Fix For: 2.0.5

 Attachments: 0001-CASSANDRA-6590.patch, 6590_disable_echo.txt


 See CASSANDRA-6571 for background.  If a node is partitioned on startup when 
 the echo command is sent, but then the partition heals, the halves of the 
 partition will never mark each other up despite being able to communicate.  
 This stems from CASSANDRA-3533.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6571) Quickly restarted nodes can list others as down indefinitely

2014-01-11 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868848#comment-13868848
 ] 

Vijay commented on CASSANDRA-6571:
--

We had this discussion in IRC, we need to test this before...

To clarify, 
(1) is same as in the description 
{quote}
I tried to fix this by defaulting isAlive=false in the constructor of 
EndpointState.
{quote}
(2) we need to recover the receiving node from the hang state (while writing to 
the socket), by restarting the connections...

 Quickly restarted nodes can list others as down indefinitely
 

 Key: CASSANDRA-6571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6571
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Richard Low
Assignee: sankalp kohli
  Labels: gossip
 Fix For: 2.0.5

 Attachments: 6571.txt


 In a healthy cluster, if a node is restarted quickly, it may list other nodes 
 as down when it comes back up and never list them as up.  I reproduced it on 
 a small cluster running in Docker containers.
 1. Have a healthy 5 node cluster:
 {quote}
 $ nodetool status
 Datacenter: datacenter1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  Owns (effective)  Host ID 
   Rack
 UN  192.168.100.140.88 KB   256 38.3% 
 92930ef6-1b29-49f0-a8cd-f962b55dca1b  rack1
 UN  192.168.100.254  80.63 KB   256 39.6% 
 ef15a717-9d60-48fb-80a9-e0973abdd55e  rack1
 UN  192.168.100.387.78 KB   256 40.8% 
 4e6765db-97ed-4429-a9f4-8e29de247f18  rack1
 UN  192.168.100.275.22 KB   256 40.6% 
 e89bc581-5345-4abd-88ba-7018371940fc  rack1
 UN  192.168.100.480.83 KB   256 40.8% 
 466a9798-d484-44f0-aae8-bb2b78d80331  rack1
 {quote}
 2. Kill a node and restart it quickly:
 bq. kill -9 pid  start-cassandra
 3. Wait for the node to come back and more often than not, it lists one or 
 more other nodes as down indefinitely:
 {quote}
 $ nodetool status
 Datacenter: datacenter1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  Owns (effective)  Host ID 
   Rack
 UN  192.168.100.140.88 KB   256 38.3% 
 92930ef6-1b29-49f0-a8cd-f962b55dca1b  rack1
 UN  192.168.100.254  80.63 KB   256 39.6% 
 ef15a717-9d60-48fb-80a9-e0973abdd55e  rack1
 DN  192.168.100.387.78 KB   256 40.8% 
 4e6765db-97ed-4429-a9f4-8e29de247f18  rack1
 DN  192.168.100.275.22 KB   256 40.6% 
 e89bc581-5345-4abd-88ba-7018371940fc  rack1
 DN  192.168.100.480.83 KB   256 40.8% 
 466a9798-d484-44f0-aae8-bb2b78d80331  rack1
 {quote}
 From trace logging, here's what I think is going on:
 1. The nodes are all happy gossiping
 2. Restart node X. When it comes back up it starts gossiping with the other 
 nodes.
 3. Before node X marks node Y as alive, X sends an echo message (introduced 
 in CASSANDRA-3533)
 4. The echo message is received by Y. To reply, Y attempts to reuse a 
 connection to X. The connection is dead, but the message is attempted anyway 
 but fails.
 5. X never receives the echo back, so Y isn't marked as alive.
 6. X gossips to Y again, but because the endpoint isAlive() returns true, it 
 never calls markAlive() to properly set Y as alive.
 I tried to fix this by defaulting isAlive=false in the constructor of 
 EndpointState. This made it less likely to mark a node as down but it still 
 happens.
 The workaround is to leave a node down for a while so the connections die on 
 the remaining nodes.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6571) Quickly restarted nodes can list others as down indefinitely

2014-01-10 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868685#comment-13868685
 ] 

Vijay commented on CASSANDRA-6571:
--

Not sure if this will fix it, because the remote machine has not responded back 
(echo message response). 
1) I think we need to always mark the nodes as dead and mark it up only after 
we received the echo response
2) I think we need to check or reset the socket in the receiving side, may be 
need to markDead (or retry the message after x seconds?)

May be because we removed the hibernate during restarts, this issue shows up? 
(we are not restarting the states) == [~brandon.williams]
I think the hang on the echo response (socket.write())

 Quickly restarted nodes can list others as down indefinitely
 

 Key: CASSANDRA-6571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6571
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Richard Low
Assignee: sankalp kohli
  Labels: gossip
 Fix For: 2.0.5


 In a healthy cluster, if a node is restarted quickly, it may list other nodes 
 as down when it comes back up and never list them as up.  I reproduced it on 
 a small cluster running in Docker containers.
 1. Have a healthy 5 node cluster:
 {quote}
 $ nodetool status
 Datacenter: datacenter1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  Owns (effective)  Host ID 
   Rack
 UN  192.168.100.140.88 KB   256 38.3% 
 92930ef6-1b29-49f0-a8cd-f962b55dca1b  rack1
 UN  192.168.100.254  80.63 KB   256 39.6% 
 ef15a717-9d60-48fb-80a9-e0973abdd55e  rack1
 UN  192.168.100.387.78 KB   256 40.8% 
 4e6765db-97ed-4429-a9f4-8e29de247f18  rack1
 UN  192.168.100.275.22 KB   256 40.6% 
 e89bc581-5345-4abd-88ba-7018371940fc  rack1
 UN  192.168.100.480.83 KB   256 40.8% 
 466a9798-d484-44f0-aae8-bb2b78d80331  rack1
 {quote}
 2. Kill a node and restart it quickly:
 bq. kill -9 pid  start-cassandra
 3. Wait for the node to come back and more often than not, it lists one or 
 more other nodes as down indefinitely:
 {quote}
 $ nodetool status
 Datacenter: datacenter1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  Owns (effective)  Host ID 
   Rack
 UN  192.168.100.140.88 KB   256 38.3% 
 92930ef6-1b29-49f0-a8cd-f962b55dca1b  rack1
 UN  192.168.100.254  80.63 KB   256 39.6% 
 ef15a717-9d60-48fb-80a9-e0973abdd55e  rack1
 DN  192.168.100.387.78 KB   256 40.8% 
 4e6765db-97ed-4429-a9f4-8e29de247f18  rack1
 DN  192.168.100.275.22 KB   256 40.6% 
 e89bc581-5345-4abd-88ba-7018371940fc  rack1
 DN  192.168.100.480.83 KB   256 40.8% 
 466a9798-d484-44f0-aae8-bb2b78d80331  rack1
 {quote}
 From trace logging, here's what I think is going on:
 1. The nodes are all happy gossiping
 2. Restart node X. When it comes back up it starts gossiping with the other 
 nodes.
 3. Before node X marks node Y as alive, X sends an echo message (introduced 
 in CASSANDRA-3533)
 4. The echo message is received by Y. To reply, Y attempts to reuse a 
 connection to X. The connection is dead, but the message is attempted anyway 
 but fails.
 5. X never receives the echo back, so Y isn't marked as alive.
 6. X gossips to Y again, but because the endpoint isAlive() returns true, it 
 never calls markAlive() to properly set Y as alive.
 I tried to fix this by defaulting isAlive=false in the constructor of 
 EndpointState. This made it less likely to mark a node as down but it still 
 happens.
 The workaround is to leave a node down for a while so the connections die on 
 the remaining nodes.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Comment Edited] (CASSANDRA-6571) Quickly restarted nodes can list others as down indefinitely

2014-01-10 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868685#comment-13868685
 ] 

Vijay edited comment on CASSANDRA-6571 at 1/11/14 6:46 AM:
---

Not sure if this will fix it, because the remote machine has not responded back 
(echo message response). 
1) I think we need to always mark the nodes as dead and mark it up only after 
we received the echo response
2) I think we need to check or reset the socket in the receiving side, may be 
need to markDead (or retry the message after x seconds?)

May be because we removed the hibernate during restarts, this issue shows up? 
(we are not resetting the states) == [~brandon.williams]
I think the hang on the echo response (socket.write())


was (Author: vijay2...@yahoo.com):
Not sure if this will fix it, because the remote machine has not responded back 
(echo message response). 
1) I think we need to always mark the nodes as dead and mark it up only after 
we received the echo response
2) I think we need to check or reset the socket in the receiving side, may be 
need to markDead (or retry the message after x seconds?)

May be because we removed the hibernate during restarts, this issue shows up? 
(we are not restarting the states) == [~brandon.williams]
I think the hang on the echo response (socket.write())

 Quickly restarted nodes can list others as down indefinitely
 

 Key: CASSANDRA-6571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6571
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Richard Low
Assignee: sankalp kohli
  Labels: gossip
 Fix For: 2.0.5


 In a healthy cluster, if a node is restarted quickly, it may list other nodes 
 as down when it comes back up and never list them as up.  I reproduced it on 
 a small cluster running in Docker containers.
 1. Have a healthy 5 node cluster:
 {quote}
 $ nodetool status
 Datacenter: datacenter1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  Owns (effective)  Host ID 
   Rack
 UN  192.168.100.140.88 KB   256 38.3% 
 92930ef6-1b29-49f0-a8cd-f962b55dca1b  rack1
 UN  192.168.100.254  80.63 KB   256 39.6% 
 ef15a717-9d60-48fb-80a9-e0973abdd55e  rack1
 UN  192.168.100.387.78 KB   256 40.8% 
 4e6765db-97ed-4429-a9f4-8e29de247f18  rack1
 UN  192.168.100.275.22 KB   256 40.6% 
 e89bc581-5345-4abd-88ba-7018371940fc  rack1
 UN  192.168.100.480.83 KB   256 40.8% 
 466a9798-d484-44f0-aae8-bb2b78d80331  rack1
 {quote}
 2. Kill a node and restart it quickly:
 bq. kill -9 pid  start-cassandra
 3. Wait for the node to come back and more often than not, it lists one or 
 more other nodes as down indefinitely:
 {quote}
 $ nodetool status
 Datacenter: datacenter1
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  Owns (effective)  Host ID 
   Rack
 UN  192.168.100.140.88 KB   256 38.3% 
 92930ef6-1b29-49f0-a8cd-f962b55dca1b  rack1
 UN  192.168.100.254  80.63 KB   256 39.6% 
 ef15a717-9d60-48fb-80a9-e0973abdd55e  rack1
 DN  192.168.100.387.78 KB   256 40.8% 
 4e6765db-97ed-4429-a9f4-8e29de247f18  rack1
 DN  192.168.100.275.22 KB   256 40.6% 
 e89bc581-5345-4abd-88ba-7018371940fc  rack1
 DN  192.168.100.480.83 KB   256 40.8% 
 466a9798-d484-44f0-aae8-bb2b78d80331  rack1
 {quote}
 From trace logging, here's what I think is going on:
 1. The nodes are all happy gossiping
 2. Restart node X. When it comes back up it starts gossiping with the other 
 nodes.
 3. Before node X marks node Y as alive, X sends an echo message (introduced 
 in CASSANDRA-3533)
 4. The echo message is received by Y. To reply, Y attempts to reuse a 
 connection to X. The connection is dead, but the message is attempted anyway 
 but fails.
 5. X never receives the echo back, so Y isn't marked as alive.
 6. X gossips to Y again, but because the endpoint isAlive() returns true, it 
 never calls markAlive() to properly set Y as alive.
 I tried to fix this by defaulting isAlive=false in the constructor of 
 EndpointState. This made it less likely to mark a node as down but it still 
 happens.
 The workaround is to leave a node down for a while so the connections die on 
 the remaining nodes.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-4914) Aggregate functions in CQL

2014-01-07 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-4914:
-

Assignee: (was: Vijay)

 Aggregate functions in CQL
 --

 Key: CASSANDRA-4914
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4914
 Project: Cassandra
  Issue Type: New Feature
Reporter: Vijay
 Fix For: 2.1


 The requirement is to do aggregation of data in Cassandra (Wide row of column 
 values of int, double, float etc).
 With some basic agree gate functions like AVG, SUM, Mean, Min, Max, etc (for 
 the columns within a row).
 Example:
 SELECT * FROM emp WHERE empID IN (130) ORDER BY deptID DESC;  
   
  empid | deptid | first_name | last_name | salary
 ---+++---+
130 |  3 | joe| doe   |   10.1
130 |  2 | joe| doe   |100
130 |  1 | joe| doe   |  1e+03
  
 SELECT sum(salary), empid FROM emp WHERE empID IN (130);  
   
  sum(salary) | empid
 -+
1110.1|  130



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6544) Reduce GC activity during compaction

2014-01-07 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864746#comment-13864746
 ] 

Vijay commented on CASSANDRA-6544:
--

Hi Jonathan, Yep in addition... if we can create a a offheap slab allocator 
(and reuse the slabs) it will help in reducing the memory fragmentation. Hope 
that make sure.

 Reduce GC activity during compaction
 

 Key: CASSANDRA-6544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6544
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Vijay
Assignee: Vijay
 Fix For: 2.1


 We are noticing increase in P99 while the compactions are running at full 
 stream. Most of it is because of the increased GC activity (followed by full 
 GC).
 The obvious solution/work around is to throttle the compactions, but with 
 SSD's we can get more disk bandwidth for reads and compactions.
 It will be nice to move the compaction object allocations off heap. First 
 thing to do might be create a Offheap Slab allocator with the size as the 
 compaction in memory size and recycle it. 
 Also we might want to make it configurable so folks can disable it when they 
 don't have off-heap memory to reserve.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Comment Edited] (CASSANDRA-6544) Reduce GC activity during compaction

2014-01-07 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864746#comment-13864746
 ] 

Vijay edited comment on CASSANDRA-6544 at 1/7/14 10:05 PM:
---

Hi Jonathan, Yep in addition... if we can create a a offheap slab allocator 
(and reuse the slabs) it will help in reducing the memory fragmentation. Hope 
that make sense.


was (Author: vijay2...@yahoo.com):
Hi Jonathan, Yep in addition... if we can create a a offheap slab allocator 
(and reuse the slabs) it will help in reducing the memory fragmentation. Hope 
that make sure.

 Reduce GC activity during compaction
 

 Key: CASSANDRA-6544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6544
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Vijay
Assignee: Vijay
 Fix For: 2.1


 We are noticing increase in P99 while the compactions are running at full 
 stream. Most of it is because of the increased GC activity (followed by full 
 GC).
 The obvious solution/work around is to throttle the compactions, but with 
 SSD's we can get more disk bandwidth for reads and compactions.
 It will be nice to move the compaction object allocations off heap. First 
 thing to do might be create a Offheap Slab allocator with the size as the 
 compaction in memory size and recycle it. 
 Also we might want to make it configurable so folks can disable it when they 
 don't have off-heap memory to reserve.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6544) Reduce GC activity during compaction

2014-01-07 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865011#comment-13865011
 ] 

Vijay commented on CASSANDRA-6544:
--

Sure, working on it. Thanks!

 Reduce GC activity during compaction
 

 Key: CASSANDRA-6544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6544
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Vijay
Assignee: Vijay
 Fix For: 2.1


 We are noticing increase in P99 while the compactions are running at full 
 stream. Most of it is because of the increased GC activity (followed by full 
 GC).
 The obvious solution/work around is to throttle the compactions, but with 
 SSD's we can get more disk bandwidth for reads and compactions.
 It will be nice to move the compaction object allocations off heap. First 
 thing to do might be create a Offheap Slab allocator with the size as the 
 compaction in memory size and recycle it. 
 Also we might want to make it configurable so folks can disable it when they 
 don't have off-heap memory to reserve.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (CASSANDRA-6544) Reduce GC activity during compaction

2014-01-02 Thread Vijay (JIRA)
Vijay created CASSANDRA-6544:


 Summary: Reduce GC activity during compaction
 Key: CASSANDRA-6544
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6544
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Vijay
Assignee: Vijay
 Fix For: 2.1


We are noticing increase in P99 while the compactions are running at full 
stream. Most of it is because of the increased GC activity (followed by full 
GC).

The obvious solution/work around is to throttle the compactions, but with SSD's 
we can get more disk bandwidth for reads and compactions.

It will be nice to move the compaction object allocations off heap. First thing 
to do might be create a Offheap Slab allocator with the size as the compaction 
in memory size and recycle it. 

Also we might want to make it configurable so folks can disable it when they 
don't have off-heap memory to reserve.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-12-04 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838796#comment-13838796
 ] 

Vijay commented on CASSANDRA-5549:
--

Well it is not exactly a constant overhead you might want to look at 
o.a.c.u.ObjectSizes (CASSANDRA-4860)...

 Remove Table.switchLock
 ---

 Key: CASSANDRA-5549
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Benedict
  Labels: performance
 Fix For: 2.1

 Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png


 As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
 path.  ReentrantReadWriteLock is not lightweight, even if there is no 
 contention per se between readers and writers of the lock (in Cassandra, 
 memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-12-03 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838627#comment-13838627
 ] 

Vijay commented on CASSANDRA-5549:
--

{quote}
Without switch lock, we won't have anything preventing writes coming through 
when we're over-burdened with memory use by memtables.
{quote}
I should be missing something, how does the switch RW Lock to a kind of CAS 
operation change this schematics?
Are we talking about additional requirement/enhancements to this ticket?

{quote}
 When we flush a memtable we release permits equal to the estimated size of 
each RM
{quote}
IMHO, that might not be good enough since Java's memory over head is not 
considered. And calculating the object size is not cheap either

 Remove Table.switchLock
 ---

 Key: CASSANDRA-5549
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Vijay
  Labels: performance
 Fix For: 2.1

 Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png


 As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
 path.  ReentrantReadWriteLock is not lightweight, even if there is no 
 contention per se between readers and writers of the lock (in Cassandra, 
 memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain

2013-11-10 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13818534#comment-13818534
 ] 

Vijay commented on CASSANDRA-5911:
--

Pushed the changes to https://github.com/Vijay2win/cassandra/commits/5911-v3

Looks like we only flush/switch when the auto snapshot is enabled (when 
truncate is called)... Fixed
Added a test case to test for truncate force commit log switch.
Force flush and other commands are still a best effort switch.


 Commit logs are not removed after nodetool flush or nodetool drain
 --

 Key: CASSANDRA-5911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: J.B. Langston
Assignee: Vijay
Priority: Minor
 Fix For: 2.0.3

 Attachments: 0001-5911-v2.patch, 0001-5911-v3.patch, 
 0001-CASSANDRA-5911.patch, 6528_140171_knwmuqxe9bjv5re_system.log


 Commit logs are not removed after nodetool flush or nodetool drain. This can 
 lead to unnecessary commit log replay during startup.  I've reproduced this 
 on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a 
 Solr-indexed column family in DSE, each replayed mutation has to be reindexed 
 which can make startup take a long time (on the order of 20-30 min).
 Reproduction follows:
 {code}
 jblangston:bin jblangston$ ./cassandra  /dev/null
 jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000  
 /dev/null
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool flush
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool drain
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ pkill java
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ ./cassandra -f | grep Replaying
  INFO 10:03:42,915 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
  INFO 10:03:42,922 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log
  INFO 10:03:43,912 Replaying 
 

[jira] [Commented] (CASSANDRA-6206) Thrift socket listen backlog

2013-11-10 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13818558#comment-13818558
 ] 

Vijay commented on CASSANDRA-6206:
--

Committed to trunk,

But a partially. Talked to [~xedin], once we have the setting in HSHA we can 
resolve this ticket.

 Thrift socket listen backlog
 

 Key: CASSANDRA-6206
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6206
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Debian Linux, Java 7
Reporter: Nenad Merdanovic
 Fix For: 2.0.3

 Attachments: cassandra-v2.patch, cassandra.patch


 Although Thrift is a depreciated method of accessing Cassandra, default 
 backlog is way too low on that socket. It shouldn't be a problem to implement 
 it and I am including a POC patch for this (sorry, really low on time with 
 limited Java knowledge so just to give an idea).
 This is an old report which was never addressed and the bug remains till this 
 day, except in my case I have a much larger scale application with 3rd party 
 software which I cannot modify to include connection pooling:
 https://issues.apache.org/jira/browse/CASSANDRA-1663
 There is also a pending change in the Thrift itself which Cassandra should be 
 able to use for parts using TServerSocket (SSL):
 https://issues.apache.org/jira/browse/THRIFT-1868



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-11-10 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13818685#comment-13818685
 ] 

Vijay commented on CASSANDRA-3578:
--

https://github.com/Vijay2win/cassandra/commits/3578-v2 addresses most of the 
concerns here.

Only thing we have discussed here and not been addressed yet is the aggressive 
allocation and deallocation of commit logs but not sure if its needed yet... 

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-3578) Multithreaded commitlog

2013-11-07 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816281#comment-13816281
 ] 

Vijay edited comment on CASSANDRA-3578 at 11/7/13 7:04 PM:
---

{quote}
You only call force when you think there is something dirty, not when the 
buffer does, 
{quote}
Ahaa that might be an over sight, we can call buffer.force all the time and 
let the OS decide if it has to sync the filesystem. If we do that then we just 
need to stop during the recovery when we have a corrupted columns (which are 
because the os or the force didnt complete the fsync completely).

{quote}
How ugly would it get to either wait for previous (in CL order) mutations 
before syncing
{quote}
We can do that with another counter which holds the bytes written by all the 
threads and comparing it with the allocated. We dont need lock in that case.


was (Author: vijay2...@yahoo.com):
{quote}
You only call force when you think there is something dirty, not when the 
buffer does, 
{quote}
Ahaa that might be an over sight, we can call buffer.force all the time and 
let the OS decide if it has to sync the filesystem. If we do that then we just 
need to stop when we have a corrupted columns (which are because the os or the 
force didnt complete the fsync completely).

{quote}
How ugly would it get to either wait for previous (in CL order) mutations 
before syncing
{quote}
We can do that with another counter which holds the bytes written by all the 
threads and comparing it with the allocated. We dont need lock in that case.

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-11-07 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816281#comment-13816281
 ] 

Vijay commented on CASSANDRA-3578:
--

{quote}
You only call force when you think there is something dirty, not when the 
buffer does, 
{quote}
Ahaa that might be an over sight, we can call buffer.force all the time and 
let the OS decide if it has to sync the filesystem. If we do that then we just 
need to stop when we have a corrupted columns (which are because the os or the 
force didnt complete the fsync completely).

{quote}
How ugly would it get to either wait for previous (in CL order) mutations 
before syncing
{quote}
We can do that with another counter which holds the bytes written by all the 
threads and comparing it with the allocated. We dont need lock in that case.

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-3578) Multithreaded commitlog

2013-11-07 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816281#comment-13816281
 ] 

Vijay edited comment on CASSANDRA-3578 at 11/7/13 7:05 PM:
---

{quote}
You only call force when you think there is something dirty, not when the 
buffer does, 
{quote}
Ahaa that might be an over sight, we can call buffer.force all the time and 
let the OS decide if it has to sync the filesystem. If we do that then we just 
need to stop during the recovery/replay when we see a corrupted columns (which 
are because the os or the force didn't complete the fsync completely).

{quote}
How ugly would it get to either wait for previous (in CL order) mutations 
before syncing
{quote}
We can do that with another counter which holds the bytes written by all the 
threads and comparing it with the allocated. We dont need lock in that case.


was (Author: vijay2...@yahoo.com):
{quote}
You only call force when you think there is something dirty, not when the 
buffer does, 
{quote}
Ahaa that might be an over sight, we can call buffer.force all the time and 
let the OS decide if it has to sync the filesystem. If we do that then we just 
need to stop during the recovery when we have a corrupted columns (which are 
because the os or the force didnt complete the fsync completely).

{quote}
How ugly would it get to either wait for previous (in CL order) mutations 
before syncing
{quote}
We can do that with another counter which holds the bytes written by all the 
threads and comparing it with the allocated. We dont need lock in that case.

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-11-07 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816344#comment-13816344
 ] 

Vijay commented on CASSANDRA-3578:
--

{quote}
we could have A allocate, B begin sync, C allocate, C write, B see counters 
equal
{quote}
I am talking about count all the allocation and written, within a segment 
Which is (A + B + C)  != (A + B) (which means C or someone else is still 
writing).

{quote}
we didn't know there were unfinished writes behind us
{quote}
Thats fine we will skip those, thats what the current implementation does too, 
if you are writing in a sequence and the server stops... the commits which 
where in the queue are not written We are just moving that queue to the 
buffer. 
Practically this is less of an concern because there is few nano's out of sync.

Anyways i should stop selling :)

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-11-07 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816935#comment-13816935
 ] 

Vijay commented on CASSANDRA-3578:
--

:)

{quote}
The current implementation never ACKs a message it cannot later replay under 
batch
{quote}

We don't guarantee that in PeriodicCommitLogExecutorService, all this time i 
was trying to optimize for the general case 
(PeriodicCommitLogExecutorService)
For BatchCommitLogExecutorService in my patch 
(https://github.com/Vijay2win/cassandra/commit/0d982e840145d466b8bcbc863d6218b24b0842ad#diff-05c1e4fd86fea19b8e0552b1f289be85L191)
 does ACK only after we write (we wait for sync after that write), and hence 
the write and sync of that particular write happens before acking back.

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-11-06 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13815618#comment-13815618
 ] 

Vijay commented on CASSANDRA-3578:
--

Hi Benedict, archiver.maybeArchive(segment.getPath(), segment.getName()) is a 
blocking call and will need to be a separate thread it might involve user 
defined archival.

{quote}
sync() would mark things as flushed to disk that weren't, which would result in 
log messages never being persisted
{quote}
My understand is that Calling force will sync the dirty pages and if we do 
a concurrent writes to the same page they will be marked as dirty and will be 
synched in the next call, how will we loose the log messages?

I still like the original approach :) of creating files (it may be just me) 
because of simplicity and we can be aggressive in allocator threads similar to 
your patch (to create empty files and deleting them).


 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain

2013-11-04 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5911:
-

Attachment: 0001-5911-v3.patch

v3 fixes the activateNextArchiveSegment issue in v2. I had to modify the test 
cases to avoid initialization of the commit log before recover method is 
called. Hope thats ok.

 Commit logs are not removed after nodetool flush or nodetool drain
 --

 Key: CASSANDRA-5911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: J.B. Langston
Assignee: Vijay
Priority: Minor
 Fix For: 2.0.3

 Attachments: 0001-5911-v2.patch, 0001-5911-v3.patch, 
 0001-CASSANDRA-5911.patch, 6528_140171_knwmuqxe9bjv5re_system.log


 Commit logs are not removed after nodetool flush or nodetool drain. This can 
 lead to unnecessary commit log replay during startup.  I've reproduced this 
 on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a 
 Solr-indexed column family in DSE, each replayed mutation has to be reindexed 
 which can make startup take a long time (on the order of 20-30 min).
 Reproduction follows:
 {code}
 jblangston:bin jblangston$ ./cassandra  /dev/null
 jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000  
 /dev/null
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool flush
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool drain
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ pkill java
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ ./cassandra -f | grep Replaying
  INFO 10:03:42,915 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
  INFO 10:03:42,922 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log
  INFO 10:03:43,912 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log
  INFO 10:03:43,912 Replaying 
 

[jira] [Commented] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain

2013-10-31 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811010#comment-13811010
 ] 

Vijay commented on CASSANDRA-5911:
--

Hi Jonathan, That was just a oversight... i missed that recover() set's that 
flag. Let me add another flag for unit tests.

 Commit logs are not removed after nodetool flush or nodetool drain
 --

 Key: CASSANDRA-5911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: J.B. Langston
Assignee: Vijay
Priority: Minor
 Fix For: 2.0.3

 Attachments: 0001-5911-v2.patch, 0001-CASSANDRA-5911.patch, 
 6528_140171_knwmuqxe9bjv5re_system.log


 Commit logs are not removed after nodetool flush or nodetool drain. This can 
 lead to unnecessary commit log replay during startup.  I've reproduced this 
 on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a 
 Solr-indexed column family in DSE, each replayed mutation has to be reindexed 
 which can make startup take a long time (on the order of 20-30 min).
 Reproduction follows:
 {code}
 jblangston:bin jblangston$ ./cassandra  /dev/null
 jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000  
 /dev/null
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool flush
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool drain
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ pkill java
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ ./cassandra -f | grep Replaying
  INFO 10:03:42,915 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
  INFO 10:03:42,922 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log
  INFO 10:03:43,912 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log
  INFO 10:03:43,912 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log
  INFO 10:03:43,912 Replaying 
 

[jira] [Updated] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain

2013-10-28 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5911:
-

Attachment: 0001-5911-v2.patch

Attached patch has warn message and fix the test to do blocking wait till the 
new segment arrives. Also little more logic to makes sure if we really need to 
switch...

{code}
if (!activeSegment.isUnused()  activeSegment.id == 
context.segment)
{
if (allocator.numSegmentsAvailable()  0 || 
allocator.createReserveSegments)
activateNextArchiveSegment();
else
logger.warn(no active commitlog to switch, additional 
mutations might be replayed if the node is restarted immediatly. See: 
CASSANDRA-5911);
}
{code}

 Commit logs are not removed after nodetool flush or nodetool drain
 --

 Key: CASSANDRA-5911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: J.B. Langston
Assignee: Vijay
Priority: Minor
 Fix For: 2.0.3

 Attachments: 0001-5911-v2.patch, 0001-CASSANDRA-5911.patch, 
 6528_140171_knwmuqxe9bjv5re_system.log


 Commit logs are not removed after nodetool flush or nodetool drain. This can 
 lead to unnecessary commit log replay during startup.  I've reproduced this 
 on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a 
 Solr-indexed column family in DSE, each replayed mutation has to be reindexed 
 which can make startup take a long time (on the order of 20-30 min).
 Reproduction follows:
 {code}
 jblangston:bin jblangston$ ./cassandra  /dev/null
 jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000  
 /dev/null
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool flush
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool drain
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ pkill java
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ ./cassandra -f | grep Replaying
  INFO 10:03:42,915 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
  INFO 10:03:42,922 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log
  INFO 10:03:43,911 Replaying 
 

[jira] [Commented] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain

2013-10-28 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13807262#comment-13807262
 ] 

Vijay commented on CASSANDRA-5911:
--

[~rcoli] I think, the issue was there even before 1.0

 Commit logs are not removed after nodetool flush or nodetool drain
 --

 Key: CASSANDRA-5911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: J.B. Langston
Assignee: Vijay
Priority: Minor
 Fix For: 2.0.3

 Attachments: 0001-5911-v2.patch, 0001-CASSANDRA-5911.patch, 
 6528_140171_knwmuqxe9bjv5re_system.log


 Commit logs are not removed after nodetool flush or nodetool drain. This can 
 lead to unnecessary commit log replay during startup.  I've reproduced this 
 on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a 
 Solr-indexed column family in DSE, each replayed mutation has to be reindexed 
 which can make startup take a long time (on the order of 20-30 min).
 Reproduction follows:
 {code}
 jblangston:bin jblangston$ ./cassandra  /dev/null
 jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000  
 /dev/null
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool flush
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool drain
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ pkill java
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ ./cassandra -f | grep Replaying
  INFO 10:03:42,915 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
  INFO 10:03:42,922 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log
  INFO 10:03:43,912 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log
  INFO 10:03:43,912 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log
  INFO 10:03:43,912 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
 {code}



--
This 

[jira] [Comment Edited] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain

2013-10-28 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13807257#comment-13807257
 ] 

Vijay edited comment on CASSANDRA-5911 at 10/28/13 9:36 PM:


Attached patch has warn message, also make unit tests to do blocking wait till 
the new segment arrives. Also little more logic to makes sure if we really need 
to switch...

{code}
if (!activeSegment.isUnused()  activeSegment.id == 
context.segment)
{
if (allocator.numSegmentsAvailable()  0 || 
allocator.createReserveSegments)
activateNextArchiveSegment();
else
logger.warn(no active commitlog to switch, additional 
mutations might be replayed if the node is restarted immediatly. See: 
CASSANDRA-5911);
}
{code}


was (Author: vijay2...@yahoo.com):
Attached patch has warn message and fix the test to do blocking wait till the 
new segment arrives. Also little more logic to makes sure if we really need to 
switch...

{code}
if (!activeSegment.isUnused()  activeSegment.id == 
context.segment)
{
if (allocator.numSegmentsAvailable()  0 || 
allocator.createReserveSegments)
activateNextArchiveSegment();
else
logger.warn(no active commitlog to switch, additional 
mutations might be replayed if the node is restarted immediatly. See: 
CASSANDRA-5911);
}
{code}

 Commit logs are not removed after nodetool flush or nodetool drain
 --

 Key: CASSANDRA-5911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: J.B. Langston
Assignee: Vijay
Priority: Minor
 Fix For: 2.0.3

 Attachments: 0001-5911-v2.patch, 0001-CASSANDRA-5911.patch, 
 6528_140171_knwmuqxe9bjv5re_system.log


 Commit logs are not removed after nodetool flush or nodetool drain. This can 
 lead to unnecessary commit log replay during startup.  I've reproduced this 
 on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a 
 Solr-indexed column family in DSE, each replayed mutation has to be reindexed 
 which can make startup take a long time (on the order of 20-30 min).
 Reproduction follows:
 {code}
 jblangston:bin jblangston$ ./cassandra  /dev/null
 jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000  
 /dev/null
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool flush
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool drain
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ pkill java
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ ./cassandra -f | grep Replaying
  INFO 10:03:42,915 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
  INFO 10:03:42,922 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
  INFO 10:03:43,908 Replaying 
 

[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-10-27 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13806473#comment-13806473
 ] 

Vijay commented on CASSANDRA-3578:
--

Other option is replace recycle with discard, we can always create new segments 
and not recycle (instead discard, in sync thread)... we can get rid of the 
header. We still need to skip the commit log recovery, if its corrupted/partial 
write...

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-10-26 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13806193#comment-13806193
 ] 

Vijay commented on CASSANDRA-3578:
--

Hi Jonathan, 

{quote}
 I must be missing where this gets persisted back to disk
{quote}
First 4 bytes at the beginning of the file, may be we can get rid of it and 
stop when the size and checksum doesn't match?

But the header is pretty light, and will need one additional seek every 10 
seconds (it just marks the end of the file at the beginning of the file just 
before fsync).

{quote}
 I think allocate needs to write the length to the segment before returning
{quote}
The first thing the thread does after allocation is writing the size and its 
checksum are we talking about synchronization in the allocation, so only 1 
thread writes the size and end (-1)? currently the atomic operation is only on 
AtomicLong (position) 

We might be able to do something similar to the current implementation and 
without headers with a Read Write lock, where write lock will ensure that we 
write the end (write -1 to mark the end, lock to ensure no one else overwrites 
the end marker) just before fsync (but the OS can also write before we force 
the buffers too)... also that might not be desirable, since it might stall the 
system like the current one.

Not sure if the header is that bad though  Let me know what you think, 
thanks!

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-10-22 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801536#comment-13801536
 ] 

Vijay commented on CASSANDRA-3578:
--

{quote}
It slows down the mutation thread by waiting for commitlog writing mutation is 
done
{quote}
Well depends on where you are bottlenecking, updating mmap buffer is not that 
expensive and its usually cpu intensive, in the other hand it reduces the 
variability as shown in the stress.

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-3578) Multithreaded commitlog

2013-10-21 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-3578:
-

Attachment: ComitlogStress.java

Micro benchmark code attached, try's to update commit log as fast as possible 
(choose a small mutation to avoid active segment starvation, we are still 
creating ~1 CL per second).

It was creating a commit log segment per second, not sure if this is valid 
comparison to real world at this time. But the good part it is that it the 
patch consumes less memory and has a less swings. 

http://pastebin.com/WeJ0QL8p

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-3578) Multithreaded commitlog

2013-10-21 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-3578:
-

Attachment: Multi-Threded-CL.png
Current-CL.png

Hi Jonathan, Ohhh you can ignore those i was experimenting few other things 
(like UUID.random was locking and the numbers where all bad, etc) and hence 
added those metrics (didn't mean to confuse). But if you are interested with 
the GC profile please see the attached.

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6206) Thrift socket listen backlog

2013-10-21 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800956#comment-13800956
 ] 

Vijay commented on CASSANDRA-6206:
--

Nenad, Can you change have the default backlog config to be java default?

 Thrift socket listen backlog
 

 Key: CASSANDRA-6206
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6206
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Debian Linux, Java 7
Reporter: Nenad Merdanovic
 Fix For: 2.0.2

 Attachments: cassandra.patch


 Although Thrift is a depreciated method of accessing Cassandra, default 
 backlog is way too low on that socket. It shouldn't be a problem to implement 
 it and I am including a POC patch for this (sorry, really low on time with 
 limited Java knowledge so just to give an idea).
 This is an old report which was never addressed and the bug remains till this 
 day, except in my case I have a much larger scale application with 3rd party 
 software which I cannot modify to include connection pooling:
 https://issues.apache.org/jira/browse/CASSANDRA-1663
 There is also a pending change in the Thrift itself which Cassandra should be 
 able to use for parts using TServerSocket (SSL):
 https://issues.apache.org/jira/browse/THRIFT-1868



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6206) Thrift socket listen backlog

2013-10-21 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800991#comment-13800991
 ] 

Vijay commented on CASSANDRA-6206:
--

Hi Nenad, Yep, thanks!

 Thrift socket listen backlog
 

 Key: CASSANDRA-6206
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6206
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Debian Linux, Java 7
Reporter: Nenad Merdanovic
 Fix For: 2.0.2

 Attachments: cassandra.patch


 Although Thrift is a depreciated method of accessing Cassandra, default 
 backlog is way too low on that socket. It shouldn't be a problem to implement 
 it and I am including a POC patch for this (sorry, really low on time with 
 limited Java knowledge so just to give an idea).
 This is an old report which was never addressed and the bug remains till this 
 day, except in my case I have a much larger scale application with 3rd party 
 software which I cannot modify to include connection pooling:
 https://issues.apache.org/jira/browse/CASSANDRA-1663
 There is also a pending change in the Thrift itself which Cassandra should be 
 able to use for parts using TServerSocket (SSL):
 https://issues.apache.org/jira/browse/THRIFT-1868



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-10-21 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801163#comment-13801163
 ] 

Vijay commented on CASSANDRA-3578:
--

Yeah we do CAS instead of queue.take() in http://goo.gl/JbNWM5 , but we do 
allocate new segments every second, not sure why the dip... will do more 
profiling on it.

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-10-21 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801496#comment-13801496
 ] 

Vijay commented on CASSANDRA-3578:
--

Found the bottleneck in the current!
Actually this happens during buffer.force()... CL.add queue is capped by 
commitlog_periodic_queue_size
{code}
public int commitlog_periodic_queue_size = 1024 * 
FBUtilities.getAvailableProcessors();
{code}

Hence, till we flush() (is called every 10 seconds) the writes to the CL us 
blocked. Hope that makes sense...

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-3578) Multithreaded commitlog

2013-10-21 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801496#comment-13801496
 ] 

Vijay edited comment on CASSANDRA-3578 at 10/22/13 5:28 AM:


Found the bottleneck in the current!
Actually this happens during buffer.force()... CL.add queue is capped by 
commitlog_periodic_queue_size
{code}
public int commitlog_periodic_queue_size = 1024 * 
FBUtilities.getAvailableProcessors();
{code}

Hence, till we flush() (is called every 10 seconds) the writes to the CL is 
blocked. Hope that makes sense...


was (Author: vijay2...@yahoo.com):
Found the bottleneck in the current!
Actually this happens during buffer.force()... CL.add queue is capped by 
commitlog_periodic_queue_size
{code}
public int commitlog_periodic_queue_size = 1024 * 
FBUtilities.getAvailableProcessors();
{code}

Hence, till we flush() (is called every 10 seconds) the writes to the CL us 
blocked. Hope that makes sense...

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, ComitlogStress.java, 
 Current-CL.png, Multi-Threded-CL.png, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6206) Thrift socket listen backlog

2013-10-21 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801518#comment-13801518
 ] 

Vijay commented on CASSANDRA-6206:
--

v2 doesn't apply clean (in future may be use git patch). I will also change 
SSLFactory.getServerSocket to use this configuration.

ping [~xedin] for HSHA for THRIFT-1868, 
Should we leave this ticket open till THRIFT-1868 gets resolved and/or also 
till 2.1 (changes the yaml configuration)?

 Thrift socket listen backlog
 

 Key: CASSANDRA-6206
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6206
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Debian Linux, Java 7
Reporter: Nenad Merdanovic
 Fix For: 2.0.2

 Attachments: cassandra.patch, cassandra-v2.patch


 Although Thrift is a depreciated method of accessing Cassandra, default 
 backlog is way too low on that socket. It shouldn't be a problem to implement 
 it and I am including a POC patch for this (sorry, really low on time with 
 limited Java knowledge so just to give an idea).
 This is an old report which was never addressed and the bug remains till this 
 day, except in my case I have a much larger scale application with 3rd party 
 software which I cannot modify to include connection pooling:
 https://issues.apache.org/jira/browse/CASSANDRA-1663
 There is also a pending change in the Thrift itself which Cassandra should be 
 able to use for parts using TServerSocket (SSL):
 https://issues.apache.org/jira/browse/THRIFT-1868



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-6206) Thrift socket listen backlog

2013-10-21 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801518#comment-13801518
 ] 

Vijay edited comment on CASSANDRA-6206 at 10/22/13 5:54 AM:


v2 doesn't apply clean (in future may be use git patch). I will also change 
SSLFactory.getServerSocket to use this configuration.

ping [~xedin] for HSHA change after THRIFT-1868, 
Should we leave this ticket open till THRIFT-1868 gets resolved and/or also 
till 2.1 (changes the yaml configuration)?


was (Author: vijay2...@yahoo.com):
v2 doesn't apply clean (in future may be use git patch). I will also change 
SSLFactory.getServerSocket to use this configuration.

ping [~xedin] for HSHA for THRIFT-1868, 
Should we leave this ticket open till THRIFT-1868 gets resolved and/or also 
till 2.1 (changes the yaml configuration)?

 Thrift socket listen backlog
 

 Key: CASSANDRA-6206
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6206
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Debian Linux, Java 7
Reporter: Nenad Merdanovic
 Fix For: 2.0.2

 Attachments: cassandra.patch, cassandra-v2.patch


 Although Thrift is a depreciated method of accessing Cassandra, default 
 backlog is way too low on that socket. It shouldn't be a problem to implement 
 it and I am including a POC patch for this (sorry, really low on time with 
 limited Java knowledge so just to give an idea).
 This is an old report which was never addressed and the bug remains till this 
 day, except in my case I have a much larger scale application with 3rd party 
 software which I cannot modify to include connection pooling:
 https://issues.apache.org/jira/browse/CASSANDRA-1663
 There is also a pending change in the Thrift itself which Cassandra should be 
 able to use for parts using TServerSocket (SSL):
 https://issues.apache.org/jira/browse/THRIFT-1868



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6218) Reduce WAN traffic while doing repairs

2013-10-19 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13799818#comment-13799818
 ] 

Vijay commented on CASSANDRA-6218:
--

Wont it be simpler to just forward (Similar to write forwards) to the 
difference of (A, B) and (C, D) to each other (after initial repair) than 
initiating another repair again between (A, B) and (C, D) in step 3?

Another possible option: 
Consider (DC1: A, B, C and DC2: X, Y, Z)
Start Merkel tree comparison between all the nodes, once the differences is 
identified: Stream within the DC and then across the DC using a proxy or a 
forwarder node picked. (A, B, C to X) and then (X, Y, Z to A) Now both the 
DC has all the inconsistent data hence they can stream the ranges which where 
identified as inconsistent 

 Reduce WAN traffic while doing repairs
 --

 Key: CASSANDRA-6218
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6218
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Priority: Minor

 The way we send out data that does not match over WAN can be improved. 
 Example: Say there are four nodes(A,B,C,D) which are replica of a range we 
 are repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data 
 which other replicas have, then we will have following streams
 1) A to B and back
 2) A to C and back(Goes over WAN)
 3) A to D and back(Goes over WAN)
 One of the ways of doing it to reduce WAN traffic is this.
 1) Repair A and B only with each other and C and D with each other starting 
 at same time t. 
 2) Once these repairs have finished, A,B and C,D are in sync with respect to 
 time t. 
 3) Now run a repair between A and C, the streams which are exchanged as a 
 result of the diff will also be streamed to B and D via A and C(C and D 
 behaves like a proxy to the streams).
 For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and 
 even more for higher replication factors. 
  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Comment Edited] (CASSANDRA-6218) Reduce WAN traffic while doing repairs

2013-10-19 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13799818#comment-13799818
 ] 

Vijay edited comment on CASSANDRA-6218 at 10/19/13 7:04 AM:


Wont it be simpler to just forward (Similar to write forwards) to the 
difference of (A, B) and (C, D) to each other (after initial repair) than 
initiating another repair again between (A, B) and (C, D) in step 3?

Another possible option: 
Consider (DC1: A, B, C and DC2: X, Y, Z)
Start Merkel tree comparison between all the nodes, once the differences is 
identified: Stream within the DC and then across the DC using a proxy or a 
forwarder node picked. (A, B, C to X) and then (X, Y, Z to A) Now both the 
DC's have consistent data hence the proxy/forwarder can stream the ranges which 
where identified as inconsistent in the Merkel comparison 


was (Author: vijay2...@yahoo.com):
Wont it be simpler to just forward (Similar to write forwards) to the 
difference of (A, B) and (C, D) to each other (after initial repair) than 
initiating another repair again between (A, B) and (C, D) in step 3?

Another possible option: 
Consider (DC1: A, B, C and DC2: X, Y, Z)
Start Merkel tree comparison between all the nodes, once the differences is 
identified: Stream within the DC and then across the DC using a proxy or a 
forwarder node picked. (A, B, C to X) and then (X, Y, Z to A) Now both the 
DC has all the inconsistent data hence they can stream the ranges which where 
identified as inconsistent 

 Reduce WAN traffic while doing repairs
 --

 Key: CASSANDRA-6218
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6218
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: sankalp kohli
Priority: Minor

 The way we send out data that does not match over WAN can be improved. 
 Example: Say there are four nodes(A,B,C,D) which are replica of a range we 
 are repairing. A, B is in DC1 and C,D is in DC2. If A does not have the data 
 which other replicas have, then we will have following streams
 1) A to B and back
 2) A to C and back(Goes over WAN)
 3) A to D and back(Goes over WAN)
 One of the ways of doing it to reduce WAN traffic is this.
 1) Repair A and B only with each other and C and D with each other starting 
 at same time t. 
 2) Once these repairs have finished, A,B and C,D are in sync with respect to 
 time t. 
 3) Now run a repair between A and C, the streams which are exchanged as a 
 result of the diff will also be streamed to B and D via A and C(C and D 
 behaves like a proxy to the streams).
 For a replication of DC1:2,DC2:2, the WAN traffic will get reduced by 50% and 
 even more for higher replication factors. 
  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-10-18 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13799612#comment-13799612
 ] 

Vijay commented on CASSANDRA-3578:
--

Ohhh great idea, will give it a shot...

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-3578) Multithreaded commitlog

2013-10-17 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13798746#comment-13798746
 ] 

Vijay commented on CASSANDRA-3578:
--

Pushed my changes to https://github.com/Vijay2win/cassandra/commits/3578

* The above takes a different approach, we update commit log as a part of the 
mutation thread and no more threads to deal with serialization. CAS operation 
to reserve a block of bytes in the MMapped segment (Similar to slab allocator) 
and activate segments.
* Sync is managed in the separate thread still.
* Doesn't have a end of segment on each mutation, we just have header which 
will hold the end.

We could clean up little more if it looks good.
Performance test shows a slight improvements... May be once we remove other 
bottlenecks the improvements (also have to test on spinning drives) will be 
more visible.



 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (CASSANDRA-3578) Multithreaded commitlog

2013-10-10 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay reassigned CASSANDRA-3578:


Assignee: Vijay

 Multithreaded commitlog
 ---

 Key: CASSANDRA-3578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3578
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: performance
 Attachments: 0001-CASSANDRA-3578.patch, parallel_commit_log_2.patch


 Brian Aker pointed out a while ago that allowing multiple threads to modify 
 the commitlog simultaneously (reserving space for each with a CAS first, the 
 way we do in the SlabAllocator.Region.allocate) can improve performance, 
 since you're not bottlenecking on a single thread to do all the copying and 
 CRC computation.
 Now that we use mmap'd CommitLog segments (CASSANDRA-3411) this becomes 
 doable.
 (moved from CASSANDRA-622, which was getting a bit muddled.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain

2013-10-07 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5911:
-

Attachment: 0001-CASSANDRA-5911.patch

Hi Jonathan, Yeah it will replay the active segment currently ( 2. above ).
Please see the attached which provides a alternative, we can just switch to a 
different segment before it is full on flush.

 Commit logs are not removed after nodetool flush or nodetool drain
 --

 Key: CASSANDRA-5911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: J.B. Langston
Assignee: Vijay
Priority: Minor
 Fix For: 2.0.2

 Attachments: 0001-CASSANDRA-5911.patch, 
 6528_140171_knwmuqxe9bjv5re_system.log


 Commit logs are not removed after nodetool flush or nodetool drain. This can 
 lead to unnecessary commit log replay during startup.  I've reproduced this 
 on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a 
 Solr-indexed column family in DSE, each replayed mutation has to be reindexed 
 which can make startup take a long time (on the order of 20-30 min).
 Reproduction follows:
 {code}
 jblangston:bin jblangston$ ./cassandra  /dev/null
 jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000  
 /dev/null
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool flush
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool drain
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ pkill java
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ ./cassandra -f | grep Replaying
  INFO 10:03:42,915 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
  INFO 10:03:42,922 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log
  INFO 10:03:43,912 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log
  INFO 10:03:43,912 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log
  INFO 

[jira] [Commented] (CASSANDRA-4681) SlabAllocator spends a lot of time in Thread.yield

2013-09-30 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13782230#comment-13782230
 ] 

Vijay commented on CASSANDRA-4681:
--

Hi Jonathan, Doesn't showing any changes in TPS, please see the attached (tried 
with stress 50/200 threads and concurrent_writes 32/256... all the runs are 
attached). http://pastebin.com/JDFqgcZN 

 SlabAllocator spends a lot of time in Thread.yield
 --

 Key: CASSANDRA-4681
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4681
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.5
 Environment: OEL Linux
Reporter: Oleg Kibirev
Assignee: Jonathan Ellis
Priority: Minor
  Labels: performance
 Attachments: 4681-v3.txt, 4691-short-circuit.txt, 
 4691-v3-rebased.txt, SlabAllocator.java, SlabAllocator.java.list, 
 slab-list.patch


 When profiling high volume inserts into Cassandra running on a host with fast 
 SSD and CPU, Thread.yield() invoked by SlabAllocator appeared as the top item 
 in CPU samples. The fix is to return a regular byte buffer if current slab is 
 being initialized by another thread. So instead of:
if (oldOffset == UNINITIALIZED)
 {
 // The region doesn't have its data allocated yet.
 // Since we found this in currentRegion, we know that 
 whoever
 // CAS-ed it there is allocating it right now. So 
 spin-loop
 // shouldn't spin long!
 Thread.yield();
 continue;
 }
 do:
 if (oldOffset == UNINITIALIZED)
 return ByteBuffer.allocate(size);
 I achieved 4x speed up in my (admittedly specialized) benchmark by using an 
 optimized version of SlabAllocator attached. Since this code is in the 
 critical path, even doing excessive atomic instructions or allocating 
 unneeded extra ByteBuffer instances has a measurable effect on performance



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5549) Remove Table.switchLock

2013-09-28 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781214#comment-13781214
 ] 

Vijay commented on CASSANDRA-5549:
--

Hi Ryan, Can you give a shot at 
https://github.com/Vijay2win/cassandra/commits/5549-v2 on 10M keys atleast.

Rebased [~jbellis] branch I moved the CommitLogAllocator forceFlush back to 
separate thread, removed isDirty boolean since isClean is called in a separate 
thread and hence shouldn't help performance on writes, rest is all Jonathan...

My benchmark on 32 physical core machine shows a better performance than 
earlier. ~72 vs ~84
http://pastebin.com/GRPMUcSB


 Remove Table.switchLock
 ---

 Key: CASSANDRA-5549
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Vijay
  Labels: performance
 Fix For: 2.1

 Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png


 As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
 path.  ReentrantReadWriteLock is not lightweight, even if there is no 
 contention per se between readers and writers of the lock (in Cassandra, 
 memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-5357) Query cache

2013-09-24 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777088#comment-13777088
 ] 

Vijay commented on CASSANDRA-5357:
--

{quote}
So the cost is quite high vs having live filters
{quote}
Some synthetic test show very low over head on the filter deserialization 
http://pastebin.com/VNREA8fG. 

IMHO... Exist check might not be that bad, since 99% (thats a assumption) of 
the queries will have the same query filters on them. For those queries which 
are discreet and present in the cache (survived the LRU), i think it is fair to 
take a hit than letting it live in JVM. 
Filters may be big in some cases (like named filters, or filters with long 
string names) and even an optimal case of empty strings we still need a minimum 
of 2 BB, count and the data structures in memory. Hence a compact storage 
off-heap might be good.

One other option which we where discussing little earlier, to optimize the 
filters in the cache by trying to find the optimal cache filter entry by 
merging similar and overlapping queries will help the above.

{quote}
I'm not concerned about that so much as, do we keep within our total memory 
budget? 
{quote}
Ahaa got it, so we need an additional parameter for the cache which says how 
much memory is available in the JVM for the cached keys... i will add it to the 
next revision.

 Query cache
 ---

 Key: CASSANDRA-5357
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5357
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Vijay

 I think that most people expect the row cache to act like a query cache, 
 because that's a reasonable model.  Caching the entire partition is, in 
 retrospect, not really reasonable, so it's not surprising that it catches 
 people off guard, especially given the confusion we've inflicted on ourselves 
 as to what a row constitutes.
 I propose replacing it with a true query cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5357) Query cache

2013-09-23 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13775967#comment-13775967
 ] 

Vijay commented on CASSANDRA-5357:
--

{quote}
I'm saying just shove the ColumnFamily payload off-heap but leave the rest 
live.
{quote}
Sure but that can cause more memory pressure in the JVM, IMHO (cost vs benefit) 
its not that bad to deserialize the filters at least in the stress tests i did.

{quote}
I'm not sure I understand exactly how the problem happens here.
{quote}
The problem is when the whole row (lets say multiple MB's) column family is 
cached, instead of de-serializing the whole column family at once we can 
de-serialize it during filter in CFS.filterColumnFamily, hence the QC should 
return a iterator instead of CF... Makes sense?

 Query cache
 ---

 Key: CASSANDRA-5357
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5357
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Vijay

 I think that most people expect the row cache to act like a query cache, 
 because that's a reasonable model.  Caching the entire partition is, in 
 retrospect, not really reasonable, so it's not surprising that it catches 
 people off guard, especially given the confusion we've inflicted on ourselves 
 as to what a row constitutes.
 I propose replacing it with a true query cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5357) Query cache

2013-09-22 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774291#comment-13774291
 ] 

Vijay commented on CASSANDRA-5357:
--

Hi Jonathan, I have pushed a version with sentinel (might have made it little 
hackie, but it works) 
https://github.com/Vijay2win/cassandra/commits/query_cache_v2.

{quote}
Serializing the entire QueryCacheValue for each lookup is going to kill 
performance on hot partitions.
{quote}
It is required because we need to know the query which populated the cache, for 
example there can be a named query for Column A, Z which can be followed by a 
slice query from A to Z and we might not respond with the right response since 
B to Y is not in the cache. 

In a separate ticket we can also optimize the above case (and more) cache 
query's stored, if thats ok. Example: If the slice with 250 is stored why to 
also store the slice with 50 in the same range, we can also merge overlapping 
slices etc.

{quote}
if there's room, that's fine, but exceeding the configured memory budget is Bad
{quote}
Can we do that in a separate ticket?, i believe we can achieve this by 
implementing a Iterator which will be similar to SSTableIterator to stream the 
columns than constructing the ColumnFamily at once.

Thanks!

 Query cache
 ---

 Key: CASSANDRA-5357
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5357
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Vijay

 I think that most people expect the row cache to act like a query cache, 
 because that's a reasonable model.  Caching the entire partition is, in 
 retrospect, not really reasonable, so it's not surprising that it catches 
 people off guard, especially given the confusion we've inflicted on ourselves 
 as to what a row constitutes.
 I propose replacing it with a true query cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-5357) Query cache

2013-09-22 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13774291#comment-13774291
 ] 

Vijay edited comment on CASSANDRA-5357 at 9/23/13 4:48 AM:
---

Hi Jonathan, I have pushed a version with sentinel (might have made it little 
hackie, but it works) 
https://github.com/Vijay2win/cassandra/commits/query_cache_v2.

{quote}
Serializing the entire QueryCacheValue for each lookup is going to kill 
performance on hot partitions.
{quote}
It is required because we need to know the query which populated the cache, for 
example there can be a named query for Column A, Z which can be followed by a 
slice query from A to Z and we might not respond with the right response since 
B to Y is not in the cache. 

In a separate ticket we can also optimize the above case (and more) cache 
query's stored, if thats ok. Example: If the slice with count as 250 is stored 
we might not need to store the slice with count of 50 with same range, we can 
also merge overlapping slices etc.

{quote}
if there's room, that's fine, but exceeding the configured memory budget is Bad
{quote}
Can we do that in a separate ticket?, i believe we can achieve this by 
implementing a Iterator which will be similar to SSTableIterator to stream the 
columns than constructing the ColumnFamily at once.

Thanks!

  was (Author: vijay2...@yahoo.com):
Hi Jonathan, I have pushed a version with sentinel (might have made it 
little hackie, but it works) 
https://github.com/Vijay2win/cassandra/commits/query_cache_v2.

{quote}
Serializing the entire QueryCacheValue for each lookup is going to kill 
performance on hot partitions.
{quote}
It is required because we need to know the query which populated the cache, for 
example there can be a named query for Column A, Z which can be followed by a 
slice query from A to Z and we might not respond with the right response since 
B to Y is not in the cache. 

In a separate ticket we can also optimize the above case (and more) cache 
query's stored, if thats ok. Example: If the slice with 250 is stored why to 
also store the slice with 50 in the same range, we can also merge overlapping 
slices etc.

{quote}
if there's room, that's fine, but exceeding the configured memory budget is Bad
{quote}
Can we do that in a separate ticket?, i believe we can achieve this by 
implementing a Iterator which will be similar to SSTableIterator to stream the 
columns than constructing the ColumnFamily at once.

Thanks!
  
 Query cache
 ---

 Key: CASSANDRA-5357
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5357
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Vijay

 I think that most people expect the row cache to act like a query cache, 
 because that's a reasonable model.  Caching the entire partition is, in 
 retrospect, not really reasonable, so it's not surprising that it catches 
 people off guard, especially given the confusion we've inflicted on ourselves 
 as to what a row constitutes.
 I propose replacing it with a true query cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CASSANDRA-1956) Convert row cache to row+filter cache

2013-09-20 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay resolved CASSANDRA-1956.
--

Resolution: Duplicate

Yep, Closing this as it is duplicate to CASSANDRA-5357.

 Convert row cache to row+filter cache
 -

 Key: CASSANDRA-1956
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1956
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Vijay
Priority: Minor
 Fix For: 2.1

 Attachments: 0001-1956-cache-updates-v0.patch, 
 0001-commiting-block-cache.patch, 0001-re-factor-row-cache.patch, 
 0001-row-cache-filter.patch, 0002-1956-updates-to-thrift-and-avro-v0.patch, 
 0002-add-query-cache.patch


 Changing the row cache to a row+filter cache would make it much more useful. 
 We currently have to warn against using the row cache with wide rows, where 
 the read pattern is typically a peek at the head, but this usecase would be 
 perfect supported by a cache that stored only columns matching the filter.
 Possible implementations:
 * (copout) Cache a single filter per row, and leave the cache key as is
 * Cache a list of filters per row, leaving the cache key as is: this is 
 likely to have some gotchas for weird usage patterns, and it requires the 
 list overheard
 * Change the cache key to rowkey+filterid: basically ideal, but you need a 
 secondary index to lookup cache entries by rowkey so that you can keep them 
 in sync with the memtable
 * others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4165) Generate Digest file for compressed SSTables

2013-09-20 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773543#comment-13773543
 ] 

Vijay commented on CASSANDRA-4165:
--

Hi Jonathan, 3648 actually adds block level CRC for uncompressed files and 
writes to a separate file (CRC.db), and uses it during the streaming parts of 
the file to validate before streaming (not during normal reads). Hence we need 
2 Checksums during the flush 1 for blocks and the md5 for the whole file.

 Generate Digest file for compressed SSTables
 

 Key: CASSANDRA-4165
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4165
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
Priority: Minor
 Attachments: 0001-Generate-digest-for-compressed-files-as-well.patch, 
 4165-rebased.txt


 We use the generated *Digest.sha1-files to verify backups, would be nice if 
 they were generated for compressed sstables as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6031) Remove code to load pre-1.2 caches

2013-09-13 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767119#comment-13767119
 ] 

Vijay commented on CASSANDRA-6031:
--

+1

 Remove code to load pre-1.2 caches
 --

 Key: CASSANDRA-6031
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6031
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Attachments: remove-deprecated-cache-load-method.txt


 AutoSavingCache has been deprecated since 1.2 and exists to read 
 pre-CASSANDRA-3762 caches.  It is thus safe to remove in 2.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5933) 2.0 read performance is slower than 1.2

2013-09-02 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756323#comment-13756323
 ] 

Vijay commented on CASSANDRA-5933:
--

Ryan, Do you mind testing the custom with 5 to 10 ms... 
I am thinking, we might need enough sample for Percentiles to make more sense 
(if conformed we might want to wait till the samples arrive etc).

 2.0 read performance is slower than 1.2
 ---

 Key: CASSANDRA-5933
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
 Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png


 Over the course of several tests I have observed that 2.0 read performance is 
 noticeably slower than 1.2
 Example:
 Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this 
 on rc2, just don't have a good graph handy)
 !1.2-faster-than-2.0.png!
 !1.2-faster-than-2.0-stats.png!
 [See test data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5933) 2.0 read performance is slower than 1.2

2013-09-02 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13756329#comment-13756329
 ] 

Vijay commented on CASSANDRA-5933:
--

Hi Ryan, You can set a custom speculative execution like the below...
{code}
update column family Standard1 with speculative_retry=10ms;
{code}

 2.0 read performance is slower than 1.2
 ---

 Key: CASSANDRA-5933
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5933
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
 Attachments: 1.2-faster-than-2.0.png, 1.2-faster-than-2.0-stats.png


 Over the course of several tests I have observed that 2.0 read performance is 
 noticeably slower than 1.2
 Example:
 Blue line is 1.2, the rest are various forms of 2.0 rc1 (I've also seen this 
 on rc2, just don't have a good graph handy)
 !1.2-faster-than-2.0.png!
 !1.2-faster-than-2.0-stats.png!
 [See test data 
 here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.jsonmetric=interval_op_rateoperation=stress-readsmoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5952) report compression ratio via nodetool cfstats

2013-08-31 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5952:
-

Reviewer: jbellis

 report compression ratio via nodetool cfstats
 -

 Key: CASSANDRA-5952
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5952
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Robert Coli
Assignee: Vijay
Priority: Trivial
 Fix For: 1.2.10, 2.0.1

 Attachments: 0001-CASSANDRA-5952.patch


 CASSANDRA-3393 adds a getCompressionRatio JMX call, and was originally 
 supposed to also expose this value per CF via nodetool cfstats.
 However, the nodetool cfstats part was not done in CASSANDRA-3393. This 
 ticket serves as a request to expose this valuable data about compression via 
 nodetool cfstats.
 (cc: [~vijay2...@yahoo.com], who did the CASSANDRA-3393 work)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5952) report compression ratio via nodetool cfstats

2013-08-30 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5952:
-

Attachment: 0001-CASSANDRA-5952.patch

One liner change to expose via NT, Thanks!

 report compression ratio via nodetool cfstats
 -

 Key: CASSANDRA-5952
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5952
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Robert Coli
Assignee: Vijay
Priority: Trivial
 Attachments: 0001-CASSANDRA-5952.patch


 CASSANDRA-3393 adds a getCompressionRatio JMX call, and was originally 
 supposed to also expose this value per CF via nodetool cfstats.
 However, the nodetool cfstats part was not done in CASSANDRA-3393. This 
 ticket serves as a request to expose this valuable data about compression via 
 nodetool cfstats.
 (cc: [~vijay2...@yahoo.com], who did the CASSANDRA-3393 work)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CASSANDRA-5952) report compression ratio via nodetool cfstats

2013-08-29 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay reassigned CASSANDRA-5952:


Assignee: Vijay

 report compression ratio via nodetool cfstats
 -

 Key: CASSANDRA-5952
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5952
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Robert Coli
Assignee: Vijay
Priority: Trivial

 CASSANDRA-3393 adds a getCompressionRatio JMX call, and was originally 
 supposed to also expose this value per CF via nodetool cfstats.
 However, the nodetool cfstats part was not done in CASSANDRA-3393. This 
 ticket serves as a request to expose this valuable data about compression via 
 nodetool cfstats.
 (cc: [~vijay2...@yahoo.com], who did the CASSANDRA-3393 work)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5939) Cache Providers calculate very different row sizes

2013-08-29 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754387#comment-13754387
 ] 

Vijay commented on CASSANDRA-5939:
--

{quote}
While java has overhead, it's not...
{quote}

Well try the following code in CacheProviderTest

{code}
@Test
public void testCompareSizes() throws IOException
{
RowCacheKey key = new RowCacheKey(UUID.randomUUID(), 
ByteBufferUtil.bytes(test));
ColumnFamily cf = createCF();
System.out.println(size: + (key.memorySize() + cf.memorySize()));
System.out.println(key size: + key.memorySize());
System.out.println(value size: + cf.memorySize());
RowCacheSerializer serializer = new RowCacheSerializer();
DataOutputBuffer out = new DataOutputBuffer();
serializer.serialize(cf, out);
System.out.println(ser size: + out.getLength());

IRowCacheEntry cf2 = serializer.deserialize(new DataInputStream(new 
ByteArrayInputStream(out.getData(;
Assert.assertEquals(cf, cf2);
}
{code}

output (actually value/CF overhead memorySize uses measureDeep() JAMM)

{code}
size:74120
key size:48
value size:74072
ser size:66
{code}

I am just trying to figure out if there is any bug I am missing/overlooking. I 
agree that we need to have a configuration for the key size in JVM heap to 
contain OOM's etc. 
We can use this ticket to solve that issue. I do understand, we have removed 
CLHM in 2.0 so we can concentrate on getting a better configuration for SC.

 Cache Providers calculate very different row sizes
 --

 Key: CASSANDRA-5939
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5939
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 1.2.8
Reporter: Chris Burroughs
Assignee: Vijay

 Took the same production node and bounced it 4 times comparing version and 
 cache provider.  ConcurrentLinkedHashCacheProvider and 
 SerializingCacheProvider produce very different results resulting in an order 
 of magnitude difference in rows cached.  In all cases the row cache size was 
 2048 MB.  Hit rate is provided for color, but entries  size are the 
 important part.
 1.2.8 ConcurrentLinkedHashCacheProvider:
  * entries: 23,217
  * hit rate: 43%
  * size: 2,147,398,344
 1.2.8 about 20 minutes of SerializingCacheProvider:
  * entries: 221,709
  * hit rate: 68%
  * size: 18,417254
 1.2.5 ConcurrentLinkedHashCacheProvider:
  * entries: 25,967
  * hit rate: ~ 50%
  * size:  2,147,421,704
 1.2.5 about 20 minutes of SerializingCacheProvider:
  * entries: 228,457
  * hit rate: ~ 70%
  * size: 19,070,315
 A related(?) problem is that the ConcurrentLinkedHashCacheProvider sizes seem 
 to be highly variable.  Digging up the values for 5 different nodes in the 
 cluster using ConcurrentLinkedHashCacheProvider shows a wide variance in 
 number of entries:
  * 12k
  * 444k
  * 10k
  * 25k
  * 25k

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5939) Cache Providers calculate very different row sizes

2013-08-28 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753061#comment-13753061
 ] 

Vijay commented on CASSANDRA-5939:
--

Chris, Not sure if i understand the question/issue right... 

If the question is whats the difference between SC and CLHM in terms of memory 
overhead?

CLHM Entry's (Key and Value) weight is calculated, where as SC we only weigh 
the values (which is off-heap) and we don't weigh the size of the keys in the 
heap (since it is kind of hybrid foot print's).
CLHM has java's Object overhead (look 
https://issues.apache.org/jira/browse/CASSANDRA-4860?focusedCommentId=13632991page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13632991),
 SC we Encode the bytes hence it will be considerably low overhead of value's 
in memory. Your milage also may vary depending on the size of the columns.

 Cache Providers calculate very different row sizes
 --

 Key: CASSANDRA-5939
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5939
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 1.2.8
Reporter: Chris Burroughs
Assignee: Vijay

 Took the same production node and bounced it 4 times comparing version and 
 cache provider.  ConcurrentLinkedHashCacheProvider and 
 SerializingCacheProvider produce very different results resulting in an order 
 of magnitude difference in rows cached.  In all cases the row cache size was 
 2048 MB.  Hit rate is provided for color, but entries  size are the 
 important part.
 1.2.8 ConcurrentLinkedHashCacheProvider:
  * entries: 23,217
  * hit rate: 43%
  * size: 2,147,398,344
 1.2.8 about 20 minutes of SerializingCacheProvider:
  * entries: 221,709
  * hit rate: 68%
  * size: 18,417254
 1.2.5 ConcurrentLinkedHashCacheProvider:
  * entries: 25,967
  * hit rate: ~ 50%
  * size:  2,147,421,704
 1.2.5 about 20 minutes of SerializingCacheProvider:
  * entries: 228,457
  * hit rate: ~ 70%
  * size: 19,070,315
 A related(?) problem is that the ConcurrentLinkedHashCacheProvider sizes seem 
 to be highly variable.  Digging up the values for 5 different nodes in the 
 cluster using ConcurrentLinkedHashCacheProvider shows a wide variance in 
 number of entries:
  * 12k
  * 444k
  * 10k
  * 25k
  * 25k

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5909) CommitLogReplayer date time issue

2013-08-27 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5909:
-

Attachment: 0001-CASSANDRA-5909.patch

Attached patch and test case as a fix to add precision. Thanks!

 CommitLogReplayer date time issue 
 --

 Key: CASSANDRA-5909
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5909
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Artur Kronenberg
Assignee: Vijay
Priority: Minor
 Fix For: 1.2.10

 Attachments: 0001-CASSANDRA-5909.patch


 Hi,
 First off I am sorry if the component is not right for this. 
 I am trying to get the point-in-time backup to work. And I ran into the 
 following issues: 
 1. The documentation in the commitlog_archiving.properties seems to be out of 
 date, as the example date format is no more valid and can't be parsed. 
 2. 
 The restore_point_in_time property seems to differ from the actual 
 maxTimeStamp. I added additional logging to the codebase in the class 
 CommitLogReplayer like that: 
 protected boolean pointInTimeExceeded(RowMutation frm)
 {
 long restoreTarget = CommitLog.instance.archiver.restorePointInTime;
 logger.info(String.valueOf(restoreTarget));
 for (ColumnFamily families : frm.getColumnFamilies())
 {
 logger.info(String.valueOf(families.maxTimestamp()));
   if (families.maxTimestamp()  restoreTarget)
 return true;
 }
 return false;
 }
 The following output can be seen: 
 The restoreTarget timestamp is: 1377015783000
 This has been correctly parsed as I added this date to the properties: 
 2013:08:20 17:23:03
 the value for families.maxTimestamp() is: 1377009021033000
 This date corresponds to: Mon 45605-09-05 10:50:33 BST (44 millennia from now)
 It seems like the timestamp has 3 additional zeros. This also means that the 
 code can never return false on the call, as the restoreTarget will always be 
 smaller then the maxTimestamp(). Therefore the Replayer can never replay any 
 of my commitlog files. 
 The timestamp minus the 3 zeros corresponds to Tue 2013-08-20 15:30:21 BST 
 (23 hours ago) which makes more sense and would allow for the replay to 
 work. 
 My config: 
 Cassandra-1.2.4
 Java 1.6
 Ubuntu 12.04 64bit 
 If you need any more information let me know and I'll be happy to suply 
 whatever info I can. 
 -- artur 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5909) CommitLogReplayer date time issue

2013-08-26 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750875#comment-13750875
 ] 

Vijay commented on CASSANDRA-5909:
--

Ahaaa looks like we need a configuration for Milli/Micro second precisions, 
Users should not mix those in a cluster to have a reliable delete and updates, 
so it should be fine. The other option is to write additional long field while 
storing the RM in the commit log.  

 CommitLogReplayer date time issue 
 --

 Key: CASSANDRA-5909
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5909
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Artur Kronenberg
Assignee: Vijay
Priority: Minor
 Fix For: 1.2.10


 Hi,
 First off I am sorry if the component is not right for this. 
 I am trying to get the point-in-time backup to work. And I ran into the 
 following issues: 
 1. The documentation in the commitlog_archiving.properties seems to be out of 
 date, as the example date format is no more valid and can't be parsed. 
 2. 
 The restore_point_in_time property seems to differ from the actual 
 maxTimeStamp. I added additional logging to the codebase in the class 
 CommitLogReplayer like that: 
 protected boolean pointInTimeExceeded(RowMutation frm)
 {
 long restoreTarget = CommitLog.instance.archiver.restorePointInTime;
 logger.info(String.valueOf(restoreTarget));
 for (ColumnFamily families : frm.getColumnFamilies())
 {
 logger.info(String.valueOf(families.maxTimestamp()));
   if (families.maxTimestamp()  restoreTarget)
 return true;
 }
 return false;
 }
 The following output can be seen: 
 The restoreTarget timestamp is: 1377015783000
 This has been correctly parsed as I added this date to the properties: 
 2013:08:20 17:23:03
 the value for families.maxTimestamp() is: 1377009021033000
 This date corresponds to: Mon 45605-09-05 10:50:33 BST (44 millennia from now)
 It seems like the timestamp has 3 additional zeros. This also means that the 
 code can never return false on the call, as the restoreTarget will always be 
 smaller then the maxTimestamp(). Therefore the Replayer can never replay any 
 of my commitlog files. 
 The timestamp minus the 3 zeros corresponds to Tue 2013-08-20 15:30:21 BST 
 (23 hours ago) which makes more sense and would allow for the replay to 
 work. 
 My config: 
 Cassandra-1.2.4
 Java 1.6
 Ubuntu 12.04 64bit 
 If you need any more information let me know and I'll be happy to suply 
 whatever info I can. 
 -- artur 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain

2013-08-23 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748365#comment-13748365
 ] 

Vijay commented on CASSANDRA-5911:
--

1) Even though the logs show Replaying, we will not replay anything since 
END_OF_SEGMENT_MARKER is placed in the beginning of the file. 

- We can improve the logging, so we don't print CL which we are skipping after 
reading first 4 bytes. 

2) Only the active segment is replayed, even if we Flush the CL since we 
have not recycled. 

- One way I can think of to avoid replaying on active segment, with a 
performance hit is to have a metadata file which might hold the info on CF 
dirty writes if any (similar to CommitLogSegment#cfLastWrite, write for the 
first write on segment and remove for flush). 

 Commit logs are not removed after nodetool flush or nodetool drain
 --

 Key: CASSANDRA-5911
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: J.B. Langston
Assignee: Vijay
Priority: Minor
 Fix For: 2.0.1


 Commit logs are not removed after nodetool flush or nodetool drain. This can 
 lead to unnecessary commit log replay during startup.  I've reproduced this 
 on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a 
 Solr-indexed column family in DSE, each replayed mutation has to be reindexed 
 which can make startup take a long time (on the order of 20-30 min).
 Reproduction follows:
 {code}
 jblangston:bin jblangston$ ./cassandra  /dev/null
 jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 2000  
 /dev/null
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool flush
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ nodetool drain
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ pkill java
 jblangston:bin jblangston$ du -h ../commitlog
 576M  ../commitlog
 jblangston:bin jblangston$ ./cassandra -f | grep Replaying
  INFO 10:03:42,915 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
  INFO 10:03:42,922 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
  INFO 10:03:43,907 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log
  INFO 10:03:43,908 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log
  INFO 10:03:43,909 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log
  INFO 10:03:43,910 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log
  INFO 10:03:43,911 Replaying 
 /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log
  INFO 10:03:43,911 Replaying 
 

[jira] [Commented] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-21 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746598#comment-13746598
 ] 

Vijay commented on CASSANDRA-5903:
--

Thanks Taylan, I will writeup a test case for it... The patch on 1.2 (0002) 
should handle up to 2GB * 8 over which we might want to serialize and 
deserialize into long for 2.1.

 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
  Labels: patch
 Fix For: 1.2.9

 Attachments: 0001-CASSANDRA-5903.patch, 0002-CASSANDRA-5903.patch


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-21 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5903:
-

Attachment: 0001-CASSANDRA-5903-check.patch

Not sure if we still need this patch, attaching it just in case :)

 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
  Labels: patch
 Fix For: 1.2.9

 Attachments: 0001-CASSANDRA-5903-check.patch, 
 0001-CASSANDRA-5903.patch, 0002-CASSANDRA-5903.patch


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-21 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5903:
-

Attachment: 0001-CASSANDRA-5903-check.patch

 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
  Labels: patch
 Fix For: 1.2.9

 Attachments: 0001-CASSANDRA-5903-check.patch, 
 0001-CASSANDRA-5903.patch, 0002-CASSANDRA-5903.patch


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-21 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5903:
-

Attachment: (was: 0001-CASSANDRA-5903-check.patch)

 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
  Labels: patch
 Fix For: 1.2.9

 Attachments: 0001-CASSANDRA-5903-check.patch, 
 0001-CASSANDRA-5903.patch, 0002-CASSANDRA-5903.patch


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-21 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746836#comment-13746836
 ] 

Vijay edited comment on CASSANDRA-5903 at 8/21/13 8:59 PM:
---

Not sure if we still need this patch, attaching it just in case :) Ignored the 
test since we need 4 GB to test it function.

  was (Author: vijay2...@yahoo.com):
Not sure if we still need this patch, attaching it just in case :)
  
 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
  Labels: patch
 Fix For: 1.2.9

 Attachments: 0001-CASSANDRA-5903-check.patch, 
 0001-CASSANDRA-5903.patch, 0002-CASSANDRA-5903.patch


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-21 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747164#comment-13747164
 ] 

Vijay commented on CASSANDRA-5903:
--

Done, Thanks!

 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
  Labels: patch
 Fix For: 1.2.9

 Attachments: 0001-CASSANDRA-5903-check.patch, 
 0001-CASSANDRA-5903.patch, 0002-CASSANDRA-5903.patch


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-20 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745244#comment-13745244
 ] 

Vijay commented on CASSANDRA-5903:
--

I can change the byte count to long, 

As a side note, i am not sure if we are addressing the right issue. From the 
stack trace the byteCount should be 228805104 which is 228 MB 
(OpenBitSet.bits2words(1830440832L) * 8L) which should fit in a integer.


 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
 Fix For: 1.2.9


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-20 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745244#comment-13745244
 ] 

Vijay edited comment on CASSANDRA-5903 at 8/20/13 6:31 PM:
---

I can change the byte count to long, 

As a side note, i am not sure if we are addressing the right issue. From the 
stack trace the byteCount should be 228805104 which is 228 MB 
(OpenBitSet.bits2words(1830440832L) * 8L) / ((1830440832L/64) * 8) which should 
fit in a integer.

  was (Author: vijay2...@yahoo.com):
I can change the byte count to long, 

As a side note, i am not sure if we are addressing the right issue. From the 
stack trace the byteCount should be 228805104 which is 228 MB 
(OpenBitSet.bits2words(1830440832L) * 8L) which should fit in a integer.

  
 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
 Fix For: 1.2.9


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-20 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745320#comment-13745320
 ] 

Vijay commented on CASSANDRA-5903:
--

Not sure yet, still trying to figure it out (Since i am more curious)... A 
simple test shows it might run out after 17B to 18B keys in a single SSTable 
(thats a giant SST) :)

{code}
for (int i = 0; i  30; i++) {
long items = (i * 10L);
System.out.println(Items:  + items +  byteCount:  + 
(OpenBitSet.bits2words(items) * 8));
}
{code}

{noformat}
Items: 0 byteCount: 0
Items: 10 byteCount: 12500
Items: 20 byteCount: 25000
Items: 30 byteCount: 37500
Items: 40 byteCount: 5
Items: 50 byteCount: 62500
Items: 60 byteCount: 75000
Items: 70 byteCount: 87500
Items: 80 byteCount: 10
Items: 90 byteCount: 112500
Items: 100 byteCount: 125000
Items: 110 byteCount: 137500
Items: 120 byteCount: 15
Items: 130 byteCount: 162500
Items: 140 byteCount: 175000
Items: 150 byteCount: 187500
Items: 160 byteCount: 20
Items: 170 byteCount: 212500
Items: 180 byteCount: -2044967296
Items: 190 byteCount: -1919967296
Items: 200 byteCount: -1794967296
...
{noformat}

 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
 Fix For: 1.2.9


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-20 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745328#comment-13745328
 ] 

Vijay commented on CASSANDRA-5903:
--

Actually my calculations where wrong it does use 2 GB for 1830440832

long numElements = 1830440832L;
FilterFactory.getFilter(numElements, 0.01d, true);

fixing it.

 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
 Fix For: 1.2.9


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-20 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5903:
-

Attachment: 0001-CASSANDRA-5903.patch

Simple fix for 1.2, it also catches for native OOM (I am neutral, i can also 
remove it so we fail fast) and throws a RTE to pause the compaction etc.

 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
 Fix For: 1.2.9

 Attachments: 0001-CASSANDRA-5903.patch


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-20 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay resolved CASSANDRA-5903.
--

Resolution: Fixed
  Reviewer: jbellis

Committed to 1.2 and merged into 2.0.0 - 2.0 - trunk. Thanks!

 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
 Fix For: 1.2.9

 Attachments: 0001-CASSANDRA-5903.patch


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5903) Integer overflow in OffHeapBitSet when bloomfilter 2GB

2013-08-20 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745622#comment-13745622
 ] 

Vijay commented on CASSANDRA-5903:
--

Done! Thanks.

 Integer overflow in OffHeapBitSet when bloomfilter  2GB
 

 Key: CASSANDRA-5903
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5903
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Taylan Develioglu
Assignee: Vijay
 Fix For: 1.2.9

 Attachments: 0001-CASSANDRA-5903.patch


 In org.apache.cassandra.utils.obs.OffHeapBitSet.
 byteCount overflows and causes an IllegalArgument exception in 
 Memory.allocate when bloomfilter is  2GB.
 Suggest changing byteCount to long.
 {code:title=OffHeapBitSet.java}
 public OffHeapBitSet(long numBits)
 {
 // OpenBitSet.bits2words calculation is there for backward 
 compatibility.
 int byteCount = OpenBitSet.bits2words(numBits) * 8;
 bytes = RefCountedMemory.allocate(byteCount);
 // flush/clear the existing memory.
 clear();
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5357) Query cache

2013-08-07 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733151#comment-13733151
 ] 

Vijay commented on CASSANDRA-5357:
--

Hi Jonathan, The idea in the current implementation is as follows:

The QueryCacheQueryFilter,CF is implemented on top of SerializedCache. It 
stores the Map's key as a RowCacheKeyRowKey, CFID (same as earlier RowCache), 
and Map's value is a composite value as QueryCacheValue[Query, ], 
ColumnFamily, 

For every new query enters the system, we get the QueryCacheValue after 
generating RowCacheKey from QueryFilter, to check if the IFilter exist. If it 
does then return CF; else get QueryCacheValue (if QCV exist; else create new), 
add the IFilter to QCV and merge the results with the existing ColumnFamily 
(also in QCV), which will in-turn be serialized.

Advantages: 
1) Queries can overlap, there could be any number of queries but the data will 
not be repeated within them.
2) When we want to invalidate it we would just invalidate the RowKey and all 
the cached QueryCacheValue goes away (avoids another Map for book keeping and 
hence little more memory efficient)
3) there is a property which user can enable to cache the whole row no matter 
what the query is (but currently patch adds overhead of deserializing identity 
filter which can be fixed though).

Of course there are disadvantages: 
1) LRU algorithm is no longer really accurate, When a single query is hot we 
have no way of invalidating the other queries on the same row, since they all 
have the same number of hit rates (which is no worse than what we have 
currently)
2) With multiple types of queries on the same row (which is kind of edge case) 
we might be pulling the whole data into memory (which can be mitigated by 
incrementally loading it or holding a index in the filter and doesn't exist in 
the current patch).

there could be more which i overlooked...

 Query cache
 ---

 Key: CASSANDRA-5357
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5357
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Vijay

 I think that most people expect the row cache to act like a query cache, 
 because that's a reasonable model.  Caching the entire partition is, in 
 retrospect, not really reasonable, so it's not surprising that it catches 
 people off guard, especially given the confusion we've inflicted on ourselves 
 as to what a row constitutes.
 I propose replacing it with a true query cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5357) Query cache

2013-08-05 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729343#comment-13729343
 ] 

Vijay commented on CASSANDRA-5357:
--

Hi Jonathan, 

I pushed a basic version of Query cache to 
https://github.com/Vijay2win/cassandra/commits/query_cache .I am not sure if we 
still need RowCacheSentinel, but the attached removes it. Attached patch also 
has an option query_cache: true (if set to false, the whole row will always be 
cached). It will be nice to have fully off-heap Map/Cache (including the keys) 
but i am thinking to address it with a separate github project/patch (though 
IMHO, CHM may have contention in the segments for a big caches).

Let me know what you think about the patch, it might need some more cleanup.

 Query cache
 ---

 Key: CASSANDRA-5357
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5357
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Vijay

 I think that most people expect the row cache to act like a query cache, 
 because that's a reasonable model.  Caching the entire partition is, in 
 retrospect, not really reasonable, so it's not surprising that it catches 
 people off guard, especially given the confusion we've inflicted on ourselves 
 as to what a row constitutes.
 I propose replacing it with a true query cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5826) Fix trigger directory detection code

2013-08-05 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729782#comment-13729782
 ] 

Vijay commented on CASSANDRA-5826:
--

Hi Brandon, Oooops... Isn't the directory found in conf? i can remove the RTE 
and make it log, if not found.

 Fix trigger directory detection code
 

 Key: CASSANDRA-5826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0 beta 2
 Environment: OS X
Reporter: Aleksey Yeschenko
Assignee: Vijay
  Labels: triggers
 Fix For: 2.0 rc1

 Attachments: 0001-5826.patch


 At least when building from source, Cassandra determines the trigger 
 directory wrong. C* calculates the trigger directory as 'build/triggers' 
 instead of 'triggers'.
 FBUtilities.cassandraHomeDir() is to blame, and should be replaced with 
 something more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5826) Fix trigger directory detection code

2013-08-05 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5826:
-

Attachment: 0001-handle-trigger-non-existance.patch

Hi Brandon, Attached, handles un reachable trigger directory.

 Fix trigger directory detection code
 

 Key: CASSANDRA-5826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0 beta 2
 Environment: OS X
Reporter: Aleksey Yeschenko
Assignee: Vijay
  Labels: triggers
 Fix For: 2.0 rc1

 Attachments: 0001-5826.patch, 0001-handle-trigger-non-existance.patch


 At least when building from source, Cassandra determines the trigger 
 directory wrong. C* calculates the trigger directory as 'build/triggers' 
 instead of 'triggers'.
 FBUtilities.cassandraHomeDir() is to blame, and should be replaced with 
 something more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5826) Fix trigger directory detection code

2013-08-05 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729907#comment-13729907
 ] 

Vijay commented on CASSANDRA-5826:
--

Hi Brandon did the patch apply clean? 

{code}
File tiggerDirectory = FBUtilities.cassandraTriggerDir();
if (tiggerDirectory == null)
return;
{code}

should save a NPE, i did test it and worked fine for me.

 Fix trigger directory detection code
 

 Key: CASSANDRA-5826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0 beta 2
 Environment: OS X
Reporter: Aleksey Yeschenko
Assignee: Vijay
  Labels: triggers
 Fix For: 2.0 rc1

 Attachments: 0001-5826.patch, 0001-handle-trigger-non-existance.patch


 At least when building from source, Cassandra determines the trigger 
 directory wrong. C* calculates the trigger directory as 'build/triggers' 
 instead of 'triggers'.
 FBUtilities.cassandraHomeDir() is to blame, and should be replaced with 
 something more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5826) Fix trigger directory detection code

2013-08-05 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5826:
-

Attachment: 0001-handle-trigger-non-existance-v2.patch

Hi Brandon, fixed in v2 

 Fix trigger directory detection code
 

 Key: CASSANDRA-5826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0 beta 2
 Environment: OS X
Reporter: Aleksey Yeschenko
Assignee: Vijay
  Labels: triggers
 Fix For: 2.0 rc1

 Attachments: 0001-5826.patch, 
 0001-handle-trigger-non-existance.patch, 
 0001-handle-trigger-non-existance-v2.patch


 At least when building from source, Cassandra determines the trigger 
 directory wrong. C* calculates the trigger directory as 'build/triggers' 
 instead of 'triggers'.
 FBUtilities.cassandraHomeDir() is to blame, and should be replaced with 
 something more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CASSANDRA-5826) Fix trigger directory detection code

2013-08-05 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay resolved CASSANDRA-5826.
--

Resolution: Fixed

Committed, with nit Thanks!

 Fix trigger directory detection code
 

 Key: CASSANDRA-5826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0 beta 2
 Environment: OS X
Reporter: Aleksey Yeschenko
Assignee: Vijay
  Labels: triggers
 Fix For: 2.0 rc1

 Attachments: 0001-5826.patch, 
 0001-handle-trigger-non-existance.patch, 
 0001-handle-trigger-non-existance-v2.patch


 At least when building from source, Cassandra determines the trigger 
 directory wrong. C* calculates the trigger directory as 'build/triggers' 
 instead of 'triggers'.
 FBUtilities.cassandraHomeDir() is to blame, and should be replaced with 
 something more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5826) Fix trigger directory detection code

2013-08-05 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730309#comment-13730309
 ] 

Vijay commented on CASSANDRA-5826:
--

Done, sorry for all the mess on a simple patch.

 Fix trigger directory detection code
 

 Key: CASSANDRA-5826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0 beta 2
 Environment: OS X
Reporter: Aleksey Yeschenko
Assignee: Vijay
  Labels: triggers
 Fix For: 2.0 rc1

 Attachments: 0001-5826.patch, 
 0001-handle-trigger-non-existance.patch, 
 0001-handle-trigger-non-existance-v2.patch


 At least when building from source, Cassandra determines the trigger 
 directory wrong. C* calculates the trigger directory as 'build/triggers' 
 instead of 'triggers'.
 FBUtilities.cassandraHomeDir() is to blame, and should be replaced with 
 something more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5826) Fix trigger directory detection code

2013-08-02 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727960#comment-13727960
 ] 

Vijay commented on CASSANDRA-5826:
--

{quote}
As long as we are not trying to isolate classloaders or anything 
{quote}

Actually we do it with triggers, similar to what solr does for Tokenizer code 
etc (but not the same). 

For the record: You can place all of your dependencies in the trigger directory 
except everything which Cassandra depends on.
If the user uses maven for building, all he needs to do is, and place the jars 
in the trigger directory.

{code}
dependency
  groupIdorg.apache.cassandra/groupId
  artifactIdcassandra-all/artifactId
  version2.0.0-beta2/version
  scopeprovided/scope
/dependency
{code}

My understanding is that, Java doesn't do nested class path scanning on sub 
directories, hence conf file was ok to do. 
But understand it is kind of scary if someone places in conf instead.

 Fix trigger directory detection code
 

 Key: CASSANDRA-5826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0 beta 2
 Environment: OS X
Reporter: Aleksey Yeschenko
Assignee: Vijay
  Labels: triggers
 Fix For: 2.0 rc1

 Attachments: 0001-5826.patch


 At least when building from source, Cassandra determines the trigger 
 directory wrong. C* calculates the trigger directory as 'build/triggers' 
 instead of 'triggers'.
 FBUtilities.cassandraHomeDir() is to blame, and should be replaced with 
 something more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5826) Fix trigger directory detection code

2013-07-31 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5826:
-

Attachment: 0001-5826.patch

Attached a small patch moves the trigger directory into conf directory, hope it 
is fine. that way we can just search for the triggers directory in the class 
path (which is Conf). Thanks!

 Fix trigger directory detection code
 

 Key: CASSANDRA-5826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0 beta 2
 Environment: OS X
Reporter: Aleksey Yeschenko
Assignee: Vijay
  Labels: triggers
 Fix For: 2.0 rc1

 Attachments: 0001-5826.patch


 At least when building from source, Cassandra determines the trigger 
 directory wrong. C* calculates the trigger directory as 'build/triggers' 
 instead of 'triggers'.
 FBUtilities.cassandraHomeDir() is to blame, and should be replaced with 
 something more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5826) Fix trigger directory detection code

2013-07-30 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724240#comment-13724240
 ] 

Vijay commented on CASSANDRA-5826:
--

Probably have to change the build.xml to copy the trigger directory to build 
like what we do with conf directory?
I will add the above and also add it to Debian package may be (in addition 
adding a property to override the trigger absolute path).

 Fix trigger directory detection code
 

 Key: CASSANDRA-5826
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5826
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 2.0 beta 2
 Environment: OS X
Reporter: Aleksey Yeschenko
Assignee: Vijay
  Labels: triggers

 At least when building from source, Cassandra determines the trigger 
 directory wrong. C* calculates the trigger directory as 'build/triggers' 
 instead of 'triggers'.
 FBUtilities.cassandraHomeDir() is to blame, and should be replaced with 
 something more robust.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5175) Unbounded (?) thread growth connecting to an removed node

2013-07-26 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721497#comment-13721497
 ] 

Vijay commented on CASSANDRA-5175:
--

Yes there was another commit on top the attached patch to fix the test cases, 
yes the logic has changed since calling close() is the only time we need to 
stop the thread.

Current code in the repo
{code}
if (m == CLOSE_SENTINEL)
{
disconnect();
if (isStopped)
break;
continue;
}
{code}

 Unbounded (?) thread growth connecting to an removed node
 -

 Key: CASSANDRA-5175
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5175
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.8
 Environment: EC2, JDK 7u9, Ubuntu 12.04.1 LTS
Reporter: Janne Jalkanen
Assignee: Vijay
Priority: Minor
 Fix For: 1.1.10, 1.2.1

 Attachments: 0001-CASSANDRA-5175.patch


 The following lines started repeating every minute in the log file
 {noformat}
  INFO [GossipStage:1] 2013-01-19 19:35:43,929 Gossiper.java (line 831) 
 InetAddress /10.238.x.y is now dead.
  INFO [GossipStage:1] 2013-01-19 19:35:43,930 StorageService.java (line 1291) 
 Removing token 170141183460469231731687303715884105718 for /10.238.x.y
 {noformat}
 Also, I got about 3000 threads which all look like this:
 {noformat}
 Name: WRITE-/10.238.x.y
 State: WAITING on 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1bb65c0f
 Total blocked: 0  Total waited: 3
 Stack trace: 
  sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
 org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:104)
 {noformat}
 A new thread seems to be created every minute, and they never go away.
 The endpoint in question had been a part of the cluster weeks ago, and the 
 node exhibiting the thread growth was added yesterday.
 Anyway, assassinating the endpoint in question stopped thread growth (but 
 kept the existing threads running), so this isn't a huge issue.  But I don't 
 think the thread count is supposed to be increasing like this...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4573) HSHA doesn't handle large messages gracefully

2013-07-20 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13714546#comment-13714546
 ] 

Vijay commented on CASSANDRA-4573:
--

Peter, Looks like your issue is because of the client timeout when you didn't 
receive a response for 10 sec. Time to tune the heap or add more nodes.

Tyler, is this ticket still valid?

 HSHA doesn't handle large messages gracefully
 -

 Key: CASSANDRA-4573
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4573
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Vijay
 Attachments: repro.py


 HSHA doesn't seem to enforce any kind of max message length, and when 
 messages are too large, it doesn't fail gracefully.
 With debug logs enabled, you'll see this:
 {{DEBUG 13:13:31,805 Unexpected state 16}}
 Which seems to mean that there's a SelectionKey that's valid, but isn't ready 
 for reading, writing, or accepting.
 Client-side, you'll get this thrift error (while trying to read a frame as 
 part of {{recv_batch_mutate}}):
 {{TTransportException: TSocket read 0 bytes}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5574) Add trigger examples

2013-07-11 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5574:
-

Affects Version/s: 2.0 beta 1
Fix Version/s: 2.0

 Add trigger examples 
 -

 Key: CASSANDRA-5574
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5574
 Project: Cassandra
  Issue Type: Test
Affects Versions: 2.0 beta 1
Reporter: Vijay
Assignee: Vijay
Priority: Trivial
 Fix For: 2.0

 Attachments: 0001-CASSANDRA-5574.patch


 Since 1311 is committed we need some example code to show the power and usage 
 of triggers. Similar to the ones in examples directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5171) Save EC2Snitch topology information in system table

2013-07-10 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13704261#comment-13704261
 ] 

Vijay commented on CASSANDRA-5171:
--

PS: i only committed to 2.0 to be safe, let me know if you think otherwise. 
Thanks!

 Save EC2Snitch topology information in system table
 ---

 Key: CASSANDRA-5171
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5171
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.1
 Environment: EC2
Reporter: Vijay
Assignee: Vijay
Priority: Critical
 Fix For: 2.0

 Attachments: 0001-CASSANDRA-5171.patch, 0001-CASSANDRA-5171-v2.patch


 EC2Snitch currently waits for the Gossip information to understand the 
 cluster information every time we restart. It will be nice to use already 
 available system table info similar to GPFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5171) Save EC2Snitch topology information in system table

2013-07-10 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-5171:
-

Fix Version/s: (was: 1.2.7)
   2.0

 Save EC2Snitch topology information in system table
 ---

 Key: CASSANDRA-5171
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5171
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.1
 Environment: EC2
Reporter: Vijay
Assignee: Vijay
Priority: Critical
 Fix For: 2.0

 Attachments: 0001-CASSANDRA-5171.patch, 0001-CASSANDRA-5171-v2.patch


 EC2Snitch currently waits for the Gossip information to understand the 
 cluster information every time we restart. It will be nice to use already 
 available system table info similar to GPFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5171) Save EC2Snitch topology information in system table

2013-07-10 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13704784#comment-13704784
 ] 

Vijay commented on CASSANDRA-5171:
--

Thanks Jason!

 Save EC2Snitch topology information in system table
 ---

 Key: CASSANDRA-5171
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5171
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.1
 Environment: EC2
Reporter: Vijay
Assignee: Vijay
Priority: Critical
 Fix For: 2.0

 Attachments: 0001-CASSANDRA-5171.patch, 0001-CASSANDRA-5171-v2.patch


 EC2Snitch currently waits for the Gossip information to understand the 
 cluster information every time we restart. It will be nice to use already 
 available system table info similar to GPFS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


<    1   2   3   4   5   6   7   8   >