[jira] [Issue Comment Edited] (CASSANDRA-4138) Add varint encoding to Serializing Cache

2012-04-16 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255193#comment-13255193
 ] 

Vijay edited comment on CASSANDRA-4138 at 4/16/12 11:57 PM:


Hi Pavel, attached patch has recommended changes except 

{quote}
I think the DBContants class is now should be changed to only share 
sizeof(type) methods and become something like DBContants.{native, 
vint}.sizeof(type)
{quote}

I will mark it private once parent ticket is complete (Messaging and SSTable 
formats), currently we have it called in other places too.

  was (Author: vijay2...@yahoo.com):
Hi Pavel, attached patch has recommended changes except 

{comment}
I think the DBContants class is now should be changed to only share 
sizeof(type) methods and become something like DBContants.{native, 
vint}.sizeof(type)
{comment}

I will mark it private once parent ticket is complete (Messaging and SSTable 
formats), currently we have it called in other places too.
  
 Add varint encoding to Serializing Cache
 

 Key: CASSANDRA-4138
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4138
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Affects Versions: 1.2
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-4138-Take1.patch, 
 0001-CASSANDRA-4138-V2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-4138) Add varint encoding to Serializing Cache

2012-04-16 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255193#comment-13255193
 ] 

Vijay edited comment on CASSANDRA-4138 at 4/16/12 11:58 PM:


Hi Pavel, attached patch has recommended changes except 

 I think the DBContants class is now should be changed to only share 
 sizeof(type) methods and become something like DBContants.{native, 
 vint}.sizeof(type)

I will mark it private once parent ticket is complete (Messaging and SSTable 
formats), currently we have it called in other places too.

  was (Author: vijay2...@yahoo.com):
Hi Pavel, attached patch has recommended changes except 

{quote}
I think the DBContants class is now should be changed to only share 
sizeof(type) methods and become something like DBContants.{native, 
vint}.sizeof(type)
{quote}

I will mark it private once parent ticket is complete (Messaging and SSTable 
formats), currently we have it called in other places too.
  
 Add varint encoding to Serializing Cache
 

 Key: CASSANDRA-4138
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4138
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Affects Versions: 1.2
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-4138-Take1.patch, 
 0001-CASSANDRA-4138-V2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-4140) Build stress classes in a location that allows tools/stress/bin/stress to find them

2012-04-13 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253918#comment-13253918
 ] 

Vijay edited comment on CASSANDRA-4140 at 4/14/12 1:06 AM:
---

Done!

  was (Author: vijay2...@yahoo.com):
Done and tested!
  
 Build stress classes in a location that allows tools/stress/bin/stress to 
 find them
 ---

 Key: CASSANDRA-4140
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4140
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 1.2
Reporter: Nick Bailey
Assignee: Vijay
Priority: Trivial
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-4140-v2.patch, 0001-CASSANDRA-4140.patch


 Right now its hard to run stress from a checkout of trunk. You need to do 
 'ant artifacts' and then run the stress tool in the generated artifacts.
 A discussion on irc came up with the proposal to just move stress to the main 
 jar, but the stress/stressd bash scripts in bin/, and drop the tools 
 directory altogether. It will be easier for users to find that way and will 
 make running stress from a checkout much easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2635) make cache skipping optional

2012-04-12 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252775#comment-13252775
 ] 

Vijay edited comment on CASSANDRA-2635 at 4/12/12 7:49 PM:
---

+1 for the rest of the patch, 

how about the following name and comments?

{code}
# The following setting populates the page cache on memtable flush and 
compaction
# WARNING: Enable this setting only node's data-size can fit in memory.
populate_cache_on_flush: false
{code}

  was (Author: vijay2...@yahoo.com):
+1 for the rest of the patch, 

how about the following name and comments?

# The following setting populates the page cache on memtable flush and 
compaction
# WARNING: Enable this setting only node's data-size can fit in memory.
populate_cache_on_flush: false
  
 make cache skipping optional
 

 Key: CASSANDRA-2635
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2635
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Harish Doddi
Priority: Minor
 Attachments: CASSANDRA-2635-075.txt, CASSANDRA-2635-trunk-1.txt, 
 CASSANDRA-2635-trunk.txt


 We've applied this patch locally in order to turn of page skipping; not 
 completely but only for compaction/repair situations where it can be directly 
 detrimental in the sense of causing data to become cold even though your 
 entire data set fits in memory.
 It's better than completely disabling DONTNEED because the cache skipping 
 does make sense and has no relevant (that I can see) detrimental effects in 
 some cases, like when dumping caches.
 The patch is against 0.7.5 right now but if the change is desired I can make 
 a patch for trunk. Also, the name of the configuration option is dubious 
 since saying 'false' does not actually turn it off completely. I wasn't able 
 to figure out a good name that conveyed the functionality in a short brief 
 name however.
 A related concern as discussed in CASSANDRA-1902 is that the cache skipping 
 isn't fsync:ing and so won't work reliably on writes. If the feature is to be 
 retained that's something to fix in a different ticket.
 A question is also whether to retain the default to true or change it to 
 false. I'm kinda leaning to false since it's detrimental in the easy cases 
 of little data. In big cases with lots of data people will have to think 
 and tweak anyway, so better to put the burden on that end.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2635) make cache skipping optional

2012-04-12 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252775#comment-13252775
 ] 

Vijay edited comment on CASSANDRA-2635 at 4/12/12 7:52 PM:
---

+1 for the rest of the patch, 

how about the following name and comments?

{code}
# The following setting populates the page cache on memtable flush and 
compaction
# WARNING: Enable this setting only when the node's data fit's in memory.
populate_cache_on_flush: false
{code}

  was (Author: vijay2...@yahoo.com):
+1 for the rest of the patch, 

how about the following name and comments?

{code}
# The following setting populates the page cache on memtable flush and 
compaction
# WARNING: Enable this setting only node's data-size can fit in memory.
populate_cache_on_flush: false
{code}
  
 make cache skipping optional
 

 Key: CASSANDRA-2635
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2635
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Harish Doddi
Priority: Minor
 Attachments: CASSANDRA-2635-075.txt, CASSANDRA-2635-trunk-1.txt, 
 CASSANDRA-2635-trunk.txt


 We've applied this patch locally in order to turn of page skipping; not 
 completely but only for compaction/repair situations where it can be directly 
 detrimental in the sense of causing data to become cold even though your 
 entire data set fits in memory.
 It's better than completely disabling DONTNEED because the cache skipping 
 does make sense and has no relevant (that I can see) detrimental effects in 
 some cases, like when dumping caches.
 The patch is against 0.7.5 right now but if the change is desired I can make 
 a patch for trunk. Also, the name of the configuration option is dubious 
 since saying 'false' does not actually turn it off completely. I wasn't able 
 to figure out a good name that conveyed the functionality in a short brief 
 name however.
 A related concern as discussed in CASSANDRA-1902 is that the cache skipping 
 isn't fsync:ing and so won't work reliably on writes. If the feature is to be 
 retained that's something to fix in a different ticket.
 A question is also whether to retain the default to true or change it to 
 false. I'm kinda leaning to false since it's detrimental in the easy cases 
 of little data. In big cases with lots of data people will have to think 
 and tweak anyway, so better to put the burden on that end.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-4138) Add varint encoding to Serializing Cache

2012-04-12 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253028#comment-13253028
 ] 

Vijay edited comment on CASSANDRA-4138 at 4/13/12 1:04 AM:
---

Attached patch is the first attempt to add VarInt Encoding to cassandra. 

It save's us around 10% of the memory compared to normal DataInputStream. 
(based on a simple test via Stress Tool)
Once this gets committed i will work on the rest of the pieces.

  was (Author: vijay2...@yahoo.com):
Attached patch is the first attempt to add VarInt Encoding to cassandra. 

It save's us around 10% of the memory compared to normal DataInputStream.

Once this gets committed i will work on the rest of the pieces.
  
 Add varint encoding to Serializing Cache
 

 Key: CASSANDRA-4138
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4138
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Affects Versions: 1.2
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-4138-Take1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-4100) Make scrub and cleanup operations throttled

2012-04-04 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246370#comment-13246370
 ] 

Vijay edited comment on CASSANDRA-4100 at 4/4/12 3:41 PM:
--

 I think Throttle object in CompcationController should be non-static since 
 compactions may run in parallel.
Exactly thats why static is better, Parallel compaction is not a problem per 
say (ParallelCompactionIterable.getReduced() will take care of it), but 
compaction running one after the other (lot of small compactions).

Let me know if everything else is ok i will rebase to 1.0.10 and move away from 
static (I am ok either ways), if needed. Thanks!

  was (Author: vijay2...@yahoo.com):
 I think Throttle object in CompcationController should be non-static 
since compactions may run in parallel.
Exactly thats why static is better, Parallel compaction is not a problem per 
say (ParallelCompactionIterable.getReduced() will take care of it), but 
compaction running one after the other (lot of small compactions).

Let me know if everything else is ok i will rebase to 1.0.10, if needed. Thanks!
  
 Make scrub and cleanup operations throttled
 ---

 Key: CASSANDRA-4100
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4100
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Vijay
Assignee: Vijay
Priority: Minor
  Labels: compaction
 Fix For: 1.0.10

 Attachments: 0001-CASSANDRA-4100.patch


 Looks like scrub and cleanup operations are not throttled and it will be nice 
 to throttle else we are likely to run into IO issues while running it on live 
 cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3997) Make SerializingCache Memory Pluggable

2012-04-02 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244479#comment-13244479
 ] 

Vijay edited comment on CASSANDRA-3997 at 4/2/12 7:20 PM:
--

Segfaults happen in multiple places (opening a file, accessing malloc, while 
calling free, and in a lot of unrelated cases)... 
Unless we open JDK source code and figure out how it is structured it is hard 
to say when exactly it can fails (Let me know if you want to take a look at the 
hs_err*.log). 

In the bright side at least we can isolate this by calling via JNI, and we dont 
see the issue by loading JEMalloc via LD_LIBRARY_PATH. In v2 I removed the 
synchronization, i have also attached it here (Plz note the yaml setting is not 
included just to hide it for now). Thanks!
Note: jemalloc 2.2.5 release works fine and so as the git/dev branch.

  was (Author: vijay2...@yahoo.com):
Segfaults happen in multiple places (opening a file, accessing malloc, 
while calling free, and in a lot of unrelated cases)... 
Unless we open JDK source code and figure out how it is structured it is hard 
to say when exactly it can fails (Let me know if you want to take a look at the 
hs_err*.log). 

In the bright side at least we can isolate this by calling via JNI. In v2 I 
removed the synchronization, i have also attached it here (Plz note the yaml 
setting is not included just to hide it for now). Thanks!
Note: jemalloc 2.2.5 release works fine and so as the git/dev branch.
  
 Make SerializingCache Memory Pluggable
 --

 Key: CASSANDRA-3997
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3997
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Vijay
Assignee: Vijay
Priority: Minor
  Labels: cache
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-3997-v2.patch, 0001-CASSANDRA-3997.patch, 
 jna.zip


 Serializing cache uses native malloc and free by making FM pluggable, users 
 will have a choice of gcc malloc, TCMalloc or JEMalloc as needed. 
 Initial tests shows less fragmentation in JEMalloc but the only issue with it 
 is that (both TCMalloc and JEMalloc) are kind of single threaded (at-least 
 they crash in my test otherwise).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-4099) IncomingTCPConnection recognizes from by doing socket.getInetAddress() instead of BroadCastAddress

2012-03-28 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240813#comment-13240813
 ] 

Vijay edited comment on CASSANDRA-4099 at 3/28/12 11:07 PM:


Thanks Brandon, CASSANDRA-4101 looks like a better solution but not only does 
the Streaming sets the version Gossip or any connunication does set it, the 
following does it

{code} 
from = msg.getFrom(); // why? see = CASSANDRA-4099
if (version  MessagingService.current_version)
{
// save the endpoint so gossip will reconnect to it
Gossiper.instance.addSavedEndpoint(from);
logger.info(Received  + (isStream ? streaming  : ) + 
connection from newer protocol version. Ignoring);
}
else if (msg != null)
{
Gossiper.instance.setVersion(from, version);
logger.debug(set version for {} to {}, from, version);
}
{code} 

  was (Author: vijay2...@yahoo.com):
Thanks Brandon, CASSANDRA-4101 looks like a better solution but not only 
does the Streaming sets the version Gossip or any connunication does set it, 
the following does it

code
from = msg.getFrom(); // why? see = CASSANDRA-4099
if (version  MessagingService.current_version)
{
// save the endpoint so gossip will reconnect to it
Gossiper.instance.addSavedEndpoint(from);
logger.info(Received  + (isStream ? streaming  : ) + 
connection from newer protocol version. Ignoring);
}
else if (msg != null)
{
Gossiper.instance.setVersion(from, version);
logger.debug(set version for {} to {}, from, version);
}
/code
  
 IncomingTCPConnection recognizes from by doing socket.getInetAddress() 
 instead of BroadCastAddress
 --

 Key: CASSANDRA-4099
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4099
 Project: Cassandra
  Issue Type: Bug
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Attachments: 0001-CASSANDRA-4099.patch


 change this.from = socket.getInetAddress() to understand the broad cast IP, 
 but the problem is we dont know until the first packet is received, this 
 ticket is to work around the problem until it reads the first packet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3997) Make SerializingCache Memory Pluggable

2012-03-24 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237722#comment-13237722
 ] 

Vijay edited comment on CASSANDRA-3997 at 3/24/12 11:15 PM:


Have an update:

Jason Evan says: LD_PRELOAD'ing jemalloc should be okay as long as the JVM 
doesn't statically link a different malloc implementation.  I expect that if it 
isn't safe, you'll experience crashes quite early on, so give it a try and see 
what happens.

I have also conformed the unsafe isn't statically linked to native Malloc by 
adding a printf in the malloc c code which basically count's the number of 
times it is called. Looks like PRELOAD is a better option. I am running a long 
running test and will close this ticket once it is successful. Thanks!



  was (Author: vijay2...@yahoo.com):
Have an update:

Jason Evan's says: LD_PRELOAD'ing jemalloc should be okay as long as the JVM 
doesn't statically link a different malloc implementation.  I expect that if it 
isn't safe, you'll experience crashes quite early on, so give it a try and see 
what happens.

I have also conformed the unsafe isn't statically linked to native Malloc by 
adding a printf in the malloc c code which basically count's the number of 
times it is called. Looks like PRELOAD is a better option. I am running a long 
running test and will close this ticket once it is successful. Thanks!


  
 Make SerializingCache Memory Pluggable
 --

 Key: CASSANDRA-3997
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3997
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Vijay
Assignee: Vijay
Priority: Minor
  Labels: cache
 Fix For: 1.2

 Attachments: 0001-CASSANDRA-3997.patch, jna.zip


 Serializing cache uses native malloc and free by making FM pluggable, users 
 will have a choice of gcc malloc, TCMalloc or JEMalloc as needed. 
 Initial tests shows less fragmentation in JEMalloc but the only issue with it 
 is that (both TCMalloc and JEMalloc) are kind of single threaded (at-least 
 they crash in my test otherwise).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3690) Streaming CommitLog backup

2012-03-08 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225656#comment-13225656
 ] 

Vijay edited comment on CASSANDRA-3690 at 3/8/12 11:12 PM:
---

Hi Jonathan,

Attached patch does exactly what we discussed here. Its almost the same as 
PostgreSQL :) 

In addition we can start the node with -Dcassandra.join_ring=false and then use 
JMX to restore files one by one via JMX.

Plz let me know.

  was (Author: vijay2...@yahoo.com):
Hi Jonathan,

Attached patch does exactly what we discussed here. Its almost the same as 
Postgress :) 

In addition we can start the node with -Dcassandra.join_ring=false and then use 
JMX to restore files one by one via JMX.

Plz let me know.
  
 Streaming CommitLog backup
 --

 Key: CASSANDRA-3690
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3690
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.1.1

 Attachments: 0001-CASSANDRA-3690-v2.patch, 
 0001-Make-commitlog-recycle-configurable.patch, 
 0002-support-commit-log-listener.patch, 0003-helper-jmx-methods.patch, 
 0004-external-commitlog-with-sockets.patch, 
 0005-cmmiting-comments-to-yaml.patch


 Problems with the current SST backups
 1) The current backup doesn't allow us to restore point in time (within a SST)
 2) Current SST implementation needs the backup to read from the filesystem 
 and hence additional IO during the normal operational Disks
 3) in 1.0 we have removed the flush interval and size when the flush will be 
 triggered per CF, 
   For some use cases where there is less writes it becomes 
 increasingly difficult to time it right.
 4) Use cases which needs BI which are external (Non cassandra), needs the 
 data in regular intervals than waiting for longer or unpredictable intervals.
 Disadvantages of the new solution
 1) Over head in processing the mutations during the recover phase.
 2) More complicated solution than just copying the file to the archive.
 Additional advantages:
 Online and offline restore.
 Close to live incremental backup.
 Note: If the listener agent gets restarted, it is the agents responsibility 
 to Stream the files missed or incomplete.
 There are 3 Options in the initial implementation:
 1) Backup - Once a socket is connected we will switch the commit log and 
 send new updates via the socket.
 2) Stream - will take the absolute path of the file and will read the file 
 and send the updates via the socket.
 3) Restore - this will get the serialized bytes and apply's the mutation.
 Side NOTE: (Not related to this patch as such) The agent which will take 
 incremental backup is planned to be open sourced soon (Name: Priam).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3997) Make SerializingCache Memory Pluggable

2012-03-07 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13224533#comment-13224533
 ] 

Vijay edited comment on CASSANDRA-3997 at 3/7/12 5:40 PM:
--

Ohhh sorry for the confusion. 
JEMAlloc's case: The Malloc/Free should done by ANY one thread at a time. The 
test had 100 Threads doing malloc/free but only one will actually malloc/free 
at a time and the Time taken shows the raw speed.
TCMalloc's case: Only one thread should be malloc and doing free. (Even after 
this it was crashing randomly because of illegal memory access, hence i said 
JEMalloc hasnt crashed).

The test code does exactly the above The implementation should deal with it 
and avoid contending for malloc and free with multiple threads. Once we deal 
with it, it works well.

  was (Author: vijay2...@yahoo.com):
Ohhh sorry for the confusion. 
JEMAlloc's case: The Malloc/Free should be one only be done by any one thread 
at a time. The test had 100 Threads doing malloc/free but only one will 
actually malloc/free at a time and the Time taken shows the raw speed.
TCMalloc's case: One thread should be malloc and doing free. (Even making this 
single threaded it was crashing randomly because of illegal memory access 
errors, hence i said JEMalloc hasnt crashed).

The test code does exactly the above The implementation should deal with it 
and avoid contending for malloc and free with multiple threads. Once we deal 
with it, it works well.
  
 Make SerializingCache Memory Pluggable
 --

 Key: CASSANDRA-3997
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3997
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Vijay
Assignee: Vijay
Priority: Minor
  Labels: cache
 Fix For: 1.2

 Attachments: jna.zip


 Serializing cache uses native malloc and free by making FM pluggable, users 
 will have a choice of gcc malloc, TCMalloc or JEMalloc as needed. 
 Initial tests shows less fragmentation in JEMalloc but the only issue with it 
 is that (both TCMalloc and JEMalloc) are kind of single threaded (at-least 
 they crash in my test otherwise).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3997) Make SerializingCache Memory Pluggable

2012-03-04 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222039#comment-13222039
 ] 

Vijay edited comment on CASSANDRA-3997 at 3/4/12 10:26 PM:
---

Attached is the test classes used for the test.

Results on CentOS:

nowiki
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.MallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26049380   45638840  0 169116 996172
-/+ buffers/cache:   24884092   46804128
Swap:0  0  0
 Starting Test! 
Total bytes read: 101422934016
Time taken: 25407
 total   used   free sharedbuffers cached
Mem:  71688220   31981924   39706296  0 169116 996312
-/+ buffers/cache:   30816496   40871724
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=/usr/local/lib/
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.TCMallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26054620   45633600  0 169128 996228
-/+ buffers/cache:   24889264   46798956
Swap:0  0  0
 Starting Test! 
Total bytes read: 101304894464
Time taken: 46387
 total   used   free sharedbuffers cached
Mem:  71688220   28535136   43153084  0 169128 996436
-/+ buffers/cache:   27369572   44318648
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=~/jemalloc-2.2.5/lib/ 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=~/jemalloc-2.2.5/lib/ 
-cp jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.JEMallocAllocator 
5 200
 total   used   free sharedbuffers cached
Mem:  71688220   26060604   45627616  0 169128 996300
-/+ buffers/cache:   24895176   46793044
Swap:0  0  0
 Starting Test! 
Total bytes read: 101321734144
Time taken: 29937
 total   used   free sharedbuffers cached
Mem:  71688220   28472436   43215784  0 169128 996440
-/+ buffers/cache:   27306868   44381352
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 

/nowiki

  was (Author: vijay2...@yahoo.com):
Attached is the test classes used for the test.

Results on CentOS:

[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.MallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26049380   45638840  0 169116 996172
-/+ buffers/cache:   24884092   46804128
Swap:0  0  0
 Starting Test! 
Total bytes read: 101422934016
Time taken: 25407
 total   used   free sharedbuffers cached
Mem:  71688220   31981924   39706296  0 169116 996312
-/+ buffers/cache:   30816496   40871724
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=/usr/local/lib/
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.TCMallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26054620   45633600  0 169128 996228
-/+ buffers/cache:   24889264   46798956
Swap:0  0  0
 Starting Test! 
Total bytes read: 101304894464
Time taken: 46387
 total   used   free sharedbuffers cached
Mem:  71688220   28535136   43153084  0 169128 996436
-/+ buffers/cache:   27369572   44318648
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=~/jemalloc-2.2.5/lib/ 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=~/jemalloc-2.2.5/lib/ 
-cp jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.JEMallocAllocator 
5 200
 total   used   free   

[jira] [Issue Comment Edited] (CASSANDRA-3997) Make SerializingCache Memory Pluggable

2012-03-04 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222039#comment-13222039
 ] 

Vijay edited comment on CASSANDRA-3997 at 3/4/12 10:28 PM:
---

Attached is the test classes used for the test.

Results on CentOS:

{noformat}
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.MallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26049380   45638840  0 169116 996172
-/+ buffers/cache:   24884092   46804128
Swap:0  0  0
 Starting Test! 
Total bytes read: 101422934016
Time taken: 25407
 total   used   free sharedbuffers cached
Mem:  71688220   31981924   39706296  0 169116 996312
-/+ buffers/cache:   30816496   40871724
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=/usr/local/lib/
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.TCMallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26054620   45633600  0 169128 996228
-/+ buffers/cache:   24889264   46798956
Swap:0  0  0
 Starting Test! 
Total bytes read: 101304894464
Time taken: 46387
 total   used   free sharedbuffers cached
Mem:  71688220   28535136   43153084  0 169128 996436
-/+ buffers/cache:   27369572   44318648
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=~/jemalloc-2.2.5/lib/ 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=~/jemalloc-2.2.5/lib/ 
-cp jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.JEMallocAllocator 
5 200
 total   used   free sharedbuffers cached
Mem:  71688220   26060604   45627616  0 169128 996300
-/+ buffers/cache:   24895176   46793044
Swap:0  0  0
 Starting Test! 
Total bytes read: 101321734144
Time taken: 29937
 total   used   free sharedbuffers cached
Mem:  71688220   28472436   43215784  0 169128 996440
-/+ buffers/cache:   27306868   44381352
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 

{noformat}

  was (Author: vijay2...@yahoo.com):
Attached is the test classes used for the test.

Results on CentOS:

nowiki
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.MallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26049380   45638840  0 169116 996172
-/+ buffers/cache:   24884092   46804128
Swap:0  0  0
 Starting Test! 
Total bytes read: 101422934016
Time taken: 25407
 total   used   free sharedbuffers cached
Mem:  71688220   31981924   39706296  0 169116 996312
-/+ buffers/cache:   30816496   40871724
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=/usr/local/lib/
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.TCMallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26054620   45633600  0 169128 996228
-/+ buffers/cache:   24889264   46798956
Swap:0  0  0
 Starting Test! 
Total bytes read: 101304894464
Time taken: 46387
 total   used   free sharedbuffers cached
Mem:  71688220   28535136   43153084  0 169128 996436
-/+ buffers/cache:   27369572   44318648
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=~/jemalloc-2.2.5/lib/ 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=~/jemalloc-2.2.5/lib/ 
-cp jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.JEMallocAllocator 
5 200
 total   

[jira] [Issue Comment Edited] (CASSANDRA-3997) Make SerializingCache Memory Pluggable

2012-03-04 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13222039#comment-13222039
 ] 

Vijay edited comment on CASSANDRA-3997 at 3/4/12 10:30 PM:
---

Attached is the test classes used for the test.

Results on CentOS:

{noformat}
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.MallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26049380   45638840  0 169116 996172
-/+ buffers/cache:   24884092   46804128
Swap:0  0  0
 Starting Test! 
Total bytes read: 101422934016
Time taken: 25407
 total   used   free sharedbuffers cached
Mem:  71688220   31981924   39706296  0 169116 996312
-/+ buffers/cache:   30816496   40871724
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=/usr/local/lib/
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.TCMallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26054620   45633600  0 169128 996228
-/+ buffers/cache:   24889264   46798956
Swap:0  0  0
 Starting Test! 
Total bytes read: 101304894464
Time taken: 46387
 total   used   free sharedbuffers cached
Mem:  71688220   28535136   43153084  0 169128 996436
-/+ buffers/cache:   27369572   44318648
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=~/jemalloc-2.2.5/lib/ 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=~/jemalloc-2.2.5/lib/ 
-cp jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.JEMallocAllocator 
5 200
 total   used   free sharedbuffers cached
Mem:  71688220   26060604   45627616  0 169128 996300
-/+ buffers/cache:   24895176   46793044
Swap:0  0  0
 Starting Test! 
Total bytes read: 101321734144
Time taken: 29937
 total   used   free sharedbuffers cached
Mem:  71688220   28472436   43215784  0 169128 996440
-/+ buffers/cache:   27306868   44381352
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 

{noformat}

The test shows around 4 GB savings. The test was on 101321734144 bytes (101 GB 
each). The test use CLHM to hold on to the objects and release them when the 
capacity is reached (5K)

  was (Author: vijay2...@yahoo.com):
Attached is the test classes used for the test.

Results on CentOS:

{noformat}
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.MallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26049380   45638840  0 169116 996172
-/+ buffers/cache:   24884092   46804128
Swap:0  0  0
 Starting Test! 
Total bytes read: 101422934016
Time taken: 25407
 total   used   free sharedbuffers cached
Mem:  71688220   31981924   39706296  0 169116 996312
-/+ buffers/cache:   30816496   40871724
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=/usr/local/lib/
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 
/etc/alternatives/jre_1.7.0/bin/java -Djava.library.path=/usr/local/lib/ -cp 
jna.jar:/apps/nfcassandra_server/lib/*:. com.sun.jna.TCMallocAllocator 5 
200
 total   used   free sharedbuffers cached
Mem:  71688220   26054620   45633600  0 169128 996228
-/+ buffers/cache:   24889264   46798956
Swap:0  0  0
 Starting Test! 
Total bytes read: 101304894464
Time taken: 46387
 total   used   free sharedbuffers cached
Mem:  71688220   28535136   43153084  0 169128 996436
-/+ buffers/cache:   27369572   44318648
Swap:0  0  0
 ending Test! 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ export 
LD_LIBRARY_PATH=~/jemalloc-2.2.5/lib/ 
[vijay_tcasstest@vijay_tcass-i-a91ee8cd ~]$ 

[jira] [Issue Comment Edited] (CASSANDRA-3853) lower impact on old-gen promotion of slow nodes or connections

2012-02-19 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13211501#comment-13211501
 ] 

Vijay edited comment on CASSANDRA-3853 at 2/19/12 7:33 PM:
---

Other thing we can do is that we can drop the references in the co-ordinator 
once we have executed/sent the query to the remote node, we can avoid promotion 
of those objects. We dont use this command for retry etc... it is done by the 
client hence we dont need to hold this references. Once the references to the 
query is dropped even if the rpc timeout is 10 seconds we have reference to 
only fewer objects.

Bonus: if we convert table name and column family name to byte buffers and use 
references then we will save some there too. 

  was (Author: vijay2...@yahoo.com):
Other thing we can do is that we can drop the references in the 
co-ordinator once we have executed/sent the query to the remote node, this we 
can avoid promotion of those objects at least we dont use this command for 
retry etc... it is done by the client hence we dont need to hold the 
references. Once the references to the query is dropped even if the rpc timeout 
is 10 seconds we have reference to very little objects.

Bonus: if we convert table name and column family name to byte buffers and use 
references then we will save some there too. 
  
 lower impact on old-gen promotion of slow nodes or connections
 --

 Key: CASSANDRA-3853
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3853
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller

 Cassandra has the unfortunate behavior that when things are slow (nodes 
 overloaded, etc) there is a tendency for cascading failure if the system is 
 overall under high load. This is generally true of most systems, but one way 
 in which it is worse than desired is the way we queue up things between 
 stages and outgoing requests.
 First off, I use the following premises:
 * The node is not running Azul ;)
 * The total cost of ownership (in terms of allocation+collection) of an 
 object that dies in old-gen is *much* higher than that of an object that dies 
 in young gen.
 * When CMS fails (concurrent mode failure or promotion failure), the 
 resulting full GC is *serial* and does not use all cores, and is a 
 stop-the-world pause.
 Here is how this very effectively leads to cascading failure of the fallen 
 and can't get up kind:
 * Some node has a problem and is slow, even if just for a little while.
 * Other nodes, especially neighbors in the replica set, start queueing up 
 outgoing requests to the node for {{rpc_timeout}} milliseconds.
 * You have a high (let's say write) throughput of 50 thousand or so requests 
 per second per node.
 * Because you want writes to be highly available and you are okay with high 
 latency, you have an {{rpc_timeout}} of 60 seconds.
 * The total amount of memory used for 60 * 50 000 requests is freaking high.
 * The young gen GC pauses happen *much* more frequently than every 60 seconds.
 * The result is that when a node goes down, other nodes in the replica set 
 start *massively* increasing their promotion rate into old gen. A cluster 
 whose nodes are normally completely fine, with slow nice promotion into 
 old-gen, will now exhibit vastly different behavior than normal: While the 
 total allocation rate doesn't change (or not very much, perhaps a little if 
 clients are doing re-tries), the promotion rate into old-gen increases 
 massively.
 * This increases the total cost of ownership, and thus demand for CPU 
 resources.
 * You will *very* easily see CMS' sweeping phase not stand a chance to sweep 
 up fast enough to keep up with the incoming request rate, even with a hugely 
 inflated heap (CMS sweeping is not parallel, even though marking is).
 * This leads to promotion failure/conc mode failure, and you fall into full 
 GC.
 * But now, your full GC is effectively stealing CPU resources since you are 
 forcing all cores but one to be completely idle on your system.
 * Once you go out of GC, you now have a huge backlog of work to do that you 
 get bombarded with from other nodes that thought it was a good idea to retain 
 30 seconds worth of messages in *their* heap. So you're now being instantly 
 shot down again by your neighbors, falling into the next full GC cycle even 
 easier than originally.
 * Meanwhile, the fact that you are in full gc, is causing your neighbors to 
 enter the same predicament.
 The solution to this in production is to rapidly restart all nodes in the 
 replica set. Doing a live-change of RPC timeouts to something very very low 
 might also do the trick.
 This is a specific instance of the overall problem that we 

[jira] [Issue Comment Edited] (CASSANDRA-3412) make nodetool ring ownership smarter

2012-02-16 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13210055#comment-13210055
 ] 

Vijay edited comment on CASSANDRA-3412 at 2/17/12 5:02 AM:
---

I tried to make this generic enough and attached is a simple patch to display 
what we discussed. If we cannot figure out a keyspace to display the default is 
the same as today.

Default:
Warning: Output contains ownership information which does not include 
replication factor.
Warning: Use nodetool ring keyspace to specify a keyspace. 
Address DC  RackStatus State   LoadOwns 
   Token   

   141784319550391026443072753096942836216 
107.21.183.168  us-east 1c  Up Normal  38.57 KB27.78%   
   18904575940052136859076367081351254013  
79.125.30.58eu-west 1c  Up Normal  36.44 KB22.22%   
   56713727820156410577229101239000783353  
50.16.117.152   us-east 1c  Up Normal  52.03 KB11.11%   
   75618303760208547436305468319979289255  
50.19.163.142   us-east 1c  Up Normal  51.59 KB33.33%   
   132332031580364958013534569558607324497 
46.51.157.33eu-west 1c  Up Normal  31.64 KB5.56%
   141784319550391026443072753096942836216 

Effective nt ring: ('org.apache.cassandra.locator.NetworkTopologyStrategy' and 
strategy_options={us-east:2,eu-west:1})
[vijay_tcasstest@vijay_tcass-i-a6643ac3 ~]$ nt ring
Address DC  RackStatus State   Load
Effective-Owership  Token   

   141784319550391026443072753096942836216 
107.21.183.168  us-east 1c  Up Normal  27.23 KB66.67%   
   18904575940052136859076367081351254013  
79.125.30.58eu-west 1c  Up Normal  31.51 KB50.00%   
   56713727820156410577229101239000783353  
50.16.117.152   us-east 1c  Up Normal  47.1 KB 66.67%   
   75618303760208547436305468319979289255  
50.19.163.142   us-east 1c  Up Normal  42.52 KB66.67%   
   132332031580364958013534569558607324497 
46.51.157.33eu-west 1c  Up Normal  36.32 KB50.00%   
   141784319550391026443072753096942836216 



  was (Author: vijay2...@yahoo.com):
I tried to make this generic enough and attached is a simple patch to 
display what we discussed. If we cannot figure out a keyspace to display the 
default is the same as today.

Default:
Warning: Output contains ownership information which does not include 
replication factor.
Warning: Use nodetool ring keyspace to specify a keyspace. 
Address DC  RackStatus State   LoadOwns 
   Token   

   141784319550391026443072753096942836216 
107.21.183.168  us-east 1c  Up Normal  38.57 KB27.78%   
   18904575940052136859076367081351254013  
79.125.30.58eu-west 1c  Up Normal  36.44 KB22.22%   
   56713727820156410577229101239000783353  
50.16.117.152   us-east 1c  Up Normal  52.03 KB11.11%   
   75618303760208547436305468319979289255  
50.19.163.142   us-east 1c  Up Normal  51.59 KB33.33%   
   132332031580364958013534569558607324497 
46.51.157.33eu-west 1c  Up Normal  31.64 KB5.56%
   141784319550391026443072753096942836216 

Effective nt ring:
[vijay_tcasstest@vijay_tcass-i-a6643ac3 ~]$ nt ring
Address DC  RackStatus State   Load
Effective-Owership  Token   

   141784319550391026443072753096942836216 
107.21.183.168  us-east 1c  Up Normal  27.23 KB66.67%   
   18904575940052136859076367081351254013  
79.125.30.58eu-west 1c  Up Normal  31.51 KB50.00%   
   56713727820156410577229101239000783353  
50.16.117.152   us-east 1c  Up Normal  47.1 KB 66.67%   
   75618303760208547436305468319979289255  
50.19.163.142   us-east 1c  Up Normal  42.52 KB66.67%   
   132332031580364958013534569558607324497 
46.51.157.33 

[jira] [Issue Comment Edited] (CASSANDRA-3772) Evaluate Murmur3-based partitioner

2012-02-13 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207436#comment-13207436
 ] 

Vijay edited comment on CASSANDRA-3772 at 2/14/12 2:26 AM:
---

If CASSANDRA-2975 gets committed you should be able to use that.

Edit: you can use MurmurHash.hash3_x64_128 function from 2975.

  was (Author: vijay2...@yahoo.com):
If CASSANDRA-2975 gets committed you should be able to use that.
  
 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Dave Brosius
 Fix For: 1.2

 Attachments: try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1956) Convert row cache to row+filter cache

2012-02-08 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204329#comment-13204329
 ] 

Vijay edited comment on CASSANDRA-1956 at 2/9/12 7:30 AM:
--

This patch is not complete yet i just wanted to show and see what you guys 
think about this... This patch is something like a block cache where it will 
cache blocks of columns where the user can choose the block size and if the 
query is within the block we are good by just pulling the block into memory 
else we will scan through the blocks and get the required blocks. Updates can 
also scan through the blocks and update them... The good part here is this 
should have lower memory foot print than Query cache but it should also solve 
the problems which we are discussing in this ticket and it doesnt support Super 
columns and I dont plan to do so. Let me know, Thanks! Again there is more 
logic/cases to be handled, Just a prototype for now.

  was (Author: vijay2...@yahoo.com):
This patch is not complete yet i just wanted to show and see what you guys 
think about this... This patch is something like a block cache where it will 
cache blocks of columns where the user can choose the block size and if the 
query is within the block we are good by just pulling the block into memory 
else we will scan through the blocks and get the required blocks. Updates can 
also scan through the blocks and update them... The good part here is this 
should have lower memory foot print than Query cache but it should also solve 
the problems which we are discussing in this ticket. Let me know thanks! Again 
there is more logic/cases to be handled, Just a prototype for now.
  
 Convert row cache to row+filter cache
 -

 Key: CASSANDRA-1956
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1956
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Vijay
Priority: Minor
 Fix For: 1.2

 Attachments: 0001-1956-cache-updates-v0.patch, 
 0001-commiting-block-cache.patch, 0001-re-factor-row-cache.patch, 
 0001-row-cache-filter.patch, 0002-1956-updates-to-thrift-and-avro-v0.patch, 
 0002-add-query-cache.patch


 Changing the row cache to a row+filter cache would make it much more useful. 
 We currently have to warn against using the row cache with wide rows, where 
 the read pattern is typically a peek at the head, but this usecase would be 
 perfect supported by a cache that stored only columns matching the filter.
 Possible implementations:
 * (copout) Cache a single filter per row, and leave the cache key as is
 * Cache a list of filters per row, leaving the cache key as is: this is 
 likely to have some gotchas for weird usage patterns, and it requires the 
 list overheard
 * Change the cache key to rowkey+filterid: basically ideal, but you need a 
 secondary index to lookup cache entries by rowkey so that you can keep them 
 in sync with the memtable
 * others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3690) Streaming CommitLog backup

2012-01-23 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191899#comment-13191899
 ] 

Vijay edited comment on CASSANDRA-3690 at 1/24/12 5:43 AM:
---

0001 = Adds a configuration so we can avoid recycling in case some one wants 
to copy the files across to another location like a archive logs
0002 = Adds CommitLogListener, implementation can recive the updates to the 
commitlogs.
0003 = helper JMX in case the user wants to query the active CL's
0004 = this can go to the tools folder/we dont need to commit it to the core.

  was (Author: vijay2...@yahoo.com):
0001 = Adds CommitLogListener, implementation can recive the updates to 
the commitlogs. This also adds a configuration so we can avoid recycling in 
case some one wants to copy the files across to another location like a archive 
logs
0002 = helper JMX in case the user wants to query the active CL's
0003 = this can go to the tools folder/we dont need to commit it to the core.
  
 Streaming CommitLog backup
 --

 Key: CASSANDRA-3690
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3690
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.1

 Attachments: 0001-Make-commitlog-recycle-configurable.patch, 
 0002-support-commit-log-listener.patch, 0003-helper-jmx-methods.patch, 
 0004-external-commitlog-with-sockets.patch, 
 0005-cmmiting-comments-to-yaml.patch


 Problems with the current SST backups
 1) The current backup doesn't allow us to restore point in time (within a SST)
 2) Current SST implementation needs the backup to read from the filesystem 
 and hence additional IO during the normal operational Disks
 3) in 1.0 we have removed the flush interval and size when the flush will be 
 triggered per CF, 
   For some use cases where there is less writes it becomes 
 increasingly difficult to time it right.
 4) Use cases which needs BI which are external (Non cassandra), needs the 
 data in regular intervals than waiting for longer or unpredictable intervals.
 Disadvantages of the new solution
 1) Over head in processing the mutations during the recover phase.
 2) More complicated solution than just copying the file to the archive.
 Additional advantages:
 Online and offline restore.
 Close to live incremental backup.
 Note: If the listener agent gets restarted, it is the agents responsibility 
 to Stream the files missed or incomplete.
 There are 3 Options in the initial implementation:
 1) Backup - Once a socket is connected we will switch the commit log and 
 send new updates via the socket.
 2) Stream - will take the absolute path of the file and will read the file 
 and send the updates via the socket.
 3) Restore - this will get the serialized bytes and apply's the mutation.
 Side NOTE: (Not related to this patch as such) The agent which will take 
 incremental backup is planned to be open sourced soon (Name: Priam).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3723) Include await for the queues in tpstats

2012-01-16 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187169#comment-13187169
 ] 

Vijay edited comment on CASSANDRA-3723 at 1/16/12 8:42 PM:
---

 latency = await + this task's processing time, right?
Currently it is just the processing time... what i am proposing is to change 
the latency number to something like await (await in the queue + processing 
time). If that makes sense.

 not sure why we'd remove the latency though
I am not saying we have to remove it... Just like a enhancement instead of 
adding one more metric to glance :)

  was (Author: vijay2...@yahoo.com):
 latency = await + this task's processing time, right?
Currently it is just the processing time... what i am proposing is to change 
the latency number to something like await. (If that makes sense).

 not sure why we'd remove the latency though
I am not saying we have to remove it... Just like a enhancement instead of 
adding one more metric to glance :)
  
 Include await for the queues in tpstats
 ---

 Key: CASSANDRA-3723
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3723
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.2
Reporter: Vijay
Assignee: Vijay
Priority: Minor

 Something simillar to IOSTAT await, there is an additional over head not sure 
 if we have to make an exception for this but i think this has a huge + 
 while troubleshooting
 await
 The average time (in milliseconds) for I/O requests issued to the request to 
 be served. This includes the time spent by the requests in queue and the time 
 spent servicing them 
 or we can also have a simple average of time spent in the queue before being 
 served.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3723) Include await for the queues in tpstats

2012-01-16 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187169#comment-13187169
 ] 

Vijay edited comment on CASSANDRA-3723 at 1/16/12 8:43 PM:
---

 latency = await + this task's processing time, right?
Currently it is just the processing time... what i am proposing is to change 
the latency number to something like await (wait time in the queue + processing 
time). If that makes sense.

 not sure why we'd remove the latency though
I am not saying we have to remove it... Just like a enhancement instead of 
adding one more metric to glance :)

  was (Author: vijay2...@yahoo.com):
 latency = await + this task's processing time, right?
Currently it is just the processing time... what i am proposing is to change 
the latency number to something like await (await in the queue + processing 
time). If that makes sense.

 not sure why we'd remove the latency though
I am not saying we have to remove it... Just like a enhancement instead of 
adding one more metric to glance :)
  
 Include await for the queues in tpstats
 ---

 Key: CASSANDRA-3723
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3723
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 1.2
Reporter: Vijay
Assignee: Vijay
Priority: Minor

 Something simillar to IOSTAT await, there is an additional over head not sure 
 if we have to make an exception for this but i think this has a huge + 
 while troubleshooting
 await
 The average time (in milliseconds) for I/O requests issued to the request to 
 be served. This includes the time spent by the requests in queue and the time 
 spent servicing them 
 or we can also have a simple average of time spent in the queue before being 
 served.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3590) Use multiple connection to share the OutboutTCPConnection

2012-01-16 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187466#comment-13187466
 ] 

Vijay edited comment on CASSANDRA-3590 at 1/17/12 6:06 AM:
---

Finally had a chance to do this bench mark. 

Configuration: M2.4xl (AWS)
Traffic Between: US and EU
Open JDK, CentOS 5.6

3 Tests where done where the active queue is the limiting factor for the 
traffic to go across the nodes. Latency is the metric which we are trying to 
measure in this test (With 1 connection the latency is high, because of the 
Delay over the public internet in a AWS multi region setup). 

Code for the benchmark is attached with this ticket. 

Server A (US): java -jar Listener.jar 7103
Server B (EU): java -jar RunTest.jar 1 107.22.50.61 7103 500

Server C (US): java -jar Listener.jar 7103
Server D (EU): java -jar RunTest.jar 2 107.22.50.61 7103 500

Data is collected with 1 Second interval (plz see code for details).
Code for the IncomingTcpConnection and OutboundTcpConnection was modified a 
little bit to work independent of other cassandra services (Plz see code for 
details).


  was (Author: vijay2...@yahoo.com):
Finally had a chance to do this bench mark. 

Configuration: M2.4xl (AWS)
Traffic Between: US and EU
Open JDK, CentOS 5.6

3 Tests where done where the active queue is the limiting factor for the 
traffic to go across the nodes. Latency is the metric which we are trying to 
measure in this test (With 1 connection the latency is high, because of the 
Delay over the public internet in a AWS multi region setup). 

Code for the benchmark is attached with this ticket. 

Server A (US): java -jar Listener.jar 7103
Server B (EU): java -jar RunTest.jar 1 107.22.50.61 7103 500

Server C (US): java -jar Listener.jar 7103
Server D (EU): java -jar RunTest.jar 2 107.22.50.61 7103 500

Data is collected with 1 Second interval (plz see code for details).

  
 Use multiple connection to share the OutboutTCPConnection
 -

 Key: CASSANDRA-3590
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3590
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.2

 Attachments: TCPTest.xlsx, TCPTest.zip


 Currently there is one connection between any given host to another host in 
 the cluster, the problem with this is:
 1) This can become a bottleneck in some cases where the latencies are higher.
 2) When a connection is dropped we also drop the queue and recreate a new one 
 and hence the messages can be lost (Currently hints will take care of it and 
 clients also can retry)
 by making it a configurable option to configure the number of connections and 
 also making the queue common to those connections the above 2 issues can be 
 resolved.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3695) Hibernating nodes that die never go away

2012-01-04 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179827#comment-13179827
 ] 

Vijay edited comment on CASSANDRA-3695 at 1/4/12 8:08 PM:
--

Or can we have AVeryLongTime more like configuration only if they dont have any 
state (Fat clients wont have state, currently) and the nodes with the dead 
state can be removed much more often something like an hour or so since we last 
got the ghossip? 

  was (Author: vijay2...@yahoo.com):
Or can we have AVeryLongTime more like configuration only if they dont have 
any state and the nodes with the dead state can be removed much more often 
something like an hour or so since we last got the ghossip? 
  
 Hibernating nodes that die never go away
 

 Key: CASSANDRA-3695
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3695
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Brandon Williams
Assignee: Brandon Williams

 Title says it all.  We should be able to monitor these via the gossip 
 heartbeat like other nodes, but it's tricky since it's a dead state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-26 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176071#comment-13176071
 ] 

Vijay edited comment on CASSANDRA-3623 at 12/27/11 3:22 AM:


Alright i think i found the the missing peace:
1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610)
2) Plz reapply v3 which has the mark() (this seem to be used by range slice and 
Stress tool does it).
3) Plz set the CRC chance to 0.0 by update chance - We need to do this before 
the SST's are created otherwise it wont take into effect. (update statements i 
used is in the *.doc attached)
You might not see any diffrence if it is not set, because thats a big 
bottleneck.
4) I used SunJDK for the test.

The Test Results are attached, let me know in case of any questions... the 
performance seem to be better.

I Used stress test so we are in the same page, and when the Column size or the 
range of columns to be fetched increases the performance gets better (rebuffers)

  was (Author: vijay2...@yahoo.com):
Alright i think i found the the missing peace:
1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610)
2) Plz reapply v3 which has the mark() (this seem to be used by range slice and 
Stress tool does it).
3) Plz set the CRC chance to 0.0 by update chance - We need to do this before 
the SST's are created otherwise it wont take into effect. (update statements i 
used is in the *.doc attached)
4) I used SunJDK for the test.

The Test Results are attached, let me know in case of any questions... the 
performance seem to be better.

I Used stress test so we are in the same page, and when the Column size or the 
range of columns to be fetched increases the performance gets better (rebuffers)
  
 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file-v3.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v3.patch, CRC+MMapIO.xlsx, 
 MMappedIO-Performance.docx


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-26 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176071#comment-13176071
 ] 

Vijay edited comment on CASSANDRA-3623 at 12/27/11 3:20 AM:


Alright i think i found the the missing peace:
1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610)
2) Plz reapply v3 which has the mark() (this seem to be used by range slice and 
Stress tool does it).
3) Plz set the CRC chance to 0.0 by update chance - We need to do this before 
the SST's are created otherwise it wont take into effect. (update statements i 
used is in the *.doc attached)
4) I used SunJDK for the test.

The Test Results are attached, let me know in case of any questions... the 
performance seem to be better.

I Used stress test so we are in the same page, and when the Column size or the 
range of columns to be fetched increases the performance gets better (rebuffers)

  was (Author: vijay2...@yahoo.com):
Alright i think i found the the missing peace:
1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610)
2) Plz reapply v3 which has the mark() (this seem to be used by range slice and 
Stress tool does it).

The Test Results are attached, let me know in case of any questions... the 
performance seem to be better.

I Used stress test so we are in the same page, and when the Column size or the 
range of columns to be fetched increases the performance gets better (rebuffers)
  
 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file-v3.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v3.patch, CRC+MMapIO.xlsx, 
 MMappedIO-Performance.docx


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-26 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176071#comment-13176071
 ] 

Vijay edited comment on CASSANDRA-3623 at 12/27/11 3:23 AM:


Alright i think i found the the missing pieces:
1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610)
2) Plz reapply v3 which has the mark() (this seem to be used by range slice and 
Stress tool does it).
3) Plz set the CRC chance to 0.0 by update chance - We need to do this before 
the SST's are created otherwise it wont take into effect. (update statements i 
used is in the *.doc attached)
You might not see any diffrence if it is not set, because thats a big 
bottleneck.
4) I used SunJDK for the test.

The Test Results are attached, let me know in case of any questions... the 
performance seem to be better.

I Used stress test so we are in the same page, and when the Column size or the 
range of columns to be fetched increases the performance gets better (rebuffers)

  was (Author: vijay2...@yahoo.com):
Alright i think i found the the missing peace:
1) Plz reapply v2 from CASSANDRA-3611 (which also depends on CASSANDRA-3610)
2) Plz reapply v3 which has the mark() (this seem to be used by range slice and 
Stress tool does it).
3) Plz set the CRC chance to 0.0 by update chance - We need to do this before 
the SST's are created otherwise it wont take into effect. (update statements i 
used is in the *.doc attached)
You might not see any diffrence if it is not set, because thats a big 
bottleneck.
4) I used SunJDK for the test.

The Test Results are attached, let me know in case of any questions... the 
performance seem to be better.

I Used stress test so we are in the same page, and when the Column size or the 
range of columns to be fetched increases the performance gets better (rebuffers)
  
 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file-v3.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v3.patch, CRC+MMapIO.xlsx, 
 MMappedIO-Performance.docx


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175595#comment-13175595
 ] 

Vijay edited comment on CASSANDRA-3623 at 12/23/11 10:30 PM:
-

Hot Methods before the patch (trunk, without any patch):
Excl. User CPUName

   sec.  %
1480.474 100.00   Total
756.717  51.11   crc32
387.767  26.19   static@0x54999 (snappy-1.0.4.1-libsnappyjava.so)
 54.814   3.70   
org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(java.lang.String,
 org.apache.cassandra.io.compress.CompressionMetadata, boolean)
 46.676   3.15   
org.apache.cassandra.io.util.RandomAccessReader.init(java.io.File, int, 
boolean)
 45.697   3.09   Copy::pd_disjoint_words(HeapWord*, HeapWord*, unsigned long)
 39.417   2.66   memcpy
 36.931   2.49   static@0xd8e9 (libpthread-2.5.so)
 23.272   1.57   CompactibleFreeListSpace::block_size(const HeapWord*) const
 22.766   1.54   SpinPause
 12.593   0.85   BlockOffsetArrayNonContigSpace::block_start_unsafe(const 
void*) const
  9.304   0.63   CardTableModRefBSForCTRS::card_will_be_scanned(signed char)
  8.468   0.57   CardTableModRefBS::non_clean_card_iterate_work(MemRegion, 
MemRegionClosure*, bool)
  8.051   0.54   
ParallelTaskTerminator::offer_termination(TerminatorTerminator*)
  5.400   0.36   madvise
  4.619   0.31   CardTableModRefBS::process_chunk_boundaries(Space*, 
DirtyCardToOopClosure*, MemRegion, MemRegion, signed char**, unsigned long, 
unsigned long)
  1.584   0.11   CardTableModRefBS::dirty_card_range_after_reset(MemRegion, 
bool, int)
  1.551   0.10   SweepClosure::do_blk_careful(HeapWord*)


Hot Methods After the patch:
sec.  %
537.681 100.00   Total
529.719  98.52   static@0x54999 (snappy-1.0.4.1-libsnappyjava.so)
4.168   0.78   memcpy
0.143   0.03   Unknown
0.121   0.02   send
0.121   0.02   sun.misc.Unsafe.park(boolean, long)
0.110   0.02   sun.misc.Unsafe.unpark(java.lang.Object)
0.088   0.02   Interpreter
0.077   0.01   org.apache.cassandra.utils.EstimatedHistogram.max()
0.077   0.01   recv
0.066   0.01   SpinPause
0.055   0.01   org.apache.cassandra.utils.EstimatedHistogram.mean()
0.044   0.01   java.lang.Object.wait(long)
0.044   0.01   org.apache.cassandra.utils.EstimatedHistogram.min()
0.044   0.01   __pthread_cond_signal
0.044   0.01   vtable stub
0.033   0.01   java.lang.Object.notify()
0.033   0.01   
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable)
0.033   0.01   
org.apache.cassandra.io.compress.CompressedMappedFileDataInput.read()
0.033   0.01   PhaseLive::compute(unsigned)
0.033   0.01   poll
0.022   0.00   Arena::contains(const void*) const
0.022   0.00   CompactibleFreeListSpace::free() const
0.022   0.00   I2C/C2I adapters
0.022   0.00   IndexSetIterator::advance_and_next()
0.022   0.00   java.lang.Class.forName0(java.lang.String, boolean, 
java.lang.ClassLoader)
0.022   0.00   java.lang.Long.getChars(long, int, char[])
0.022   0.00   java.nio.Bits.swap(int)



Before this patch response times (With crc chance set to 0):
Epoch   Rds/s   RdLat   Wrts/s  WrtLat %user   %sys  %idle  
 %iowait %steal  md0r/s  w/s rMB/s   wMB/s   NetRxKb NetTxKb Percentiles
 ReadWrite   Compacts
1324587443  15  186.305 00.000   27.85  0.0271.83   
0.24  0.053.890.000.120.0041  45  99th 
545.791 ms 95th 454.826 ms 99th 0.00 ms95th 0.00 msPen/0
1324587455  15  1142.712   00.000   39.55  0.1357.61
   2.50  0.21118.30  0.302.200.0034  36  99th 
8409.007 ms95th 8409.007 ms99th 0.00 ms95th 0.00 msPen/0
1324587467  10  171.808 00.000   23.83  0.0476.05   
0.04   0.054.800.000.140.00127 33  99th 
454.826 ms 95th 315.852 ms 99th 0.00 ms95th 0.00 msPen/0
1324587478  10  182.775 00.000   20.43  0.0479.47   
0.01  0.051.600.400.040.0030  37  99th 
379.022 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0
1324587490  13  190.893 00.000   27.58  0.0372.20   
0.14  0.063.200.500.090.0039  42  99th 
545.791 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0
1324587503  28  358.719 00.000   52.24  0.0846.20   
1.40  0.09159.40  0.003.160.00196 71  99th 
3379.391 ms95th 943.127 ms 99th 0.00 ms95th 0.00 msPen/0
1324587517  13  194.281 00.000   16.68  0.0283.23   
0.04  0.022.400.300.070.0038  41  99th 
785.939 ms 95th 545.791 ms 99th 0.00 ms95th 0.00 msPen/0
1324587535  36  662.410 00.000   58.34  0.08

[jira] [Issue Comment Edited] (CASSANDRA-3610) Checksum improvement for CompressedRandomAccessReader

2011-12-22 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175035#comment-13175035
 ] 

Vijay edited comment on CASSANDRA-3610 at 12/22/11 8:29 PM:


Ooops pasted the wrong data the above data is without any Heap settings 
hence GC becomes a bottleneck... Plz see the below :)


/usr/java/latest/jre/bin/java -XX:+UseThreadPriorities 
-XX:ThreadPriorityPolicy=42 -Xms8G -Xmx8G -Xmn2G 
-XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 
-XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -jar 
TestCRC32Performance.jar 


||bytes||PureJava MB/sec||Native MB/sec||Random PureJava MB/sec||Native MB/sec||
| 1 |121.124|11.866 |
| 2 |161.981|23.851 |
| 4 |204.718|45.486 |
| 8 |297.229|76.296 |
| 16|379.268|117.326|
| 32|440.153|157.711|
| 64|468.143|193.304|| PureJava |0-64   |272.921 
MB/sec|| Native|0-64   |145.289 MB/sec|
| 128   |500.006|219.657|| PureJava |0-128  |367.816 
MB/sec|| Native|0-128  |186.861 MB/sec|
| 256   |511.572|234.052|| PureJava |0-256  |432.433 
MB/sec|| Native|0-256  |214.047 MB/sec|
| 512   |517.550|242.634|| PureJava |0-512  |474.074 
MB/sec|| Native|0-512  |231.047 MB/sec|
| 1024  |516.994|246.424|| PureJava |0-1024 |498.055 
MB/sec|| Native|0-1024 |241.056 MB/sec|
| 2048  |518.095|248.529|| PureJava |0-2048 |509.960 
MB/sec|| Native|0-2048 |245.683 MB/sec|
| 4096  |522.002|249.755|| PureJava |0-4096 |518.226 
MB/sec|| Native|0-4096 |248.062 MB/sec|
| 8192  |522.795|250.316|| PureJava |0-8192 |520.326 
MB/sec|| Native|0-8192 |249.519 MB/sec|
| 16384 |522.521|250.484|| PureJava |0-16384
|522.480 MB/sec|| Native|0-16384|250.002 MB/sec|
| 32768 |521.098|250.604|| PureJava |0-32768
|520.349 MB/sec|| Native|0-32768|250.494 MB/sec|
| 65536 |520.973|250.837|| PureJava |0-65536
|520.392 MB/sec|| Native|0-65536|249.063 MB/sec|
| 131072|510.129|248.949|| PureJava |0-131072   
|516.246 MB/sec|| Native|0-131072   |249.535 MB/sec|
| 262144|513.534|249.506|| PureJava |0-262144   
|514.407 MB/sec|| Native|0-262144   |250.617 MB/sec|
| 524288|519.554|250.696|| PureJava |0-524288   
|520.402 MB/sec|| Native|0-524288   |251.048 MB/sec|
| 1048576   |519.559|250.557|| PureJava |0-1048576  
|520.403 MB/sec|| Native|0-1048576  |250.734 MB/sec|
| 2097152   |519.259|250.456|| PureJava |0-2097152  
|519.337 MB/sec|| Native|0-2097152  |250.299 MB/sec|
| 4194304   |518.649|250.470|| PureJava |0-4194304  
|518.495 MB/sec|| Native|0-4194304  |250.523 MB/sec|
| 8388608   |501.986|248.044|| PureJava |0-8388608  
|509.521 MB/sec|| Native|0-8388608  |248.626 MB/sec|
| 16777216  |508.201|247.587|| PureJava |0-16777216 
|505.258 MB/sec|| Native|0-16777216 |249.558 MB/sec|


[vijay_tcasstest@vijay_tcass--1a-i-aad629c8 ~]$ /usr/java/latest/jre/bin/java 
-version
java version 1.6.0_27
Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)
[vijay_tcasstest@vijay_tcass--1a-i-aad629c8 ~]$ 



  was (Author: vijay2...@yahoo.com):
Ooops pasted the wrong data the above data is without any Heap settings 
hence GC becomes a bottleneck... Plz see the below :)


/usr/java/latest/jre/bin/java -XX:+UseThreadPriorities 
-XX:ThreadPriorityPolicy=42 -Xms8G -Xmx8G -Xmn2G 
-XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 
-XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -jar 
TestCRC32Performance.jar 


||bytes||PureJava MB/sec||Native MB/sec||Random PureJava MB/sec||Native MB/sec||
| 1 |121.124|11.866 |
| 2 |161.981|23.851 |
| 4 |204.718|45.486 |
| 8 |297.229|76.296 |
| 16|379.268|117.326|
| 32|440.153|157.711|
| 64|468.143|193.304|| PureJava |0-64   |272.921 
MB/sec|| Native|0-64   |145.289 MB/sec|
| 128   |500.006|219.657|| PureJava |0-128  

[jira] [Issue Comment Edited] (CASSANDRA-3610) Checksum improvement for CompressedRandomAccessReader

2011-12-22 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175035#comment-13175035
 ] 

Vijay edited comment on CASSANDRA-3610 at 12/22/11 8:30 PM:


Ooops pasted the wrong data the above data is without any Heap settings and 
on Open JDK.. Plz see the below :)


/usr/java/latest/jre/bin/java -XX:+UseThreadPriorities 
-XX:ThreadPriorityPolicy=42 -Xms8G -Xmx8G -Xmn2G 
-XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 
-XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -jar 
TestCRC32Performance.jar 


||bytes||PureJava MB/sec||Native MB/sec||Random PureJava MB/sec||Native MB/sec||
| 1 |121.124|11.866 |
| 2 |161.981|23.851 |
| 4 |204.718|45.486 |
| 8 |297.229|76.296 |
| 16|379.268|117.326|
| 32|440.153|157.711|
| 64|468.143|193.304|| PureJava |0-64   |272.921 
MB/sec|| Native|0-64   |145.289 MB/sec|
| 128   |500.006|219.657|| PureJava |0-128  |367.816 
MB/sec|| Native|0-128  |186.861 MB/sec|
| 256   |511.572|234.052|| PureJava |0-256  |432.433 
MB/sec|| Native|0-256  |214.047 MB/sec|
| 512   |517.550|242.634|| PureJava |0-512  |474.074 
MB/sec|| Native|0-512  |231.047 MB/sec|
| 1024  |516.994|246.424|| PureJava |0-1024 |498.055 
MB/sec|| Native|0-1024 |241.056 MB/sec|
| 2048  |518.095|248.529|| PureJava |0-2048 |509.960 
MB/sec|| Native|0-2048 |245.683 MB/sec|
| 4096  |522.002|249.755|| PureJava |0-4096 |518.226 
MB/sec|| Native|0-4096 |248.062 MB/sec|
| 8192  |522.795|250.316|| PureJava |0-8192 |520.326 
MB/sec|| Native|0-8192 |249.519 MB/sec|
| 16384 |522.521|250.484|| PureJava |0-16384
|522.480 MB/sec|| Native|0-16384|250.002 MB/sec|
| 32768 |521.098|250.604|| PureJava |0-32768
|520.349 MB/sec|| Native|0-32768|250.494 MB/sec|
| 65536 |520.973|250.837|| PureJava |0-65536
|520.392 MB/sec|| Native|0-65536|249.063 MB/sec|
| 131072|510.129|248.949|| PureJava |0-131072   
|516.246 MB/sec|| Native|0-131072   |249.535 MB/sec|
| 262144|513.534|249.506|| PureJava |0-262144   
|514.407 MB/sec|| Native|0-262144   |250.617 MB/sec|
| 524288|519.554|250.696|| PureJava |0-524288   
|520.402 MB/sec|| Native|0-524288   |251.048 MB/sec|
| 1048576   |519.559|250.557|| PureJava |0-1048576  
|520.403 MB/sec|| Native|0-1048576  |250.734 MB/sec|
| 2097152   |519.259|250.456|| PureJava |0-2097152  
|519.337 MB/sec|| Native|0-2097152  |250.299 MB/sec|
| 4194304   |518.649|250.470|| PureJava |0-4194304  
|518.495 MB/sec|| Native|0-4194304  |250.523 MB/sec|
| 8388608   |501.986|248.044|| PureJava |0-8388608  
|509.521 MB/sec|| Native|0-8388608  |248.626 MB/sec|
| 16777216  |508.201|247.587|| PureJava |0-16777216 
|505.258 MB/sec|| Native|0-16777216 |249.558 MB/sec|


[vijay_tcasstest@vijay_tcass--1a-i-aad629c8 ~]$ /usr/java/latest/jre/bin/java 
-version
java version 1.6.0_27
Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
Java HotSpot(TM) 64-Bit Server VM (build 20.2-b06, mixed mode)
[vijay_tcasstest@vijay_tcass--1a-i-aad629c8 ~]$ 



  was (Author: vijay2...@yahoo.com):
Ooops pasted the wrong data the above data is without any Heap settings 
hence GC becomes a bottleneck... Plz see the below :)


/usr/java/latest/jre/bin/java -XX:+UseThreadPriorities 
-XX:ThreadPriorityPolicy=42 -Xms8G -Xmx8G -Xmn2G 
-XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 
-XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -jar 
TestCRC32Performance.jar 


||bytes||PureJava MB/sec||Native MB/sec||Random PureJava MB/sec||Native MB/sec||
| 1 |121.124|11.866 |
| 2 |161.981|23.851 |
| 4 |204.718|45.486 |
| 8 |297.229|76.296 |
| 16|379.268|117.326|
| 32|440.153|157.711|
| 64|468.143|193.304|| PureJava |0-64   |272.921 
MB/sec|| Native|0-64   |145.289 MB/sec|
| 128   |500.006|219.657|| PureJava |0-128  |367.816 
MB/sec|| 

[jira] [Issue Comment Edited] (CASSANDRA-3112) Make repair fail when an unexpected error occurs

2011-12-01 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13161327#comment-13161327
 ] 

Vijay edited comment on CASSANDRA-3112 at 12/2/11 12:22 AM:


Hi Sylvain,

I have seen the following issues in the Repairs specially in AWS Multi DC 
deployments...
1) Stream session or the stream doesn't have any progress (Read Timeout/rpc 
timeout - Socket timeout might help)
2) Validation compaction completed but the result tree is sent but not received.
3) Repair request is sent but the receiving node didn't receive it.
4) When we have a big repair which runs for hours it will be better to retry 
the failed part rather than full retry.

Do you think it is worth to address this in a separate ticket? else i will 
close CASSANDRA-3487.


  was (Author: vijay2...@yahoo.com):
Hi Sylvain,

I have seen the following issues in the Repairs specially in AWS Multi DC 
deployments...
1) Stream session or the stream doesn't have any progress (Read Timeout/rpc 
timeout - Socket timeout might help)
2) Validation compaction completed but the result tree is sent but not received?
3) Repair request is sent but the receiving node didn't receive it?
4) When we have a big repair which runs for hours it will be better to retry 
the failed part rather than full retry.

Do you think it is worth to address this in a separate ticket? else i will 
close CASSANDRA-3487.

  
 Make repair fail when an unexpected error occurs
 

 Key: CASSANDRA-3112
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3112
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: repair
 Fix For: 1.0.6

 Attachments: 0003-Report-streaming-errors-back-to-repair-v4.patch, 
 0004-Reports-validation-compaction-errors-back-to-repair-v4.patch


 CASSANDRA-2433 makes it so that nodetool repair will fail if a node 
 participating to repair dies before completing his part of the repair. This 
 handles most of the situation where repair was previously hanging, but repair 
 can still hang if an unexpected error occurs during either the merkle tree 
 creation (an on-disk corruption triggers an IOError say) or during streaming 
 (though I'm not sure what could make streaming failed outside of 'one of the 
 node died' (besides a bug)).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1740) Nodetool commands to query and stop compaction, repair, cleanup and scrub

2011-11-24 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156810#comment-13156810
 ] 

Vijay edited comment on CASSANDRA-1740 at 11/24/11 5:27 PM:


Yeah I did try it and saw the exception, I was not sure who is wrapping it 
again with RTE.

To be more clear:
The RTE additional wrap is because WrappableRunnable which catches all 
Exceptions just to wrap it to a RTE (plz check 
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/RuntimeException.java).
 Hope this helps :)


  was (Author: vijay2...@yahoo.com):
Yeah I did try it and saw the exception, I was not sure who is wrapping it 
again with rte.

Sent from my iPhone



  
 Nodetool commands to query and stop compaction, repair, cleanup and scrub
 -

 Key: CASSANDRA-1740
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1740
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Chip Salzenberg
Assignee: Vijay
Priority: Minor
  Labels: compaction
 Fix For: 1.0.4

 Attachments: 0001-Patch-to-Stop-compactions-v2.patch, 
 0001-Patch-to-Stop-compactions-v3.patch, 
 0001-Patch-to-Stop-compactions-v4.patch, 
 0001-Patch-to-Stop-compactions-v5.patch, 
 0001-Patch-to-Stop-compactions.patch, CASSANDRA-1740.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 The only way to stop compaction, repair, cleanup, or scrub in progress is to 
 stop and restart the entire Cassandra server.  Please provide nodetool 
 commands to query whether such things are running, and stop them if they are.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3502) Repair of one CF streams data of the other CF's

2011-11-17 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152274#comment-13152274
 ] 

Vijay edited comment on CASSANDRA-3502 at 11/17/11 7:39 PM:


Was actually looking at 0.8.8 patch and thinking of fixing it. seems like it is 
fixed in 1.0.0, Thanks!

  was (Author: vijay2...@yahoo.com):
Was actually looking at 0.8.8 patch and thinking of fixing it. seems like 
it is fixed in 1.0.0
  
 Repair of one CF streams data of the other CF's
 ---

 Key: CASSANDRA-3502
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3502
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1
 Environment: JVM
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.1


 Currently Manual Repair on a cf with inconsistent data will stream those 
 ranges for all the CF's in the keyspace. 
 StreamIn.requestRanges() just takes table name as a argument and requests 
 inconsistent ranges for all the CF's within a keyspace.
 The expected behaviour is to stream only the CF's Ranges.
 This trigger a lot more compaction of the CF's which are not inconsistent and 
 a lot more traffic between the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3502) Repair of one CF streams data of the other CF's

2011-11-17 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13152274#comment-13152274
 ] 

Vijay edited comment on CASSANDRA-3502 at 11/17/11 7:38 PM:


Was actually looking at 0.8.8 patch and thinking of fixing it. seems like it is 
fixed in 1.0.0

  was (Author: vijay2...@yahoo.com):
Was actually looking at 0.8.6 patch and thinking of fixing it. seems like 
it is fixed in 1.0.0
  
 Repair of one CF streams data of the other CF's
 ---

 Key: CASSANDRA-3502
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3502
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1
 Environment: JVM
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.1


 Currently Manual Repair on a cf with inconsistent data will stream those 
 ranges for all the CF's in the keyspace. 
 StreamIn.requestRanges() just takes table name as a argument and requests 
 inconsistent ranges for all the CF's within a keyspace.
 The expected behaviour is to stream only the CF's Ranges.
 This trigger a lot more compaction of the CF's which are not inconsistent and 
 a lot more traffic between the nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira