Running multiple compactions concurrently

2011-02-25 Thread Daniel Josefsson
We experienced the java.lang.NegativeArraySizeException when upgrading to
0.7.2 in staging. The proposed solution (running compaction) seems to have
solved this. However it took a lot of time to run.

Is it safe to invoke a major compaction on all of the machines concurrently?

I can't see a reason why it wouldn't, but I want to be sure :)

Thanks,
Daniel


Re: java.io.IOException in CompactionExecutor

2011-02-21 Thread Daniel Josefsson
There is no antivirus program or similar running on that machine I
guess?

That could definitely lock the file if Cassandra is creating the .tmp
file and then fairly shortly after tries to rename it.

/Daniel

On Mon, 2011-02-21 at 11:34 +, Aaron Morton wrote:
 The code creates a new .tmp file in the saved_caches directory and
 then renames it to a non .tmp file name, so there is nothing else with
 a handle open. The rename is to an existing file though. 
 
 
 Ruslan can you please raise a bug against 0.7.2 for this and include
 the platform. 
 
 
 Thanks
 Aaron
 
 
 On 22 Feb, 2011,at 12:22 AM, Norman Maurer nor...@apache.org wrote:
 
 
  The problem on windows is that it is a bit more worried about
  rename
  a file if the handle is still open.
  
  So maybe some stream not closed on the file.
  
  Bye,
  Norman
  
  
  2011/2/21 Aaron Morton aa...@thelastpickle.com:
   From th F:/ I assume you are on Windows ? What version?
   Just did a quick test on Ubuntu 10.0.4 and it works, but the
  File.renameTo()
   function used has different behavior depending on the host OS.
  There may be
   some issues on
  
  Window 
  http://stackoverflow.com/questions/1000183/reliable-file-renameto-alternative-on-windows
   Aaron
  
  
   On 21 Feb, 2011,at 11:43 PM, ruslan usifov
  ruslan.usi...@gmail.com wrote:
  
   I launch clean cassandra 7.2 instalation, and after few days i
  look at
   system.log follow error (more then 10 time):
  
  
   ERROR [CompactionExecutor:1] 2011-02-19 02:56:17,965
   AbstractCassandraDaemon.java (line 114) Fatal exception in thread
   Thread[CompactionExecutor:1,1,main]
   java.lang.RuntimeException: java.io.IOException: Unable to rename
  cache to
   F:\Cassandra\7.2\saved_caches\system-LocationInfo-KeyCache
   at
  
  org.apache.cassandra.utils.WrappedRunnablerun(WrappedRunnable.java:34)
   at
   java.util.concurrent.Executors
  $RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask
  $Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at
   java.util.concurrent.ThreadPoolExecutor
  $Worker.runTask(ThreadPoolExecutor.java:886)
   at
   java.util.concurrent.ThreadPoolExecutor
  $Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
   Caused by: java.io.IOException: Unable to rename cache to
   F:\Cassandra\7.2\saved_caches\system-LocationInfo-KeyCache
   at
  
  org.apache.cassandra.io.sstable.CacheWriter.saveCache(CacheWriter.java:85)
   at
   org.apache.cassandra.db.CompactionManager
  $9.runMayThrow(CompactionManager.java:746)
   at
  
  org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
   ... 6 more
  
  
  

-- 


Daniel Josefsson

Software Engineer

Shazam Entertainment Ltd 
26-28 Hammersmith Grove, London W6 7HA
w: www.shazam.com 

Please consider the environment before printing this document

This e-mail and its contents are strictly private and confidential. It
must not be disclosed, distributed or copied without our prior consent.
If you have received this transmission in error, please notify Shazam
Entertainment immediately on: +44 (0) 020 8742 6820 and then delete it
from your system. Please note that the information contained herein
shall additionally constitute Confidential Information for the purposes
of any NDA between the recipient/s and Shazam Entertainment. Shazam
Entertainment Limited is incorporated in England and Wales under company
number 3998831 and its registered office is at 26-28 Hammersmith Grove,
London W6 7HA.  




__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
__

Re: Upgrading from 0.6 to 0.7.0

2011-01-25 Thread Daniel Josefsson
Yes, it should be possible to try.

We have not yet quite decided which way to go, think operations won't be
happy with upgrading both server and client at the same time.

Either we upgrade to 0.7.0 (currently does not look very likely), or we go
to 0.6.9 and patch with TTL. I'm not too sure what a possible future upgrade
would look like if we use the TTL patch, though.

/Daniel

2011/1/21 Aaron Morton aa...@thelastpickle.com

 Yup, you can use diff ports and you can give them different cluster names
 and different seed lists.

 After you upgrade the second cluster partition the data should repair
 across, either via RR or the HHs that were stored while the first partition
 was down. Easiest thing would be to run node tool repair. Then a clean up to
 remove any leftover data.

 AFAIK file formats are compatible. But drain the nodes before upgrading to
 clear the log.

 Can you test this on a non production system?

 Aaron
 (we really need to write some upgrade docs:))

 On 21/01/2011, at 10:42 PM, Dave Gardner dave.gard...@imagini.net wrote:

 What about executing writes against both clusters during the changeover?
 Interested in this topic because we're currently thinking about the same
 thing - how to upgrade to 0.7 without any interruption.

 Dave

 On 21 January 2011 09:20, Daniel Josefsson  jid...@gmail.com
 jid...@gmail.com wrote:

 No, what I'm thinking of is having two clusters (0.6 and 0.7) running on
 different ports so they can't find each other. Or isn't that configurable?

 Then, when I have the two clusters, I could upgrade all of the clients to
 run against the new cluster, and finally upgrade the rest of the Cassandra
 nodes.

 I don't know how the new cluster would cope with having new data in the
 old cluster when they are upgraded though.

 /Daniel

 2011/1/20 Aaron Morton  aa...@thelastpickle.comaa...@thelastpickle.com
 

 I'm not sure if your suggesting running a mixed mode cluster there, but
 AFAIK the changes to the internode protocol prohibit this. The nodes will
 probable see each either via gossip, but the way the messages define their
 purpose (their verb handler) has been changed.

 Out of interest which is more painful, stopping the cluster and upgrading
 it or upgrading your client code?

 Aaron

 On 21/01/2011, at 12:35 AM, Daniel Josefsson  jid...@gmail.com
 jid...@gmail.com wrote:

 In our case our replication factor is more than half the number of nodes
 in the cluster.

 Would it be possible to do the following:

- Upgrade half of them
- Change Thrift Port and inter-server port (is this the
storage_port?)
- Start them up
- Upgrade clients one by one
- Upgrade the the rest of the servers

 Or might we get some kind of data collision when still writing to the old
 cluster as the new storage is being used?

 /Daniel






Re: Upgrading from 0.6 to 0.7.0

2011-01-21 Thread Daniel Josefsson
No, what I'm thinking of is having two clusters (0.6 and 0.7) running on
different ports so they can't find each other. Or isn't that configurable?

Then, when I have the two clusters, I could upgrade all of the clients to
run against the new cluster, and finally upgrade the rest of the Cassandra
nodes.

I don't know how the new cluster would cope with having new data in the old
cluster when they are upgraded though.

/Daniel

2011/1/20 Aaron Morton aa...@thelastpickle.com

 I'm not sure if your suggesting running a mixed mode cluster there, but
 AFAIK the changes to the internode protocol prohibit this. The nodes will
 probable see each either via gossip, but the way the messages define their
 purpose (their verb handler) has been changed.

 Out of interest which is more painful, stopping the cluster and upgrading
 it or upgrading your client code?

 Aaron

 On 21/01/2011, at 12:35 AM, Daniel Josefsson jid...@gmail.com wrote:

 In our case our replication factor is more than half the number of nodes in
 the cluster.

 Would it be possible to do the following:

- Upgrade half of them
- Change Thrift Port and inter-server port (is this the storage_port?)
- Start them up
- Upgrade clients one by one
- Upgrade the the rest of the servers

 Or might we get some kind of data collision when still writing to the old
 cluster as the new storage is being used?

 /Daniel




Re: Upgrading from 0.6 to 0.7.0

2011-01-20 Thread Daniel Josefsson
In our case our replication factor is more than half the number of nodes in
the cluster.

Would it be possible to do the following:

   - Upgrade half of them
   - Change Thrift Port and inter-server port (is this the storage_port?)
   - Start them up
   - Upgrade clients one by one
   - Upgrade the the rest of the servers

Or might we get some kind of data collision when still writing to the old
cluster as the new storage is being used?

/Daniel


Upgrading from 0.6 to 0.7.0

2011-01-19 Thread Daniel Josefsson
Hi,

I've been looking around for how to upgrade from 0.6 to 0.7, and it looks
like you need to shut down the whole cluster, plus upgrade the clients at
the same time.

Our live cassandra instances are currently running 0.6.4 with an ever
growing database and need the new TTL feature available in 0.7. For client
we use Pelops.

Has anyone done a similar upgrade of a live cluster? How did you go about?

Is there at least a way to avoid having to upgrade both server side and
client side simultaneously?

Thanks,
Daniel