about validity of recipe A node join using external data copy methods

2013-01-08 Thread DE VITO Dominique
Hi,

Edward Capriolo described in his Cassandra book a faster way [1] to start new 
nodes if the cluster size doubles, from N to 2 *N.

It's about splitting in 2 parts each token range taken in charge, after the 
split, with 2 nodes: the existing one, and a new one. And for starting a new 
node, one needs to:
- copy the data records from the corresponding node (without the system 
records)
- start the new node with auto_bootstrap: false

This raises 2 questions:

A) is this recipe still valid with v1.1 and v1.2 ?

B) do we still need to start the new node with auto_bootstrap: false ?
My guess is yes as the happening of the bootstrap phase is not recorded into 
the data records.

Thanks.

Dominique

[1] see recipe A node join using external data copy methods, page 165


Re: Astyanax

2013-01-08 Thread Markus Klems
The wiki? https://github.com/Netflix/astyanax/wiki


On Tue, Jan 8, 2013 at 2:44 PM, Everton Lima peitin.inu...@gmail.comwrote:

 Hi,
 Someone has or could indicate some good tutorial or book to learn Astyanax?

 Thanks

 --
 Everton Lima Aleixo
 Mestrando em Ciência da Computação pela UFG
 Programador no LUPA




Re: about validity of recipe A node join using external data copy methods

2013-01-08 Thread Edward Capriolo
Basically this recipe is from the old days when we had anti-compaction. Now
streaming is very efficient rarely fails and there is no need to do it this
way anymore. This recipe will be abolished from the second edition. It
still likely works except when using counters.

Edward

On Tue, Jan 8, 2013 at 7:27 AM, DE VITO Dominique 
dominique.dev...@thalesgroup.com wrote:

   Hi,



 Edward Capriolo described in his Cassandra book a faster way [1] to start
 new nodes if the cluster size doubles, from N to 2 *N.



 It's about splitting in 2 parts each token range taken in charge, after
 the split, with 2 nodes: the existing one, and a new one. And for starting
 a new node, one needs to:

 - copy the data records from the corresponding node (without the system
 records)

 - start the new node with auto_bootstrap: false



 This raises 2 questions:



 A) is this recipe still valid with v1.1 and v1.2 ?



 B) do we still need to start the new node with auto_bootstrap: false ?

 My guess is yes as the happening of the bootstrap phase is not recorded
 into the data records.



 Thanks.



 Dominique



 [1] see recipe A node join using external data copy methods, page 165



Re: help turning compaction..hours of run to get 0% compaction....

2013-01-08 Thread Jim Cistaro
One metric to watch is pending compactions (via nodetool compactionstats).  
This count will give you some idea of whether you are falling behind with 
compactions.  The other measure is how long you are compacting after your 
inserts have stopped.

If I understand correctly, since you never update the data, that would explain 
why the compaction logging shows 100% of orig.  With size-tiered, you are 
flushing small files, compacting when you get 4 of like size, etc.  Since you 
have no updates, the compaction will not shrink the data.

As Aaron said, use iostat –x (or dstat) to see if you are taxing the disks.  If 
so, then leveled compaction may be your option (for reasons already stated).  
If not taxing the disks, then you might want to increase your compaction 
throughput, as you suggested.

Depending on what version you are using, another thing to possibly tune is the 
size of sstables when flushed to disk.  In your case of insert only, the 
smaller the flush size, the more times that row is going to be rewritten during 
a compaction (hence increase I/O).

jc

From: Edward Capriolo edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Monday, January 7, 2013 2:33 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: help turning compaction..hours of run to get 0% compaction

There is some point where you simply need more machines.

On Mon, Jan 7, 2013 at 5:02 PM, Michael Kjellman 
mkjell...@barracuda.commailto:mkjell...@barracuda.com wrote:
Right, I guess I'm saying that you should try loading your data with leveled 
compaction and see how your compaction load is.

Your work load sounds like leveled will fit much better than size tiered.

From: Brian Tarbox tar...@cabotresearch.commailto:tar...@cabotresearch.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Monday, January 7, 2013 1:58 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: help turning compaction..hours of run to get 0% compaction

The problem I see is that it already takes me more than 24 hours just to load 
my data...during which time the logs say I'm spending tons of time doing 
compaction.  For example in the last 72 hours I'm consumed 20 hours per machine 
on compaction.

Can I conclude from that than I should be (perhaps drastically) increasing my 
compaction_mb_per_sec on the theory that I'm getting behind?

The fact that it takes me 3 days or more to run a test means its hard to just 
play with values and see what works best, so I'm trying to understand the 
behavior in detail.

Thanks.

Brain


On Mon, Jan 7, 2013 at 4:13 PM, Michael Kjellman 
mkjell...@barracuda.commailto:mkjell...@barracuda.com wrote:
http://www.datastax.com/dev/blog/when-to-use-leveled-compaction

If you perform at least twice as many reads as you do writes, leveled 
compaction may actually save you disk I/O, despite consuming more I/O for 
compaction. This is especially true if your reads are fairly random and don’t 
focus on a single, hot dataset.

From: Brian Tarbox tar...@cabotresearch.commailto:tar...@cabotresearch.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Monday, January 7, 2013 12:56 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: help turning compaction..hours of run to get 0% compaction

I have not specified leveled compaction so I guess I'm defaulting to size 
tiered?  My data (in the column family causing the trouble) insert once, ready 
many, update-never.

Brian


On Mon, Jan 7, 2013 at 3:13 PM, Michael Kjellman 
mkjell...@barracuda.commailto:mkjell...@barracuda.com wrote:
Size tiered or leveled compaction?

From: Brian Tarbox tar...@cabotresearch.commailto:tar...@cabotresearch.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Monday, January 7, 2013 12:03 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: help turning compaction..hours of run to get 0% compaction

I have a column family where I'm doing 500 inserts/sec for 12 hours or so at 
time.  At some point my performance falls off a cliff due to time spent doing 
compactions.

I'm seeing row after row of logs saying that after 1 or 2 hours of compactiing 
it reduced to 100% of 99% of the original.

I'm trying to understand what direction this data points me to in term of 
configuration change.

   a) increase my compaction_throughput_mb_per_sec 

Re: Astyanax

2013-01-08 Thread Radek Gruchalski
Hi,

We are using astyanax and we found out that github wiki with stackoverflow is 
the most comprehensive set of documentation.

Do you have any specific questions?

Kind regards,
Radek Gruchalski

On 8 Jan 2013, at 15:46, Everton Lima peitin.inu...@gmail.com wrote:

 I was studing by there, but I would to know if anyone knows other sources.
 
 2013/1/8 Markus Klems markuskl...@gmail.com
 The wiki? https://github.com/Netflix/astyanax/wiki
 
 
 On Tue, Jan 8, 2013 at 2:44 PM, Everton Lima peitin.inu...@gmail.com wrote:
 Hi,
 Someone has or could indicate some good tutorial or book to learn Astyanax?
 
 Thanks
 
 -- 
 Everton Lima Aleixo
 Mestrando em Ciência da Computação pela UFG
 Programador no LUPA
 
 
 
 -- 
 Everton Lima Aleixo
 Bacharel em Ciência da Computação pela UFG
 Mestrando em Ciência da Computação pela UFG
 Programador no LUPA
 


RE: about validity of recipe A node join using external data copy methods

2013-01-08 Thread DE VITO Dominique
 Now streaming is very efficient rarely fails and there is no need to do it 
this way anymore

I guess it's true in v1.2.
Is it true also in v1.1 ?

Thanks.

Dominique


De : Edward Capriolo [mailto:edlinuxg...@gmail.com]
Envoyé : mardi 8 janvier 2013 16:01
À : user@cassandra.apache.org
Objet : Re: about validity of recipe A node join using external data copy 
methods

Basically this recipe is from the old days when we had anti-compaction. Now 
streaming is very efficient rarely fails and there is no need to do it this way 
anymore. This recipe will be abolished from the second edition. It still likely 
works except when using counters.

Edward

On Tue, Jan 8, 2013 at 7:27 AM, DE VITO Dominique 
dominique.dev...@thalesgroup.commailto:dominique.dev...@thalesgroup.com 
wrote:
Hi,

Edward Capriolo described in his Cassandra book a faster way [1] to start new 
nodes if the cluster size doubles, from N to 2 *N.

It's about splitting in 2 parts each token range taken in charge, after the 
split, with 2 nodes: the existing one, and a new one. And for starting a new 
node, one needs to:
- copy the data records from the corresponding node (without the system 
records)
- start the new node with auto_bootstrap: false

This raises 2 questions:

A) is this recipe still valid with v1.1 and v1.2 ?

B) do we still need to start the new node with auto_bootstrap: false ?
My guess is yes as the happening of the bootstrap phase is not recorded into 
the data records.

Thanks.

Dominique

[1] see recipe A node join using external data copy methods, page 165



Re: CQL3 Frame Length

2013-01-08 Thread Sylvain Lebresne
Mostly this is because having the frame length is convenient to have in
practice.

Without pretending that there is only one way to write a server, it is
common
to separate the phase read a frame from the network from the phase decode
the frame which is often simpler if you can read the frame upfront. Also,
if
you don't have the frame size, it means you need to decode the whole frame
before being able to decode the next one, and so you can't parallelize the
decoding.

It is true however that it means for the write side that you need to either
be
able to either pre-compute the frame body size or to serialize it in memory
first. That's a trade of for making it easier on the read side. But if you
want
my opinion, on the write side too it's probably worth parallelizing the
message
encoding (which require you encode it in memory first) since it's an
asynchronous protocol and so there will likely be multiple writer
simultaneously.

--
Sylvain



On Tue, Jan 8, 2013 at 12:48 PM, Ben Hood 0x6e6...@gmail.com wrote:

 Hi,

 I've read the CQL wire specification and naively, I can't see how the
 frame length length header is used.

 To me, it looks like on the read side, you know which type of structures
 to expect based on the opcode and each structure is TLV encoded.

 On the write side, you need to encode TLV structures as well, but you
 don't know the overall frame length until you've encoded it. So it would
 seem that you either need to pre-calculate the cumulative TLV size before
 you serialize the frame body, or you serialize the frame body to a buffer
 which you can then get the size of and then write to the socket, after
 having first written the count out.

 Is there potentially an implicit assumption that the reader will want to
 pre-buffer the entire frame before decoding it?

 Cheers,

 Ben



Re: CQL3 Frame Length

2013-01-08 Thread Ben Hood
Hey Sylvain,

Thanks for explaining the rationale. When you look at from the perspective
of the use cases you mention, it makes sense to be able to supply the
reader with the frame size up front.

I've opted to go for serializing the frame into a buffer. Although this
could materialize an arbitrarily large amount of memory, ultimately the
driving application has control of the degree to which this can occur, so
in the grander scheme of things, you can still maintain streaming semantics.

Thanks for the heads up.

Cheers,

Ben


On Tue, Jan 8, 2013 at 4:08 PM, Sylvain Lebresne sylv...@datastax.comwrote:

 Mostly this is because having the frame length is convenient to have in
 practice.

 Without pretending that there is only one way to write a server, it is
 common
 to separate the phase read a frame from the network from the phase
 decode
 the frame which is often simpler if you can read the frame upfront. Also,
 if
 you don't have the frame size, it means you need to decode the whole frame
 before being able to decode the next one, and so you can't parallelize the
 decoding.

 It is true however that it means for the write side that you need to
 either be
 able to either pre-compute the frame body size or to serialize it in memory
 first. That's a trade of for making it easier on the read side. But if you
 want
 my opinion, on the write side too it's probably worth parallelizing the
 message
 encoding (which require you encode it in memory first) since it's an
 asynchronous protocol and so there will likely be multiple writer
 simultaneously.

 --
 Sylvain



 On Tue, Jan 8, 2013 at 12:48 PM, Ben Hood 0x6e6...@gmail.com wrote:

 Hi,

 I've read the CQL wire specification and naively, I can't see how the
 frame length length header is used.

 To me, it looks like on the read side, you know which type of structures
 to expect based on the opcode and each structure is TLV encoded.

 On the write side, you need to encode TLV structures as well, but you
 don't know the overall frame length until you've encoded it. So it would
 seem that you either need to pre-calculate the cumulative TLV size before
 you serialize the frame body, or you serialize the frame body to a buffer
 which you can then get the size of and then write to the socket, after
 having first written the count out.

 Is there potentially an implicit assumption that the reader will want to
 pre-buffer the entire frame before decoding it?

 Cheers,

 Ben





Re: Astyanax

2013-01-08 Thread Brian O'Neill
Not sure where you are on the learning curve, but I've put a couple getting
started projects out on github:
https://github.com/boneill42/astyanax-quickstart

And the latest from the webinar is here:
https://github.com/boneill42/naughty-or-nice
http://brianoneill.blogspot.com/2013/01/creating-your-frist-java-application
-w.html

-brian

---
Brian O'Neill
Lead Architect, Software Development
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Radek Gruchalski radek.gruchal...@portico.io
Reply-To:  user@cassandra.apache.org
Date:  Tuesday, January 8, 2013 10:17 AM
To:  user@cassandra.apache.org user@cassandra.apache.org
Cc:  user@cassandra.apache.org user@cassandra.apache.org
Subject:  Re: Astyanax

Hi,

We are using astyanax and we found out that github wiki with stackoverflow
is the most comprehensive set of documentation.

Do you have any specific questions?

Kind regards,
Radek Gruchalski

On 8 Jan 2013, at 15:46, Everton Lima peitin.inu...@gmail.com wrote:

 I was studing by there, but I would to know if anyone knows other sources.
 
 2013/1/8 Markus Klems markuskl...@gmail.com
 The wiki? https://github.com/Netflix/astyanax/wiki
 
 
 On Tue, Jan 8, 2013 at 2:44 PM, Everton Lima peitin.inu...@gmail.com wrote:
 Hi,
 Someone has or could indicate some good tutorial or book to learn Astyanax?
 
 Thanks
 
 -- 
 Everton Lima Aleixo
 Mestrando em Ciência da Computação pela UFG
 Programador no LUPA
 
 
 
 
 
 -- 
 Everton Lima Aleixo
 Bacharel em Ciência da Computação pela UFG
 Mestrando em Ciência da Computação pela UFG
 Programador no LUPA
 




Re: about validity of recipe A node join using external data copy methods

2013-01-08 Thread Edward Capriolo
It has been true since about 0.8. in the old days ANTI-COMPACTION stunk and
many weird errors would cause node joins to have to be retried N times.

Now node moves/joins seem to work near 100% of the time (in 1.0.7) they are
also very fast and efficient.

If you want to move a node to new hardware you can do it with rsync, but I
would not use the technique for growing the cluster. It is error prone, and
ends up being more work.

On Tue, Jan 8, 2013 at 10:57 AM, DE VITO Dominique 
dominique.dev...@thalesgroup.com wrote:

Now streaming is very efficient rarely fails and there is no need to
 do it this way anymore



 I guess it's true in v1.2.

 Is it true also in v1.1 ?



 Thanks.



 Dominique





 *De :* Edward Capriolo [mailto:edlinuxg...@gmail.com]
 *Envoyé :* mardi 8 janvier 2013 16:01
 *À :* user@cassandra.apache.org
 *Objet :* Re: about validity of recipe A node join using external data
 copy methods



 Basically this recipe is from the old days when we had anti-compaction.
 Now streaming is very efficient rarely fails and there is no need to do it
 this way anymore. This recipe will be abolished from the second edition. It
 still likely works except when using counters.



 Edward



 On Tue, Jan 8, 2013 at 7:27 AM, DE VITO Dominique 
 dominique.dev...@thalesgroup.com wrote:

 Hi,



 Edward Capriolo described in his Cassandra book a faster way [1] to start
 new nodes if the cluster size doubles, from N to 2 *N.



 It's about splitting in 2 parts each token range taken in charge, after
 the split, with 2 nodes: the existing one, and a new one. And for starting
 a new node, one needs to:

 - copy the data records from the corresponding node (without the system
 records)

 - start the new node with auto_bootstrap: false



 This raises 2 questions:



 A) is this recipe still valid with v1.1 and v1.2 ?



 B) do we still need to start the new node with auto_bootstrap: false ?

 My guess is yes as the happening of the bootstrap phase is not recorded
 into the data records.



 Thanks.



 Dominique



 [1] see recipe A node join using external data copy methods, page 165





Script to load sstables from v1.0.x to v 1.1.x

2013-01-08 Thread Todd Nine
Hi all,
  I have recently been trying to restore backups from a v1.0.x cluster we
have into a 1.1.7 cluster.  This has not been as trivial as I expected, and
I've had a lot of help from the IRC channel in tackling this problem.  As a
way of saying thanks, I'd like to contribute the updated ruby script I was
originally given for accomplishing this task.  Here it is.

https://gist.github.com/1c161edab88a4e4aea06


It takes a keyspace directory as the input, then creates symlinks in the
output directory with the 1.1 structure pointing to the 1.0 sstables.  If
you've specified a host, it will then invoke the sstableloader for each of
the Keyspaces and CFs it discovers in the output directory.  I hope this is
helpful to someone else.  I'll keep the gist updated as I update the script.

Todd


Date Index?

2013-01-08 Thread Stephen.M.Thompson
Hi folks -

Question about secondary indexes.  How are people doing date indexes?I have 
a date column in my tables in RDBMS that we use frequently, such as look at all 
records recorded in the last month.  What is the best practice for being able 
to do such a query?  It seems like there could be an advantage to adding a 
couple of columns like this:

{timestamp=2013/01/08 12:32:01 -0500}
{month=201301}
{day=08}

And then I could do secondary index on the month and day columns?  Would that 
be the best way to do something like this?  Is there any accepted best 
practice on this yet?

Thanks!
Steve


Re: help turning compaction..hours of run to get 0% compaction....

2013-01-08 Thread B. Todd Burruss
i'll second edward's comment.  cassandra is designed to scale horizontally,
so if disk I/O is slowing you down then you must scale


On Tue, Jan 8, 2013 at 7:10 AM, Jim Cistaro jcist...@netflix.com wrote:

  One metric to watch is pending compactions (via nodetool
 compactionstats).  This count will give you some idea of whether you are
 falling behind with compactions.  The other measure is how long you are
 compacting after your inserts have stopped.

  If I understand correctly, since you never update the data, that would
 explain why the compaction logging shows 100% of orig.  With size-tiered,
 you are flushing small files, compacting when you get 4 of like size, etc.
  Since you have no updates, the compaction will not shrink the data.

  As Aaron said, use iostat –x (or dstat) to see if you are taxing the
 disks.  If so, then leveled compaction may be your option (for reasons
 already stated).  If not taxing the disks, then you might want to increase
 your compaction throughput, as you suggested.

  Depending on what version you are using, another thing to possibly tune
 is the size of sstables when flushed to disk.  In your case of insert only,
 the smaller the flush size, the more times that row is going to be
 rewritten during a compaction (hence increase I/O).

  jc

   From: Edward Capriolo edlinuxg...@gmail.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Monday, January 7, 2013 2:33 PM

 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: help turning compaction..hours of run to get 0%
 compaction

  There is some point where you simply need more machines.

 On Mon, Jan 7, 2013 at 5:02 PM, Michael Kjellman 
 mkjell...@barracuda.comwrote:

  Right, I guess I'm saying that you should try loading your data with
 leveled compaction and see how your compaction load is.

  Your work load sounds like leveled will fit much better than size
 tiered.

   From: Brian Tarbox tar...@cabotresearch.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Monday, January 7, 2013 1:58 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: help turning compaction..hours of run to get 0%
 compaction

  The problem I see is that it already takes me more than 24 hours just
 to load my data...during which time the logs say I'm spending tons of time
 doing compaction.  For example in the last 72 hours I'm consumed* 20
 hours* per machine on compaction.

  Can I conclude from that than I should be (perhaps drastically)
 increasing my compaction_mb_per_sec on the theory that I'm getting behind?

  The fact that it takes me 3 days or more to run a test means its hard
 to just play with values and see what works best, so I'm trying to
 understand the behavior in detail.

  Thanks.

  Brain


 On Mon, Jan 7, 2013 at 4:13 PM, Michael Kjellman mkjell...@barracuda.com
  wrote:

  http://www.datastax.com/dev/blog/when-to-use-leveled-compaction

  If you perform at least twice as many reads as you do writes, leveled
 compaction may actually save you disk I/O, despite consuming more I/O for
 compaction. This is especially true if your reads are fairly random and
 don’t focus on a single, hot dataset.

   From: Brian Tarbox tar...@cabotresearch.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
  Date: Monday, January 7, 2013 12:56 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: help turning compaction..hours of run to get 0%
 compaction

  I have not specified leveled compaction so I guess I'm defaulting to
 size tiered?  My data (in the column family causing the trouble) insert
 once, ready many, update-never.

  Brian


 On Mon, Jan 7, 2013 at 3:13 PM, Michael Kjellman 
 mkjell...@barracuda.com wrote:

  Size tiered or leveled compaction?

   From: Brian Tarbox tar...@cabotresearch.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Monday, January 7, 2013 12:03 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: help turning compaction..hours of run to get 0% compaction

  I have a column family where I'm doing 500 inserts/sec for 12 hours
 or so at time.  At some point my performance falls off a cliff due to time
 spent doing compactions.

  I'm seeing row after row of logs saying that after 1 or 2 hours of
 compactiing it reduced to 100% of 99% of the original.

  I'm trying to understand what direction this data points me to in
 term of configuration change.

 a) increase my compaction_throughput_mb_per_sec because I'm
 falling behind (am I falling behind?)

 b) enable multi-threaded compaction?

  Any help is appreciated.

  Brian

 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your community, please visit:
 http://on.fb.me/UAdL4f
   ­­



 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your 

Re: Script to load sstables from v1.0.x to v 1.1.x

2013-01-08 Thread Rob Coli
On Tue, Jan 8, 2013 at 8:41 AM, Todd Nine todd.n...@gmail.com wrote:
   I have recently been trying to restore backups from a v1.0.x cluster we
 have into a 1.1.7 cluster.  This has not been as trivial as I expected, and
 I've had a lot of help from the IRC channel in tackling this problem.  As a
 way of saying thanks, I'd like to contribute the updated ruby script I was
 originally given for accomplishing this task.  Here it is.

While I laud your contribution, I am still not fully understanding why
this is not working automagically, as it should :

http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-flexible-data-file-placement

What about upgrading?

Do you need to manually move all pre-1.1 data files to the new
directory structure before upgrading to 1.1? No. Immediately after
Cassandra 1.1 starts, it checks to see whether it has old directory
structure and migrates all data files (including backups and
snapshots) to the new directory structure if needed. So, just upgrade
as you always do (don’t forget to read NEWS.txt first), and you will
get more control over data files for free.


Is it possible that, for example, the installation of the debian
package results in your 1.1.x node starting up before you intend it
to.. and then when you start it again with the 1.0 paths, it doesn't
try to change the paths?

 * To check if sstables needs migration, we look at the System
directory. If it contains a directory for the status cf, we'll attempt
a sstable migrating. 

This quote from Directories.java (thx driftx!) suggests that any
starting of a 1.1 node, which would result in a Status columnfamily
being created, would make sstablesNeedsMigration return false.

If this is your case due to the use of the debian package or similar
which auto-starts, your input is welcomed at :

https://issues.apache.org/jira/browse/CASSANDRA-2356

=Rob

-- 
=Robert Coli
AIMGTALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: Script to load sstables from v1.0.x to v 1.1.x

2013-01-08 Thread Michael Kjellman
I thought this was to load between separate clusters not to upgrade within the 
same cluster. No?

On Jan 8, 2013, at 11:29 AM, Rob Coli rc...@palominodb.com wrote:

 On Tue, Jan 8, 2013 at 8:41 AM, Todd Nine todd.n...@gmail.com wrote:
  I have recently been trying to restore backups from a v1.0.x cluster we
 have into a 1.1.7 cluster.  This has not been as trivial as I expected, and
 I've had a lot of help from the IRC channel in tackling this problem.  As a
 way of saying thanks, I'd like to contribute the updated ruby script I was
 originally given for accomplishing this task.  Here it is.
 
 While I laud your contribution, I am still not fully understanding why
 this is not working automagically, as it should :
 
 http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-flexible-data-file-placement
 
 What about upgrading?
 
 Do you need to manually move all pre-1.1 data files to the new
 directory structure before upgrading to 1.1? No. Immediately after
 Cassandra 1.1 starts, it checks to see whether it has old directory
 structure and migrates all data files (including backups and
 snapshots) to the new directory structure if needed. So, just upgrade
 as you always do (don’t forget to read NEWS.txt first), and you will
 get more control over data files for free.
 
 
 Is it possible that, for example, the installation of the debian
 package results in your 1.1.x node starting up before you intend it
 to.. and then when you start it again with the 1.0 paths, it doesn't
 try to change the paths?
 
  * To check if sstables needs migration, we look at the System
 directory. If it contains a directory for the status cf, we'll attempt
 a sstable migrating. 
 
 This quote from Directories.java (thx driftx!) suggests that any
 starting of a 1.1 node, which would result in a Status columnfamily
 being created, would make sstablesNeedsMigration return false.
 
 If this is your case due to the use of the debian package or similar
 which auto-starts, your input is welcomed at :
 
 https://issues.apache.org/jira/browse/CASSANDRA-2356
 
 =Rob
 
 -- 
 =Robert Coli
 AIMGTALK - rc...@palominodb.com
 YAHOO - rcoli.palominob
 SKYPE - rcoli_palominodb

Southfield Public School students safely access the tech tools they need on and 
off campus with the Barracuda Web Filter.

Quick installation and easy to use- try the Barracuda Web Filter free for 30 
days: http://on.fb.me/Vj6JBd


Re: Script to load sstables from v1.0.x to v 1.1.x

2013-01-08 Thread Todd Nine
Our use case is for testing migrations in our data, as well as stress testing 
outside our production environment.  To do this, we load our backups into a 
fresh cluster, then perform our testing.  Our current production cluster is 
still on 1.0.x, so we can either fire up a 1.0.x cluster, then upgrade every 
node to accomplish this, or just use the script. We also have a different 
number of nodes in stage vs production, so we'd still need to run a repair if 
we did a straight sstable copy.   The script is a lot faster and easier for us 
than going through the upgrade process, then running repair to ensure the data 
is distributed correctly in the ring.



--  
Todd Nine


On Tuesday, January 8, 2013 at 12:32 PM, Michael Kjellman wrote:

 I thought this was to load between separate clusters not to upgrade within 
 the same cluster. No?
  
 On Jan 8, 2013, at 11:29 AM, Rob Coli rc...@palominodb.com 
 (mailto:rc...@palominodb.com) wrote:
  
  On Tue, Jan 8, 2013 at 8:41 AM, Todd Nine todd.n...@gmail.com 
  (mailto:todd.n...@gmail.com) wrote:
   I have recently been trying to restore backups from a v1.0.x cluster we
   have into a 1.1.7 cluster. This has not been as trivial as I expected, and
   I've had a lot of help from the IRC channel in tackling this problem. As a
   way of saying thanks, I'd like to contribute the updated ruby script I was
   originally given for accomplishing this task. Here it is.

   
   
  While I laud your contribution, I am still not fully understanding why
  this is not working automagically, as it should :
   
  http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-1-flexible-data-file-placement
  
  What about upgrading?
   
  Do you need to manually move all pre-1.1 data files to the new
  directory structure before upgrading to 1.1? No. Immediately after
  Cassandra 1.1 starts, it checks to see whether it has old directory
  structure and migrates all data files (including backups and
  snapshots) to the new directory structure if needed. So, just upgrade
  as you always do (don’t forget to read NEWS.txt first), and you will
  get more control over data files for free.
  
   
  Is it possible that, for example, the installation of the debian
  package results in your 1.1.x node starting up before you intend it
  to.. and then when you start it again with the 1.0 paths, it doesn't
  try to change the paths?
   
   * To check if sstables needs migration, we look at the System
  directory. If it contains a directory for the status cf, we'll attempt
  a sstable migrating. 
   
  This quote from Directories.java (thx driftx!) suggests that any
  starting of a 1.1 node, which would result in a Status columnfamily
  being created, would make sstablesNeedsMigration return false.
   
  If this is your case due to the use of the debian package or similar
  which auto-starts, your input is welcomed at :
   
  https://issues.apache.org/jira/browse/CASSANDRA-2356
   
  =Rob
   
  --  
  =Robert Coli
  AIMGTALK - rc...@palominodb.com (mailto:rc...@palominodb.com)
  YAHOO - rcoli.palominob
  SKYPE - rcoli_palominodb
   
  
  
 Southfield Public School students safely access the tech tools they need on 
 and off campus with the Barracuda Web Filter.
  
 Quick installation and easy to use- try the Barracuda Web Filter free for 30 
 days: http://on.fb.me/Vj6JBd  



Re: Script to load sstables from v1.0.x to v 1.1.x

2013-01-08 Thread Rob Coli
On Tue, Jan 8, 2013 at 11:56 AM, Todd Nine todd.n...@gmail.com wrote:
 Our current production
 cluster is still on 1.0.x, so we can either fire up a 1.0.x cluster, then
 upgrade every node to accomplish this, or just use the script.

No 1.0 cluster is required to restore 1.0 directory structure to a 1.1
cluster and have the tables be migrated by Cassandra. The 1.1 node
should look at the 1.0 directory structure you just restored and
migrate it automagically.

 We also have
 a different number of nodes in stage vs production, so we'd still need to
 run a repair if we did a straight sstable copy.

This is a compelling reason to bulk load. My commentary merely points
out that if you *aren't* changing cluster size/topology, Cassandra 1.1
should be migrating the sstables for you. :)

=Rob

-- 
=Robert Coli
AIMGTALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: inconsistent hadoop/cassandra results

2013-01-08 Thread aaron morton
Assuming their were no further writes, running repair or using CL all should 
have fixed it. 

Can you describe the inconsistency between runs? 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 8/01/2013, at 2:16 AM, Brian Jeltema brian.jelt...@digitalenvoy.net wrote:

 I need some help understanding unexpected behavior I saw in some recent 
 experiments with Cassandra 1.1.5 and Hadoop 1.0.3:
 
 I've written a small map/reduce job that simply counts the number of columns 
 in each row of a static CF (call it Foo) 
 and generates a list of every row and column count. A relatively small 
 fraction of the rows have a large number
 of columns; worst case is approximately 36 million. So when I set up the job, 
 I used wide-row support:
 
 ConfigHelper.setInputColumnFamily(job.getConfiguration(), fooKS, Foo, 
 WIDE_ROWS); // where WIDE_ROWS == true
 
 When I ran this job using the default CL (1) I noticed that the results 
 varied from run to run, which I attributed to inconsistent
 replicas, since Foo was generated with CL == 1 and the RF == 3. 
 
 So I ran repair for that CF on every node. The cassandra log on every node 
 contains lines similar to:
 
   INFO [AntiEntropyStage:1] 2013-01-05 20:38:48,605 AntiEntropyService.java 
 (line 778) [repair #e4a1d7f0-579d-11e2--d64e0a75e6df] Foo is fully synced
 
 However, repeated runs were still inconsistent. Then I set CL to ALL, which I 
 presumed would always result in identical
 output, but repeated runs initially continued to be inconsistent. However, I 
 noticed that the results seemed to
 be converging, and after several runs (somewhere between 4 and 6) I finally 
 was producing identical results on every run.
 Then I set CL to QUORUM, and again generated inconsistent results.
 
 Does this behavior make sense?
 
 Brian



JIRA for native IAuthorizer and IAuthenticator ?

2013-01-08 Thread Frank Hsueh
I am very interested in the native IAuthorizer and IAuthenticator
implementation. However, I can't find a JIRA entry to follow in the 1.2.1
[1] or 1.2.2 [2] issues page.

does anybody know about it ?

thanks !

[1]
https://issues.apache.org/jira/issues/?jql=fixVersion%20%3D%20%221.2.1%22%20AND%20project%20%3D%20CASSANDRA
[2]
https://issues.apache.org/jira/issues/?jql=fixVersion%20%3D%20%221.2.2%22%20AND%20project%20%3D%20CASSANDRA




On Wed, Jan 2, 2013 at 7:00 AM, Sylvain Lebresne sylv...@datastax.comwrote:

 The Cassandra team wishes you a very happy new year 2013, and is very
 pleased
 to announce the release of Apache Cassandra version 1.2.0. Cassandra 1.2.0
 is a
 new major release for the Apache Cassandra distributed database. This
 version
 adds numerous improvements[1,2] including (but not restricted to):
 - Virtual nodes[4]
 - The final version of CQL3 (featuring many improvements)
 - Atomic batches[5]
 - Request tracing[6]
 - Numerous performance improvements[7]
 - A new binary protocol for CQL3[8]
 - Improved configuration options[9]
 - And much more...

 Please make sure to carefully read the release notes[2] before upgrading.

 Both source and binary distributions of Cassandra 1.2.0 can be downloaded
 at:

  http://cassandra.apache.org/download/

 Or you can use the debian package available from the project APT
 repository[3]
 (you will need to use the 12x series).

 The Cassandra Team

 [1]: http://goo.gl/JmKp3 (CHANGES.txt)
 [2]: http://goo.gl/47bFz (NEWS.txt)
 [3]: http://wiki.apache.org/cassandra/DebianPackaging
 [4]: http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2
 [5]: http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2
 [6]: http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2
 [7]:
 http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2
 [8]: http://www.datastax.com/dev/blog/binary-protocol
 [9]:
 http://www.datastax.com/dev/blog/configuration-changes-in-cassandra-1-2




-- 
Frank Hsueh | frank.hs...@gmail.com


Re: How long does it take for a write to actually happen?

2013-01-08 Thread aaron morton
 EC2 m1.large node
You will have a much happier time if you use a m1.xlarge. 

 We set MAX_HEAP_SIZE=6G and HEAP_NEWSIZE=400M  
Thats a pretty low new heap size.

 checks for new entries (in Entries CF, with indexed column status=1), 
 processes them, and sets the status to 2, when done
This is not the best data model. 
You may be better have one CF for the unprocessed and one for the process. 
Or if you really need a queue using something like Kafka. 

 I will appreciate any advice on how to speed the writes up,
Writes are instantly available for reading. 
The first thing I would do is see where the delay is. Use the nodetool cfstats 
to see the local write latency, or track the write latency from the client 
perspective. 

If you are looking for near real time / continuous computation style processing 
take a look at http://storm-project.net/ and register for this talk from a 
Brian O'Neill one of my fellow Data Stax MVP's 
http://learn.datastax.com/WebinarCEPDistributedProcessingonCassandrawithStorm_Registration.html

Cheers
  
-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 9/01/2013, at 5:48 AM, Vitaly Sourikov vitaly.souri...@gmail.com wrote:

 Hi,
 we are currently at an early stage of our project and have only one Cassandra 
 1.1.7 node hosted on EC2 m1.large node, where the data is written to the 
 ephemeral disk, and /var/lib/cassandra/data is just a soft link to it. Commit 
 logs and caches are still on /var/lib/cassandra/. We set MAX_HEAP_SIZE=6G 
 and HEAP_NEWSIZE=400M  
 
 On the client-side, we use Astyanax 1.56.18 to access the data.  We have a 
 processing server that writes to Cassandra, and an online server that reads 
 from it. The former wakes up every 0.5-5sec., checks for new entries (in 
 Entries CF, with indexed column status=1), processes them, and sets the 
 status to 2, when done. The online server checks once a second if an entry 
 that should be processed got the status 2 and sends it to its client side for 
 display. Processing takes 5-10 seconds and updates various columns in the 
 Entries CF few times on the way. One of these columns may contain ~12KB of 
 textual data, others are just short strings or numbers.
 
 Now, our problem is that it takes 20-40 seconds before the online server 
 actually sees the change - and it is way too long, this process is supposed 
 to be nearly real-time. Moreover, in sqlsh, if I perform a similar update, it 
 is immediately seen in the following select results, but the updates from the 
 back-end server also do not appear for 20-40 seconds. 
 
 I tried switching the row caches for that table and in yaml on and of. I 
 tried commitlog_sync: batch with commitlog_sync_batch_window_in_ms: 50. 
 Nothing helped. 
 
 I will appreciate any advice on how to speed the writes up, or at least an 
 explanation why this happens.
 
 thanks,
 Vitaly



Re: Date Index?

2013-01-08 Thread aaron morton
There has to be one equality clause in there, and thats the thing to cassandra 
uses to select of disk. The others are in memory filters. 

So if you have one on the year+month you can have a simple select clause and it 
limits the amount of data that has to be read. 

If you have like many 10's to 100's millions of things in the same month you 
may want to do some performance testing. There can still be times when you want 
to support common read paths by using custom / hand rolled indexes.

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 9/01/2013, at 6:05 AM, stephen.m.thomp...@wellsfargo.com wrote:

 Hi folks –
  
 Question about secondary indexes.  How are people doing date indexes?I 
 have a date column in my tables in RDBMS that we use frequently, such as look 
 at all records recorded in the last month.  What is the best practice for 
 being able to do such a query?  It seems like there could be an advantage to 
 adding a couple of columns like this:
  
 {timestamp=2013/01/08 12:32:01 -0500}
 {month=201301}
 {day=08}
  
 And then I could do secondary index on the month and day columns?  Would that 
 be the best way to do something like this?  Is there any accepted “best 
 practice” on this yet?
  
 Thanks!
 Steve