Re: large range read in Cassandra

2015-02-02 Thread Dan Kinder
For the benefit of others, I ended up finding out that the CQL library I
was using (https://github.com/gocql/gocql) at this time leaves paging page
size defaulted to no paging, so Cassandra was trying to pull all rows of
the partition into memory at once. Setting the page size to a reasonable
number seems to have done the trick.

On Tue, Nov 25, 2014 at 2:54 PM, Dan Kinder dkin...@turnitin.com wrote:

 Thanks, very helpful Rob, I'll watch for that.

 On Tue, Nov 25, 2014 at 11:45 AM, Robert Coli rc...@eventbrite.com
 wrote:

 On Tue, Nov 25, 2014 at 10:45 AM, Dan Kinder dkin...@turnitin.com
 wrote:

 To be clear, I expect this range query to take a long time and perform
 relatively heavy I/O. What I expected Cassandra to do was use auto-paging (
 https://issues.apache.org/jira/browse/CASSANDRA-4415,
 http://stackoverflow.com/questions/17664438/iterating-through-cassandra-wide-row-with-cql3)
 so that we aren't literally pulling the entire thing in. Am I
 misunderstanding this use case? Could you clarify why exactly it would slow
 way down? It seems like with each read it should be doing a simple range
 read from one or two sstables.


 If you're paging through a single partition, that's likely to be fine.
 When you said range reads ... over rows my impression was you were
 talking about attempting to page through millions of partitions.

 With that confusion cleared up, the likely explanation for lack of
 availability in your case is heap pressure/GC time. Look for GCs around
 that time. Also, if you're using authentication, make sure that your
 authentication keyspace has a replication factor greater than 1.

 =Rob





 --
 Dan Kinder
 Senior Software Engineer
 Turnitin – www.turnitin.com
 dkin...@turnitin.com




-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com


Re: Upgrading from 1.2 to 2.1 questions

2015-02-02 Thread Kai Wang
I would not use 2.1.2 for production yet. It doesn't seem stable enough
based on the feedbacks I see here. The newest 2.0.12 may be a better option.
On Feb 2, 2015 8:43 AM, Sibbald, Charles charles.sibb...@bskyb.com
wrote:

 Hi Oleg,

 What is the minor version of 1.2? I am looking to do the same for 1.2.14
 in a very large cluster.

 Regards

 Charles


 On 02/02/2015 13:33, Oleg Dulin oleg.du...@gmail.com wrote:

 Dear Distinguished Colleagues:
 
 We'd like to upgrade our cluster from 1.2 to 2.0 and then to 2.1 .
 
 We are using Pelops Thrift client, which has long been abandoned by its
 authors. I've read that 2.x has changes to the Thrift protocol making
 it incompatible with 1.2 (and of course now the link to that site
 eludes me). If that is true, we need to first upgrade our Thrift client
 and then upgrade cassandra.
 
 Let's start by confirming if that indeed is the case -- if that is
 true, I have my work cut out for me.
 
 Anyone knows for sure ?
 
 Regards,
 Oleg
 
 

 Information in this email including any attachments may be privileged,
 confidential and is intended exclusively for the addressee. The views
 expressed may not be official policy, but the personal views of the
 originator. If you have received it in error, please notify the sender by
 return e-mail and delete it from your system. You should not reproduce,
 distribute, store, retransmit, use or disclose its contents to anyone.
 Please note we reserve the right to monitor all e-mail communication
 through our internal and external networks. SKY and the SKY marks are
 trademarks of British Sky Broadcasting Group plc and Sky International AG
 and are used under licence. British Sky Broadcasting Limited (Registration
 No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075) and
 Sky Subscribers Services Limited (Registration No. 2340150) are direct or
 indirect subsidiaries of British Sky Broadcasting Group plc (Registration
 No. 2247735). All of the companies mentioned in this paragraph are
 incorporated in England and Wales and share the same registered office at
 Grant Way, Isleworth, Middlesex TW7 5QD.



Re: Upgrading from 1.2 to 2.1 questions

2015-02-02 Thread Oleg Dulin

Sure but the question is really about going from 1.2 to 2.0 ...

On 2015-02-02 13:59:27 +, Kai Wang said:

I would not use 2.1.2 for production yet. It doesn't seem stable enough 
based on the feedbacks I see here. The newest 2.0.12 may be a better 
option.

On Feb 2, 2015 8:43 AM, Sibbald, Charles charles.sibb...@bskyb.com wrote:
Hi Oleg,

What is the minor version of 1.2? I am looking to do the same for 1.2.14
in a very large cluster.

Regards

Charles


On 02/02/2015 13:33, Oleg Dulin oleg.du...@gmail.com wrote:

Dear Distinguished Colleagues:

We'd like to upgrade our cluster from 1.2 to 2.0 and then to 2.1 .

We are using Pelops Thrift client, which has long been abandoned by its
authors. I've read that 2.x has changes to the Thrift protocol making
it incompatible with 1.2 (and of course now the link to that site
eludes me). If that is true, we need to first upgrade our Thrift client
and then upgrade cassandra.

Let's start by confirming if that indeed is the case -- if that is
true, I have my work cut out for me.

Anyone knows for sure ?

Regards,
Oleg



Information in this email including any attachments may be privileged, 
confidential and is intended exclusively for the addressee. The views 
expressed may not be official policy, but the personal views of the 
originator. If you have received it in error, please notify the sender 
by return e-mail and delete it from your system. You should not 
reproduce, distribute, store, retransmit, use or disclose its contents 
to anyone. Please note we reserve the right to monitor all e-mail 
communication through our internal and external networks. SKY and the 
SKY marks are trademarks of British Sky Broadcasting Group plc and Sky 
International AG and are used under licence. British Sky Broadcasting 
Limited (Registration No. 2906991), Sky-In-Home Service Limited 
(Registration No. 2067075) and Sky Subscribers Services Limited 
(Registration No. 2340150) are direct or indirect subsidiaries of 
British Sky Broadcasting Group plc (Registration No. 2247735). All of 
the companies mentioned in this paragraph are incorporated in England 
and Wales and share the same registered office at Grant Way, Isleworth, 
Middlesex TW7 5QD.






Re: Any problem mounting a keyspace directory in ram memory?

2015-02-02 Thread Gabriel Menegatti
Hi Colin,

Yes, we don't want to use the C* in-memory, we just want to mount the keyspace 
data directory to RAM instead of leaving it on the spinning disks.

My question is more related to the technical side of mounting the keyspace data 
folder to the ram memory than checking if Cassandra has some in-memory feature.

My intention is to understand if mounting a keyspace data directory to RAM 
could cause any technical problems... On our point of it shouldn't, as we are 
just moving the directory to be stored on RAM instead of the spinning disks.

Thanks so much.
 

 Em 02/02/2015, às 05:15, Colin co...@clark.ws escreveu:
 
 Until the in-memory option stores data off heap, I would strongly recommend 
 staying away from this option.  This was a marketing driven hack in my 
 opinion.
 
 --
 Colin Clark 
 +1 612 859 6129
 Skype colin.p.clark
 
 On Feb 2, 2015, at 5:31 AM, Jan cne...@yahoo.com wrote:
 
 HI Gabriel; 
 
 I don't think Apache Cassandra supports in-memory keyspaces. 
 However Datastax Enterprise does support it. 
 
 Quoting from Datastax: 
 DataStax Enterprise includes the in-memory option for storing data to and 
 accessing data from memory exclusively. No disk I/O occurs. Consider using 
 the in-memory option for storing a modest amount of data, mostly composed of 
 overwrites, such as an application for mirroring stock exchange data. Only 
 the prices fluctuate greatly while the keys for the data remain relatively 
 constant. Generally, the table you design for use in-memory should have the 
 following characteristics:
 Store a small amount of data
 Experience a workload that is mostly overwrites
 Be heavily trafficked
 Using the in-memory option | DataStax Enterprise 4.0 Documentation
  
  
  
  
  
  
 Using the in-memory option | DataStax Enterprise 4.0 Documentation
 Using the in-memory option
 View on www.datastax.com
 Preview by Yahoo
  
  
 
 hope this helps
 Jan
 
 C* Architect
 
 
 
 On Sunday, February 1, 2015 1:32 PM, Gabriel Menegatti 
 gabr...@s1mbi0se.com.br wrote:
 
 
 Hi guys,
 
 Please, does anyone here already mounted a specific keyspace directory to 
 ram memory using tmpfs?
 
 Do you see any problem doing so, except by the fact that the data can be 
 lost?
 
 Thanks in advance.
 
 Regards,
 Gabriel.
 
 


Re: Upgrading from 1.2 to 2.1 questions

2015-02-02 Thread Oleg Dulin

Our minor version is 1.2.15 ...

I am not looking forward to the experience, and would like to gather as 
much information as possible.


This presents an opportunity to also review the data structures we use 
and possibly move them out of Cassandra.


Oleg

On 2015-02-02 13:42:52 +, Sibbald, Charles said:


Hi Oleg,

What is the minor version of 1.2? I am looking to do the same for 1.2.14
in a very large cluster.

Regards

Charles


On 02/02/2015 13:33, Oleg Dulin oleg.du...@gmail.com wrote:


Dear Distinguished Colleagues:

We'd like to upgrade our cluster from 1.2 to 2.0 and then to 2.1 .

We are using Pelops Thrift client, which has long been abandoned by its
authors. I've read that 2.x has changes to the Thrift protocol making
it incompatible with 1.2 (and of course now the link to that site
eludes me). If that is true, we need to first upgrade our Thrift client
and then upgrade cassandra.

Let's start by confirming if that indeed is the case -- if that is
true, I have my work cut out for me.

Anyone knows for sure ?

Regards,
Oleg




Information in this email including any attachments may be privileged, 
confidential and is intended exclusively for the addressee. The views 
expressed may not be official policy, but the personal views of the 
originator. If you have received it in error, please notify the sender 
by return e-mail and delete it from your system. You should not 
reproduce, distribute, store, retransmit, use or disclose its contents 
to anyone. Please note we reserve the right to monitor all e-mail 
communication through our internal and external networks. SKY and the 
SKY marks are trademarks of British Sky Broadcasting Group plc and Sky 
International AG and are used under licence. British Sky Broadcasting 
Limited (Registration No. 2906991), Sky-In-Home Service Limited 
(Registration No. 2067075) and Sky Subscribers Services Limited 
(Registration No. 2340150) are direct or indirect subsidiaries of 
British Sky Broadcasting Group plc (Registration No. 2247735). All of 
the companies mentioned in this paragraph are incorporated in England 
and Wales and share the same registered office at Grant Way, Isleworth, 
Middlesex TW7 5QD.






Upgrading from 1.2 to 2.1 questions

2015-02-02 Thread Oleg Dulin

Dear Distinguished Colleagues:

We'd like to upgrade our cluster from 1.2 to 2.0 and then to 2.1 .

We are using Pelops Thrift client, which has long been abandoned by its 
authors. I've read that 2.x has changes to the Thrift protocol making 
it incompatible with 1.2 (and of course now the link to that site 
eludes me). If that is true, we need to first upgrade our Thrift client 
and then upgrade cassandra.


Let's start by confirming if that indeed is the case -- if that is 
true, I have my work cut out for me.


Anyone knows for sure ?

Regards,
Oleg




Re: Upgrading from 1.2 to 2.1 questions

2015-02-02 Thread Sibbald, Charles
Hi Oleg,

What is the minor version of 1.2? I am looking to do the same for 1.2.14
in a very large cluster.

Regards

Charles


On 02/02/2015 13:33, Oleg Dulin oleg.du...@gmail.com wrote:

Dear Distinguished Colleagues:

We'd like to upgrade our cluster from 1.2 to 2.0 and then to 2.1 .

We are using Pelops Thrift client, which has long been abandoned by its
authors. I've read that 2.x has changes to the Thrift protocol making
it incompatible with 1.2 (and of course now the link to that site
eludes me). If that is true, we need to first upgrade our Thrift client
and then upgrade cassandra.

Let's start by confirming if that indeed is the case -- if that is
true, I have my work cut out for me.

Anyone knows for sure ?

Regards,
Oleg



Information in this email including any attachments may be privileged, 
confidential and is intended exclusively for the addressee. The views expressed 
may not be official policy, but the personal views of the originator. If you 
have received it in error, please notify the sender by return e-mail and delete 
it from your system. You should not reproduce, distribute, store, retransmit, 
use or disclose its contents to anyone. Please note we reserve the right to 
monitor all e-mail communication through our internal and external networks. 
SKY and the SKY marks are trademarks of British Sky Broadcasting Group plc and 
Sky International AG and are used under licence. British Sky Broadcasting 
Limited (Registration No. 2906991), Sky-In-Home Service Limited (Registration 
No. 2067075) and Sky Subscribers Services Limited (Registration No. 2340150) 
are direct or indirect subsidiaries of British Sky Broadcasting Group plc 
(Registration No. 2247735). All of the companies mentioned in this paragraph 
are incorporated in England and Wales and share the same registered office at 
Grant Way, Isleworth, Middlesex TW7 5QD.


RE: FW: How to use cqlsh to access Cassandra DB if the client_encryption_options is enabled

2015-02-02 Thread Lu, Boying
Hi, Holmberg,

I tried your suggestion and run the following command:
keytool –exportcert –keystore path-to-my-keystore-file –storepass 
my-keystore-password –storetype JKS –file path-to-outptfile and

I got following error:
keytool error: java.lang.Exception: Alias mykey does not exist

Do you know how to fix this issue?

Thanks

Boying

From: Adam Holmberg [mailto:adam.holmb...@datastax.com]
Sent: 2015年1月31日 1:12
To: user@cassandra.apache.org
Subject: Re: FW: How to use cqlsh to access Cassandra DB if the 
client_encryption_options is enabled

Assuming the truststore you are referencing is the same one the server is 
using, it's probably in the wrong format. You will need to export the cert into 
a PEM format for use in the (Python) cqlsh client. If exporting from the java 
keystore format, use

keytool -exportcert source keystore, pass, etc -rfc -file output file

If you have the crt file, you should be able to accomplish the same using 
openssl:

openssl x509 -in in crt -inform DER -out output file -outform PEM

Then, you should refer to that PEM file in your command. Alternatively, you can 
specify a path to the file (along with other options) in your cqlshrc file.

References:
How cqlsh picks up ssl 
optionshttps://github.com/apache/cassandra/blob/cassandra-2.1/pylib/cqlshlib/sslhandling.py
Example cqlshrc 
filehttps://github.com/apache/cassandra/blob/cassandra-2.1/conf/cqlshrc.sample

Adam Holmberg

On Wed, Jan 28, 2015 at 1:08 AM, Lu, Boying 
boying...@emc.commailto:boying...@emc.com wrote:
Hi, All,

Does anyone know the answer?

Thanks a lot

Boying


From: Lu, Boying
Sent: 2015年1月6日 11:21
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: How to use cqlsh to access Cassandra DB if the 
client_encryption_options is enabled

Hi, All,

I turned on the dbclient_encryption_options like this:
client_encryption_options:
enabled: true
keystore:  path-to-my-keystore-file
keystore_password:  my-keystore-password
truststore: path-to-my-truststore-file
truststore_password:  my-truststore-password
…

I can use following cassandra-cli command to access DB:
cassandra-cli  -ts path-to-my-truststore-file –tspw my-truststore-password –tf 
org.apache.cassandra.thrift.SSLTransportFactory

But when I tried to access DB by cqlsh like this:
SSL_CERTFILE=path-to-my-truststore cqlsh –t cqlishlib.ssl.ssl_transport_factory

I got following error:
Connection error: Could not connect to localhost:9160: [Errno 0] _ssl.c:332: 
error::lib(0):func(0):reason(0)

I guess the reason maybe is that I didn’t provide the trustore password.   But 
cqlsh doesn’t provide such option.

Does anyone know how to resolve this issue?

Thanks

Boying




Re: Any problem mounting a keyspace directory in ram memory?

2015-02-02 Thread Gabriel Menegatti
Hi Jan,

Thanks for your reply, but C* in-memory just supports 1 GB keyspaces at the 
moment, what is not enough for us.

My question is more related to the technical side of mounting the keyspace data 
folder to the ram memory than checking if Cassandra has some in-memory feature.

My intention is to understand if mounting a keyspace data directory to RAM 
could cause any technical problems... On our point of it shouldn't, as we are 
just moving the directory to be stored on RAM instead of in spinning disks.

Thanks so much.



 Em 02/02/2015, às 02:31, Jan cne...@yahoo.com escreveu:
 
 HI Gabriel; 
 
 I don't think Apache Cassandra supports in-memory keyspaces. 
 However Datastax Enterprise does support it. 
 
 Quoting from Datastax: 
 DataStax Enterprise includes the in-memory option for storing data to and 
 accessing data from memory exclusively. No disk I/O occurs. Consider using 
 the in-memory option for storing a modest amount of data, mostly composed of 
 overwrites, such as an application for mirroring stock exchange data. Only 
 the prices fluctuate greatly while the keys for the data remain relatively 
 constant. Generally, the table you design for use in-memory should have the 
 following characteristics:
 Store a small amount of data
 Experience a workload that is mostly overwrites
 Be heavily trafficked
 Using the in-memory option | DataStax Enterprise 4.0 Documentation
  
  
  
  
  
  
 Using the in-memory option | DataStax Enterprise 4.0 Documentation
 Using the in-memory option
 View on www.datastax.com
 Preview by Yahoo
  
  
 
 hope this helps
 Jan
 
 C* Architect
 
 
 
 On Sunday, February 1, 2015 1:32 PM, Gabriel Menegatti 
 gabr...@s1mbi0se.com.br wrote:
 
 
 Hi guys,
 
 Please, does anyone here already mounted a specific keyspace directory to ram 
 memory using tmpfs?
 
 Do you see any problem doing so, except by the fact that the data can be lost?
 
 Thanks in advance.
 
 Regards,
 Gabriel.
 
 


Re: Any problem mounting a keyspace directory in ram memory?

2015-02-02 Thread Hannu Kröger
At least I cannot think of any reason why it wouldn't work. As you said, you 
might lose the data but if you can live with that then why not.

Hannu

 On 02.02.2015, at 14:21 , Gabriel Menegatti gabr...@s1mbi0se.com.br wrote:
 
 Hi Colin,
 
 Yes, we don't want to use the C* in-memory, we just want to mount the 
 keyspace data directory to RAM instead of leaving it on the spinning disks.
 
 My question is more related to the technical side of mounting the keyspace 
 data folder to the ram memory than checking if Cassandra has some in-memory 
 feature.
 
 My intention is to understand if mounting a keyspace data directory to RAM 
 could cause any technical problems... On our point of it shouldn't, as we are 
 just moving the directory to be stored on RAM instead of the spinning disks.
 
 Thanks so much.
  
 
 Em 02/02/2015, às 05:15, Colin co...@clark.ws mailto:co...@clark.ws 
 escreveu:
 
 Until the in-memory option stores data off heap, I would strongly recommend 
 staying away from this option.  This was a marketing driven hack in my 
 opinion.
 
 --
 Colin Clark 
 +1 612 859 6129
 Skype colin.p.clark
 
 On Feb 2, 2015, at 5:31 AM, Jan cne...@yahoo.com mailto:cne...@yahoo.com 
 wrote:
 
 HI Gabriel; 
 
 I don't think Apache Cassandra supports in-memory keyspaces. 
 However Datastax Enterprise does support it. 
 
 Quoting from Datastax: 
 DataStax Enterprise includes the in-memory option for storing data to and 
 accessing data from memory exclusively. No disk I/O occurs. Consider using 
 the in-memory option for storing a modest amount of data, mostly composed 
 of overwrites, such as an application for mirroring stock exchange data. 
 Only the prices fluctuate greatly while the keys for the data remain 
 relatively constant. Generally, the table you design for use in-memory 
 should have the following characteristics:
 Store a small amount of data
 Experience a workload that is mostly overwrites
 Be heavily trafficked
 Using the in-memory option | DataStax Enterprise 4.0 Documentation 
 http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/inMemory.html
  
  
  
  
  
  
 Using the in-memory option | DataStax Enterprise 4.0 Documentation
  
 http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/inMemory.htmlUsing
  the in-memory option
 View on www.datastax.com 
 http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/inMemory.html
   
 Preview by Yahoo
  
  
 
 hope this helps
 Jan
 
 C* Architect
 
 
 
 On Sunday, February 1, 2015 1:32 PM, Gabriel Menegatti 
 gabr...@s1mbi0se.com.br mailto:gabr...@s1mbi0se.com.br wrote:
 
 
 Hi guys,
 
 Please, does anyone here already mounted a specific keyspace directory to 
 ram memory using tmpfs?
 
 Do you see any problem doing so, except by the fact that the data can be 
 lost?
 
 Thanks in advance.
 
 Regards,
 Gabriel.
 
 



Re: How to deal with too many sstables

2015-02-02 Thread 曹志富
Just run nodetool repair.

The nodes witch has many sstables are newest in my cluster.Before add these
nodes to my cluster ,my cluster have not compaction automaticly because my
cluster is an only write cluster.

thanks.

--
曹志富
手机:18611121927
邮箱:caozf.zh...@gmail.com
微博:http://weibo.com/boliza/

2015-02-03 12:16 GMT+08:00 Flavien Charlon flavien.char...@gmail.com:

 Did you run incremental repair? Incremental repair is broken in 2.1 and
 tends to create way too many SSTables.

 On 2 February 2015 at 18:05, 曹志富 cao.zh...@gmail.com wrote:

 Hi,all:
 I have 18 nodes C* cluster with cassandra2.1.2.Some nodes have aboud
 40,000+ sstables.

 my compaction strategy is STCS.

 Could someone give me some solution to deal with this situation.

 Thanks.
 --
 曹志富
 手机:18611121927
 邮箱:caozf.zh...@gmail.com
 微博:http://weibo.com/boliza/





Re: How to deal with too many sstables

2015-02-02 Thread 曹志富
You are right.I have already change cold_reads_to_omit to 0.0.

--
曹志富
手机:18611121927
邮箱:caozf.zh...@gmail.com
微博:http://weibo.com/boliza/

2015-02-03 14:15 GMT+08:00 Roland Etzenhammer r.etzenham...@t-online.de:

  Hi,

 maybe you are running into an issue that I also had on my test cluster.
 Since there were almost no reads on it cassandra did not run any minor
 compactions at all. Solution for me (in this case) was:

 ALTER TABLE tablename WITH compaction = {'class':
 'SizeTieredCompactionStrategy', 'min_threshold': '4', 'max_threshold':
 '32', 'cold_reads_to_omit': 0.0};
 where cold_reads_to_omit is the trick.

 Anyway as Eric and Marcus among others suggest, do not run 2.1.2 for
 production as it has many issues. I'm looking forward to test 2.1.3 when it
 arrives.

 Cheers,
 Roland


 Am 03.02.2015 um 03:05 schrieb 曹志富:

  Hi,all:
 I have 18 nodes C* cluster with cassandra2.1.2.Some nodes have aboud
 40,000+ sstables.

  my compaction strategy is STCS.

  Could someone give me some solution to deal with this situation.

  Thanks.
  --
 曹志富
 手机:18611121927
 邮箱:caozf.zh...@gmail.com
 微博:http://weibo.com/boliza/





Re: How to deal with too many sstables

2015-02-02 Thread Marcus Eriksson
https://issues.apache.org/jira/browse/CASSANDRA-8635

On Tue, Feb 3, 2015 at 5:47 AM, 曹志富 cao.zh...@gmail.com wrote:

 Just run nodetool repair.

 The nodes witch has many sstables are newest in my cluster.Before add
 these nodes to my cluster ,my cluster have not compaction automaticly
 because my cluster is an only write cluster.

 thanks.

 --
 曹志富
 手机:18611121927
 邮箱:caozf.zh...@gmail.com
 微博:http://weibo.com/boliza/

 2015-02-03 12:16 GMT+08:00 Flavien Charlon flavien.char...@gmail.com:

 Did you run incremental repair? Incremental repair is broken in 2.1 and
 tends to create way too many SSTables.

 On 2 February 2015 at 18:05, 曹志富 cao.zh...@gmail.com wrote:

 Hi,all:
 I have 18 nodes C* cluster with cassandra2.1.2.Some nodes have aboud
 40,000+ sstables.

 my compaction strategy is STCS.

 Could someone give me some solution to deal with this situation.

 Thanks.
 --
 曹志富
 手机:18611121927
 邮箱:caozf.zh...@gmail.com
 微博:http://weibo.com/boliza/






Re: How to deal with too many sstables

2015-02-02 Thread Flavien Charlon
Did you run incremental repair? Incremental repair is broken in 2.1 and
tends to create way too many SSTables.

On 2 February 2015 at 18:05, 曹志富 cao.zh...@gmail.com wrote:

 Hi,all:
 I have 18 nodes C* cluster with cassandra2.1.2.Some nodes have aboud
 40,000+ sstables.

 my compaction strategy is STCS.

 Could someone give me some solution to deal with this situation.

 Thanks.
 --
 曹志富
 手机:18611121927
 邮箱:caozf.zh...@gmail.com
 微博:http://weibo.com/boliza/



Re: Upgrading from 1.2 to 2.1 questions

2015-02-02 Thread Carlos Rolo
Using Pycassa (https://github.com/pycassa/pycassa)I had no trouble with the
Clients writing/reading from 1.2.x to 2.0.x (Can't recall the minor
versions out of my head right now).

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
http://linkedin.com/in/carlosjuzarterolo*
Tel: 1649
www.pythian.com

On Mon, Feb 2, 2015 at 3:21 PM, Oleg Dulin oleg.du...@gmail.com wrote:

  Sure but the question is really about going from 1.2 to 2.0 ...


 On 2015-02-02 13:59:27 +, Kai Wang said:


 I would not use 2.1.2 for production yet. It doesn't seem stable enough
 based on the feedbacks I see here. The newest 2.0.12 may be a better option.

 On Feb 2, 2015 8:43 AM, Sibbald, Charles charles.sibb...@bskyb.com
 wrote:

 Hi Oleg,


 What is the minor version of 1.2? I am looking to do the same for 1.2.14

 in a very large cluster.


 Regards


 Charles



 On 02/02/2015 13:33, Oleg Dulin oleg.du...@gmail.com wrote:


 Dear Distinguished Colleagues:

 

 We'd like to upgrade our cluster from 1.2 to 2.0 and then to 2.1 .

 

 We are using Pelops Thrift client, which has long been abandoned by its

 authors. I've read that 2.x has changes to the Thrift protocol making

 it incompatible with 1.2 (and of course now the link to that site

 eludes me). If that is true, we need to first upgrade our Thrift client

 and then upgrade cassandra.

 

 Let's start by confirming if that indeed is the case -- if that is

 true, I have my work cut out for me.

 

 Anyone knows for sure ?

 

 Regards,

 Oleg

 

 


 Information in this email including any attachments may be privileged,
 confidential and is intended exclusively for the addressee. The views
 expressed may not be official policy, but the personal views of the
 originator. If you have received it in error, please notify the sender by
 return e-mail and delete it from your system. You should not reproduce,
 distribute, store, retransmit, use or disclose its contents to anyone.
 Please note we reserve the right to monitor all e-mail communication
 through our internal and external networks. SKY and the SKY marks are
 trademarks of British Sky Broadcasting Group plc and Sky International AG
 and are used under licence. British Sky Broadcasting Limited (Registration
 No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075) and
 Sky Subscribers Services Limited (Registration No. 2340150) are direct or
 indirect subsidiaries of British Sky Broadcasting Group plc (Registration
 No. 2247735). All of th! e compani es mentioned in this paragraph are
 incorporated in England and Wales and share the same registered office at
 Grant Way, Isleworth, Middlesex TW7 5QD.







-- 


--





Re: Upgrading from 1.2 to 2.1 questions

2015-02-02 Thread Oleg Dulin
What about Java clients that were built for 1.2 and how they work with 2.0 ? 


On 2015-02-02 14:32:53 +, Carlos Rolo said:

Using Pycassa (https://github.com/pycassa/pycassa)I had no trouble with 
the Clients writing/reading from 1.2.x to 2.0.x (Can't recall the minor 
versions out of my head right now).


Regards,

Carlos Juzarte Rolo
Cassandra Consultant
 
Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
Tel: 1649
www.pythian.com

On Mon, Feb 2, 2015 at 3:21 PM, Oleg Dulin oleg.du...@gmail.com wrote:
Sure but the question is really about going from 1.2 to 2.0 ...


On 2015-02-02 13:59:27 +, Kai Wang said:


I would not use 2.1.2 for production yet. It doesn't seem stable enough 
based on the feedbacks I see here. The newest 2.0.12 may be a better 
option.

On Feb 2, 2015 8:43 AM, Sibbald, Charles charles.sibb...@bskyb.com wrote:
Hi Oleg,


What is the minor version of 1.2? I am looking to do the same for 1.2.14
in a very large cluster.


Regards


Charles




On 02/02/2015 13:33, Oleg Dulin oleg.du...@gmail.com wrote:


Dear Distinguished Colleagues:

We'd like to upgrade our cluster from 1.2 to 2.0 and then to 2.1 .

We are using Pelops Thrift client, which has long been abandoned by its
authors. I've read that 2.x has changes to the Thrift protocol making
it incompatible with 1.2 (and of course now the link to that site
eludes me). If that is true, we need to first upgrade our Thrift client
and then upgrade cassandra.

Let's start by confirming if that indeed is the case -- if that is
true, I have my work cut out for me.

Anyone knows for sure ?

Regards,
Oleg




Information in this email including any attachments may be privileged, 
confidential and is intended exclusively for the addressee. The views 
expressed may not be official policy, but the personal views of the 
originator. If you have received it in error, please notify the sender 
by return e-mail and delete it from your system. You should not 
reproduce, distribute, store, retransmit, use or disclose its contents 
to anyone. Please note we reserve the right to monitor all e-mail 
communication through our internal and external networks. SKY and the 
SKY marks are trademarks of British Sky Broadcasting Group plc and Sky 
International AG and are used under licence. British Sky Broadcasting 
Limited (Registration No. 2906991), Sky-In-Home Service Limited 
(Registration No. 2067075) and Sky Subscribers Services Limited 
(Registration No. 2340150) are direct or indirect subsidiaries of 
British Sky Broadcasting Group plc (Registration No. 2247735). All of 
th! e compani es mentioned in this paragraph are incorporated in 
England and Wales and share the same registered office at Grant Way, 
Isleworth, Middlesex TW7 5QD.











--






--
Regards,
Oleg Dulin
http://www.olegdulin.com 

Help on modeling a table

2015-02-02 Thread Asit KAUSHIK
HI All

We are working on a application logging project and this is one of the
search tables  as below :


CREATE TABLE logentries (
logentrytimestamputcguid timeuuid PRIMARY KEY,
context text,
date_to_hour bigint,
durationinseconds float,
eventtimestamputc timestamp,
ipaddress inet,
logentrytimestamputc timestamp,
loglevel int,
logmessagestring text,
logsequence int,
message text,
modulename text,
productname text,
searchitems maptext, text,
servername text,
sessionname text,
stacktrace text,
threadname text,
timefinishutc timestamp,
timestartutc timestamp,
urihostname text,
uripathvalue text,
uriquerystring text,
useragentstring text,
username text
);

I have some queries on the design of this table :

1) Does a timeuuid is a good candidate for partition key  as we would be
querying other fields with stargate-core full text project

This table is actually be used for search like username like '*john'
likewise and uing this present model the performance is very slow .

Please advise

Regards
Asit


Re: Help on modeling a table

2015-02-02 Thread Jack Krupansky
A leading wildcard is one of the slowest things you can do with Lucene, and
not a recommended practice, so either accept that it is slow or don't do it.

That said, there is a trick you can do with a reverse wildcard filter, but
that's an expert-level feature and not recommended for average developers.

-- Jack Krupansky

On Mon, Feb 2, 2015 at 10:33 AM, Asit KAUSHIK asitkaushikno...@gmail.com
wrote:

 HI All

 We are working on a application logging project and this is one of the
 search tables  as below :


 CREATE TABLE logentries (
 logentrytimestamputcguid timeuuid PRIMARY KEY,
 context text,
 date_to_hour bigint,
 durationinseconds float,
 eventtimestamputc timestamp,
 ipaddress inet,
 logentrytimestamputc timestamp,
 loglevel int,
 logmessagestring text,
 logsequence int,
 message text,
 modulename text,
 productname text,
 searchitems maptext, text,
 servername text,
 sessionname text,
 stacktrace text,
 threadname text,
 timefinishutc timestamp,
 timestartutc timestamp,
 urihostname text,
 uripathvalue text,
 uriquerystring text,
 useragentstring text,
 username text
 );

 I have some queries on the design of this table :

 1) Does a timeuuid is a good candidate for partition key  as we would be
 querying other fields with stargate-core full text project

 This table is actually be used for search like username like '*john'
 likewise and uing this present model the performance is very slow .

 Please advise

 Regards
 Asit








Re: Help on modeling a table

2015-02-02 Thread Asit KAUSHIK
I'll try your recommendations and would update on the same
Thanks so much
Cheers
Asit

On Mon, Feb 2, 2015, 9:56 PM Eric Stevens migh...@gmail.com wrote:

 Just a minor observation: those field names are extremely long.  You store
 a copy of every field name with every value with only a couple of
 exceptions:
 http://www.datastax.com/documentation/cassandra/1.2/cassandra/architecture/architecturePlanningUserData_t.html

 Your partition key column name (logentrytimestamputcguid) is just kept in
 the schema, so the length of that name doesn't impact your storage costs.
 Also clustering keys (you have none) you pay to store the *value* (not
 the name) before all other non clustering columns.

 Generally it's a good idea to prefer short column names over long ones.
 It increases application and diagnostic complexity, but we try to keep our
 column names under 4 bytes.  This storage overhead for column names is
 reduced if you use sstable compression, but at the cost of an increase in
 CPU time.

 On Mon, Feb 2, 2015 at 8:33 AM, Asit KAUSHIK asitkaushikno...@gmail.com
 wrote:

 HI All

 We are working on a application logging project and this is one of the
 search tables  as below :


 CREATE TABLE logentries (
 logentrytimestamputcguid timeuuid PRIMARY KEY,
 context text,
 date_to_hour bigint,
 durationinseconds float,
 eventtimestamputc timestamp,
 ipaddress inet,
 logentrytimestamputc timestamp,
 loglevel int,
 logmessagestring text,
 logsequence int,
 message text,
 modulename text,
 productname text,
 searchitems maptext, text,
 servername text,
 sessionname text,
 stacktrace text,
 threadname text,
 timefinishutc timestamp,
 timestartutc timestamp,
 urihostname text,
 uripathvalue text,
 uriquerystring text,
 useragentstring text,
 username text
 );

 I have some queries on the design of this table :

 1) Does a timeuuid is a good candidate for partition key  as we would be
 querying other fields with stargate-core full text project

 This table is actually be used for search like username like '*john'
 likewise and uing this present model the performance is very slow .

 Please advise

 Regards
 Asit









Re: Help on modeling a table

2015-02-02 Thread Eric Stevens
Just a minor observation: those field names are extremely long.  You store
a copy of every field name with every value with only a couple of
exceptions:
http://www.datastax.com/documentation/cassandra/1.2/cassandra/architecture/architecturePlanningUserData_t.html

Your partition key column name (logentrytimestamputcguid) is just kept in
the schema, so the length of that name doesn't impact your storage costs.
Also clustering keys (you have none) you pay to store the *value* (not the
name) before all other non clustering columns.

Generally it's a good idea to prefer short column names over long ones.  It
increases application and diagnostic complexity, but we try to keep our
column names under 4 bytes.  This storage overhead for column names is
reduced if you use sstable compression, but at the cost of an increase in
CPU time.

On Mon, Feb 2, 2015 at 8:33 AM, Asit KAUSHIK asitkaushikno...@gmail.com
wrote:

 HI All

 We are working on a application logging project and this is one of the
 search tables  as below :


 CREATE TABLE logentries (
 logentrytimestamputcguid timeuuid PRIMARY KEY,
 context text,
 date_to_hour bigint,
 durationinseconds float,
 eventtimestamputc timestamp,
 ipaddress inet,
 logentrytimestamputc timestamp,
 loglevel int,
 logmessagestring text,
 logsequence int,
 message text,
 modulename text,
 productname text,
 searchitems maptext, text,
 servername text,
 sessionname text,
 stacktrace text,
 threadname text,
 timefinishutc timestamp,
 timestartutc timestamp,
 urihostname text,
 uripathvalue text,
 uriquerystring text,
 useragentstring text,
 username text
 );

 I have some queries on the design of this table :

 1) Does a timeuuid is a good candidate for partition key  as we would be
 querying other fields with stargate-core full text project

 This table is actually be used for search like username like '*john'
 likewise and uing this present model the performance is very slow .

 Please advise

 Regards
 Asit








Re: Cassandra on Ceph

2015-02-02 Thread Eric Stevens
Colin, I'm not familiar with Ceph, but it sounds like it's a more
sophisticated version of a SAN.

Be aware that running Cassandra on absolutely anything other than local
disks is an anti-pattern.  It will have a profound negative impact on
performance, scalability, and reliability of your cluster.

On Sun, Feb 1, 2015 at 8:13 PM, Colin Taylor colin.tay...@gmail.com wrote:

 Oops -  Nonetheless in on my environments  -  Nonetheless in *one of* my
 environments

 On 2 February 2015 at 16:12, Colin Taylor colin.tay...@gmail.com wrote:

 Thanks all for you input.

 I'm aware of the overlap, I'm aware I need to turn Ceph replication off,
 I'm aware this isn't ideal. Nonetheless in on my environments instead of
 raw disk to install C* on, I'm likely to just have Ceph storage. This is a
 fully managed environment (excepting for C*) and that's their standard.

 cheers
 Colin

 On 2 February 2015 at 14:42, Daniel Compton 
 daniel.compton.li...@gmail.com wrote:

 As Jan has already mentioned, Ceph and Cassandra do almost all of the
 same things. Replicated self healing data storage on commodity hardware
 without a SPOF describes both of these systems. If you did manage to get
 it running it would be a nightmare to reason about what's happening at the
 disk and network level.

 You're going to get write amplification by your replication factor of
 both Cassandra, and Ceph unless you turn one of them down. This impacts
 disk I/O, disk space, CPU, and network bandwidth. If you turned down Ceph
 replication I think it would be possible for all of the replicated data for
 some chunk to be stored on one node and be at risk of loss. E.g. 1x Ceph,
 3x Cassandra replication could store all 3 Cassandra replicas on the same
 Ceph node. 3x Ceph, 1x Cassandra would be safer, but presumably slower.

 Lastly Cassandra is designed around running against local disks, you
 will lose a lot of the advantages of this running it on Ceph.

 Daniel.

 On Mon, 2 Feb 2015 at 1:11 am Baskar Duraikannu 
 baskar.duraika...@outlook.com wrote:

  What is the reason for running Cassandra on Ceph? I have both running
 in my environment but doing different things - Cassandra as transactional
 store and Ceph as block storage for storing files.
  --
 From: Jan cne...@yahoo.com
 Sent: ‎2/‎1/‎2015 2:53 AM
 To: user@cassandra.apache.org
 Subject: Re: Cassandra on Ceph

   Colin;

  Ceph is a block based storage architecture based on RADOS.
 It comes with its own replication  rebalancing along with a map of the
 storage layer.

  Some more details  similarities:
  a)Ceph stores a client’s data as objects within storage pools.
 (think of C* partitions)
  b) Using the CRUSH algorithm, Ceph calculates which placement group
 should contain the object, (C* primary keys  vnode data distribution)
  c) and further calculates which Ceph OSD Daemon should store the
 placement group   (C* node locality)
  d) The CRUSH algorithm enables the Ceph Storage Cluster to scale,
 rebalance, and recover dynamically (C* big table storage architecture).

 Summary:
 C*  comes with everything that Ceph provides (with the exception of
 block storage).
  There is no value add that Ceph brings to the table that C* does not
 already provide.
  I seriously doubt if C* could even work out of the box with yet
 another level of replication  rebalancing.

  Hope this helps
  Jan/

  C* Architect






   On Saturday, January 31, 2015 7:28 PM, Colin Taylor 
 colin.tay...@gmail.com wrote:


  I may be forced to run Cassandra on top of Ceph. Does anyone have
 experience / tips with this. Or alternatively, strong reasons why this
 won't work.

  cheers
 Colin







How to deal with too many sstables

2015-02-02 Thread 曹志富
Hi,all:
I have 18 nodes C* cluster with cassandra2.1.2.Some nodes have aboud
40,000+ sstables.

my compaction strategy is STCS.

Could someone give me some solution to deal with this situation.

Thanks.
--
曹志富
手机:18611121927
邮箱:caozf.zh...@gmail.com
微博:http://weibo.com/boliza/


Re: How to deal with too many sstables

2015-02-02 Thread Roland Etzenhammer

Hi,

maybe you are running into an issue that I also had on my test cluster. 
Since there were almost no reads on it cassandra did not run any minor 
compactions at all. Solution for me (in this case) was:


ALTER TABLE tablename WITH compaction = {'class': 
'SizeTieredCompactionStrategy', 'min_threshold': '4', 'max_threshold': 
'32', 'cold_reads_to_omit': 0.0};

where cold_reads_to_omit is the trick.

Anyway as Eric and Marcus among others suggest, do not run 2.1.2 for 
production as it has many issues. I'm looking forward to test 2.1.3 when 
it arrives.


Cheers,
Roland


Am 03.02.2015 um 03:05 schrieb 曹志富:

Hi,all:
I have 18 nodes C* cluster with cassandra2.1.2.Some nodes have aboud 
40,000+ sstables.


my compaction strategy is STCS.

Could someone give me some solution to deal with this situation.

Thanks.
--
曹志富
手机:18611121927
邮箱:caozf.zh...@gmail.com mailto:caozf.zh...@gmail.com
微博:http://weibo.com/boliza/




Re: Question about use scenario with fulltext search

2015-02-02 Thread Asit KAUSHIK
Yes but the stargate-core project is using native lucene libraries but yes
it would be dependent on the stargate-core developer.

I find that very easy and doing more analysis on this.

Regards
Asit

On Mon, Feb 2, 2015 at 12:50 PM, Colin colpcl...@gmail.com wrote:

 I use solr and cassandra but not together.  I write what I want indexed
 into solr (and only unstructured data), and related data into either
 cassandra or oracle.  I use the same key across all three db's.

 When I need full text search etc, I read the data from solr, grab the
 keys, and go get the data from the other db's.

 This avoids conflation of concerns, isolates failures, but is dependent
 upon multiple writes.  I use a message bus and services based approach.

 In my experience, at scale this approach works better and avoids vendor
 lock in.

 --
 *Colin Clark*
 +1 612 859 6129
 Skype colin.p.clark

 On Feb 2, 2015, at 7:25 AM, Asit KAUSHIK asitkaushikno...@gmail.com
 wrote:

 I tried elasticsearch but pulling up the data from Cassandra is a big
 pain. The river pulls up all the the data everytime and no incremental
 approach. Its a great product but i had to change my writing approach which
 i am just doing in Cassandra from .net client .
 Also you have to create a separate infrastructure for elasticsearch.
 Agin this is what i found with limited analysis on elasticsearch

 Regards
 Asit


 On Mon, Feb 2, 2015 at 11:43 AM, Asit KAUSHIK asitkaushikno...@gmail.com
 wrote:

 Also there is a project as Stargate-Core which gives the utility of
 querying with wildcard characters.
 the location is
 https://github.com/tuplejump/stargate-core/releases/tag/0.9.9

 it supports the 2.0.11 version of cassandra..



 Also elasticsearch is another product but pumping the data from Cassandra
 is a bad option in elasticsearch. You have to design you write such that
 you write on both.
 But i am using the Stargate-Core personally  its very easy to implement
 and use

 Hope this add a cent to you evaluation on this topic

 Regards
 Asit





 On Sun, Feb 1, 2015 at 10:45 PM, Mark Reddy mark.l.re...@gmail.com
 wrote:

 If you have a list of usernames stored in your cassandra database,
 how could you find all usernames starting with 'Jo'?


 Cassandra does not support full text search on its own, if you are
 looking into DataStax enterprise Cassandra there is an integration with
 Slor that gives you this functionality.

 Personally for projects I work on that use Cassandra and require full
 text search, the necessary data is indexed into Elasticsearch.

 Or ... if this is not possible,
 what are you using cassandra for?


 If you are looking for use cases here is a comprehensive set from
 companies spanning many industries:
 http://planetcassandra.org/apache-cassandra-use-cases/


 Regards,
 Mark

 On 1 February 2015 at 16:05, anton anto...@gmx.de wrote:

 Hi,

 I was just reading about cassandra and playing a little
 with it (using django www.djangoproject.com on the web server).

 One thing that I realized now is that fulltext search
 as in a normal sql statement (example):

   select name from users where name like 'Jo%';

 Simply does not work because this functionality does not exist.
 After reading and googeling and reading ...
 I still do not understand how I could use a db without this
 functionality (If I do not want to restrict myself on numerical data).

 So my question is:

 If you have a list of usernames stored in your cassandra database,
 how could you find all usernames starting with 'Jo'?


 Or ... if this is not possible,
 what are you using cassandra for?

 Actually I still did not get the point of how I could use cassandra :-(

 Anton







Re: Question about use scenario with fulltext search

2015-02-02 Thread Andres de la Peña
You can also try Stratio Cassandra, which is based in Cassandra 2.1.2, the
latest version of Apache Cassandra:
https://github.com/Stratio/stratio-cassandra

It provides an open sourced implementation of the secondary indexes of
Cassandra, which allows you to perform full-text queries, distributed
relevance search, etc.

It was presented in the last Cassandra Summit Europe:
http://www.slideshare.net/dhiguero/advanced-search-and-topk-queries-in-cassandracassandrasummiteurope2014
https://www.youtube.com/watch?v=Hg5s-hXy_-M



2015-02-02 9:04 GMT+01:00 Asit KAUSHIK asitkaushikno...@gmail.com:

 Yes but the stargate-core project is using native lucene libraries but yes
 it would be dependent on the stargate-core developer.

 I find that very easy and doing more analysis on this.

 Regards
 Asit

 On Mon, Feb 2, 2015 at 12:50 PM, Colin colpcl...@gmail.com wrote:

 I use solr and cassandra but not together.  I write what I want indexed
 into solr (and only unstructured data), and related data into either
 cassandra or oracle.  I use the same key across all three db's.

 When I need full text search etc, I read the data from solr, grab the
 keys, and go get the data from the other db's.

 This avoids conflation of concerns, isolates failures, but is dependent
 upon multiple writes.  I use a message bus and services based approach.

 In my experience, at scale this approach works better and avoids vendor
 lock in.

 --
 *Colin Clark*
 +1 612 859 6129
 Skype colin.p.clark

 On Feb 2, 2015, at 7:25 AM, Asit KAUSHIK asitkaushikno...@gmail.com
 wrote:

 I tried elasticsearch but pulling up the data from Cassandra is a big
 pain. The river pulls up all the the data everytime and no incremental
 approach. Its a great product but i had to change my writing approach which
 i am just doing in Cassandra from .net client .
 Also you have to create a separate infrastructure for elasticsearch.
 Agin this is what i found with limited analysis on elasticsearch

 Regards
 Asit


 On Mon, Feb 2, 2015 at 11:43 AM, Asit KAUSHIK asitkaushikno...@gmail.com
  wrote:

 Also there is a project as Stargate-Core which gives the utility of
 querying with wildcard characters.
 the location is
 https://github.com/tuplejump/stargate-core/releases/tag/0.9.9

 it supports the 2.0.11 version of cassandra..



 Also elasticsearch is another product but pumping the data from
 Cassandra is a bad option in elasticsearch. You have to design you write
 such that you write on both.
 But i am using the Stargate-Core personally  its very easy to implement
 and use

 Hope this add a cent to you evaluation on this topic

 Regards
 Asit





 On Sun, Feb 1, 2015 at 10:45 PM, Mark Reddy mark.l.re...@gmail.com
 wrote:

 If you have a list of usernames stored in your cassandra database,
 how could you find all usernames starting with 'Jo'?


 Cassandra does not support full text search on its own, if you are
 looking into DataStax enterprise Cassandra there is an integration with
 Slor that gives you this functionality.

 Personally for projects I work on that use Cassandra and require full
 text search, the necessary data is indexed into Elasticsearch.

 Or ... if this is not possible,
 what are you using cassandra for?


 If you are looking for use cases here is a comprehensive set from
 companies spanning many industries:
 http://planetcassandra.org/apache-cassandra-use-cases/


 Regards,
 Mark

 On 1 February 2015 at 16:05, anton anto...@gmx.de wrote:

 Hi,

 I was just reading about cassandra and playing a little
 with it (using django www.djangoproject.com on the web server).

 One thing that I realized now is that fulltext search
 as in a normal sql statement (example):

   select name from users where name like 'Jo%';

 Simply does not work because this functionality does not exist.
 After reading and googeling and reading ...
 I still do not understand how I could use a db without this
 functionality (If I do not want to restrict myself on numerical data).

 So my question is:

 If you have a list of usernames stored in your cassandra database,
 how could you find all usernames starting with 'Jo'?


 Or ... if this is not possible,
 what are you using cassandra for?

 Actually I still did not get the point of how I could use cassandra :-(

 Anton








-- 

Andrés de la Peña


http://www.stratio.com/
Avenida de Europa, 26. Ática 5. 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 352 59 42 // *@stratiobd https://twitter.com/StratioBD*


RE: FW: How to use cqlsh to access Cassandra DB if the client_encryption_options is enabled

2015-02-02 Thread Lu, Boying
Thanks a lot ;)

I’ll try your suggestions.

From: Adam Holmberg [mailto:adam.holmb...@datastax.com]
Sent: 2015年1月31日 1:12
To: user@cassandra.apache.org
Subject: Re: FW: How to use cqlsh to access Cassandra DB if the 
client_encryption_options is enabled

Assuming the truststore you are referencing is the same one the server is 
using, it's probably in the wrong format. You will need to export the cert into 
a PEM format for use in the (Python) cqlsh client. If exporting from the java 
keystore format, use

keytool -exportcert source keystore, pass, etc -rfc -file output file

If you have the crt file, you should be able to accomplish the same using 
openssl:

openssl x509 -in in crt -inform DER -out output file -outform PEM

Then, you should refer to that PEM file in your command. Alternatively, you can 
specify a path to the file (along with other options) in your cqlshrc file.

References:
How cqlsh picks up ssl 
optionshttps://github.com/apache/cassandra/blob/cassandra-2.1/pylib/cqlshlib/sslhandling.py
Example cqlshrc 
filehttps://github.com/apache/cassandra/blob/cassandra-2.1/conf/cqlshrc.sample

Adam Holmberg

On Wed, Jan 28, 2015 at 1:08 AM, Lu, Boying 
boying...@emc.commailto:boying...@emc.com wrote:
Hi, All,

Does anyone know the answer?

Thanks a lot

Boying


From: Lu, Boying
Sent: 2015年1月6日 11:21
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: How to use cqlsh to access Cassandra DB if the 
client_encryption_options is enabled

Hi, All,

I turned on the dbclient_encryption_options like this:
client_encryption_options:
enabled: true
keystore:  path-to-my-keystore-file
keystore_password:  my-keystore-password
truststore: path-to-my-truststore-file
truststore_password:  my-truststore-password
…

I can use following cassandra-cli command to access DB:
cassandra-cli  -ts path-to-my-truststore-file –tspw my-truststore-password –tf 
org.apache.cassandra.thrift.SSLTransportFactory

But when I tried to access DB by cqlsh like this:
SSL_CERTFILE=path-to-my-truststore cqlsh –t cqlishlib.ssl.ssl_transport_factory

I got following error:
Connection error: Could not connect to localhost:9160: [Errno 0] _ssl.c:332: 
error::lib(0):func(0):reason(0)

I guess the reason maybe is that I didn’t provide the trustore password.   But 
cqlsh doesn’t provide such option.

Does anyone know how to resolve this issue?

Thanks

Boying