Issue with Cassandra client to node security using SSL

2015-10-27 Thread Vishwajeet Singh
Hi,

I am using cassandra 2.1 . My goal is to do cassandra client to node
security using SSL with my self-signed CA.

Self-signed CA is giving me following files.
1. ca.crt
2. ca.key
3. client.csr
4. client.crt
5. client.key
6. client.p12

I am creating .jks (client.jks) file from client.p12 using below command so
that I can use client.jks as a keystore and truststore in cassandra.yaml
file. (We can't use .p12 file as a keystore and truststore because it's not
in X.509 format).

"keytool -importkeystore -srckeystore client.p12 -srcstoretype pkcs12
-destkeystore client.jks -deststoretype jks -deststorepass "

I have to connect cassandra with cql.

I am creating cqlshrc file in .cassandra directory and I am putting
client.crt as a certfile and usercert. I am putting client.key as a userkey.

When I am running "cqlsh --ssl". I am getting error (mentioned below).

"Connection error: ('Unable to connect to any servers', {'127.0.0.1':
error(1, u"Tried connecting to [('127.0.0.1', 9042)]. Last error: [SSL:
CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)")})"

Second thing is when I am using ca.key and ca.crt and again creating client
certificate(node_cer_user1.pem), client key(node_key_user1.pem) and
keystore(node.keystore) using below commands then It's working.

1. keytool -importcert -alias  -file ca.crt -keystore
 -storepass 
2. keytool -genkeypair -alias  -keyalg RSA -keysize 2048 -keypass
 -keystore  -storepass  -validity 365
3. keytool -keystore  -alias  -certreq -file
 -storepass  -keypass 
4. openssl x509 -req -CA ca.crt -CAkey ca.key -in  -out
 -days 365 -CAcreateserial
5. keytool -keystore  -storepass  -alias 
-import -file ca.crt -noprompt
6. keytool -keystore  -storepass  -alias 
-import -file  -keypass 
7. keytool -importkeystore -srckeystore  -destkeystore
 -deststoretype PKCS12
8. openssl pkcs12 -in  -nokeys -out 
-passin pass:
9. openssl pkcs12 -in  -nodes -nocerts -out
 -passin pass:



Self-signed CA is giving me client.crt, client.key and keystore, then Why
It's not working? Why I have to create certificates again using ca.crt and
ca.key?


Thanks,
Vishwajeet


Re: C* Table Changed and Data Migration with new primary key

2015-10-27 Thread qihuang.zheng
I try spark-cassandra-connector. As our data from src table has ttl, and 
saveToCassandra default does’t insert ttl .
fortunately we have a timestamp field indicate insert time. but 
TTLOption.perRow should based on a column,
not support query this column then do calculation to setup ttl.  so before 
saveToCassandra,
I map to newly case class which had a ttl field(last one) so that can directory 
used in TTLOption.perRow


case class Velocity(attribute: String,partner_code:String,
 app_name: String,attr_type:String, timestamp: Long,ttl: Int)
def localTest(tbl : String): Unit = {
 val velocitySrcTbl = sc.cassandraTable(“ks", "velocity").filter(row = 
(row.getLong("timestamp"): java.lang.Long) != null)
 val nowlong = System.currentTimeMillis()
 val now = (nowlong/1000).toInt
 val velocityRDD = velocitySrcTbl.map(row={
 val ts = (row.getLong("timestamp")/1000).toInt
 Velocity(
 row.getString("attribute"),
 row.getString("partner_code"),
 row.getString("app_name"),
 row.getString("type"),
 row.getLong("timestamp"),
 90*86400-(now-ts) //calculation ttl and directly used as parameter in 
TTLOption.perRow()
 )
 })
 velocityRDD.saveToCassandra("forseti", tbl,
 SomeColumns("attribute", "partner_code", "app_name", "type" as "attr_type", 
"timestamp"),
 writeConf = WriteConf(ttl = TTLOption.perRow("ttl")))
}
But there are something wrong here:


WARN scheduler.TaskSetManager: Lost task 1.3 in stage 16.0 (TID 87, 
192.168.6.53): java.lang.NullPointerException: Unexpected null value of column 
5. Use get[Option[...]] to receive null values.


I alreay filter column5: timestamp filed not null. But why this exception 
happen. 
I also try use : getLongOption, but this exception still happen.
https://github.com/datastax/spark-cassandra-connector/blob/master/doc/5_saving.md


at first I want to ask issue on spark-case-connector project, but there are no 
issues there, so I ask here.


Tks, qihuang.zheng




原始邮件
发件人:DuyHai doandoanduy...@gmail.com
收件人:useru...@cassandra.apache.org
发送时间:2015年10月22日(周四) 19:50
主题:Re: C* Table Changed and Data Migration with new primary key


Use Spark to distribute the job of copying data all over the cluster and help 
accelerating the migration. The Spark connector does auto paging in the 
background with the Java Driver
Le 22 oct. 2015 11:03, "qihuang.zheng" qihuang.zh...@fraudmetrix.cn a écrit :

I tried using java driver with auto paging query: setFetchSize instead of token 
function. as Cass has this feature already.
ref from 
here:http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0


But I tried in test envrionment with only 1million data read then insert 3 
tables, It’s too slow.
After running 20 min, Exception likeNoHostAvailableException happen, offcourse 
data did’t sync completed.
And our product env has nearly 25 billion data. which is unacceptble for this 
case. It’s there other ways?



Thanks  Regards,
qihuang.zheng


原始邮件
发件人:Jeff jirsajeff.ji...@crowdstrike.com
收件人:user@cassandra.apache.orgu...@cassandra.apache.org
发送时间:2015年10月22日(周四) 13:52
主题:Re: C* Table Changed and Data Migration with new primary key


Because the data format has changed, you’ll need to read it out and write it 
back in again.


This means using either a driver (java, python, c++, etc), or something like 
spark.


In either case, split up the token range so you can parallelize it for 
significant speed improvements.






From: "qihuang.zheng"
Reply-To: "user@cassandra.apache.org"
Date: Wednesday, October 21, 2015 at 6:18 PM
To: user
Subject: C* Table Changed and Data Migration with new primary key



Hi All:
 We have a table defined only one partition key and some cluster key.
CREATE TABLE test1 (
 attribute text,   
 partner text,  
 app text,
 "timestamp" bigint, 
 event text, 
 PRIMARY KEY ((attribute), partner, app, "timestamp")
)
And now we want to split original test1 table to 3 tables like this: 
test_global : PRIMARY KEY ((attribute),“timestamp")
test_partner: PRIMARY KEY ((attribute, partner), "timestamp”)
test_app:PRIMARY KEY ((attribute, partner, app), “timestamp”)


Why we split original table because when queryglobal databy timestamp desc like 
this:
select * from test1 where attribute=? order by timestamp desc
is not support in Cass. As class order by support should use all clustering key.
But sql like this:
select * from test1 where attribute=? order by partner desc,app desc, timestamp 
desc
can’t query the right global data by ts desc.
After Split table we could do globa data query right: select * from test_global 
where attribute=? order by timestamp desc.


Now we have a problem ofdata migration.
As I Know,sstableloaderis the most easy way,but could’t deal with different 
table name. (Am I right?)
Andcpcmd in cqlsh can’t fit our situation because our data is two large. 
(10Nodes, one nodes has 400G data)
I alos try JavaAPI by query the origin table and then insert into 3 different 
splited table.But seems too slow


Any Solution aboult quick data 

unsubscribe

2015-10-27 Thread Brian Tarbox
unsubscribe

-- 
http://about.me/BrianTarbox


2.1 counters and CL=ONE

2015-10-27 Thread Robert Wille
I’m planning an upgrade from 2.0 to 2.1, and was reading about counters, and 
ended up with a question. I read that in 2.0, counters are implemented by 
storing deltas, and in 2.1, read-before-write is used to store totals instead. 
What does this mean for the following scenario?

Suppose we have a cluster with two nodes, RF=2 and CL=ONE. With node 2 down, a 
previously nonexistent counter is incremented twice. With node 1 down, the 
counter is incremented once. When both nodes are up, repair is run.

Does this mean that 2.0 would repair the counter by replicating the missing 
deltas so that both nodes have all three increments, and 2.1 would repair the 
counter by replicating node 2’s total to node 1? With 2.0, the count would end 
up 3, and with 2.1 the count would end up 1?

I assume that the implementation isn’t that naive, but I need to make sure.

Thanks

Robert



Re: unsubscribe

2015-10-27 Thread Steve Robenalt
Hi Brian,

You can't unsubscribe using the mailing list email. There's a separate
email address for unsubscribing. You can find the unsubscribe email address
using the "Unsubscribe" link at the bottom of the page at
http://cassandra.apache.org

Steve


On Tue, Oct 27, 2015 at 4:44 AM, Brian Tarbox  wrote:

> unsubscribe
>
> --
> http://about.me/BrianTarbox
>



-- 
Steve Robenalt
Software Architect
sroben...@highwire.org 
(office/cell): 916-505-1785

HighWire Press, Inc.
425 Broadway St, Redwood City, CA 94063
www.highwire.org

Technology for Scholarly Communication


memtable flush size with LCS

2015-10-27 Thread Dan Kinder
Hi all,

The docs indicate that memtables are triggered to flush when data in the
commitlog is expiring or based on memtable_flush_period_in_ms.

But LCS has a specified sstable size; when using LCS are memtables flushed
when they hit the desired sstable size (default 160MB) or could L0 sstables
be much larger than that?

Wondering because I have an overwrite workload where larger memtables would
be helpful, and if I need to increase my LCS sstable size in order to allow
for that.

-dan


Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?

2015-10-27 Thread Vasileios Vlachos
Rob,

Would you mind to elaborate further on this? I am a little concerned that
my understanding (nodetool repair is *not* the only way one can achieve
consistency) is not correct. I understand that if people use CL < QUORUM,
nodetool repair is the only way to go, but I just cannot see how can that
be the only way irrespective of everything else.

Thanks in advance for your input!

On Sat, Oct 24, 2015 at 10:02 PM, Vasileios Vlachos <
vasileiosvlac...@gmail.com> wrote:

>
>> All other means of repair are optimizations which require a certain
>> amount of luck to happen to result in consistency.
>>
>
> Is that true regardless of the CL one uses? So, for example if writing
> QUORUM and reading QUORUM, wouldn't an increased read_repair_chance
> probability be sufficient? If not, is there a case where nodetool repair
> wouldn't be required (given consistency is a requirement)?
>
> Thanks
>


Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?

2015-10-27 Thread Robert Coli
On Sat, Oct 24, 2015 at 2:02 PM, Vasileios Vlachos <
vasileiosvlac...@gmail.com> wrote:

>
>> All other means of repair are optimizations which require a certain
>> amount of luck to happen to result in consistency.
>>
>
> Is that true regardless of the CL one uses? So, for example if writing
> QUORUM and reading QUORUM, wouldn't an increased read_repair_chance
> probability be sufficient? If not, is there a case where nodetool repair
> wouldn't be required (given consistency is a requirement)?
>

The only thing which guarantees consistency[1] is repair.

It's likely true that if the following conditions are met :

1) you read and write with QUORUM or ALL
2) you repair periodically
3) you have not dropped any mutations or had a crashed node since you last
repaired
4) you have not discarded any hints which happened to be stored for
whatever reason
5) you have not failed to store any hints due to hint backpressure

That you have a system with the property of consistency.

However no step other than 2) provides a *guarantee* of consistency. And it
only provides that guarantee for data that exists when the repair starts.

In a related concept, no step other than 2) *guarantees* that all data is
repaired within gc_grace_seconds, which is essential to consistency.

=Rob
[1] Durability and consistency are commingled in Cassandra, you are more
likely to fully achieve the former than the latter in the typical RF=3 case.


Re: memtable flush size with LCS

2015-10-27 Thread Dan Kinder
Thanks, I am using most of the suggested parameters to tune compactions. To
clarify, when you say "The sstable_size_in_mb can be thought of a target
for the compaction process moving the file beyond L0." do you mean that
this property is ignored at memtable flush time, and so memtables are
already allowed to be much larger than sstable_size_in_mb?

On Tue, Oct 27, 2015 at 2:57 PM, Nate McCall  wrote:

> The sstable_size_in_mb can be thought of a target for the compaction
> process moving the file beyond L0.
>
> Note: If there are more than 32 SSTables in L0, it will switch over to
> doing STCS for L0 (you can disable this behavior by passing
> -Dcassandra.disable_stcs_in_l0=true as a system property).
>
> With a lot of overwrites, the settings you want to tune will be
> gc_grace_seconds in combination with tombstone_threhsold,
> tombstone_compaction_interval and maybe unchecked_tombstone_compaction
> (there are different opinions about this last one, YMMV). Making these more
> aggressive and increasing your sstable_size_in_mb will allow for
> potentially capturing more overwrites in a level which will lead to less
> fragmentation. However, making the size too large will keep compaction from
> triggering on further out levels which can then exacerbate problems
> particulary if you have long-lived TTLs.
>
> In general, it is very workload specific, but monitoring the histogram for
> the number of ssables used in a read (via
> org.apache.cassandra.metrics.ColumnFamily.$KEYSPACE.$TABLE.SSTablesPerReadHistogram.95percentile
> or shown manually in nodetool cfhistograms output) after any change will
> help you narrow in a good setting.
>
> See
> http://docs.datastax.com/en/cql/3.1/cql/cql_reference/compactSubprop.html?scroll=compactSubprop__compactionSubpropertiesLCS
> for more details.
>
> On Tue, Oct 27, 2015 at 3:42 PM, Dan Kinder  wrote:
> >
> > Hi all,
> >
> > The docs indicate that memtables are triggered to flush when data in the
> commitlog is expiring or based on memtable_flush_period_in_ms.
> >
> > But LCS has a specified sstable size; when using LCS are memtables
> flushed when they hit the desired sstable size (default 160MB) or could L0
> sstables be much larger than that?
> >
> > Wondering because I have an overwrite workload where larger memtables
> would be helpful, and if I need to increase my LCS sstable size in order to
> allow for that.
> >
> > -dan
>
>
>
>
> --
> -
> Nate McCall
> Austin, TX
> @zznate
>
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>



-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com


Re: memtable flush size with LCS

2015-10-27 Thread Nate McCall
The sstable_size_in_mb can be thought of a target for the compaction
process moving the file beyond L0.

Note: If there are more than 32 SSTables in L0, it will switch over to
doing STCS for L0 (you can disable this behavior by passing
-Dcassandra.disable_stcs_in_l0=true as a system property).

With a lot of overwrites, the settings you want to tune will be
gc_grace_seconds in combination with tombstone_threhsold,
tombstone_compaction_interval and maybe unchecked_tombstone_compaction
(there are different opinions about this last one, YMMV). Making these more
aggressive and increasing your sstable_size_in_mb will allow for
potentially capturing more overwrites in a level which will lead to less
fragmentation. However, making the size too large will keep compaction from
triggering on further out levels which can then exacerbate problems
particulary if you have long-lived TTLs.

In general, it is very workload specific, but monitoring the histogram for
the number of ssables used in a read (via
org.apache.cassandra.metrics.ColumnFamily.$KEYSPACE.$TABLE.SSTablesPerReadHistogram.95percentile
or shown manually in nodetool cfhistograms output) after any change will
help you narrow in a good setting.

See
http://docs.datastax.com/en/cql/3.1/cql/cql_reference/compactSubprop.html?scroll=compactSubprop__compactionSubpropertiesLCS
for more details.

On Tue, Oct 27, 2015 at 3:42 PM, Dan Kinder  wrote:
>
> Hi all,
>
> The docs indicate that memtables are triggered to flush when data in the
commitlog is expiring or based on memtable_flush_period_in_ms.
>
> But LCS has a specified sstable size; when using LCS are memtables
flushed when they hit the desired sstable size (default 160MB) or could L0
sstables be much larger than that?
>
> Wondering because I have an overwrite workload where larger memtables
would be helpful, and if I need to increase my LCS sstable size in order to
allow for that.
>
> -dan




--
-
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: decommission too slow

2015-10-27 Thread Robert Coli
On Mon, Oct 26, 2015 at 7:58 PM, qihuang.zheng  wrote:

> Recently we want to delete some c* nodes for data migration.  C*
> verision:2.0.15
>
> we use *nodetool decommission* with *nohup: nohup nodetool decommission
> -h xxx *
>
> After execute 3 days already, seems this process did’t finished yet!
>

Your streams are probably hanging indefinitely.

IOW, the problem is not "decommission is slow" but "decommission will never
complete."

https://issues.apache.org/jira/browse/CASSANDRA-8611

in 2.1.10 gives a non-infinite timeout for such streams... until then your
only option is to restart the node and restart the decomission.

FWIW, you don't really need to nohup, because having nodetool connecting to
JMX has no effect on whether the decommission completes or doesn't...

=Rob


Re: Cassandra 2.2.1 stuck at 100% on Windows

2015-10-27 Thread Alaa Zubaidi (PDF)
It was a process installed by IT that triggered this, when disabling the
process everything went to normal.

Thanks.
Alaa

On Fri, Oct 16, 2015 at 11:32 AM, Alaa Zubaidi (PDF) 
wrote:

> Thanks guys,
> I will look into this more, and put an update here, if I find anything
>
> On Fri, Oct 16, 2015 at 10:40 AM, Josh McKenzie 
> wrote:
>
>> One option: use process explorer to find out the TID's of the java
>> process (instructions
>> ),
>> screen cap that, then also run jstack against the running cassandra process
>> out to a file a few times (instructions
>> 
>> ).
>>
>> We should be able to at least link up the TID to the hex thread # in the
>> jstack output to figure out who/what is spinning on there.
>>
>> On Fri, Oct 16, 2015 at 1:28 PM, Michael Shuler 
>> wrote:
>>
>>> On 10/16/2015 12:02 PM, Alaa Zubaidi (PDF) wrote:
>>>
 No OOM in any of the log files, and NO long GC at that time.
 I attached the last 2 minutes before it hangs until we restart cassandra
 after hour an half.

>>>
>>> Your logs show gossip issues with some seed nodes. `nodetool gossipinfo`
>>> on all nodes might be an interesting place to start.
>>>
>>> --
>>> Michael
>>>
>>
>>
>
>
> --
>
> Alaa Zubaidi
> PDF Solutions, Inc.
> 333 West San Carlos Street, Suite 1000
> San Jose, CA 95110  USA
> Tel: 408-283-5639
> fax: 408-938-6479
> email: alaa.zuba...@pdf.com
>
>


-- 

Alaa Zubaidi
PDF Solutions, Inc.
333 West San Carlos Street, Suite 1000
San Jose, CA 95110  USA
Tel: 408-283-5639
fax: 408-938-6479
email: alaa.zuba...@pdf.com

-- 
*This message may contain confidential and privileged information. If it 
has been sent to you in error, please reply to advise the sender of the 
error and then immediately permanently delete it and all attachments to it 
from your systems. If you are not the intended recipient, do not read, 
copy, disclose or otherwise use this message or any attachments to it. The 
sender disclaims any liability for such unauthorized use. PLEASE NOTE that 
all incoming e-mails sent to PDF e-mail accounts will be archived and may 
be scanned by us and/or by external service providers to detect and prevent 
threats to our systems, investigate illegal or inappropriate behavior, 
and/or eliminate unsolicited promotional e-mails (“spam”). If you have any 
concerns about this process, please contact us at *
*legal.departm...@pdf.com* *.*


Cassandra 2.2.1 on Windows

2015-10-27 Thread Alaa Zubaidi (PDF)
Hi,
We have Cassandra 2.2.1 on Windows 2008R2-64-bit.

We are noticing that during compaction Cassandra consumes all the available
memory for the VM and bring the VM to a crawl. compaction settings are
default, and we are using sized tiered compaction.

When investigating this, we find through VMMAP that the memory is consumed
by mapped files.
These are almost all the sstables (index and data) for all keyspaces,
mapped into memory (system and user defined keyspaces) .

Is this normal, or should these files not be mapped into memroy after
compaction?

Regards,
Alaa

-- 
*This message may contain confidential and privileged information. If it 
has been sent to you in error, please reply to advise the sender of the 
error and then immediately permanently delete it and all attachments to it 
from your systems. If you are not the intended recipient, do not read, 
copy, disclose or otherwise use this message or any attachments to it. The 
sender disclaims any liability for such unauthorized use. PLEASE NOTE that 
all incoming e-mails sent to PDF e-mail accounts will be archived and may 
be scanned by us and/or by external service providers to detect and prevent 
threats to our systems, investigate illegal or inappropriate behavior, 
and/or eliminate unsolicited promotional e-mails (“spam”). If you have any 
concerns about this process, please contact us at *
*legal.departm...@pdf.com* *.*


Fwd: Issue with Cassandra client to node security using SSL

2015-10-27 Thread Vishwajeet Singh
Hi,

I am using cassandra version 2.1 . My goal is to do cassandra client to
node security using SSL with my self-signed CA.

Self-signed CA is giving me following files.
1. ca.crt
2. ca.key
3. client.csr
4. client.crt
5. client.key
6. client.p12

I am creating .jks (client.jks) file from client.p12 using below command so
that I can use client.jks as a keystore and truststore in cassandra.yaml
file. (We can't use .p12 file as a keystore and truststore because it's not
in X.509 format).

"keytool -importkeystore -srckeystore client.p12 -srcstoretype pkcs12
-destkeystore client.jks -deststoretype jks -deststorepass "

I have to connect cassandra with cql.

I am creating cqlshrc file in .cassandra directory and I am putting
client.crt as a certfile and usercert. I am putting client.key as a userkey.

When I am running "cqlsh --ssl". I am getting error (mentioned below).

"Connection error: ('Unable to connect to any servers', {'127.0.0.1':
error(1, u"Tried connecting to [('127.0.0.1', 9042)]. Last error: [SSL:
CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)")})"

Second thing is when I am using ca.key and ca.crt and again creating client
certificate(node_cer_user1.pem), client key(node_key_user1.pem) and
keystore(node.keystore) using below commands then It's working.

1. keytool -importcert -alias  -file ca.crt -keystore
 -storepass 
2. keytool -genkeypair -alias  -keyalg RSA -keysize 2048 -keypass
 -keystore  -storepass  -validity 365
3. keytool -keystore  -alias  -certreq -file
 -storepass  -keypass 
4. openssl x509 -req -CA ca.crt -CAkey ca.key -in  -out
 -days 365 -CAcreateserial
5. keytool -keystore  -storepass  -alias 
-import -file ca.crt -noprompt
6. keytool -keystore  -storepass  -alias 
-import -file  -keypass 
7. keytool -importkeystore -srckeystore  -destkeystore
 -deststoretype PKCS12
8. openssl pkcs12 -in  -nokeys -out 
-passin pass:
9. openssl pkcs12 -in  -nodes -nocerts -out
 -passin pass:


Self-signed CA is giving me client.crt, client.key and keystore, then Why
It's not working? Why I have to create certificates again using ca.crt and
ca.key?


Thanks,
Vishwajeet


Need company to support Cassandra on Windows

2015-10-27 Thread Troy Collinsworth
Searching for a well established company that can provide consulting and
operations support for private multi-dc production Cassandra cluster on
Windows OS. New project. OS is hosting mandate.

Troy Collinsworth
585-576-8761