Re: Securing Cassandra database

2014-04-04 Thread Mark Reddy
Ok so you want to enable auth on Cassandra itself. You will want to look into the authentication and authorisation functionality then. Here is a quick overview: http://www.datastax.com/dev/blog/a-quick-tour-of-internal-authentication-and-authorization-security-in-datastax-enterprise-and-apache-cas

Re: Securing Cassandra database

2014-04-04 Thread Check Peck
Just to add, nobody should be able to read and write into our Cassandra database through any API *or any CQL client as well *only our team should be able to do that. On Fri, Apr 4, 2014 at 11:29 PM, Check Peck wrote: > Thanks Mark. But what about Cassandra database? I don't want anybody to > re

Re: Securing Cassandra database

2014-04-04 Thread Check Peck
Thanks Mark. But what about Cassandra database? I don't want anybody to read and write into our Cassandra database through any API only just our team should be able to do that. We are using CQL based tables so data doesn't get shown on the OPSCENTER. In our case, we would like to secure database

Re: Securing Cassandra database

2014-04-04 Thread Mark Reddy
Hi, If you want to just secure OpsCenter itself take a look here: http://www.datastax.com/documentation/opscenter/4.1/opsc/configure/opscAssigningAccessRoles_t.html If you want to enable internal authentication and still allow OpsCenter access, you can create an OpsCenter user and once you have

Securing Cassandra database

2014-04-04 Thread Check Peck
Hi All, We would like to secure our Cassandra database. We don't want anybody to read/write on our Cassandra database leaving our team members only. We are using Cassandra 1.2.9 in Production and we have 36 node Cassandra cluster. 12 in each colo as we have three datacenters. But we would lik

Re: Using C* and CAS to coordinate workers

2014-04-04 Thread Jan Algermissen
Hi Duy Hai, On 04 Apr 2014, at 20:48, DuyHai Doan wrote: > @Jan > > Your use-case is different than what i though. So basically you have only > one data source (the feed) and many consumers (the workers) > > Only one worker is allowed to consumer the feed at a time. > > This can be model

Re: Read performance in map data type

2014-04-04 Thread Tyler Hobbs
http://www.datastax.com/documentation/developer/java-driver/2.0/java-driver/tracing_t.html On Fri, Apr 4, 2014 at 11:34 AM, Apoorva Gaurav wrote: > > > On Fri, Apr 4, 2014 at 9:37 PM, Tyler Hobbs wrote: > >> >> On Fri, Apr 4, 2014 at 12:41 AM, Apoorva Gaurav < >> apoorva.gau...@myntra.com> wrot

Re: Using C* and CAS to coordinate workers

2014-04-04 Thread DuyHai Doan
@Jan Your use-case is different than what i though. So basically you have only one data source (the feed) and many consumers (the workers) Only one worker is allowed to consumer the feed at a time. This can be modeled very easily using distributed lock with C.A.S *CREATE TABLE feed_lock (*

Re: using hadoop + cassandra for CF mutations (delete)

2014-04-04 Thread William Oberman
Looking at the code, cassandra.input.split.size==Pig URL split_size, right? But, in cassandra 1.2.15 I'm wondering if there is a bug that would make the hadoop conf setting cassandra.input.split.size not be used unless you manually set the URI to splitSize=0 (because the abstract class defaults th

Re: using hadoop + cassandra for CF mutations (delete)

2014-04-04 Thread Paulo Ricardo Motta Gomes
You said you have tried the Pig URL split_size, but have you actually tried decreasing the value of cassandra.input.split.size hadoop property? The default is 65536, so you may want to decrease that to see if the number of mappers increase. But at some point, even if you lower that value it will st

Re: Using C* and CAS to coordinate workers

2014-04-04 Thread Jan Algermissen
Hi DuyHai, On 04 Apr 2014, at 13:58, DuyHai Doan wrote: > @Jan > > This subject of distributed workers & queues has been discussed in the > mailing list many times. Sorry + thanks. Unfortunately, I do not want to use C* as a queue, but to coordinate workers that page through an (XML) data

Re: Read performance in map data type

2014-04-04 Thread Apoorva Gaurav
On Fri, Apr 4, 2014 at 9:37 PM, Tyler Hobbs wrote: > > On Fri, Apr 4, 2014 at 12:41 AM, Apoorva Gaurav > wrote: > >> If we store the same data as a json using text data type i.e (studentID >> int, subjectMarksJson text) we are getting a latency of ~10ms from the same >> client for even bigger. I

using hadoop + cassandra for CF mutations (delete)

2014-04-04 Thread William Oberman
Hi, I have some history with cassandra + hadoop: 1.) Single DC + integrated hadoop = Was "ok" until I needed steady performance (the single DC was used in a production environment) 2.) Two DC's + integrated hadoop on 1 of 2 DCs = Was "ok" until my data grew and in AWS compute is expensive compared

Re: Read performance in map data type

2014-04-04 Thread Tyler Hobbs
On Fri, Apr 4, 2014 at 12:41 AM, Apoorva Gaurav wrote: > If we store the same data as a json using text data type i.e (studentID > int, subjectMarksJson text) we are getting a latency of ~10ms from the same > client for even bigger. I understand that json is not the preferred storage > for cassand

Re: Using C* and CAS to coordinate workers

2014-04-04 Thread DuyHai Doan
@Jan This subject of distributed workers & queues has been discussed in the mailing list many times. Basically one implementation can be: 1) *p* data providers, *c* data consumers 2) create partitions (physical rows) of arbitrary number of columns (let's say 10 000, not too big though). Partitio

Re: Using C* and CAS to coordinate workers

2014-04-04 Thread prem yadav
Oh ok. I thought you did not have a cassandra cluster already. Sorry about that. On Fri, Apr 4, 2014 at 11:42 AM, Jan Algermissen wrote: > > On 04 Apr 2014, at 11:18, prem yadav wrote: > > Though cassandra can work but to me it looks like you could use a > persistent queue for example (rabbitM

Re: Using C* and CAS to coordinate workers

2014-04-04 Thread Jan Algermissen
On 04 Apr 2014, at 11:18, prem yadav wrote: > Though cassandra can work but to me it looks like you could use a persistent > queue for example (rabbitMQ) to implement this. All your workers can > subscribe to a queue. > In fact, why not just MySQL? Hey, I have got a C* cluster that can (poten

Re: Using C* and CAS to coordinate workers

2014-04-04 Thread prem yadav
Though cassandra can work but to me it looks like you could use a persistent queue for example (rabbitMQ) to implement this. All your workers can subscribe to a queue. In fact, why not just MySQL? On Thu, Apr 3, 2014 at 11:44 PM, Jan Algermissen wrote: > Hi, > > maybe someone knows a nice solut