date:20140519

Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-19 Thread Aaron Morton

>> “between 1.2.6 and 2.0.6 the setInputRange(startToken, endToken) is not >> working” > Can you confirm or disprove? My reading of the code is that it will consider the part of a token range (from vnodes or initial tokens) that overlap with the provided token range. > I’ve already got one co

RE: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-19 Thread Anton Brazhnyk

Hi Aaron, I've seen the code which you describe (working with splits and intersections) but that range is derived from keys and work only for ordered partitioners (in 1.2.15). I've already got one confirmation that in C* version I use (1.2.15) setting limits with setInputRange(startToken, endTo

Re: Multi-dc cassandra keyspace

2014-05-19 Thread Nate McCall

We did something similar with a split cloud/physical hardware deployment. There was a weird requirement that app authentication data (fortunately in it's own keyspace already) could not "live on the cloud" (shrug). This ended up being a simple configuration change in the schema just like your exam

Re: CQL 3 and wide rows

2014-05-19 Thread Maciej Miklas

Hi James, Clustering is based on rows. I think that you meant not clustering columns, but compound columns. Still all columns belong to single table and are stored within single folder on one computer. And it looks to me (but I’am not sure) that CQL 3 driver loads all column names into memory -

Re: CQL 3 and wide rows

2014-05-19 Thread Maciej Miklas

Hallo Jack, You have given a perfect example for wide row. Each reading from sensor creates new column within a row. It was also possible with Hector/CLI to have millions of columns within a single row. According to this page http://wiki.apache.org/cassandra/CassandraLimitations single row can

Re: Ec2 Network I/O

2014-05-19 Thread Nate McCall

It's a good idea to increase phi_convict_threshold to at least 12 on EC2. Using placement groups and single-tenant systems will certainly help. Another optimization would be dedicating an Enhanced Network Interface ( http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html) specifically f

Re: Filtering on Collections

2014-05-19 Thread Patricia Gorla

I'm not sure about that — allowing collections as a primary key would be a much different implementation than setting up a secondary index. The primary key in CQL3 is actually the partition key which determines which token the row is assigned, so you would still need to have one partition key. Als

Re: Filtering on Collections

2014-05-19 Thread Eric Plowe

Ah, that is interesting, Patricia. Since they can be a secondary index, it's not too far off for them being able to be a primary key, no? On Mon, May 19, 2014 at 1:54 PM, Patricia Gorla wrote: > Raj, > > Secondary indexes across CQL3 collections were introduced into 2.1 beta1, > so will be avail

Re: Index with same Name but different keyspace

2014-05-19 Thread Bryan Talbot

On Mon, May 19, 2014 at 6:39 AM, mahesh rajamani wrote: > Sorry I just realized the table name in 2 schema are slightly different, > but still i am not sure why i should not use same index name across > different schema. Below is the instruction to reproduce. > > > Created 2 keyspace using cassand

Re: Best partition type for Cassandra with JBOD

2014-05-19 Thread Bryan Talbot

For XFS, using noatime and nodirtime isn't really useful either. http://xfs.org/index.php/XFS_FAQ#Q:_Is_using_noatime_or.2Fand_nodiratime_at_mount_time_giving_any_performance_benefits_in_xfs_.28or_not_using_them_performance_decrease.29.3F On Sat, May 17, 2014 at 7:52 AM, James Campbell < ja...

Re: Query first 1 columns for each partitioning keys in CQL?

2014-05-19 Thread Bryan Talbot

I think there are several issues in your schema and queries. First, the schema can't efficiently return the single newest post for every author. It can efficiently return the newest N posts for a particular author. On Fri, May 16, 2014 at 11:53 PM, 後藤泰陽 wrote: > > But I consider LIMIT to be a

Ec2 Network I/O

2014-05-19 Thread Phil Burress

Has anyone experienced network i/o issues with ec2? We are seeing a lot of these in our logs: HintedHandOffManager.java (line 477) Timed out replaying hints to /10.0.x.xxx; aborting (15 delivered) and these... Cannot handshake version with /10.0.x.xxx and these... java.io.IOException: Cannot p

Re: Filtering on Collections

2014-05-19 Thread Raj Janakarajan

Thanks Eric for the information. It looks like it will be supported in future versions. Raj On Mon, May 19, 2014 at 10:03 AM, Eric Plowe wrote: > Collection types cannot be used for filtering (as part of the where > statement). > They cannot be used as a primary key or part of a primary key.

Re: Filtering on Collections

2014-05-19 Thread Raj Janakarajan

Thank you Patricia. This is helpful. Raj On Mon, May 19, 2014 at 10:54 AM, Patricia Gorla wrote: > Raj, > > Secondary indexes across CQL3 collections were introduced into 2.1 beta1, > so will be available in future versions. See > https://issues.apache.org/jira/browse/CASSANDRA-4511 > > If yo

Re: Filtering on Collections

2014-05-19 Thread Patricia Gorla

Raj, Secondary indexes across CQL3 collections were introduced into 2.1 beta1, so will be available in future versions. See https://issues.apache.org/jira/browse/CASSANDRA-4511 If your main concern is performance then you should find another way to model the data: each collection is read entirely

Re: Filtering on Collections

2014-05-19 Thread Eric Plowe

Collection types cannot be used for filtering (as part of the where statement). They cannot be used as a primary key or part of a primary key. Secondary indexes are not supported as well. On Mon, May 19, 2014 at 12:50 PM, Raj Janakarajan wrote: > Hello all, > > I am using Cassandra version 2.0.7

Filtering on Collections

2014-05-19 Thread Raj Janakarajan

Hello all, I am using Cassandra version 2.0.7. I am wondering if "collections" is efficient for filtering. We are thinking of using "collections" to maintain a list for a customer row but we have to be able to filter on the collection values. Select UUID from customer where eligibility_state IN

Re: CQL 3 and wide rows

2014-05-19 Thread Jack Krupansky

You might want to review this blog post on supporting dynamic columns in CQL3, which points out that “the way to model dynamic cells in CQL is with a compound primary key.” See: http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows -- Jack Krupansky From: Maciej Miklas S

RE: CQL 3 and wide rows

2014-05-19 Thread James Campbell

Maciej, In CQL3 "wide rows" are expected to be created using clustering columns. So while the schema will have a relatively smaller number of named columns, the effect is a wide row. For example: CREATE TABLE keyspace.widerow ( row_key text, wide_row_column text, data_column text, PRIMA

CQL 3 and wide rows

2014-05-19 Thread Maciej Miklas

Hi *, I’ve checked DataStax driver code for CQL 3, and it looks like the column names for particular table are fully loaded into memory, it this true? Cassandra should support wide rows, meaning tables with millions of columns. Knowing that, I would expect kind of iterator for column names. Am I

Re: Cyclop - CQL web based editor has been released!

2014-05-19 Thread Maciej Miklas

thanks - I've fixed it. Regards, Maciej On Mon, May 12, 2014 at 2:50 AM, graham sanderson wrote: > Looks cool - giving it a try now (note FYI when building, > TestDataConverter.java line 46 assumes a specific time zone) > > On May 11, 2014, at 12:41 AM, Maciej Miklas wrote: > > Hi everybody,

Re: Index with same Name but different keyspace

2014-05-19 Thread mahesh rajamani

Sorry I just realized the table name in 2 schema are slightly different, but still i am not sure why i should not use same index name across different schema. Below is the instruction to reproduce. Created 2 keyspace using cassandra-cli [default@unknown] create keyspace keyspace1 with placement

Can SSTables overlap with SizeTieredCompactionStrategy?

2014-05-19 Thread Phil Luckhurst

We have a table defined using SizeTieredCompactionStrategy that is used to store time series data. Over a period of a few days we wrote approximately 200,000 unique time based entries for each of 700 identifiers, i.e. 700 wide rows with 200,000 entries in each. The table was empty when we started

Changing default_time_to_live

2014-05-19 Thread Keith Wright

Hi all, we are using C* 2.0.6 and have set the default_time_to_live parameter on a number of our LCS column families. I was wondering what would happen if we were to decrease this value via a table alter. Would subsequent compactions of data written before that alter honor the new value and re

Re: idempotent counters

2014-05-19 Thread Jabbar Azam

Thanks Aaron. I've mitigated this by removing the dependency on idempotent counters. But its good to know the limitations of counters. Thanks Jabbar Azam On 19 May 2014 08:36, "Aaron Morton" wrote: > Does anybody else use another technique for achieving this idempotency > with counters? > > > T

Re: Effect of number of keyspaces on write-throughput....

2014-05-19 Thread Krishna Chaitanya

Thankyou for making these issues clear. Currently, in my datamodel, I have the current second( seconds-from-epoch) as the row key and micro second with the client number as the column key. Hence, all the packets received during a particular second on all the clients are stored in t

Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-19 Thread Aaron Morton

> The limit is just ignored and the entire column family is scanned. Which limit ? > 1. Am I right that there is no way to get some data limited by token range > with ColumnFamilyInputFormat? From what I understand setting the input range is used when calculating the splits. The token ranges in

Re: Question about READS in a multi DC environment.

2014-05-19 Thread Aaron Morton

In this case I was not thinking about what was happening synchronous to client request, only that the request was hitting all nodes. You are right, when reading at LOCAL_ONE the coordinator will only be blocking for one response (the data response). Cheers Aaron - Aaron Morton

Re: Cassandra counter column family performance

2014-05-19 Thread Aaron Morton

> I get a lot of TExceptions What are the exceptions ? In general counters are slower than writes, but that does not lead them to fail like that. Check the logs for errors and/or messages from the GCInspector saying the garbage collection is going on. Cheers A - Aaron Morton

Re: Datacenter understanding question

2014-05-19 Thread Aaron Morton

Depends on how you have setup the replication. If you are using SimpleStrategy with RF 1, then there will be a single copy of each row in the cluster. If you are using the NetworkTopologyStrategy with RF 1 in each DC then there will be two copies of each row in the cluster. One in each DC.

Re: Query returns incomplete result

2014-05-19 Thread Aaron Morton

Calling execute the second time runs the query a second time, and it looks like the query mutates instance state during the pagination. What happens if you only call execute() once ? Cheers Aaron - Aaron Morton New Zealand @aaronmorton Co-Founder & Principal Consultant Apache

Re: Schema errors when bootstrapping / restarting node

2014-05-19 Thread Aaron Morton

> I am able to fix this error by clearing out the schema_columns system table > on disk. After that, a node can boot successfully. > > Does anyone have a clue what's going on here? Something has come corrupted in the system tables as you say. A less aggressive way to reset the local schema is

Re: Effect of number of keyspaces on write-throughput....

2014-05-19 Thread Aaron Morton

> Each client is writing to a separate keyspace simultaneously. Hence, is there > a lot of switching of keyspaces? > > I would think not. If the client app is using one keyspace per connection there should be no reason for the driver to change keyspaces. > But, I observed that when using a

Re: idempotent counters

2014-05-19 Thread Aaron Morton

> Does anybody else use another technique for achieving this idempotency with > counters? The idempotency problem with counters has to do with what will happen when you get a timeout. If you reply the write there is a chance of the increment been applied twice. This is inherent in the current d

Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

RE: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

Re: Multi-dc cassandra keyspace

Re: CQL 3 and wide rows

Re: CQL 3 and wide rows

Re: Ec2 Network I/O

Re: Filtering on Collections

Re: Filtering on Collections

Re: Index with same Name but different keyspace

Re: Best partition type for Cassandra with JBOD

Re: Query first 1 columns for each partitioning keys in CQL?

Ec2 Network I/O

Re: Filtering on Collections

Re: Filtering on Collections

Re: Filtering on Collections

Re: Filtering on Collections

Filtering on Collections

Re: CQL 3 and wide rows

RE: CQL 3 and wide rows

CQL 3 and wide rows

Re: Cyclop - CQL web based editor has been released!

Re: Index with same Name but different keyspace

Can SSTables overlap with SizeTieredCompactionStrategy?

Changing default_time_to_live

Re: idempotent counters

Re: Effect of number of keyspaces on write-throughput....

Re: Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

Re: Question about READS in a multi DC environment.

Re: Cassandra counter column family performance

Re: Datacenter understanding question

Re: Query returns incomplete result

Re: Schema errors when bootstrapping / restarting node

Re: Effect of number of keyspaces on write-throughput....

Re: idempotent counters

34 matches

Site Navigation

Mail list logo

Footer information