Re: Indexes Fragmentation

2014-10-03 Thread Arthur Zubarev
Original Message From: Robert Coli To: user Sent: Mon, Sep 29, 2014 8:01 pm Subject: Re: Indexes Fragmentation On Sun, Sep 28, 2014 at 9:49 AM, Arthur Zubarev wrote: There are 200+ times more updates and 50x inserts than analytical loads. In Cassandra to just be able to query (in CQL

Re: Indexes Fragmentation

2014-09-28 Thread Arthur Zubarev
: Arthur Zubarev Sent: Sunday, September 28, 2014 11:19 AM To: user@cassandra.apache.org Subject: Re: Indexes Fragmentation Thank you Jack, But I am afraid it may be an overhead. Added complexity. /Arthur Original Message From: Jack Krupansky To: user Sent: Sun, Sep 28, 2014

Re: Indexes Fragmentation

2014-09-28 Thread Arthur Zubarev
e, then it would be easier to answer your question. Hannu -- Forwarded message ------ From: Arthur Zubarev Date: 2014-09-28 17:55 GMT+03:00 Subject: Indexes Fragmentation To: user@cassandra.apache.org Hi all: A client on a RDBMS faces quick index fragmentations, statist

Re: Indexes Fragmentation

2014-09-28 Thread Arthur Zubarev
of Cassandra data. -- Jack Krupansky From: Arthur Zubarev Sent: Sunday, September 28, 2014 10:55 AM To: user@cassandra.apache.org Subject: Indexes Fragmentation Hi all: A client on a RDBMS faces quick index fragmentations, statistics become inaccurate. Many within 4 hours (fast

Indexes Fragmentation

2014-09-28 Thread Arthur Zubarev
Hi all: A client on a RDBMS faces quick index fragmentations, statistics become inaccurate. Many within 4 hours (fast updates + writes, but mostly updates). I am looking into replacing the RDBMS with Cassandra. Will I face the same issue with indexes with Cassandra? Thank you! Regards, Art

Re: Why "select count("*) from .." hangs ?

2014-03-26 Thread Arthur Zubarev
I faced the same nuance in my early days with C*, specifically I got RPC timeouts on selecting data from CFs larger than 300 GB. The typical remedy is to implement paging. So instead of using the CLI resort to a custom built client app. Regards, Arthur From: shahab Sent: Wednesday, March 26

Re: RPC timeout error while exporting data from CQL

2013-09-18 Thread Arthur Zubarev
Hello Pradeep, Let me try to help you, I faced a similar issue, too. Thing is I was told selecting all the records at once is not an ideal approach. No matter how strong the hardware is an arbitrary upward adjusted RPC time out would not help, whatever value you give to it, the ‘SELECT *’ query

Re: Deleting column data from Cassandra without setting its TTL

2013-08-16 Thread Arthur Zubarev
I understand the intent is to remove select columns, then the simple DELETE would do. The other way of doing things is probably having empty placeholders for the dates/data. /Arthur From: Suruchi Deodhar Sent: Friday, August 16, 2013 2:38 PM To: user@cassandra.apache.org ; Arthur Zubarev

Re: Deleting column data from Cassandra without setting its TTL

2013-08-16 Thread Arthur Zubarev
Not sure what client you use to interact with C*. In your scenario I assume it is CQL. So what can go wrong with TRUNCATE? Regards Arthur From: Suruchi Deodhar Sent: Friday, August 16, 2013 12:23 PM To: user@cassandra.apache.org Subject: Deleting column data from Cassandra without setting i

Re: Configuring ephemeral only column family

2013-08-16 Thread Arthur Zubarev
What about compactions, how often do you run them? -Original Message- From: Todd Nine Sent: Friday, August 16, 2013 1:43 PM To: user@cassandra.apache.org Subject: Configuring ephemeral only column family Hi guys, We're using expiring columns as a mean for locking. All of this data sh

RE: Handling quorum writies fails

2013-08-11 Thread Arthur Zubarev
Hello Mikhail, The bullet 1 implies consistency, but at later time. And you don't lose the transaction. By the way, RF 3 to support financials is too low. #2, if the entire disk (that had no parity) fails you lost this write, but the 3rd node would have the write. Again, having a greater CF is

Re: How often to run `nodetool repair`

2013-08-01 Thread Arthur Zubarev
: Thursday, August 01, 2013 3:03 PM To: user@cassandra.apache.org ; Arthur Zubarev Subject: Re: How often to run `nodetool repair` Arthur, Yes, my use case for this Cassandra cluster is analytics. I am building a google dapper (application tracing) like system. I collect application traces and write

Re: Adding my first node to another one...

2013-08-01 Thread Arthur Zubarev
Hi Morgan, The scaling out depends on several factors. The most intricate is perhaps calculating the tokens. Also the Cassandra version is important. At this point in time I suggest you read section Adding Capacity to an Existing Cluster at http://www.datastax.com/docs/1.0/operations/cluste

Re: How often to run `nodetool repair`

2013-08-01 Thread Arthur Zubarev
Hi Carl, The ‘repair’ is for data reads. Compaction will take care of the expired data. The fact a repair runs long makes me think the nodes receive unbalanced amounts of writes rather. Regards, Arthur From: Carl Lerche Sent: Thursday, August 01, 2013 12:35 PM To: user@cassandra.apache.org

Re: Creating an "Index" column...

2013-06-26 Thread Arthur Zubarev
better. Just some thoughts on the matter. -Tony -------- *From:* Arthur Zubarev *To:* Tony Anecito ; Robert Coli ; Users-Cassandra *Sent:* Wednesday, June 26, 2013 3:08 PM *Subject:* Re: Creating an "Index" column... Ton

Re: Creating an "Index" column...

2013-06-26 Thread Arthur Zubarev
Tony hi, Yes, in some scenarios (e.g. a DW), e.g. absence of proper PKs or indexes (just too hard to envision, you need to think of future queries 1st) getting thru large volumes of data makes NoSQL IMHO hard to fit in. But you have other choices: 1) pagination or 2) slice queries. Both of th

Re: copy data between clusters

2013-06-25 Thread Arthur Zubarev
This is the best reference I have seen so far http://www.datastax.com/dev/blog/bulk-loading But I must tell it is not updated to match the most recent changes in C*. I suggest you read thru comments, too. From: S C Sent: Tuesday, June 25, 2013 10:23 PM To: user@cassandra.apache.org Subject: RE

Re: copy data between clusters

2013-06-25 Thread Arthur Zubarev
Hello SC, whilst most of the sstableloader errors stem from incorrect setups I suspect this time you merely have a connectivity issue e.g. a firewall blocking traffic. From: S C Sent: Tuesday, June 25, 2013 5:28 PM To: user@cassandra.apache.org Subject: RE: copy data between clusters Bob and

Re: copy data between clusters

2013-06-24 Thread Arthur Zubarev
On 06/24/2013 11:35 PM, S C wrote: I have a scenario here. I have a cluster A and cluster B running on cassandra 1.1. I need to copy data from Cluster A to Cluster B. Cluster A has few keyspaces that I need to copy over to Cluster B. What are my options? Thanks, SC I am thinking of SSTABLELOA

Re: How to do a CAS UPDATE on single column CF?

2013-06-24 Thread Arthur Zubarev
On 06/24/2013 11:23 PM, Blair Zajac wrote: CAS UPDATE Since when C* has IF NOT EXISTS in DML part of CQL? -- Regards, Arthur

Re: Counter value becomes incorrect after several dozen reads & writes

2013-06-24 Thread Arthur Zubarev
Hi Josh, are you looking at the read counter produced by cfstats? If so it is not for a CF, but the entire KS and not tied to a specific operation, but rather per the entire lifetime of JVM. Just in case, some supporting info: http://stackoverflow.com/questions/9431590/cassandra-cfstats-and-me

Re: Dropped mutation messages

2013-06-18 Thread Arthur Zubarev
Cem hi, as per http://wiki.apache.org/cassandra/FAQ#dropped_messages Internode messages which are received by a node, but do not get not to be processed within rpc_timeout are dropped rather than processed. As the coordinator node will no longer be waiting for a response. If the Coordinator no

Re: Unable to count records of a column family with 210 columns x 500K rows

2013-06-11 Thread Arthur Zubarev
y", line 293, in flush self.__trans.write(buf) File "/usr/share/cassandra/lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py", line 117, in write plus = self.handle.send(buff) error: [Errno 32] Broken pipe I then lose C* and need to restart its service to

Unable to count records of a column family with 210 columns x 500K rows

2013-06-11 Thread Arthur Zubarev
Hello, I am unable to count records using cqlsh (e.g. select count(*) from MyCF limit 5;) I have a column family with 210 columns x 500K rows. The row length is 40K chars. The same issue is with any other large CF.

Re: Cassandra (1.2.5) + Pig (0.11.1) Errors with large column families

2013-06-08 Thread Arthur Zubarev
On 06/07/2013 06:02 PM, Mark Lewandowski wrote: I'm currently trying to get Cassandra (1.2.5) and Pig (0.11.1) to play nice together. I'm running a basic script: rows = LOAD 'cassandra://keyspace/colfam' USING CassandraStorage(); dump rows; This fails for my column family which has ~100,000 r

Re: Bulk loader with Cassandra 1.2.5

2013-06-08 Thread Arthur Zubarev
I am interested to know if the compaction directive is the key because I have the same symptoms on Ubuntu Server 12.04 64 bit C* 1.2.4 with a CF of ~ > half mil records 6,000 chars each. I can only get back max 6,000 records read in cqlsh, so, if I query SELECT COUNT(*) FROM A_CF LIMIT 6000; I

Getting error Request did not complete within rpc_timeout.

2013-06-04 Thread Arthur Zubarev
Hello, I am on a 64 Bit Ubuntu server 12.04 with 50% of memory free, 2 core CPU idling (it is running on old hardware). C* 1.2.4 has 1 local node. All worked until I bulk-loaded ~3K rows of 210 columns. I created the data and index files off a CSV file as per more or less http://www.datastax.

Re: Creating namespace and column family from multiple nodes concurrently

2013-05-27 Thread Arthur Zubarev
applied. Good Luck! From: Emalayan Vairavanathan Sent: Friday, May 24, 2013 1:14 AM To: user@cassandra.apache.org ; Arthur Zubarev Subject: Re: Creating namespace and column family from multiple nodes concurrently I am sorry if I was not clear. I was using nodes to refer machines (or vice versa

Re: Creating namespace and column family from multiple nodes concurrently

2013-05-23 Thread Arthur Zubarev
so where the multiple nodes are? I am just puzzled From: Emalayan Vairavanathan Sent: Thursday, May 23, 2013 3:43 PM To: Arthur Zubarev ; user@cassandra.apache.org Subject: Re: Creating namespace and column family from multiple nodes concurrently "Would each device/machine have it

Re: Creating namespace and column family from multiple nodes concurrently

2013-05-23 Thread Arthur Zubarev
manual intervention ? - Or will this result in race conditions ? - Or some other issues e.g: memory/ cpu /network bottlenecks ? Thank you Emalayan From: Arthur Zubarev To: user@cassandra.apache.org

Re: Creating namespace and column family from multiple nodes concurrently

2013-05-22 Thread Arthur Zubarev
I am assuming here you want to sync all the 100s of nodes once the application is airborne. I suspect this would flood the network and even potentially affect the machine itself memory-wise. How are you going to maintain the nodes (compaction+repair)? Regards, Arthur -Original

Re: Unable to start Cassandra

2013-05-20 Thread Arthur Zubarev
specifics and where I found the correct fix is https://issues.apache.org/jira/browse/CASSANDRA-4058#comment-13662604 Regards, Arthur Original Message From: Arthur Zubarev To: user Sent: Mon, May 20, 2013 9:46 pm Subject: Re: Unable to start Cassandra Copied the contents of

Re: Unable to start Cassandra

2013-05-20 Thread Arthur Zubarev
ities -XX:ThreadPriorityPolicy=42 -Xms1841M -Xmx1841M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss180k Regards, Arthur Original Message From: Arthur Zubarev To: user Sent: Mon, May 20, 2013 4:37 pm Subject: Re: Unable to start Cassandra I have no such evidence: Start

Re: Unable to start Cassandra

2013-05-20 Thread Arthur Zubarev
there and own them? Hm... that's weird -- try looking in the dpkg logs to find out if they were somehow removed. You might try reinstalling the package to have it reinstall the files. Faraaz On Mon, May 20, 2013 at 01:15:05PM -0700, Arthur Zubarev wrote: > Hi Faraaz, > > yes, this t

Re: Unable to start Cassandra

2013-05-20 Thread Arthur Zubarev
them should have the log4j configuration in it, I presume? Faraaz On Mon, May 20, 2013 at 09:17:11AM -0700, Arthur Zubarev wrote: > > > Hello All: > > I had Cassandra 1.2.4 running up until yesterday, then after I took an Ubuntu > update (may be related or may not), I sud

Unable to start Cassandra

2013-05-20 Thread Arthur Zubarev
Hello All: I had Cassandra 1.2.4 running up until yesterday, then after I tookan Ubuntu update (may be related or may not), I suddenly wasn't ableto connect any more. The error was related to lack of memory and suggested a modificationto its config, I kept increa

Re: Failing to install C* error

2013-05-02 Thread Arthur Zubarev
Indeed, I figured this post is outdated. The proper location of the packages is deb http://www.apache.org/dist/cassandra/debian 12x main Arthur Original Message From: Arthur Zubarev To: user Sent: Thu, May 2, 2013 10:59 pm Subject: Failing to install C* error Hello: I am