Original Message
From: Robert Coli
To: user
Sent: Mon, Sep 29, 2014 8:01 pm
Subject: Re: Indexes Fragmentation
On Sun, Sep 28, 2014 at 9:49 AM, Arthur Zubarev wrote:
There are 200+ times more updates and 50x inserts than analytical loads.
In Cassandra to just be able to query (in CQL
: Arthur Zubarev
Sent: Sunday, September 28, 2014 11:19 AM
To: user@cassandra.apache.org
Subject: Re: Indexes Fragmentation
Thank you Jack,
But I am afraid it may be an overhead. Added complexity.
/Arthur
Original Message
From: Jack Krupansky
To: user
Sent: Sun, Sep 28, 2014
e, then it would be easier to answer
your question.
Hannu
-- Forwarded message ------
From: Arthur Zubarev
Date: 2014-09-28 17:55 GMT+03:00
Subject: Indexes Fragmentation
To: user@cassandra.apache.org
Hi all:
A client on a RDBMS faces quick index fragmentations, statist
of Cassandra data.
-- Jack Krupansky
From: Arthur Zubarev
Sent: Sunday, September 28, 2014 10:55 AM
To: user@cassandra.apache.org
Subject: Indexes Fragmentation
Hi all:
A client on a RDBMS faces quick index fragmentations, statistics become
inaccurate. Many within 4 hours (fast
Hi all:
A client on a RDBMS faces quick index fragmentations, statistics become
inaccurate. Many within 4 hours (fast updates + writes, but mostly updates).
I am looking into replacing the RDBMS with Cassandra.
Will I face the same issue with indexes with Cassandra?
Thank you!
Regards,
Art
I faced the same nuance in my early days with C*, specifically I got RPC
timeouts on selecting data from CFs larger than 300 GB.
The typical remedy is to implement paging. So instead of using the CLI resort
to a custom built client app.
Regards,
Arthur
From: shahab
Sent: Wednesday, March 26
Hello Pradeep,
Let me try to help you, I faced a similar issue, too.
Thing is I was told selecting all the records at once is not an ideal approach.
No matter how strong the hardware is an arbitrary upward adjusted RPC time out
would not help, whatever value you give to it, the ‘SELECT *’ query
I understand the intent is to remove select columns, then the simple DELETE
would do.
The other way of doing things is probably having empty placeholders for the
dates/data.
/Arthur
From: Suruchi Deodhar
Sent: Friday, August 16, 2013 2:38 PM
To: user@cassandra.apache.org ; Arthur Zubarev
Not sure what client you use to interact with C*. In your scenario I assume it
is CQL.
So what can go wrong with TRUNCATE?
Regards
Arthur
From: Suruchi Deodhar
Sent: Friday, August 16, 2013 12:23 PM
To: user@cassandra.apache.org
Subject: Deleting column data from Cassandra without setting i
What about compactions, how often do you run them?
-Original Message-
From: Todd Nine
Sent: Friday, August 16, 2013 1:43 PM
To: user@cassandra.apache.org
Subject: Configuring ephemeral only column family
Hi guys,
We're using expiring columns as a mean for locking. All of this
data sh
Hello Mikhail,
The bullet 1 implies consistency, but at later time. And you don't lose the
transaction. By the way, RF 3 to support financials is too low.
#2, if the entire disk (that had no parity) fails you lost this write, but the
3rd node would have the write.
Again, having a greater CF is
: Thursday, August 01, 2013 3:03 PM
To: user@cassandra.apache.org ; Arthur Zubarev
Subject: Re: How often to run `nodetool repair`
Arthur,
Yes, my use case for this Cassandra cluster is analytics. I am building a
google dapper (application tracing) like system. I collect application traces
and write
Hi Morgan,
The scaling out depends on several factors. The most intricate is perhaps
calculating the tokens.
Also the Cassandra version is important.
At this point in time I suggest you read section Adding Capacity to an
Existing Cluster at
http://www.datastax.com/docs/1.0/operations/cluste
Hi Carl,
The ‘repair’ is for data reads. Compaction will take care of the expired data.
The fact a repair runs long makes me think the nodes receive unbalanced amounts
of writes rather.
Regards,
Arthur
From: Carl Lerche
Sent: Thursday, August 01, 2013 12:35 PM
To: user@cassandra.apache.org
better.
Just some thoughts on the matter.
-Tony
--------
*From:* Arthur Zubarev
*To:* Tony Anecito ; Robert Coli
; Users-Cassandra
*Sent:* Wednesday, June 26, 2013 3:08 PM
*Subject:* Re: Creating an "Index" column...
Ton
Tony hi,
Yes, in some scenarios (e.g. a DW), e.g. absence of proper PKs or indexes (just
too hard to envision, you need to think of future queries 1st) getting thru
large volumes of data makes NoSQL IMHO hard to fit in.
But you have other choices:
1) pagination or
2) slice queries.
Both of th
This is the best reference I have seen so far
http://www.datastax.com/dev/blog/bulk-loading But I must tell it is not updated
to match the most recent changes in C*. I suggest you read thru comments, too.
From: S C
Sent: Tuesday, June 25, 2013 10:23 PM
To: user@cassandra.apache.org
Subject: RE
Hello SC,
whilst most of the sstableloader errors stem from incorrect setups I suspect
this time you merely have a connectivity issue e.g. a firewall blocking traffic.
From: S C
Sent: Tuesday, June 25, 2013 5:28 PM
To: user@cassandra.apache.org
Subject: RE: copy data between clusters
Bob and
On 06/24/2013 11:35 PM, S C wrote:
I have a scenario here. I have a cluster A and cluster B running on
cassandra 1.1. I need to copy data from Cluster A to Cluster B.
Cluster A has few keyspaces that I need to copy over to Cluster B.
What are my options?
Thanks,
SC
I am thinking of SSTABLELOA
On 06/24/2013 11:23 PM, Blair Zajac wrote:
CAS UPDATE
Since when C* has IF NOT EXISTS in DML part of CQL?
--
Regards,
Arthur
Hi Josh,
are you looking at the read counter produced by cfstats?
If so it is not for a CF, but the entire KS and not tied to a specific
operation, but rather per the entire lifetime of JVM.
Just in case, some supporting info:
http://stackoverflow.com/questions/9431590/cassandra-cfstats-and-me
Cem hi,
as per http://wiki.apache.org/cassandra/FAQ#dropped_messages
Internode messages which are received by a node, but do not get not to be
processed within rpc_timeout are dropped rather than processed. As the
coordinator node will no longer be waiting for a response. If the Coordinator
no
y",
line 293, in flush
self.__trans.write(buf)
File
"/usr/share/cassandra/lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py",
line 117, in write
plus = self.handle.send(buff)
error: [Errno 32] Broken pipe
I then lose C* and need to restart its service to
Hello,
I am unable to count records using cqlsh (e.g. select count(*) from MyCF limit
5;)
I have a column family with 210 columns x 500K rows. The row length is 40K
chars.
The same issue is with any other large CF.
On 06/07/2013 06:02 PM, Mark Lewandowski wrote:
I'm currently trying to get Cassandra (1.2.5) and Pig (0.11.1) to play
nice together. I'm running a basic script:
rows = LOAD 'cassandra://keyspace/colfam' USING CassandraStorage();
dump rows;
This fails for my column family which has ~100,000 r
I am interested to know if the compaction directive is the key because I
have the same symptoms on Ubuntu Server 12.04 64 bit C* 1.2.4 with a CF
of ~ > half mil records 6,000 chars each.
I can only get back max 6,000 records read in cqlsh, so, if I query
SELECT COUNT(*) FROM A_CF LIMIT 6000; I
Hello,
I am on a 64 Bit Ubuntu server 12.04 with 50% of memory free, 2 core CPU idling
(it is running on old hardware).
C* 1.2.4 has 1 local node. All worked until I bulk-loaded ~3K rows of 210
columns.
I created the data and index files off a CSV file as per more or less
http://www.datastax.
applied.
Good Luck!
From: Emalayan Vairavanathan
Sent: Friday, May 24, 2013 1:14 AM
To: user@cassandra.apache.org ; Arthur Zubarev
Subject: Re: Creating namespace and column family from multiple nodes
concurrently
I am sorry if I was not clear. I was using nodes to refer machines (or vice
versa
so where the multiple nodes are? I am just puzzled
From: Emalayan Vairavanathan
Sent: Thursday, May 23, 2013 3:43 PM
To: Arthur Zubarev ; user@cassandra.apache.org
Subject: Re: Creating namespace and column family from multiple nodes
concurrently
"Would each device/machine have it
manual
intervention ?
- Or will this result in race conditions ?
- Or some other issues e.g: memory/ cpu /network bottlenecks ?
Thank you
Emalayan
From: Arthur Zubarev
To: user@cassandra.apache.org
I am assuming here you want to sync all the 100s of nodes once the application
is airborne. I suspect this would flood the network and even potentially affect
the machine itself memory-wise. How are you going to maintain the nodes
(compaction+repair)?
Regards,
Arthur
-Original
specifics and where I found the correct fix is
https://issues.apache.org/jira/browse/CASSANDRA-4058#comment-13662604
Regards,
Arthur
Original Message
From: Arthur Zubarev
To: user
Sent: Mon, May 20, 2013 9:46 pm
Subject: Re: Unable to start Cassandra
Copied the contents of
ities -XX:ThreadPriorityPolicy=42 -Xms1841M -Xmx1841M
-Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
Regards,
Arthur
Original Message
From: Arthur Zubarev
To: user
Sent: Mon, May 20, 2013 4:37 pm
Subject: Re: Unable to start Cassandra
I have no such evidence:
Start
there and own them? Hm... that's weird -- try looking in the dpkg logs to
find out if they were somehow removed. You might try reinstalling the package to
have it reinstall the files.
Faraaz
On Mon, May 20, 2013 at 01:15:05PM -0700, Arthur Zubarev wrote:
> Hi Faraaz,
>
> yes, this t
them should have the log4j configuration in it, I presume?
Faraaz
On Mon, May 20, 2013 at 09:17:11AM -0700, Arthur Zubarev wrote:
>
>
> Hello All:
>
> I had Cassandra 1.2.4 running up until yesterday, then after I took an Ubuntu
> update (may be related or may not), I sud
Hello All:
I had Cassandra 1.2.4 running up until yesterday, then after I tookan
Ubuntu update (may be related or may not), I suddenly wasn't ableto connect
any more.
The error was related to lack of memory and suggested a modificationto
its config, I kept increa
Indeed, I figured this post is outdated.
The proper location of the packages is
deb http://www.apache.org/dist/cassandra/debian 12x main
Arthur
Original Message
From: Arthur Zubarev
To: user
Sent: Thu, May 2, 2013 10:59 pm
Subject: Failing to install C* error
Hello:
I am
37 matches
Mail list logo