data ends up on one disk. If
you need the additional io, you will want raid0. But simply listing multiple
DataFileDirectories will not work.
-Anthony
On Wed, Mar 10, 2010 at 02:08:13AM -0600, Stu Hood wrote:
> You can list multiple DataFileDirectories, and Cassandra will scatter files
> ac
Anyone can edit any page once they have an account: click the "Login" link at
the top right next to the search box to create an account.
Thanks,
Stu
-Original Message-
From: "Eric Rosenberry"
Sent: Wednesday, March 10, 2010 2:52am
To: cassandra-user@incubator.apache.org
Subject: Cassand
You can list multiple DataFileDirectories, and Cassandra will scatter files
across all of them. Use 1 disk for the commitlog, and 3 disks for data
directories.
See http://wiki.apache.org/cassandra/CassandraHardware#Disk
Thanks,
Stu
-Original Message-
From: "Eric Rosenberry"
Sent: Wedn
Definitely on board!
-Original Message-
From: "Dan Di Spaltro"
Sent: Tuesday, March 9, 2010 8:05pm
To: cassandra-user@incubator.apache.org
Subject: Re: Hackathon?!?
Alright guys, we have settled on a date for the Cassandra meetup on...
April 15th, better known as, Tax day!
We can host
Run `ant clean` before building. A few files moved around.
-Original Message-
From: "Cool BSD"
Sent: Monday, March 8, 2010 5:18pm
To: "cassandra-user"
Subject: Latest check-in to trunk/ is broken
version info:
$ svn info
Path: .
URL: https://svn.apache.org/repos/asf/incubator/cassandra/
But rather than switching, you should definitely try the 'loadbalance' approach
first, and see whether OrderPP works out for you.
-Original Message-
From: "Chris Goffinet"
Sent: Friday, March 5, 2010 1:43pm
To: cassandra-user@incubator.apache.org
Subject: Re: Dynamically Switching from O
You are probably in the portion of bootstrap where data to be transferred is
split out to disk, which can take a while: see
https://issues.apache.org/jira/browse/CASSANDRA-579
Look for a 'streaming' subdirectory in your data directories to confirm.
-Original Message-
From: "Brian Frank
> In HBase you have table:row:family:key:val:version, which some people
> might consider richer
Cassandra is actually table:family:row:key:val[:subval], where subvals are the
columns stored in a supercolumn (which can be easily arranged by timestamp to
give the versioned approach).
-Origina
`nodetool cleanup` is a very expensive process: it performs a major compaction,
and should not be done that frequently.
-Original Message-
From: "shiv shivaji"
Sent: Sunday, February 28, 2010 3:34pm
To: cassandra-user@incubator.apache.org
Subject: Re: Anti-compaction Diskspace issue even
Ran,
There are bounds to how large your data directory will grow, relative to the
actual data. Please read up on compaction:
http://wiki.apache.org/cassandra/MemtableSSTable , and if you have a
significant number of deletes occuring, also read
http://wiki.apache.org/cassandra/DistributedDelete
> After I ran "nodeprobe compact" on node B its read latency went up to 150ms.
The compaction process can take a while to finish... in 0.5 you need to watch
the logs to figure out when it has actually finished, and then you should start
seeing the improvement in read latency.
> Is there any way
The combination of 'too many open files' and lots of memtable flushes could
mean you have tons and tons of sstables on disk. This can make reads especially
slow.
If you are seeing the timeouts on reads a lot more often than on writes, then
this explanation might make sense, and you should watch
PS: If this turns out to actually be the problem, I'll open a ticket for it.
Thanks,
Stu
-Original Message-
From: "Stu Hood"
Sent: Sunday, December 13, 2009 12:28pm
To: cassandra-user@incubator.apache.org
Subject: Re: OOM Exception
With 248G per box, you probably hav
With 248G per box, you probably have slightly more than 1/2 billion items?
One current implementation detail in Cassandra is that it loads 128th of the
index into memory for faster lookups. This means you might have something like
4.5 million keys in memory at the moment.
The '128' value is a c
> JR> After chatting with some Facebook guys, we realized that one potential
> JR> benefit from using HDFS is that the recovery from losing partial data in a
> JR> node is more efficient. Suppose that one lost a single disk at a node.
> HDFS
> JR> can quickly rebuild the blocks on the failed disk
You need a quorum relative to your replication factor. You mentioned in the
first e-mail that you have RF=2, so you need a quorum of 2. If you use RF=3,
then you need a quorum of 2 as well.
-Original Message-
From: "B. Todd Burruss"
Sent: Friday, November 20, 2009 4:14pm
To: cassandra-u
Hey Ted,
Would you mind creating a ticket for this issue in JIRA? A lot of discussion
has gone on, and a place to collect the design and feedback would be a good
start.
Thanks,
Stu
-Original Message-
From: "Ted Zlatanov"
Sent: Wednesday, November 11, 2009 3:28pm
To: cassandra-user@inc
This type of problem is one of the primary examples of something that should be
handled by pluggable/client-side conflict resolution in an eventually
consistent system. Currently, all conflicts in Cassandra are handled with
"highest timestamp wins"
Rather than attempting to add atomic operation
There is no such thing as a column or supercolumn that is not contained in a
ColumnFamily. The ColumnFamily is the structure that is stored together on disk.
A supercolumn is not what you think it is: supercolumns are like regular
columns, except they contain other columns, and you can have an a
19 matches
Mail list logo