Re: Does DateTieredCompactionStrategy work with a compound clustering key?

2015-03-07 Thread mck

 I believe, that the DateTieredCompactionStrategy would work for PRIMARY
 KEY (timeblock, timestamp) -- but does it also work for PRIMARY KEY
 (timeblock, timestamp, hash) ?


Yes.

 (sure you don't want to be using a timeuuid instead?)

~mck


Re: best practices for time-series data with massive amounts of records

2015-03-03 Thread mck

 Here partition is a random digit from 0 to (N*M) 
 where N=nodes in cluster, and M=arbitrary number.


Hopefully it was obvious, but here (unless you've got hot partitions),
you don't need N.
~mck


Re: best practices for time-series data with massive amounts of records

2015-03-03 Thread mck
Clint,

 CREATE TABLE events (
   id text,
   date text, // Could also use year+month here or year+week or something else
   event_time timestamp,
   event blob,
   PRIMARY KEY ((id, date), event_time))
 WITH CLUSTERING ORDER BY (event_time DESC);
 
 The downside of this approach is that we can no longer do a simple
 continuous scan to get all of the events for a given user.  Some users
 may log lots and lots of interactions every day, while others may interact
 with our application infrequently, so I'd like a quick way to get the most
 recent interaction for a given user.
 
 Has anyone used different approaches for this problem?


One idea is to provide additional manual partitioning like…

CREATE TABLE events (
  user_id text,
  partition int,
  event_time timeuuid,
  event_json text,
  PRIMARY KEY ((user_id, partition), event_time)
) WITH
  CLUSTERING ORDER BY (event_time DESC) AND
  compaction={'class': 'DateTieredCompactionStrategy'};


Here partition is a random digit from 0 to (N*M) 
where N=nodes in cluster, and M=arbitrary number.

Read performance is going to suffer a little because you need to query
N*M as many partition keys for each read, but should be constant enough
that it comes down to increasing the cluster's hardware and scaling out
as need be.

The multikey reads you can do it with a SELECT…IN query, or better yet
with parallel reads (less pressure on the coordinator at expense of 
extra network calls).

Starting with M=1, you have the option to increase it over time if the
rows in partitions for any users get too high.
(We do¹ something similar for storing all raw events in our enterprise
platform, but because the data is not user-centric the initial partition
key is minute-by-minute timebuckets, and M has remained at 1 the whole
time).

This approach is better than using order-preserving partition (really
don't do that).

I would also consider replacing event blob with event text, choosing
json instead of any binary serialisation. We've learnt the hard way the
value of data transparency, and i'm guessing the storage cost is small
given c* compression.

Otherwise the advice here is largely repeating what Jens has already
said.

~mck

  ¹ slide 19+20 from
  
https://prezi.com/vt98oob9fvo4/cassandra-summit-cassandra-and-hadoop-at-finnno/


Re: how to scan all rows of cassandra using multiple threads

2015-02-26 Thread mck

  Can I get data owned by a particular node and this way generate sum
  on different nodes by iterating over data from virtual nodes and later
 generate total sum by doing sum of data from all virtual nodes.
 


You're pretty much describing a map/reduce job using CqlInputFormat. 


Re: Node stuck in joining the ring

2015-02-26 Thread mck
Any errors in your log file?

We saw something similar when bootstrap crashed when rebuilding
secondary indexes.

See CASSANDRA-8798

~mck


Re: Why no virtual nodes for Cassandra on EC2?

2015-02-23 Thread mck

 … my understanding was that
 performance of Hadoop jobs on C* clusters with vnodes was poor because a
 given Hadoop input split has to run many individual scans (one for each
 vnode) rather than just a single scan.  I've run C* and Hadoop in
 production with a custom input format that used vnodes (and just combined
 multiple vnodes in a single input split) and didn't have any issues (the
 jobs had many other performance bottlenecks besides starting multiple
 scans from C*).

You've described the ticket, and how it has been solved :-)

 This is one of the videos where I recall an off-hand mention of the Spark
 connector working with vnodes:
 https://www.youtube.com/watch?v=1NtnrdIUlg0

Thanks.

~mck


Re: Why no virtual nodes for Cassandra on EC2?

2015-02-21 Thread mck
At least the problem of hadoop and vnodes described in CASSANDRA-6091
doesn't apply to spark.
 (Spark already allows multiple token ranges per split).

If this is the reason why DSE hasn't enabled vnodes then fingers crossed
that'll change soon.


 Some of the DataStax videos that I watched discussed how the Cassandra Spark 
 connecter has 
 optimizations to deal with vnodes.


Are these videos public? if so got any link to them?

~mck 


Re: How to speed up SELECT * query in Cassandra

2015-02-16 Thread mck

 Could you please share how much data you store on the cluster and what
 is HW configuration of the nodes? 


These nodes are dedicated HW, 24 cpu and 50Gb ram.
Each node has a few TBs of data (you don't want to go over this) in
raid50 (we're migrating over to JBOD).
Each c* node is running 2.0.11 and configured to use 8gm heap, 2g new,
and jdk1.7.0_55.

Hadoop (2.2.0) tasktrackers and dfs run on these nodes as well, all up
they use up to 12Gb ram, leaving ~30Gb ram for kernel and page cache.
Data-locality is an important goal, in the worse case scenarios we've
seen it mean a four times throughput benefit.

Hdfs being a volatile hadoop-internals space for us is on SSDs,
providing strong m/r performance.
 (commitlog of course is also on SSD – we made the mistake of putting it
 on the same SSD to begin with. don't do that, commitlog gets its own
 SSD)


 I am really impressed that you are
 able to read 100M records in ~4minutes on 4 nodes. It makes something
 like 100k reads per node, which is something we are quite far away from.


These are not individual reads and not the number of partition keys, but
m/r records (or cql rows).
But yes, the performance of spark against cassandra is impressive.


 It leads me to question, whether reading from Spark goes through
 Cassandra's JVM and thus go through normal read path, or if it reads the
 sstables directly from disks sequentially and possibly filters out
 old/tombstone values by itself?


Both Hadoop-Cassandra integration and the Spark-Cassandra connector goes
through the normal read path like all cql read queries.

With our m/r jobs each task works with just one partition key, doing
repeated column slice reads through that partition key according to the
ConfigHelper.rangeBatchSize setting, which we have set to 100. These
hadoop jobs use a custom written CqlInputFormat due to the poor
performance CqlInputFormat has today against a vnodes setup, the
customisation we have is pretty much the same as the patch on offer in
CASSANDRA-6091.

This problem with vnodes we haven't experienced with the spark
connector.
I presume that, like the hadoop integration, spark also bulk reads
(column slices) from each partition key.

Otherwise this is useful reading
http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting


 This is also a cluster that serves requests to web applications that
 need low latency.

Let it be said this isn't something i'd recommend, just the path we had
to take because of our small initial dedicated-HW cluster.
(You really want to separate online and offline datacenters, so that you
can maximise the offline clusters for the heavy batch reads).

~mck


Re: How to speed up SELECT * query in Cassandra

2015-02-14 Thread mck
Jirka,

 But I am really interested how it can work well with Spark/Hadoop where
 you basically needs to read all the data as well (as far as I understand
 that).


I can't give you any benchmarking between technologies (nor am i
particularly interested in getting involved in such a discussion) but i
can share our experiences with Cassandra, Hadoop, and Spark, over the
past 4+ years, and hopefully assure you that Cassandra+Spark is a smart
choice.

On a four node cluster we were running 5000+ small hadoop jobs each day
each finishing within two minutes, often within one minute, resulting in
(give or take) a billion records read and 150 millions records written
from and to c*.
These small jobs are incrementally processing on limited partition key
sets each time. These jobs are primarily reading data from a raw events
store that has a ttl of 3 months and 22+Gb of tombstones a day (reads
over old partition keys are rare).

We also run full-table-scan jobs and have never come across any issues
particular to that. There are hadoop map/reduce settings to increase
durability if you have tables with troublesome partition keys.

This is also a cluster that serves requests to web applications that
need low latency.

We recently wrote a spark job that does full table scans over 100
million+ rows, involves a handful of stages (two tables, 9 maps, 4
reduce, and 2 joins), and writes back to a new table 5 millions rows.
This job runs in ~260 seconds.

Spark is becoming a natural complement to schema evolution for
cassandra, something you'll want to do to keep your schema optimised
against your read request patterns, even little things like switching
cluster keys around. 

With any new technology hitting some hurdles (especially if you go
wondering outside recommended practices) will of course be part of the
game, but that said I've only had positive experiences with this
community's ability to help out (and do so quickly).

Starting from scratch i'd use Spark (on scala) over Hadoop no questions
asked. 
Otherwise Cassandra has always been our 'big data' platform,
hadoop/spark is just an extra tool on top.
We've never kept data in hdfs and are very grateful for having made that
choice.

~mck

ref
https://prezi.com/vt98oob9fvo4/cassandra-summit-cassandra-and-hadoop-at-finnno/


Re: cqlinputformat and retired cqlpagingingputformat creates lots of connections to query the server

2015-01-28 Thread mck
Shenghua,
 
 The problem is the user might only want all the data via a select *
 like statement. It seems that 257 connections to query the rows are necessary.
 However, is there any way to prohibit 257 concurrent connections?


Your reasoning is correct.
The number of connections should be tunable via the
cassandra.input.split.size property. See
ConfigHelper.setInputSplitSize(..)

The problem is that vnodes completely trashes this, since splits
returned don't span across vnodes.
There's an issue out for this –
https://issues.apache.org/jira/browse/CASSANDRA-6091
 but part of the problem is that the thrift stuff involved here is
 getting rewritten¹ to be pure cql.

In the meantime you override the CqlInputFormat and manually re-merge
splits together, where location sets match, so to better honour
inputSplitSize and to return to a more reasonable number of connections.
We do this, using code similar to this patch
https://github.com/michaelsembwever/cassandra/pull/2/files

~mck

¹ https://issues.apache.org/jira/browse/CASSANDRA-8358


Re: Which Topology fits best ?

2015-01-26 Thread mck

 However I guess it can be easily changed ? 


 that's correct.


Re: Which Topology fits best ?

2015-01-25 Thread mck
NetworkTopogolyStrategy gives you a better horizon and more flexibility
as you scale out, at least once you've gone past small cluster problems
like wanting RF=3 in a 4 node two dc cluster. 

IMO I'd go with DC:1,DC2:1.
~mck



Re: Why does C* repeatedly compact the same tables over and over?

2015-01-08 Thread mck

 Are you using Leveled compaction strategy? 


And if you're using Date Tiered compaction strategy on a table that
isn't time-series data, for example deletes happen, you find it
compacting over and over.

~mck


Re: Storing large files for later processing through hadoop

2015-01-02 Thread mck
 1) The FAQ … informs that I can have only files of around 64 MB …

See http://wiki.apache.org/cassandra/CassandraLimitations
 A single column value may not be larger than 2GB; in practice, single
 digits of MB is a more reasonable limit, since there is no streaming
 or random access of blob values.

CASSANDRA-16  only covers pushing those objects through compaction.
Getting the objects in and out of the heap during normal requests is
still a problem.

You could manually chunk them down to 64Mb pieces.


 2) Can I replace HDFS with Cassandra so that I don't have to sync/fetch
 the file from cassandra to HDFS when I want to process it in hadoop cluster?


We¹ keep HDFS as a volatile filesystem simply for hadoop internals. No
need for backups of it, no need to upgrade data, and we're free to wipe
it whenever hadoop has been stopped. 

Otherwise all our hadoop jobs still read from and write to Cassandra.
Cassandra is our big data platform, with hadoop/spark just providing
additional aggregation abilities. I think this is the effective way,
rather than trying to completely gut out HDFS. 

There was a datastax project before in being able to replace HDFS with
Cassandra, but i don't think it's alive anymore.

~mck


Re: Storing large files for later processing through hadoop

2015-01-02 Thread mck
 Since the hadoop MR streaming job requires the file to be processed to be 
 present in HDFS,
  I was thinking whether can it get directly from mongodb instead of me 
 manually fetching it 
 and placing it in a directory before submitting the hadoop job?


Hadoop M/R can get data directly from Cassandra. See CqlInputFormat.

~mck


Re: 2.0.10 to 2.0.11 upgrade and immediate ParNew and CMS GC storm

2014-12-29 Thread mck
  Should I stick to 2048 or try
  with something closer to 128 or even something else ?


2048 worked fine for us.


  About HSHA,
 
 I anti-recommend hsha, serious apparently unresolved problems exist with
 it.


We saw an improvement when we switched to HSHA, particularly for our
offline (hadoop/spark) nodes.
Sorry i don't have the data anymore to support that statement, although
i can say that improvement paled in comparison to cross_node_timeout
which we enabled shortly afterwards.

~mck


Re: 2.0.10 to 2.0.11 upgrade and immediate ParNew and CMS GC storm

2014-12-29 Thread mck

 Perf is better, correctness seems less so. I value latter more than
 former.


Yeah no doubt.
Especially in CASSANDRA-6285 i see some scary stuff went down.

But there are no outstanding bugs that we know of, are there? 
 (CASSANDRA-6815 remains just a wrap up of how options are to be
 presented in cassandra.yaml?)

~mck


Can initial_token be decimal or hexadecimal format?

2011-09-13 Thread Mck
And does it matter when using different partitoners?

In the config it seems only strings are used.
In RP it parses this string into a BigInteger so it needs to be in
decimal format,
but for ByteOrderPartitioner it uses FBUtilities.hexToBytes(..) when
translating a string to a token (BytesToken).

More to the point...
For a 3 node cluster using BOP where my largest token will be
0x8000 (coincidently 2**127)
should i write out initial_tokens like

node0: 0
node1: 2AAA
node2: 5554

or like

node0: 0
node1: 56713727820156410577229101238628035242
node2: 113427455640312821154458202477256070484


If it is the former there's some important documentation missing.

~mck


ps CASSANDRA-1006 seems to be of some relation.




Task's map reading more record than CFIF's inputSplitSize

2011-09-07 Thread Mck
Cassandra-0.8.4 w/ ByteOrderedPartitioner

CFIF's inputSplitSize=196608

3 map tasks (from 4013) is still running after read 25 million rows.

Can this be a bug in StorageService.getSplits(..) ?

With this data I've had general headache with using tokens that are
longer than usual (and trying to move nodes around to balance the ring).

 nodetool ring gives
Address Status State   LoadOwnsToken
   
   
Token(bytes[76118303760208547436305468318170713656])
152.90.241.22   Up Normal  270.46 GB   33.33%  
Token(bytes[30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8])
152.90.241.24   Up Normal  247.89 GB   33.33%  
Token(bytes[303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8])
152.90.241.23   Up Normal  1.1 TB  33.33%  
Token(bytes[76118303760208547436305468318170713656])


~mck



Re: RF=1 w/ hadoop jobs

2011-09-01 Thread Mck
On Thu, 2011-08-18 at 08:54 +0200, Patrik Modesto wrote:
 But there is the another problem with Hadoop-Cassandra, if there is no
 node available for a range of keys, it fails on RuntimeError. For
 example having a keyspace with RF=1 and a node is down all MapReduce
 tasks fail. 

CASSANDRA-2388 is related but not the same.

Before 0.8.4 the behaviour was if the local cassandra node didn't have
the split's data the tasktracker would connect to another cassandra node
where the split's data could be found.

So even 0.8.4 with RF=1 you would have your hadoop job fail.

Although I've reopened CASSANDRA-2388 (and reverted the code locally)
because the new behaviour in 0.8.4 leads to abysmal tasktracker
throughput (for me task allocation doesn't seem to honour data-locality
according to split.getLocations()).

 I've reworked my previous patch, that was addressing this
 issue and now there are ConfigHelper methods for enable/disable
 ignoring unavailable ranges.
 It's available here: http://pastebin.com/hhrr8m9P (for version 0.7.8) 

I'm interested in this patch and see it's usefulness but no one will act
until you attach it to an issue. (I think a new issue is appropriate
here).

~mck



IOException: Unable to create hard link ... /snapshots/ ... (errno 17)

2011-05-03 Thread Mck
Running a 3 node cluster with cassandra-0.8.0-beta1 

I'm seeing the first node logging many (thousands) times lines like


Caused by: java.io.IOException: Unable to create hard link
from 
/iad/finn/countstatistics/cassandra-data/countstatisticsCount/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-5504-Data.db
 to 
/iad/finn/countstatistics/cassandra-data/countstatisticsCount/snapshots/compact-thrift_no_finntech_countstats_count_Count_1299479381593068337/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-5504-Data.db
 (errno 17)


This seems to happen for all column families (including system).
It happens a lot during startup.

The hardlinks do exist. Stopping, deleting the hardlinks, and starting
again does not help.

But i haven't seen it once on the other nodes...

~mck


ps the stacktrace


java.io.IOError: java.io.IOException: Unable to create hard link from 
/iad/finn/countstatistics/cassandra-data/countstatisticsCount/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-3875-Data.db
 to 
/iad/finn/countstatistics/cassandra-data/countstatisticsCount/snapshots/compact-thrift_no_finntech_countstats_count_Count_1299479381593068337/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-3875-Data.db
 (errno 17)
at 
org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1629)
at 
org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1654)
at org.apache.cassandra.db.Table.snapshot(Table.java:198)
at 
org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:504)
at 
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:146)
at 
org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:112)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Unable to create hard link from 
/iad/finn/countstatistics/cassandra-data/countstatisticsCount/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-3875-Data.db
 to 
/iad/finn/countstatistics/cassandra-data/countstatisticsCount/snapshots/compact-thrift_no_finntech_countstats_count_Count_1299479381593068337/thrift_no_finntech_countstats_count_Count_1299479381593068337-f-3875-Data.db
 (errno 17)
at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:155)
at 
org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:713)
at 
org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1622)
... 10 more





Re: IOException: Unable to create hard link ... /snapshots/ ... (errno 17)

2011-05-03 Thread Mck
On Tue, 2011-05-03 at 16:52 +0200, Mck wrote:
 Running a 3 node cluster with cassandra-0.8.0-beta1 
 
 I'm seeing the first node logging many (thousands) times 

Only special thing about this first node is it receives all the writes
from our sybase-cassandra import job.
This process migrates an existing 60million rows into cassandra (before
the cluster is /turned on/ for normal operations). The import job runs
over ~20minutes.

I wiped everything and started from scratch, this time running the
import job with cassandra configured instead with:

incremental_backups: false
snapshot_before_compaction: false

This created the problem then on another node.
So changing to these settings on all nodes and running the import again
fixed it: no more Unable to create hard link ...

After the import i could turn both incremental_backups and
snapshot_before_compaction to true again without problems so far.

To me this says something is broken with incremental_backups and
snapshot_before_compaction under heavy writing?

~mck




Re: IOException: Unable to create hard link ... /snapshots/ ... (errno 17)

2011-05-03 Thread Mck
On Tue, 2011-05-03 at 13:52 -0500, Jonathan Ellis wrote:
 you should probably look to see what errno 17 means for the link
 system call on your system. 

That the file already exists.
It seems cassandra is trying to make the same hard link in parallel
(under heavy write load) ?

I see now i can also reproduce the problem with hadoop and
ColumnFamilyOutputFormat. 
Turning off snapshot_before_compaction seems to be enough to prevent
it. 

~mck




Re: IOException: Unable to create hard link ... /snapshots/ ... (errno 17)

2011-05-03 Thread Mck
On Tue, 2011-05-03 at 14:22 -0500, Jonathan Ellis wrote:
 Can you create a ticket?

CASSANDRA-2598



Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-26 Thread Mck
On Fri, 2011-04-22 at 16:49 -0500, Eric Evans wrote:
 I am pleased to announce the release of Apache Cassandra 0.8.0 beta1.


*Truly Awesome!*  
  CQL rocks in so many ways. 


Is 0.8.0-beta1 available in apache's maven repository?
  And if not, why not? 

~mck





Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-26 Thread Mck
On Tue, 2011-04-26 at 12:53 +0100, Stephen Connolly wrote:
 (or did you want 20million unneeded deps for the
 client jars?) 

Yes that's a good reason :-)
If there anything i can help with?

Will beta versions be available under releases repository?


~mck



Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-26 Thread Mck
On Wed, 2011-01-26 at 12:13 +0100, Patrik Modesto wrote:
 BTW how to get current time in microseconds in Java?

I'm using HFactory.clock() (from hector).

  As far as moving the clone(..) into ColumnFamilyRecordWriter.write(..)
  won't this hurt performance? 
 
 The size of the queue is computed at runtime:
 ColumnFamilyOutputFormat.QUEUE_SIZE, 32 *
 Runtime.getRuntime().availableProcessors()
 So the queue is not too large so I'd say the performance shouldn't get hurt. 

This is only the default.
I'm running w/ 8. Testing have given this the best throughput for me
when processing 25+ million rows...

In the end it is still 25+ million .clone(..) calls. 

 The key isn't the only potential live byte[]. You also have names and
 values in all the columns (and supercolumns) for all the mutations.

Now make that over a billion .clone(..) calls... :-(

byte[] copies are relatively quick and cheap, still i am seeing a
performance degradation in m/r reduce performance with cloning of keys.
It's not that you don't have my vote here, i'm just stating my
uncertainty on what the correct API should be.

~mck


signature.asc
Description: This is a digitally signed message part


Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Mck

   is d.timestamp = System.currentTimeMillis(); ok?
 
 You are correct that microseconds would be better but for the test it
 doesn't matter that much. 

Have you tried. I'm very new to cassandra as well, and always uncertain
as to what to expect...


 ByteBuffer bbKey = ByteBufferUtil.clone(ByteBuffer.wrap(key.getBytes(), 0, 
 key.getLength())); 

An alternative approach to your client-side cloning is 

  ByteBuffer bbKey = ByteBuffer.wrap(key.toString().getBytes(UTF_8)); 

Here at least it is obvious you are passing in the bytes from an immutable 
object.

As far as moving the clone(..) into ColumnFamilyRecordWriter.write(..)
won't this hurt performance? Normally i would _always_ agree that a
defensive copy of an array/collection argument be stored, but has this
intentionally not been done (or should it) because of large reduce jobs
(millions of records) and the performance impact here.

The key isn't the only potential live byte[]. You also have names and
values in all the columns (and supercolumns) for all the mutations.


~mck



Should nodetool ring give equal load ?

2011-01-12 Thread mck
I'm using 0.7.0-rc3, 3 nodes, RF=3, and ByteOrderedPartitioner.

When i run nodetool ring it reports

 Address Status State   LoadOwnsToken  
  

 Token(bytes[ff034355152567a5b2d962b55990e692])
 152.90.242.91   Up Normal  12.26 GB33.33%  
 Token(bytes[01cecd88847283229a3dc88292deff86])
 152.90.242.93   Up Normal  6.13 GB 33.33%  
 Token(bytes[d4a4de25c0dad34749e99219e227d896])
 152.90.242.92   Up Normal  6.13 GB 33.33%  
 Token(bytes[ff034355152567a5b2d962b55990e692])

why would the first node have double the Load?
is this expected or is something wrong?

Number of files data_file_directories for the keyspace is roughly the same.
But each Index and Filter file is double the size on the first node (regardless 
of the cf they belong to).

cleanup didn't help. compact only took away 2GB. Otherwise there is a lot 
here i don't understand.


~mck

-- 
The turtle only makes progress when it's neck is stuck out Rollo May |
www.semb.wever.org | www.sesat.no | www.finn.no |
http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread mck
On Wed, 2011-01-12 at 18:40 +, Jairam Chandar wrote:
 Caused by: TimedOutException()

What is the exception in the cassandra logs?

~mck

-- 
Don't use Outlook. Outlook is really just a security hole with a small
e-mail client attached to it. Brian Trosko | www.semb.wever.org |
www.sesat.no | www.finn.no | http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: Should nodetool ring give equal load ?

2011-01-12 Thread mck

 You're using an ordered partitioner and your nodes are evenly spread
 around the ring, but your data probably isn't evenly distributed. 

This load number seems equals to `du -hs data_file_directories` and
since i've got N == RF shouldn't the data size always be the same on
every node?

~mck

-- 
Traveller, there are no paths. Paths are made by walking. Australian
Aboriginal saying | www.semb.wever.org | www.sesat.no | www.finn.no |
http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread mck
On Wed, 2011-01-12 at 23:04 +0100, mck wrote:
  Caused by: TimedOutException()
 
 What is the exception in the cassandra logs? 

Or tried increasing rpc_timeout_in_ms?

~mck

-- 
When there is no enemy within, the enemies outside can't hurt you.
African proverb | www.semb.wever.org | www.sesat.no | www.finn.no |
http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: Should nodetool ring give equal load ?

2011-01-12 Thread mck
On Wed, 2011-01-12 at 14:21 -0800, Ryan King wrote:
 What consistency level did you use to write the
 data? 

R=1,W=1 (reads happen a long time afterwards).

~mck

-- 
It is now quite lawful for a Catholic woman to avoid pregnancy by a
resort to mathematics, though she is still forbidden to resort to
physics and chemistry. H.L. Mencken | www.semb.wever.org | www.sesat.no
| www.finn.no | http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: Hadoop Integration doesn't work when one node is down

2011-01-02 Thread mck

 Is this a bug or feature or a misuse? 

i can confirm this bug.
on a 3 node cluster testing environment with RF 3. 
(and no issue exists for it AFAIK).

~mck


-- 
Simplicity is the ultimate sophistication Leonardo Da Vinci's (William
of Ockham) 
| www.semb.wever.org | www.sesat.no 
| www.finn.no| http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: nodetool can't jmx authenticate...

2010-12-30 Thread mck
On Thu, 2010-12-30 at 08:03 -0600, Jonathan Ellis wrote:
 We don't have any explicit code for enabling that, no.

https://issues.apache.org/jira/browse/CASSANDRA-1921

the patch was simple (NodeCmd and NodeProbe). just testing it now...

~mck


-- 
I'm not one of those who think Bill Gates is the devil. I simply
suspect that if Microsoft ever met up with the devil, it wouldn't need
an interpreter. Nicholas Petreley | www.semb.wever.org | www.sesat.no |
www.finn.no | http://xss-http-filter.sf.net


signature.asc
Description: This is a digitally signed message part


Re: (newbie) ColumnFamilyOutputFormat only writes one column (per key)

2010-11-24 Thread Mck

 I then went to write a m/r job that deserialises the thrift objects and
 aggregates the data accordingly into a new column family. But what i've
 found is that ColumnFamilyOutputFormat will only write out one column
 per key.

I've entered a bug for this:
 https://issues.apache.org/jira/browse/CASSANDRA-1774

~mck



signature.asc
Description: This is a digitally signed message part


(newbie) ColumnFamilyOutputFormat only writes one column (per key)

2010-11-21 Thread mck
(I'm new here so forgive any mistakes or mis-presumptions...)

I've set up a cassandra-0.7.0-beta3 and populated it with
thrift-serialised objects via a scribe server. This seems a great way to
get thrift beans out of the application asap and have then sitting in
cassandra for later processing.

I then went to write a m/r job that deserialises the thrift objects and
aggregates the data accordingly into a new column family. But what i've
found is that ColumnFamilyOutputFormat will only write out one column
per key.

Alex Burkoff also reported this nearly two months ago, but nobody ever
replied...
 http://article.gmane.org/gmane.comp.db.cassandra.user/9325

has anyone any ideas? 
should it be possible to write multiple columns out?

This is very easy to reproduce. Use the contrib/wordcount example, with
OUTPUT_REDUCER=cassandra and in WordCount.java add at line 132

  results.add(getMutation(key, sum));
 +results.add(getMutation(new Text(doubled), sum*2));

Only the last mutation for any key seems to be written.


~mck

-- 
echo '[q]sa[ln0=aln256%
Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc 

| www.semb.wever.org | www.sesat.no 
| www.finn.no| http://xss-http-filter.sf.net



signature.asc
Description: This is a digitally signed message part