Thrift - CQL

2014-03-26 Thread rubbish me
Hi all,
 
We have been using Cassandra for more than 3 years and now we have a cluster in 
production still running on 1.1.x contains dynamic-columned column-families - 
with hector as client. 

We are trying to update to the latest 1.2.x and considering to use datastax 
client in order to utilise some of its round robin / failover goodness.
 
We bumped on to a few walls however when converting our thrift based client 
code to CQL.  We read through the docs + datastax dev blog entires like: this 
and this.  However they are mostly focus on reading from an existing dynamic 
cf, run some alter table statements, and reading it again.
Very little about how to insert / update.
 
So there comes my questions:
-  Is there any way to do insert / update at all on a good old wide cf using 
CQL?   Based on what we read back out, we have tried:

INSERT INTO cf_name(key, column1, value) VALUES (‘key1’, 
‘columnName1’,’columnValue2’)

But we ended up with “Unknown identifier column1”
 
-  About read -  One of our cf is defined with a secondary index.  So the 
schema looks something like:
 
create column family cf_with_index
  with column_type = 'Standard'
  and comparator = 'UTF8Type'
  and default_validation_class = 'UTF8Type'
  and key_validation_class = 'UTF8Type'
  and column_metadata = [
{column_name : 'indexed_column',
validation_class : UTF8Type,
index_name : 'column_idx',
index_type : 0}];
 
When reading from cli, we will see all columns, data as you expected:
--
---
RowKey: rowkey1
= (name=c1, value=v1, timestamp=xxx, ttl=604800)
= (name=c2, value=v2, timestamp=xxx, ttl=604800)
= (name=c3, value=v3, timestamp=xxx, ttl=604800)
= (name=indexed_column, value=value1, timestamp=xxx, ttl=604800)
---
 
However when we Query via CQL, we only get the indexed column:
SELECT * FROM cf_with_index WHERE key = ‘rowkey1’;
 
key   | indexed_column
---+
rowkey1   | value1
 
Any way to get the rest?
 
-  Obtaining TTL and writetime on these wide rows  - we tried:
SELECT key, column1, value, writetime(value), ttl(value) FROM cf LIMIT 1;
It works, but a bit clumsy.  Is there a better way?
 
-  We can live with thrift.  Is there any way / plan to let us to execute 
thrift with datastax driver?  Hector seems not active anymore.
 
Many thanks in advanced,
 
A



Re: Commit log periodic sync?

2012-08-28 Thread rubbish me
Thanks again Aaron.

 I think case I would not expect to see data lose. If you are still in a test 
 scenario can you try to reproduce the problem ? If possible can you reproduce 
 it with a single node ?

We will try that later this week. 


We did the same exercise this week, this time we did a flush and snapshot 
before the DR actually happened - as an attempt to identify if the commit logs 
fsync was the problem. 

We can clearly see stables were created for the flush command. 
And those sstables were loaded in when the nodes started up again after the DR 
exercise. 

At this point we believed all nodes had all the data, so we let them serving 
client requests while we run repair on the nodes. 

Data created before the last flush was still missing, according to the client 
that talked to DC1 (the disaster DC). 

We had a look at the log of one of the DC1 nodes. The suspicious thing was that 
latest sstable was being compacted during streaming sessions of the repair. But 
no error was reported. 

Here comes my questions:
- if during the streaming session, the sstable that was about to stream out but 
was being compacted, would we see error in the log?
- could this lead to data not found?
- is it safe to let a node serving read/write requests while repair is running?

Many thanks again. 

- A




aaron morton aa...@thelastpickle.com 於 27 Aug 2012 09:08 寫道:

 Brutally. kill -9.
 that's fine. I was thinking about reboot -f -n
 
 We are wondering if the fsync of the commit log was working.
 I would say yes only because there other reported problems. 
 
 I think case I would not expect to see data lose. If you are still in a test 
 scenario can you try to reproduce the problem ? If possible can you reproduce 
 it with a single node ?
 
 Cheers
 
 
 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 25/08/2012, at 11:00 AM, rubbish me rubbish...@googlemail.com wrote:
 
 Thanks, Aaron, for your reply - please see the inline.
 
 
 On 24 Aug 2012, at 11:04, aaron morton wrote:
 
 - we are running on production linux VMs (not ideal but this is out of our 
 hands)
 Is the VM doing anything wacky with the IO ?
 
 Could be.  But I thought we would ask here first.  This is a bit difficult 
 to prove cos we dont have the control over these VMs.
 
  
 
 As part of a DR exercise, we killed all 6 nodes in DC1,
 Nice disaster. Out of interest, what was the shutdown process ?
 
 Brutally. kill -9.
 
 
 
 We noticed that data that was written an hour before the exercise, around 
 the last memtables being flushed,was not found in DC1. 
 To confirm, data was written to DC 1 at CL LOCAL_QUORUM before the DR 
 exercise. 
 
 Was the missing data written before or after the memtable flush ? I'm 
 trying to understand if the data should have been in the commit log or the 
 memtables. 
 
 Missing data was those written after the last flush.  These data was 
 retrievable before the DR exercise.
 
 
 Can you provide some more info on how you are detecting it is not found in 
 DC 1?
 
 
 We tried hector, consistencylevel=local quorum.  We had missing column or 
 the whole row.  
 
 We tried cassandra-cli on DC1 nodes, same.
 
 However once we run the same query on DC2, C* must have then done a 
 read-repair. That particular piece of result data would appear in DC1 again.
 
 
 If we understand correctly, commit logs are being written first and then 
 to disk every 10s. 
 Writes are put into a bounded queue and processed as fast as the IO can 
 keep up. Every 10s a sync messages is added to the queue. Not that the 
 commit log segment may rotate at any time which requires a sync. 
 
 A loss of data across all nodes in a DC seems odd. If you can provide some 
 more information we may be able to help. 
 
 
 We are wondering if the fsync of the commit log was working.  But we saw no 
 errors / warning in logs.  Wondering if there is way to verify
 
 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 24/08/2012, at 6:01 AM, rubbish me rubbish...@googlemail.com wrote:
 
 Hi all
 
 First off, let's introduce the setup. 
 
 - 6 x C* 1.1.2 in active DC (DC1), another 6 in another (DC2)
 - keyspace's RF=3 in each DC
 - Hector as client.
 - client talks only to DC1 unless DC1 can't serve the request. In which 
 case talks only to DC2
 - commit log was periodically sync with the default setting of 10s. 
 - consistency policy = LOCAL QUORUM for both read and write. 
 - we are running on production linux VMs (not ideal but this is out of our 
 hands)
 -
 As part of a DR exercise, we killed all 6 nodes in DC1, hector starts 
 talking to DC2, all the data was still there, everything continued to work 
 perfectly. 
 
 Then we brought all nodes, one by one, in DC1 up. We saw a message saying 
 all the commit logs were replayed. No errors reported.  We didn't run 
 repair at this time. 
 
 We noticed that data

commit log to disk with periodic mode

2012-08-23 Thread rubbish me
Hi all

First off, please let me introduce the setup.


- a balance ring of 6 x C* 1.1.2 in active DC (DC1), 6 in another (DC2); 
- keyspace's RF=3 in each DC;
- client talks only to DC1 unless DC1 can't serve the request, in which case 
talks only to DC2;
- commit log is being sync periodically with the default setting of 10s.
- consistency policy = LOCAL QUORUM for both read and write.
- we are running on production linux VMs (not ideal but this is out of our 
hands)
-

As part of a DR exercise, we brutally killed all 6 nodes in DC1, client started 
talking to DC2. All data survived, everything continued to work perfectly.

Then we brought all nodes in DC1 up, one by one We saw each with message saying 
commit logs were all replayed. No errors reported.  We didn't run repair at 
this time.

However, DC1 lost data that was written an hour before the DR exercise.  It 
seemed everything after the last memtable-flush was gone.

If we understand correctly, commit logs are being written first and then sync 
to disk every 10s. At worst we would have lost the last 10s of data. 
But it seemed as if the periodic sync didnt happen.  What could be the cause of 
this behaviour?

With the blessing of C* we could recovered all these data from DC2. But we 
would like to understand the possible cause.

Many thanks in advanced.

- A

Re: BulkLoading sstables from v1.0.3 to v1.1.1

2012-07-10 Thread rubbish me
Thanks Ivo. 

We are quite close to releasing so we'd hope to understand what causing the 
error and may try to avoid it where possible. As said, it seems to work ok the 
first time round. 

The problem you referring in the last mail, was it restricted to bulk loading 
or otherwise?

Thanks

-A

Ivo Meißner i...@overtronic.com 於 10 Jul 2012 07:20 寫道:

 Hi,
 
 there are some problems in version 1.1.1 with secondary indexes and key 
 caches that are fixed in 1.1.2. 
 I would try to upgrade to 1.1.2 and see if the error still occurs. 
 
 Ivo
 
 
 
 
 
 Hi 
 
 As part of a continuous development of a system migration, we have a test 
 build to take a snapshot of a keyspace from cassandra v 1.0.3 and bulk load 
 it to a cluster of 1.1.1 using the sstableloader.sh.  Not sure if relevant, 
 but one of the cf contains a secondary index. 
 
 The build basically does: 
 Drop the destination keyspace if exist 
 Add the destination keyspace, wait for schema to agree 
 run sstableLoader 
 Do some validation of the streamed data 
 
 Keyspace / column families schema are basically the same, apart from in the 
 one of v1.1.1, we had compression and key cache switched on. 
 
 On a clean cluster, (empty data, commit log, saved-cache dirs) the sstables 
 loaded beautifully. 
 
 But subsequent build failed with 
 -- 
 [21:02:02][exec] progress: [snip ip_addresses]... [total: 0 - 0MB/s (avg: 
 0MB/s)]ERROR 21:02:02,811 Error in 
 ThreadPoolExecutorjava.lang.RuntimeException: java.net.SocketException: 
 Connection reset