Re: about collections limit

2014-11-14 Thread DuyHai Doan
No, collections and map are fetched entirely server side and returned to
the client. Updates & deletes can be done on elements though


On Fri, Nov 14, 2014 at 7:47 AM, diwayou  wrote:

> can i scan collection (list, set) paged by limit?
>


Re: Questiona about node repair

2014-11-14 Thread DuyHai Doan
By checking into the source code:

StorageService:
public void forceTerminateAllRepairSessions() {
ActiveRepairService.instance.terminateSessions();
}


ActiveRepairService:
public void terminateSessions()
{
for (RepairSession session : sessions.values())
{
session.forceShutdown();
}
parentRepairSessions.clear();
}

RepairSession:
public void forceShutdown()
{
taskExecutor.shutdownNow();
differencingDone.signalAll();
completed.signalAll();
}

On Fri, Nov 14, 2014 at 4:01 AM, Di, Jieming  wrote:

>  Hi There,
>
>
>
> I have a question about Cassandra node repair, there is a function called
> “forceTerminateAllRepairSessions();”, so will the function terminate all
> the repair session in only one node, or it will terminate all the session
> in a ring? And when it terminates all repair sessions, does it just
> terminate it immediately or it just send a terminate signal, and do the
> real terminate later? Thanks a lot.
>
>
>
> Regards,
>
> -Jieming-
>
>
>


Re: Cassandra communication between 2 datacenter

2014-11-14 Thread Adil
Thank you Eric, the problem in fact was that the ports were open only in
one sense.
now is working.

2014-11-13 22:38 GMT+01:00 Eric Plowe :

> Are you sure that both DC's can communicate with each other over the
> necessary ports?
>
> On Thu, Nov 13, 2014 at 3:46 PM, Adil  wrote:
>
>> yeh we started nodes one at timemy doubt is if we should configure
>> alse cassandra-topology.properties or not? we leave it with default vlaues
>>
>> 2014-11-13 21:05 GMT+01:00 Robert Coli :
>>
>>> On Thu, Nov 13, 2014 at 10:26 AM, Adil  wrote:
>>>
 Hi,
 we have two datacenter with those inof:

 Cassandra version 2.1.0
 DC1 with 5 nodes
 DC2 with 5 nodes

 we set the snitch to GossipingPropertyFileSnitch and in
 cassandra-rackdc.properties we put:
 in DC1:
 dc=DC1
 rack=RAC1

 in DC2:
 dc=DC2
 rack=RAC1

 and in every node's cassandra.yaml we define two seeds of DC1 and two
 seed of DC2.

>>>
>>> Do you start the nodes one at a time, and then consult nodetool ring
>>> (etc.) to see if the cluster coalesces in the way you expect?
>>>
>>> If so, a Keyspace created in one should very quickly be created in the
>>> other.
>>>
>>> =Rob
>>> http://twitter.com/rcolidba
>>>
>>
>>
>


Cassandra default consistency level on multi datacenter

2014-11-14 Thread Adil
Hi,
We are using two datacenter and we want to set the default consistency
level to LOCAL_ONE instead of ONE but we don't know how to configure it.
We set LOCAL_QUORUM via cql driver for the desired queries but we won't do
the same for the default one.

Thanks in advance

Adil


Which JMX item can I use to see total cluster (or data center) Read and Write volumes?

2014-11-14 Thread Bob Nilsen
Hi all,

Within DataStax OpsCenter I can see metrics that show total traffic volume
for a cluster and each data center.

How can I find these same numbers amongst all the JMX items?

Thanks,

-- 
Bob Nilsen
rwnils...@gmail.com


Repair completes successfully but data is still inconsistent

2014-11-14 Thread André Cruz
Hello.

So, I have detected a data inconsistency between my nodes:

(Consistency level is ONE)
[disco@Disco] get 
NamespaceFile2['45fc8996-41bc-429b-a382-5da9294eb59c:/XXXDIRXXX']['XXXFILEXXX'];
Value was not found
Elapsed time: 48 msec(s).
[disco@Disco] get 
NamespaceFile2['45fc8996-41bc-429b-a382-5da9294eb59c:/XXXDIRXXX']['XXXFILEXXX'];
=> (name=XXXFILEXXX, value=a96a7f54-49d4-11e4-8185-e0db55018fa4, 
timestamp=1412213845215797)
Elapsed time: 7.45 msec(s).
[disco@Disco]

I have a replication factor of 3, and if I read with a consistency level of 
QUORUM, the result converges to the column being present. The strange thing is: 
I have deleted it on 2014-11-01 03:39:25 and my writes use QUORUM. My 
gc_grace_period is 10 days and if I had not been running repairs during this 
period this could be explained, but the fact is that repairs have been running 
every day with no sign of problems. First of all some data:

Cassandra version: 1.2.19 (we upgraded from 1.2.16 on 2014-10-22)
NamespaceFile2 compaction strategy: LeveledCompactionStrategy
CFStats:
Read Count: 3376397
Read Latency: 0.24978254956392865 ms.
Write Count: 21254817
Write Latency: 0.04713691710448507 ms.
Pending Tasks: 0
Column Family: NamespaceFile2
SSTable count: 28
SSTables in each level: [1, 10, 17, 0, 0, 0, 0, 0, 0]
Space used (live): 4961751613
Space used (total): 5021967044
SSTable Compression Ratio: 0.5773464014899713
Number of Keys (estimate): 9369856
Memtable Columns Count: 82901
Memtable Data Size: 19352636
Memtable Switch Count: 283
Read Count: 3376397
Read Latency: NaN ms.
Write Count: 21254817
Write Latency: 0.049 ms.
Pending Tasks: 0
Bloom Filter False Positives: 21904
Bloom Filter False Ratio: 0.0
Bloom Filter Space Used: 7405728
Compacted row minimum size: 61
Compacted row maximum size: 74975550
Compacted row mean size: 795
Average live cells per slice (last five minutes): 1025.0
Average tombstones per slice (last five minutes): 0.0

This particular row has 1.1M columns, 70934390 bytes, and the offending key 
encoded by the partitioner is 
001045fc899641bc429ba3825da9294eb59c072f646f7267656d00.

First I checked which nodes were responsible for this particular row:

$ nodetool -h XXX.XXX.XXX.112 getendpoints Disco NamespaceFile2 
'45fc8996-41bc-429b-a382-5da9294eb59c:/XXXDIRXXX'
XXX.XXX.XXX.17
XXX.XXX.XXX.14
XXX.XXX.XXX.18


This is the repair log of these particular nodes:

--XXX.XXX.XXX.14--
INFO [AntiEntropySessions:366] 2014-11-14 04:34:55,740 AntiEntropyService.java 
(line 651) [repair #956244b0-6bb7-11e4-8eec-f5962c02982e] new session: will 
sync /XXX.XXX.XXX.14, /XXX.XXX.XXX.12, /XXX.XXX.XXX.18 on range 
(14178431955039102644307275309657008810,28356863910078205288614550619314017621] 
for Disco.[NamespaceFile2]
...
INFO [AntiEntropyStage:1] 2014-11-14 04:36:51,125 AntiEntropyService.java (line 
764) [repair #956244b0-6bb7-11e4-8eec-f5962c02982e] NamespaceFile2 is fully 
synced
INFO [AntiEntropySessions:366] 2014-11-14 04:36:51,126 AntiEntropyService.java 
(line 698) [repair #956244b0-6bb7-11e4-8eec-f5962c02982e] session completed 
successfully
...
INFO [AntiEntropySessions:367] 2014-11-14 04:36:51,103 AntiEntropyService.java 
(line 651) [repair #da2543e0-6bb7-11e4-8eec-f5962c02982e] new session: will 
sync /XXX.XXX.XXX.14, /XXX.XXX.XXX.11, /XXX.XXX.XXX.17 on range 
(155962751505430129087380028406227096910,0] for Disco.[NamespaceFile2]
...
INFO [AntiEntropyStage:1] 2014-11-14 04:38:20,949 AntiEntropyService.java (line 
764) [repair #da2543e0-6bb7-11e4-8eec-f5962c02982e] NamespaceFile2 is fully 
synced
INFO [AntiEntropySessions:367] 2014-11-14 04:38:20,949 AntiEntropyService.java 
(line 698) [repair #da2543e0-6bb7-11e4-8eec-f5962c02982e] session completed 
successfully
...
INFO [AntiEntropySessions:368] 2014-11-14 04:38:20,930 AntiEntropyService.java 
(line 651) [repair #0fafc710-6bb8-11e4-8eec-f5962c02982e] new session: will 
sync /XXX.XXX.XXX.14, /XXX.XXX.XXX.17, /XXX.XXX.XXX.18 on range 
(0,14178431955039102644307275309657008810] for Disco.[NamespaceFile2]
...
INFO [AntiEntropyStage:1] 2014-11-14 04:40:34,237 AntiEntropyService.java (line 
764) [repair #0fafc710-6bb8-11e4-8eec-f5962c02982e] NamespaceFile2 is fully 
synced
INFO [AntiEntropySessions:368] 2014-11-14 04:40:34,237 AntiEntropyService.java 
(line 698) [repair #0fafc710-6bb8-11e4-8eec-f5962c02982e] session completed 
successfully


--XXX.XXX.XXX.17--
INFO [AntiEntropySessions:362] 2014-11-13 04:33:01,974 AntiEntropyService.java 
(line 651) [repair #27293450-6aee-11e4-aabc-9b69569c95c3] new session: will 
sync /XXX.XXX.XXX.17, 

Re: Repair completes successfully but data is still inconsistent

2014-11-14 Thread André Cruz
Some extra info. I checked the backups and on the 8th of November, all 3 
replicas had the tombstone of the deleted column. So:

1 November - column is deleted - gc_grace_period is 10 days
8 November - all 3 replicas have tombstone
13/14 November - column/tombstone is gone on 2 replicas, 3rd replica has the 
original value (!), with the original timestamp…

Is there a logical explanation for this behaviour?

Thank you,
André Cruz

Re: Repair completes successfully but data is still inconsistent

2014-11-14 Thread Michael Shuler

On 11/14/2014 12:12 PM, André Cruz wrote:

Some extra info. I checked the backups and on the 8th of November, all 3 
replicas had the tombstone of the deleted column. So:

1 November - column is deleted - gc_grace_period is 10 days
8 November - all 3 replicas have tombstone
13/14 November - column/tombstone is gone on 2 replicas, 3rd replica has the 
original value (!), with the original timestamp…


After seeing your first post, this is helpful info. I'm curious what the 
logs show between the 8th-13th, particularly around the 10th-11th :)



Is there a logical explanation for this behaviour?


Not yet!

--
Kind regards,
Michael


Re: Repair completes successfully but data is still inconsistent

2014-11-14 Thread André Cruz
On 14 Nov 2014, at 18:29, Michael Shuler  wrote:
> 
> On 11/14/2014 12:12 PM, André Cruz wrote:
>> Some extra info. I checked the backups and on the 8th of November, all 3 
>> replicas had the tombstone of the deleted column. So:
>> 
>> 1 November - column is deleted - gc_grace_period is 10 days
>> 8 November - all 3 replicas have tombstone
>> 13/14 November - column/tombstone is gone on 2 replicas, 3rd replica has the 
>> original value (!), with the original timestamp…
> 
> After seeing your first post, this is helpful info. I'm curious what the logs 
> show between the 8th-13th, particularly around the 10th-11th :)

Which logs in particular, just the ones from the 3rd machine which has the 
zombie column? What should I be looking for? :)

Thanks for the help,
André

Re: Questiona about node repair

2014-11-14 Thread Robert Coli
On Thu, Nov 13, 2014 at 7:01 PM, Di, Jieming  wrote

> I have a question about Cassandra node repair, there is a function called
> “forceTerminateAllRepairSessions();”, so will the function terminate all
> the repair session in only one node, or it will terminate all the session
> in a ring? And when it terminates all repair sessions, does it just
> terminate it immediately or it just send a terminate signal, and do the
> real terminate later? Thanks a lot.
>

https://issues.apache.org/jira/browse/CASSANDRA-3486

=Rob


Reading the write time of each value in a set?

2014-11-14 Thread Kevin Burton
I’m trying to build a histograph in CQL for various records. I’d like to
keep a max of ten items or items with a TTL.  but if there are too many
items, I’d like to trim it so the max number of records is about 20.

So if I exceed 20, I need to removed the oldest records.

I’m using a set append so each member of the set has a different write time
and ttl.

But I can’t figure out how to compute the writetime()  of each set member
since the CQL write time only takes a column reference.

Any advice?  Seems like I’m an edge case.

Plan B is to upgrade everything to 2.1 and I can use custom datatypes and
just store the write times myself, but that takes a while.

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile




Re: Working with legacy data via CQL

2014-11-14 Thread Tyler Hobbs
What version of cassandra did you originally create the column family in?
Have you made any schema changes to it through cql or cassandra-cli, or has
it always been exactly the same?

On Wed, Nov 12, 2014 at 2:06 AM, Erik Forsberg  wrote:

> On 2014-11-11 19:40, Alex Popescu wrote:
> > On Tuesday, November 11, 2014, Erik Forsberg  > > wrote:
> >
> >
> > You'll have better chances to get an answer about the Python driver on
> > its own mailing
> > list
> https://groups.google.com/a/lists.datastax.com/forum/#!forum/python-driver-user
>
> As I said, this also happens when using cqlsh:
>
> cqlsh:test> SELECT column1,value from "Users" where key =
> a6b07340-047c-4d4c-9a02-1b59eabf611c and column1 = 'date_created';
>
>  column1  | value
> --+--
>  date_created | '\x00\x00\x00\x00Ta\xf3\xe0'
>
> (1 rows)
>
> Failed to decode value '\x00\x00\x00\x00Ta\xf3\xe0' (for column 'value')
> as text: 'utf8' codec can't decode byte 0xf3 in position 6: unexpected
> end of data
>
> So let me rephrase: How do I work with data where the table has metadata
> that makes some columns differ from the main validation class? From
> cqlsh, or the python driver, or any driver?
>
> Thanks,
> \EF
>



-- 
Tyler Hobbs
DataStax 


Re: Which JMX item can I use to see total cluster (or data center) Read and Write volumes?

2014-11-14 Thread Tyler Hobbs
OpsCenter is aggregating individual metrics across the whole datacenter (or
cluster).  The individual metrics are in
org.apache.cassandra.metrics.ClientRequest.Read.Latency.count and
Write.Latency.count.

On Fri, Nov 14, 2014 at 10:04 AM, Bob Nilsen  wrote:

> Hi all,
>
> Within DataStax OpsCenter I can see metrics that show total traffic volume
> for a cluster and each data center.
>
> How can I find these same numbers amongst all the JMX items?
>
> Thanks,
>
> --
> Bob Nilsen
> rwnils...@gmail.com
>



-- 
Tyler Hobbs
DataStax 


Re: Cassandra default consistency level on multi datacenter

2014-11-14 Thread Tyler Hobbs
Cassandra itself does not have default consistency levels.  These are only
configured in the driver.

On Fri, Nov 14, 2014 at 8:54 AM, Adil  wrote:

> Hi,
> We are using two datacenter and we want to set the default consistency
> level to LOCAL_ONE instead of ONE but we don't know how to configure it.
> We set LOCAL_QUORUM via cql driver for the desired queries but we won't do
> the same for the default one.
>
> Thanks in advance
>
> Adil
>



-- 
Tyler Hobbs
DataStax 


RE: Questiona about node repair

2014-11-14 Thread Di, Jieming
Thanks Rob.

From: Robert Coli [mailto:rc...@eventbrite.com]
Sent: 2014年11月15日 2:50
To: user@cassandra.apache.org
Subject: Re: Questiona about node repair



On Thu, Nov 13, 2014 at 7:01 PM, Di, Jieming 
mailto:jieming...@emc.com>> wrote
I have a question about Cassandra node repair, there is a function called 
“forceTerminateAllRepairSessions();”, so will the function terminate all the 
repair session in only one node, or it will terminate all the session in a 
ring? And when it terminates all repair sessions, does it just terminate it 
immediately or it just send a terminate signal, and do the real terminate 
later? Thanks a lot.

https://issues.apache.org/jira/browse/CASSANDRA-3486

=Rob



RE: Questiona about node repair

2014-11-14 Thread Di, Jieming
Thanks DuyHai.

From: DuyHai Doan [mailto:doanduy...@gmail.com]
Sent: 2014年11月14日 21:55
To: user@cassandra.apache.org
Subject: Re: Questiona about node repair

By checking into the source code:

StorageService:
public void forceTerminateAllRepairSessions() {
ActiveRepairService.instance.terminateSessions();
}


ActiveRepairService:
public void terminateSessions()
{
for (RepairSession session : sessions.values())
{
session.forceShutdown();
}
parentRepairSessions.clear();
}

RepairSession:
public void forceShutdown()
{
taskExecutor.shutdownNow();
differencingDone.signalAll();
completed.signalAll();
}

On Fri, Nov 14, 2014 at 4:01 AM, Di, Jieming 
mailto:jieming...@emc.com>> wrote:
Hi There,

I have a question about Cassandra node repair, there is a function called 
“forceTerminateAllRepairSessions();”, so will the function terminate all the 
repair session in only one node, or it will terminate all the session in a 
ring? And when it terminates all repair sessions, does it just terminate it 
immediately or it just send a terminate signal, and do the real terminate 
later? Thanks a lot.

Regards,
-Jieming-




Re: Reading the write time of each value in a set?

2014-11-14 Thread DuyHai Doan
Why don't you use map to store write time as value and data as key?
Le 15 nov. 2014 00:24, "Kevin Burton"  a écrit :

> I’m trying to build a histograph in CQL for various records. I’d like to
> keep a max of ten items or items with a TTL.  but if there are too many
> items, I’d like to trim it so the max number of records is about 20.
>
> So if I exceed 20, I need to removed the oldest records.
>
> I’m using a set append so each member of the set has a different write
> time and ttl.
>
> But I can’t figure out how to compute the writetime()  of each set member
> since the CQL write time only takes a column reference.
>
> Any advice?  Seems like I’m an edge case.
>
> Plan B is to upgrade everything to 2.1 and I can use custom datatypes and
> just store the write times myself, but that takes a while.
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
> 
>
>