Re: Out of memory on wide row read

2015-05-19 Thread Antoine Blanchet
The issue has been closed by Jonathan Ellis. The limit is useless in CQL
because of the automatic paging feature
,
that's cool. But this feature will not be add to the Thrift API. Subject
closed :).

On Mon, May 18, 2015 at 6:05 PM, Antoine Blanchet <
a.blanc...@abc-arbitrage.com> wrote:

> Done, https://issues.apache.org/jira/browse/CASSANDRA-9413 . Feel free to
> improve the description, I've only copy/paste the first message from Kévin.
>
> Thanks.
>
> On Fri, May 15, 2015 at 9:56 PM, Alprema  wrote:
>
>> I William file a jira for that, thanks
>> On May 12, 2015 10:15 PM, "Jack Krupansky" 
>> wrote:
>>
>>> Sounds like it's worth a Jira - Cassandra should protect itself from
>>> innocent mistakes or excessive requests from clients. Maybe there should be
>>> a timeout or result size (bytes in addition to count) limit. Something.
>>> Anything. But OOM seems a tad unfriendly for an innocent mistake. In this
>>> particular case, maybe Cassandra could detect the total row size/slice
>>> being read and error out on a configurable limiit.
>>>
>>> -- Jack Krupansky
>>>
>>> On Tue, May 12, 2015 at 1:57 PM, Robert Coli 
>>> wrote:
>>>
 On Tue, May 12, 2015 at 8:43 AM, Kévin LOVATO 
 wrote:

> My question is the following: Is it possible to prevent Cassandra from
> OOM'ing when a client does this kind of requests? I'd rather have an error
> thrown to the client than a multi-server crash.
>

 You can provide a default LIMIT clause, but this is based on number of
 results and not size.

 Other than that, there are not really great options.

 =Rob


>>>
>>>
>
>
> --
> Antoine Blanchet
> ABC Arbitrage Asset Management
> http://www.abc-arbitrage.com/
>



-- 
Antoine Blanchet
ABC Arbitrage Asset Management
http://www.abc-arbitrage.com/

-- 

*ABC arbitrage, partenaire officiel du skipper Jean-Pierre Dick // ABC 
arbitrage, official partner of skipper Jean-Pierre Dick // www.jpdick.com 
*
Please consider your environmental responsibility before printing this email
*
Ce message peut contenir des informations confidentielles. Les idées et 
opinions presentées dans ce message sont celles de son auteur, et ne 
représentent pas nécessairement celles du groupe ABC arbitrage.
Au cas où il ne vous serait pas destiné,merci d'en aviser l'expéditeur 
immédiatement et de le supprimer.

This message may contain confidential information. Any views or opinions 
presented are solely those of its author and do not necessarily represent 
those of ABC arbitrage. 
If you are not the intended recipient, please notify the sender immediately 
and delete it.
*



Re: Batch isolation within a single partition

2015-05-19 Thread DuyHai Doan
If RF > 1, the consistency level at QUORUM cannot guarantee strict
isolation (for normal mutation or batch). If you look at this slide:
http://www.slideshare.net/doanduyhai/cassandra-introduction-apache-con-2014-budapest/25,
you can see that the mutation is sent by the coordinator, in parallel, to
all replicas.

 Now it is very possible that due to network latency, the mutation is
applied on the first replica and is applied with "some delay" (which can be
at the order of microseconds) on other replicas.

 Theoretically, one client can read updated value on first replica and old
value on the other replicas, even at QUORUM.

 I think that reading at ALL may guarantee to have a "read isolation" in
practice but I'm not sure it can be considered "isolated" from the
theoretical definition of isolation.


On Tue, May 19, 2015 at 7:57 AM, Martin Krasser 
wrote:

>  Hi DuyHai,
>
> thanks for your answer. What if I set RF > 1 and the consistency level for
> reads and writes to QUORUM? Would that isolate the single-partition batch
> update from reads? (I do not consider node failures here between the write
> and the read(s)).
>
>
> On 19.05.15 07:50, DuyHai Doan wrote:
>
> Hello Martin
>
>  If, and only if you have RF=1, single partition mutations (including
> batches) are isolated.
>
>  Otherwise, with RF>1, even a simple UPDATE is not isolated because one
> client can read the updated value on one replica and another client reads
> the old value on another replica
>
>
>
> On Mon, May 18, 2015 at 12:32 PM, Martin Krasser 
> wrote:
>
>>  Hello,
>>
>> I have an application that inserts multiple rows within a single
>> partition (= all rows share the same partition key) using a BATCH
>> statement. Is it possible that other clients can partially read that batch
>> or is the batch application isolated i.e. other clients can only read all
>> rows of that batch or none of them?
>>
>> I understand that a BATCH update to multiple partitions is not isolated
>> but I'm not sure if this is also the case for a single partition:
>>
>> - The article Atomic batches in Cassandra 1.2
>>  says
>> that *"... we mean atomic in the database sense that if any part of the
>> batch succeeds, all of it will. No other guarantees are implied; in
>> particular, there is no isolation"*.
>>
>> - On the other hand, the CQL BATCH
>>  docs at
>> cassandra.apache.org mention that *"* *... the [batch] operations are
>> still only isolated within a single partition"* which is a clear
>> statement but doesn't it contradict the previous and the next one?
>>
>> - The CQL BATCH
>> 
>> docs at docs.datastax.com mention that *"... there is no batch
>> isolation. Clients are able to read the first updated rows from the batch,
>> while other rows are still being updated on the server. However,
>> transactional row updates within a partition key are isolated: clients
>> cannot read a partial update"*. Also, what does *"transactional row
>> updates"* mean in this context? A lightweight transaction? Something
>> else?
>>
>> Thanks for any help,
>> Martin
>>
>>
>
> --
> Martin Krasser
>
> blog:http://krasserm.github.io
> code:http://github.com/krasserm
> twitter: http://twitter.com/mrt1nz
>
>


cqlsh ValueError: Don't know how to parse type string

2015-05-19 Thread Kaushal Shriyan
Hi,

I am running cassandra version 1.2.19 and cqlsh version 3.1.8 in my setup

cqlsh:apprepo> select * from test_proxy_revisions_r21 limit 10 ;
Traceback (most recent call last):
  File "./cqlsh", line 1039, in perform_statement_untraced
self.cursor.execute(statement, decoder=decoder)
  File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py",
line 81, in execute
return self.process_execution_results(response, decoder=decoder)
  File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/thrifteries.py",
line 116, in process_execution_results
self.get_metadata_info(self.result[0])
  File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py",
line 97, in get_metadata_info
name, nbytes, vtype, ctype = self.get_column_metadata(colid)
  File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py",
line 104, in get_column_metadata
return self.decoder.decode_metadata_and_type(column_id)
  File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/decoders.py",
line 40, in decode_metadata_and_type
valdtype = cqltypes.lookup_casstype(validator)
  File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cqltypes.py",
line 145, in lookup_casstype
raise ValueError("Don't know how to parse type string %r: %s" %
(casstype, e))
ValueError: Don't know how to parse type string
'org.apache.cassandra.db.marshal.DynamicCompositeType(s=>org.apache.cassandra.db.marshal.UTF8Type)':
weird characters '=>org.apache.cassandra.db.marshal.UTF8Type)' at end

Any clue?

Regards,

Kaushal


Re: cqlsh ValueError: Don't know how to parse type string

2015-05-19 Thread DuyHai Doan
Hello Kaushal

Humm your schema is using the ancient DynamicCompositeType. Can you just
send a describe of the table in cassandra-cli ?

On Tue, May 19, 2015 at 9:44 AM, Kaushal Shriyan 
wrote:

> Hi,
>
> I am running cassandra version 1.2.19 and cqlsh version 3.1.8 in my setup
>
> cqlsh:apprepo> select * from test_proxy_revisions_r21 limit 10 ;
> Traceback (most recent call last):
>   File "./cqlsh", line 1039, in perform_statement_untraced
> self.cursor.execute(statement, decoder=decoder)
>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py", line 
> 81, in execute
> return self.process_execution_results(response, decoder=decoder)
>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/thrifteries.py", 
> line 116, in process_execution_results
> self.get_metadata_info(self.result[0])
>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py", line 
> 97, in get_metadata_info
> name, nbytes, vtype, ctype = self.get_column_metadata(colid)
>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py", line 
> 104, in get_column_metadata
> return self.decoder.decode_metadata_and_type(column_id)
>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/decoders.py", line 
> 40, in decode_metadata_and_type
> valdtype = cqltypes.lookup_casstype(validator)
>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cqltypes.py", line 
> 145, in lookup_casstype
> raise ValueError("Don't know how to parse type string %r: %s" % 
> (casstype, e))
> ValueError: Don't know how to parse type string 
> 'org.apache.cassandra.db.marshal.DynamicCompositeType(s=>org.apache.cassandra.db.marshal.UTF8Type)':
>  weird characters '=>org.apache.cassandra.db.marshal.UTF8Type)' at end
>
> Any clue?
>
> Regards,
>
> Kaushal
>


Re: Batch isolation within a single partition

2015-05-19 Thread Sylvain Lebresne
On Tue, May 19, 2015 at 9:42 AM, DuyHai Doan  wrote:

> If RF > 1, the consistency level at QUORUM cannot guarantee strict
> isolation (for normal mutation or batch). If you look at this slide:
> http://www.slideshare.net/doanduyhai/cassandra-introduction-apache-con-2014-budapest/25,
> you can see that the mutation is sent by the coordinator, in parallel, to
> all replicas.
>
>  Now it is very possible that due to network latency, the mutation is
> applied on the first replica and is applied with "some delay" (which can be
> at the order of microseconds) on other replicas.
>
>  Theoretically, one client can read updated value on first replica and old
> value on the other replicas, even at QUORUM.
>

Unfortunately different people will tend to have different definitions of
isolation and I don't seem to have the same definition than you but still,
I don't understand what you're talking about. Of course replicas might not
get a mutation at the same time, and yes, a read at QUORUM may thus not see
the most up to date value from all replicas. But the coordinator resolves
all responses together and return only the most recent one, so that doesn't
matter to the client and I don't see how that has anything to do with
isolation from the client perspective.

My response to the original question is that if by isolation you mean "can
a reader observe a write only partially applied", then for single partition
writes, Cassandra do offer isolation. One caveat however is that if 2
writes conflits, they are resolved using their timestamp and if the
timestamp are the same, resolution is based on values, which is not
necessarily intuitive and may make it sound like the writes where not
applied in isolation (even though technically they are), see
https://issues.apache.org/jira/browse/CASSANDRA-6123 for details on that
later problem. I'll note that my definition of isolation does not mean you
can't read stale data, and you can indeed if you use weak consistency
levels.

If you mean something else by isolation, then I think agreeing first on the
definition would be wise.

--
Sylvain


Re: Batch isolation within a single partition

2015-05-19 Thread Stefan Podkowinski
Multiple inserts for the same partition key within a batch will be consolidated 
into a single row update operation (since 2.0.6, 
#6737). Ie. you get the 
same row level isolation 
guarantees as any single write operation on that key.


Von: Martin Krasser [mailto:krass...@googlemail.com]
Gesendet: Montag, 18. Mai 2015 12:32
An: user@cassandra.apache.org
Betreff: Batch isolation within a single partition

Hello,

I have an application that inserts multiple rows within a single partition (= 
all rows share the same partition key) using a BATCH statement. Is it possible 
that other clients can partially read that batch or is the batch application 
isolated i.e. other clients can only read all rows of that batch or none of 
them?

I understand that a BATCH update to multiple partitions is not isolated but I'm 
not sure if this is also the case for a single partition:

- The article Atomic batches in Cassandra 
1.2 says that 
"... we mean atomic in the database sense that if any part of the batch 
succeeds, all of it will. No other guarantees are implied; in particular, there 
is no isolation".

- On the other hand, the CQL 
BATCH docs at 
cassandra.apache.org mention that " ... the [batch] operations are still only 
isolated within a single partition" which is a clear statement but doesn't it 
contradict the previous and the next one?

- The CQL 
BATCH docs 
at docs.datastax.com mention that "... there is no batch isolation. Clients are 
able to read the first updated rows from the batch, while other rows are still 
being updated on the server. However, transactional row updates within a 
partition key are isolated: clients cannot read a partial update". Also, what 
does "transactional row updates" mean in this context? A lightweight 
transaction? Something else?

Thanks for any help,
Martin


Re: Batch isolation within a single partition

2015-05-19 Thread Martin Krasser



On 19.05.15 10:04, Sylvain Lebresne wrote:
On Tue, May 19, 2015 at 9:42 AM, DuyHai Doan > wrote:


If RF > 1, the consistency level at QUORUM cannot guarantee strict
isolation (for normal mutation or batch). If you look at this
slide:

http://www.slideshare.net/doanduyhai/cassandra-introduction-apache-con-2014-budapest/25,
you can see that the mutation is sent by the coordinator, in
parallel, to all replicas.

 Now it is very possible that due to network latency, the mutation
is applied on the first replica and is applied with "some delay"
(which can be at the order of microseconds) on other replicas.

 Theoretically, one client can read updated value on first replica
and old value on the other replicas, even at QUORUM.


Unfortunately different people will tend to have different definitions 
of isolation and I don't seem to have the same definition than you but 
still, I don't understand what you're talking about. Of course 
replicas might not get a mutation at the same time, and yes, a read at 
QUORUM may thus not see the most up to date value from all replicas.


If I understand correctly, this can only happen if a QUORUM read started 
*before* the QUORUM write completed. If, on the other hand, a QUORUM 
read follows a *completed* QUORUM write, shouldn't the read always 
return the most recent value?


For example, with RF = 3 and QUORUM write + read, we have nodes_written 
+ nodes_read > RF (with nodes_written = nodes_read = 2) which guarantees 
consistency, or am I missing something?


But the coordinator resolves all responses together and return only 
the most recent one, so that doesn't matter to the client and I don't 
see how that has anything to do with isolation from the client 
perspective.


+1



My response to the original question is that if by isolation you mean 
"can a reader observe a write only partially applied", then for single 
partition writes, Cassandra do offer isolation.


Yes, this is exactly what I mean (and what I need for batch writes to a 
single partition).


One caveat however is that if 2 writes conflits, they are resolved 
using their timestamp and if the timestamp are the same, resolution is 
based on values, which is not necessarily intuitive and may make it 
sound like the writes where not applied in isolation (even though 
technically they are), see 
https://issues.apache.org/jira/browse/CASSANDRA-6123 for details on 
that later problem. I'll note that my definition of isolation does not 
mean you can't read stale data, and you can indeed if you use weak 
consistency levels.


I completely share your view/definition of isolation - it's not about 
staleness, it's only about that a reader cannot observe partial writes.


Regarding staleness/consistency, if I want to read the most recent 
batch-write a QUORUM read must follow the completed QUORUM (batch) 
write, right?


Thanks for your clarifications,
Martin



If you mean something else by isolation, then I think agreeing first 
on the definition would be wise.


--
Sylvain


--
Martin Krasser

blog:http://krasserm.github.io
code:http://github.com/krasserm
twitter: http://twitter.com/mrt1nz



Re: Batch isolation within a single partition

2015-05-19 Thread Martin Krasser



On 19.05.15 10:38, Stefan Podkowinski wrote:


Multiple inserts for the same partition key within a batch will be 
consolidated into a single row update operation (since 2.0.6, #6737 
). Ie. you get 
the same row level isolation 
 guarantees as 
any single write operation on that key.




This is exactly what I need, thanks for clarifying and sharing the link(s).


*Von:*Martin Krasser [mailto:krass...@googlemail.com]
*Gesendet:* Montag, 18. Mai 2015 12:32
*An:* user@cassandra.apache.org
*Betreff:* Batch isolation within a single partition

Hello,

I have an application that inserts multiple rows within a single 
partition (= all rows share the same partition key) using a BATCH 
statement. Is it possible that other clients can partially read that 
batch or is the batch application isolated i.e. other clients can only 
read all rows of that batch or none of them?


I understand that a BATCH update to multiple partitions is not 
isolated but I'm not sure if this is also the case for a single partition:


- The article Atomic batches in Cassandra 1.2 
 
says that /"... we mean atomic in the database sense that if any part 
of the batch succeeds, all of it will. No other guarantees are 
implied; in particular, there is no isolation"/.


- On the other hand, the CQL BATCH 
 docs at 
cassandra.apache.org mention that /"/ /... the [batch] operations are 
still only isolated within a single partition"/ which is a clear 
statement but doesn't it contradict the previous and the next one?


- The CQL BATCH 
 
docs at docs.datastax.com mention that /"... there is no batch 
isolation. Clients are able to read the first updated rows from the 
batch, while other rows are still being updated on the server. 
However, transactional row updates within a partition key are 
isolated: clients cannot read a partial update"/. Also, what does 
/"transactional row updates"/ mean in this context? A lightweight 
transaction? Something else?


Thanks for any help,
Martin



--
Martin Krasser

blog:http://krasserm.github.io
code:http://github.com/krasserm
twitter: http://twitter.com/mrt1nz



Re: cqlsh ValueError: Don't know how to parse type string

2015-05-19 Thread Kaushal Shriyan
Hi DuyHai Doan,

Please find the below details

/opt/test4/share/apache-cassandra/bin/cassandra-cli -h 0
Connected to: “test” on 0/9160
Welcome to Cassandra CLI version 1.2.19

Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.

[default@unknown] use testrepo;
Authenticated to keyspace: testrepo
[default@testrepo] describe test_proxy_revisions_r21;

WARNING: CQL3 tables are intentionally omitted from 'describe' output.
See https://issues.apache.org/jira/browse/CASSANDRA-4377 for details.

ColumnFamily: test_proxy_revisions_r21
  Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
  Default column value validator:
org.apache.cassandra.db.marshal.BytesType
  Cells sorted by:
org.apache.cassandra.db.marshal.DynamicCompositeType(s=>org.apache.cassandra.db.marshal.UTF8Type)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 0.1
  DC Local Read repair chance: 0.0
  Populate IO Cache on flush: false
  Replicate on write: true
  Caching: KEYS_ONLY
  Bloom Filter FP chance: default
  Built indexes: []
  Compaction Strategy:
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
  Compression Options:
sstable_compression:
org.apache.cassandra.io.compress.SnappyCompressor
[default@testrepo]


Please do let me know if you need any additional information.

Regards,

Kaushal

On Tue, May 19, 2015 at 1:22 PM, DuyHai Doan  wrote:

> Hello Kaushal
>
> Humm your schema is using the ancient DynamicCompositeType. Can you just
> send a describe of the table in cassandra-cli ?
>
> On Tue, May 19, 2015 at 9:44 AM, Kaushal Shriyan  > wrote:
>
>> Hi,
>>
>> I am running cassandra version 1.2.19 and cqlsh version 3.1.8 in my setup
>>
>> cqlsh:apprepo> select * from test_proxy_revisions_r21 limit 10 ;
>> Traceback (most recent call last):
>>   File "./cqlsh", line 1039, in perform_statement_untraced
>> self.cursor.execute(statement, decoder=decoder)
>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py", line 
>> 81, in execute
>> return self.process_execution_results(response, decoder=decoder)
>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/thrifteries.py", 
>> line 116, in process_execution_results
>> self.get_metadata_info(self.result[0])
>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py", line 
>> 97, in get_metadata_info
>> name, nbytes, vtype, ctype = self.get_column_metadata(colid)
>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py", line 
>> 104, in get_column_metadata
>> return self.decoder.decode_metadata_and_type(column_id)
>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/decoders.py", 
>> line 40, in decode_metadata_and_type
>> valdtype = cqltypes.lookup_casstype(validator)
>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cqltypes.py", 
>> line 145, in lookup_casstype
>> raise ValueError("Don't know how to parse type string %r: %s" % 
>> (casstype, e))
>> ValueError: Don't know how to parse type string 
>> 'org.apache.cassandra.db.marshal.DynamicCompositeType(s=>org.apache.cassandra.db.marshal.UTF8Type)':
>>  weird characters '=>org.apache.cassandra.db.marshal.UTF8Type)' at end
>>
>> Any clue?
>>
>> Regards,
>>
>> Kaushal
>>
>
>


Re: Consistency Issues

2015-05-19 Thread Jared Rodriguez
It looks like NTP was the problem.  Thanks for the solution!!!

On Wed, May 13, 2015 at 9:20 AM, Robert Wille  wrote:

>  Timestamps have millisecond granularity. If you make multiple writes
> within the same millisecond, then the outcome is not deterministic.
>
>  Also, make sure you are running ntp. Clock skew will manifest itself
> similarly.
>
>  On May 13, 2015, at 3:47 AM, Jared Rodriguez 
> wrote:
>
>  Thanks for the feedback.  We have dug in deeper and upgraded to
> Cassandra 2.0.14 and are seeing the same issue.  What appears to be
> happening is that if a record is initially written, then the first read is
> fine.  But if we immediately update that record with a second write, that
> then the second read is problematic.
>
>  We have a 4 node cluster and a replication factor of 2.  What seems to
> be happening on the initial write the record is sent to nodes A and B.  If
> a secondary write (update) of the record occurs while the record is in the
> memtable and not yet written to the sstable of A or B, that the next read
> returns nothing.
>
>  We are continuing to dig in and get as much detail as possible before
> opening this as a JIRA.
>
> On Tue, May 12, 2015 at 6:51 PM, Robert Coli  wrote:
>
>>  On Tue, May 12, 2015 at 12:35 PM, Michael Shuler > > wrote:
>>
>>>  This is a 4 node cluster running Cassandra 2.0.6

>>>
>>> Can you reproduce the same issue on 2.0.14? (or better yet, the
>>> cassandra-2.0 branch HEAD, which will soon ship 2.0.15) If you get the same
>>> results, please, open a JIRA with the reproduction steps.
>>
>>
>>  And if you do file such a JIRA, please let the list know the JIRA URL,
>> to close the loop!
>>
>>  =Rob
>>
>>
>
>
>
>  --
> Jared Rodriguez
>
>
>


-- 
Jared Rodriguez


Re: cqlsh ValueError: Don't know how to parse type string

2015-05-19 Thread Kaushal Shriyan
Hi DuyHai Doan,

I am looking forward for your reply and please do let me know if you need
any other additional information.

Regards,

Kaushal

On Tue, May 19, 2015 at 2:36 PM, Kaushal Shriyan 
wrote:

> Hi DuyHai Doan,
>
> Please find the below details
>
> /opt/test4/share/apache-cassandra/bin/cassandra-cli -h 0
> Connected to: “test” on 0/9160
> Welcome to Cassandra CLI version 1.2.19
>
> Type 'help;' or '?' for help.
> Type 'quit;' or 'exit;' to quit.
>
> [default@unknown] use testrepo;
> Authenticated to keyspace: testrepo
> [default@testrepo] describe test_proxy_revisions_r21;
>
> WARNING: CQL3 tables are intentionally omitted from 'describe' output.
> See https://issues.apache.org/jira/browse/CASSANDRA-4377 for details.
>
> ColumnFamily: test_proxy_revisions_r21
>   Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
>   Default column value validator:
> org.apache.cassandra.db.marshal.BytesType
>   Cells sorted by:
> org.apache.cassandra.db.marshal.DynamicCompositeType(s=>org.apache.cassandra.db.marshal.UTF8Type)
>   GC grace seconds: 864000
>   Compaction min/max thresholds: 4/32
>   Read repair chance: 0.1
>   DC Local Read repair chance: 0.0
>   Populate IO Cache on flush: false
>   Replicate on write: true
>   Caching: KEYS_ONLY
>   Bloom Filter FP chance: default
>   Built indexes: []
>   Compaction Strategy:
> org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
>   Compression Options:
> sstable_compression:
> org.apache.cassandra.io.compress.SnappyCompressor
> [default@testrepo]
>
>
> Please do let me know if you need any additional information.
>
> Regards,
>
> Kaushal
>
> On Tue, May 19, 2015 at 1:22 PM, DuyHai Doan  wrote:
>
>> Hello Kaushal
>>
>> Humm your schema is using the ancient DynamicCompositeType. Can you just
>> send a describe of the table in cassandra-cli ?
>>
>> On Tue, May 19, 2015 at 9:44 AM, Kaushal Shriyan <
>> kaushalshri...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am running cassandra version 1.2.19 and cqlsh version 3.1.8 in my setup
>>>
>>> cqlsh:apprepo> select * from test_proxy_revisions_r21 limit 10 ;
>>> Traceback (most recent call last):
>>>   File "./cqlsh", line 1039, in perform_statement_untraced
>>> self.cursor.execute(statement, decoder=decoder)
>>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py", line 
>>> 81, in execute
>>> return self.process_execution_results(response, decoder=decoder)
>>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/thrifteries.py", 
>>> line 116, in process_execution_results
>>> self.get_metadata_info(self.result[0])
>>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py", line 
>>> 97, in get_metadata_info
>>> name, nbytes, vtype, ctype = self.get_column_metadata(colid)
>>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cursor.py", line 
>>> 104, in get_column_metadata
>>> return self.decoder.decode_metadata_and_type(column_id)
>>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/decoders.py", 
>>> line 40, in decode_metadata_and_type
>>> valdtype = cqltypes.lookup_casstype(validator)
>>>   File "./../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/cqltypes.py", 
>>> line 145, in lookup_casstype
>>> raise ValueError("Don't know how to parse type string %r: %s" % 
>>> (casstype, e))
>>> ValueError: Don't know how to parse type string 
>>> 'org.apache.cassandra.db.marshal.DynamicCompositeType(s=>org.apache.cassandra.db.marshal.UTF8Type)':
>>>  weird characters '=>org.apache.cassandra.db.marshal.UTF8Type)' at end
>>>
>>> Any clue?
>>>
>>> Regards,
>>>
>>> Kaushal
>>>
>>
>>
>


Re: Fail to add a node to a cluster - Unknown keyspace system_traces

2015-05-19 Thread Tzach Livyatan
More finding on the problem:
1. the problem present itself when using nodetool status

$ nodetool status
error: Unknown keyspace system_traces
-- StackTrace --
java.lang.AssertionError: Unknown keyspace system_traces
at org.apache.cassandra.db.Keyspace.(Keyspace.java:270)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:119)
at org.apache.cassandra.db.Keyspace.open(Keyspace.java:96)
...

2. the problem disappear when I create an empty keyspace

cqlsh> create keyspace temp WITH REPLICATION = { 'class' :
'SimpleStrategy', 'replication_factor' : 2 };

$ nodetool status
Datacenter: datacenter1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Tokens  OwnsHost ID
Rack
UN  172.31.44.38  118.31 KB  256 ?
e5e97978-b048-46d8-936b-88b544459856  rack1
UN  172.31.44.39  106.48 KB  256 ?
6b021cd9-2fd0-44f4-afea-d01f0b64c45c  rack1

My guess is system_traces initialization complete only after any data
insertion.
Before it does, any attempt to read  from it either from nodetool, cqlsh or
streaming to a new node will fail.


On Mon, May 18, 2015 at 3:33 PM, Tzach Livyatan 
wrote:

> I have a dev cluster of two Cassandra 2.12 servers on EC2
> When adding a new server, I get a
> "Streaming error occurred java.lang.AssertionError: Unknown keyspace
> system_traces"
> exception on the cluster (not the new) server (full log below).
>
> Indeed, when I cqlsh to the cluster server, I see the following:
> cqlsh> DESCRIBE KEYSPACES;
>
> system_traces  system
>
> cqlsh> use system_traces;
> code=2200 [Invalid query] message="Keyspace 'system_traces' does not exist"
>
> While
> cqlsh> DESCRIBE KEYSPACE system_traces;
> Do works!
>
> Is it a bug? feature?
>
> Thanks
> Tzach
>
>
>
> Full log from adding a node:
> INFO  [STREAM-INIT-/172.31.19.130:48054] 2015-05-18 11:36:17,734
> StreamResultFuture.java:109 - [Stream #18f25dc0-fd52-11e4-a00a-c1b81a3b68f5
> ID#0] Creating new streaming plan for Bootst
> rap
> INFO  [STREAM-INIT-/172.31.19.130:48054] 2015-05-18 11:36:17,735
> StreamResultFuture.java:116 - [Stream
> #18f25dc0-fd52-11e4-a00a-c1b81a3b68f5, ID#0] Received streaming plan for
> Bootstrap
> INFO  [STREAM-INIT-/172.31.19.130:48055] 2015-05-18 11:36:17,736
> StreamResultFuture.java:116 - [Stream
> #18f25dc0-fd52-11e4-a00a-c1b81a3b68f5, ID#0] Received streaming plan for
> Bootstrap
> ERROR [STREAM-IN-/172.31.19.130] 2015-05-18 11:36:17,777
> StreamSession.java:472 - [Stream #18f25dc0-fd52-11e4-a00a-c1b81a3b68f5]
> Streaming error occurred
> java.lang.AssertionError: Unknown keyspace system_traces
> at org.apache.cassandra.db.Keyspace.(Keyspace.java:273)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at org.apache.cassandra.db.Keyspace.open(Keyspace.java:122)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at org.apache.cassandra.db.Keyspace.open(Keyspace.java:99)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.streaming.StreamSession.getColumnFamilyStores(StreamSession.java:280)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.streaming.StreamSession.addTransferRanges(StreamSession.java:257)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.streaming.StreamSession.prepare(StreamSession.java:488)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:420)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:251)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
>


Re: Out of memory on wide row read

2015-05-19 Thread Jack Krupansky
Shame on me for not noticing that you uttered the magic anti-pattern word -
Thrift. Yeah, the standard response to any inquiry concerning Thrift is
always that you should be migrating to CQL3.

-- Jack Krupansky

On Tue, May 19, 2015 at 3:13 AM, Antoine Blanchet <
a.blanc...@abc-arbitrage.com> wrote:

> The issue has been closed by Jonathan Ellis. The limit is useless in CQL
> because of the automatic paging feature
> ,
> that's cool. But this feature will not be add to the Thrift API. Subject
> closed :).
>
> On Mon, May 18, 2015 at 6:05 PM, Antoine Blanchet <
> a.blanc...@abc-arbitrage.com> wrote:
>
>> Done, https://issues.apache.org/jira/browse/CASSANDRA-9413 . Feel free
>> to improve the description, I've only copy/paste the first message from
>> Kévin.
>>
>> Thanks.
>>
>> On Fri, May 15, 2015 at 9:56 PM, Alprema  wrote:
>>
>>> I William file a jira for that, thanks
>>> On May 12, 2015 10:15 PM, "Jack Krupansky" 
>>> wrote:
>>>
 Sounds like it's worth a Jira - Cassandra should protect itself from
 innocent mistakes or excessive requests from clients. Maybe there should be
 a timeout or result size (bytes in addition to count) limit. Something.
 Anything. But OOM seems a tad unfriendly for an innocent mistake. In this
 particular case, maybe Cassandra could detect the total row size/slice
 being read and error out on a configurable limiit.

 -- Jack Krupansky

 On Tue, May 12, 2015 at 1:57 PM, Robert Coli 
 wrote:

> On Tue, May 12, 2015 at 8:43 AM, Kévin LOVATO 
> wrote:
>
>> My question is the following: Is it possible to prevent Cassandra
>> from OOM'ing when a client does this kind of requests? I'd rather have an
>> error thrown to the client than a multi-server crash.
>>
>
> You can provide a default LIMIT clause, but this is based on number of
> results and not size.
>
> Other than that, there are not really great options.
>
> =Rob
>
>


>>
>>
>> --
>> Antoine Blanchet
>> ABC Arbitrage Asset Management
>> http://www.abc-arbitrage.com/
>>
>
>
>
> --
> Antoine Blanchet
> ABC Arbitrage Asset Management
> http://www.abc-arbitrage.com/
>
>
> 
> *ABC arbitrage, partenaire officiel du skipper Jean-Pierre Dick // ABC
> arbitrage, official partner of skipper Jean-Pierre Dick // www.jpdick.com
> *
> Please consider your environmental responsibility before printing this
> email
>
> *
> Ce message peut contenir des informations confidentielles. Les idées et
> opinions presentées dans ce message sont celles de son auteur, et ne
> représentent pas nécessairement celles du groupe ABC arbitrage.
> Au cas où il ne vous serait pas destiné,merci d'en aviser l'expéditeur
> immédiatement et de le supprimer.
>
> This message may contain confidential information. Any views or opinions
> presented are solely those of its author and do not necessarily represent
> those of ABC arbitrage.
> If you are not the intended recipient, please notify the sender
> immediately and delete it.
>
> *
>
>


spout storm cassandra

2015-05-19 Thread Vanessa Gligor
Is there any way to consume inputs from Cassandra using a Storm spout?

Thank you.


[BETA-RELEASE] Apache Cassandra 2.2.0-beta1 released

2015-05-19 Thread Jake Luciani
The Cassandra team is pleased to announce the release of Apache Cassandra
version 2.2.0-beta1.

This release is *not* production ready. We are looking for testing of
existing and new features. If you encounter any problem please let us know
[1].

Cassandra 2.2 features major enhancements such as:

* Resume-able Bootstrapping
* JSON Support [4]
* User Defined Functions [5]
* Server-side Aggregation [6]
* Role based access control

Read [2] and [3] to learn about all the new features.

Downloads of source and binary distributions are listed in our download
section:

http://cassandra.apache.org/download/

Enjoy!

-The Cassandra Team

[1]: https://issues.apache.org/jira/browse/CASSANDRA
[2]: http://goo.gl/MyOEib (NEWS.txt)
[3]: http://goo.gl/MBJd1S (CHANGES.txt)
[4]: http://cassandra.apache.org/doc/cql3/CQL-2.2.html#json
[5]: http://cassandra.apache.org/doc/cql3/CQL-2.2.html#udfs
[6]: http://cassandra.apache.org/doc/cql3/CQL-2.2.html#udas


Re: spout storm cassandra

2015-05-19 Thread Manoj Khangaonkar
Hi,

Storm spouts are supposed to read from somewhere - preferably streams.

In theory, you could write a spout that queries cassandra and makes data
available to storm.

But remember that cassandra datamodel is based on partitioning based on key
and the use of wide columns. So if the your data model is such that the
spout can query Cassandra appropriately, it can be possible.

But architecturally , you might ask yourself if there are better of ways of
doing this in your environment.The data is produced somewhere. It might be
possible to consume it from the source, rather than write to Cassandra and
consume it from there.

regards


On Tue, May 19, 2015 at 6:36 AM, Vanessa Gligor 
wrote:

> Is there any way to consume inputs from Cassandra using a Storm spout?
>
> Thank you.
>



-- 
http://khangaonkar.blogspot.com/


Does Cassandra CQL supports 'Create Table as Select'?

2015-05-19 Thread amit tewari
Hi

We would like to have the ability of being able to create new tables from
existing tables (on the fly), but with with a new/different partition key.

Can this be done from CQL?

Thanks
Amit


Re: Does Cassandra CQL supports 'Create Table as Select'?

2015-05-19 Thread Jonathan Haddad
It's not built into Cassandra.  You'll probably want to take a look at
Apache Spark & the DataStax connector.

https://github.com/datastax/spark-cassandra-connector

Jon

On Tue, May 19, 2015 at 10:29 PM amit tewari  wrote:

> Hi
>
> We would like to have the ability of being able to create new tables from
> existing tables (on the fly), but with with a new/different partition key.
>
> Can this be done from CQL?
>
> Thanks
> Amit
>


Re: Does Cassandra CQL supports 'Create Table as Select'?

2015-05-19 Thread Jonathan Haddad
Here's a simple example I did a little while ago that might be helpful:
https://github.com/rustyrazorblade/spark-data-migration

On Tue, May 19, 2015 at 10:53 PM Jonathan Haddad  wrote:

> It's not built into Cassandra.  You'll probably want to take a look at
> Apache Spark & the DataStax connector.
>
> https://github.com/datastax/spark-cassandra-connector
>
> Jon
>
> On Tue, May 19, 2015 at 10:29 PM amit tewari 
> wrote:
>
>> Hi
>>
>> We would like to have the ability of being able to create new tables from
>> existing tables (on the fly), but with with a new/different partition key.
>>
>> Can this be done from CQL?
>>
>> Thanks
>> Amit
>>
>