Re: Returning an UDT from a user defined function (UDF)

2016-04-07 Thread Henry M
Whatever I wanted to do does not seem to be possible (probably a limitation
or a bug)... I see a way to get the KeyspaceMetadata and from that get the
UserType instance (code lines 1 & 2 below).

1.)

org.apache.cassandra.schema.KeyspaceMetadata ksm =
org.apache.cassandra.config.Schema.instance.getKSMetaData("test_ks");


2.)

com.datastax.driver.core.UserType myUdt = ksm.types.get("my_other_udt").get();


But this fails with the below error because Schema is not a whitelisted
package. It probably should not be whitelisted but there should be a way to
create and return a user defined type.

:88:InvalidRequest: code=2200 [Invalid query] message="Could not
compile function 'test_ks.transform_udt' from Java source:
org.apache.cassandra.exceptions.InvalidRequestException: Java source
compilation failed:
Line 4: org.apache.cassandra.schema.KeyspaceMetadata cannot be resolved to
a type
Line 4: org.apache.cassandra.config.Schema.instance cannot be resolved to a
type
"
:90:InvalidRequest: code=2200 [Invalid query] message="Unknown
function 'extract_text_field_sample_udt'"

My updated UDF for complete context.

CREATE OR REPLACE FUNCTION test_ks.transform_udt (val my_udt)
 RETURNS NULL ON NULL INPUT
 RETURNS my_other_udt
 LANGUAGE java
  AS '
String fieldA = val.getString("field_a");

org.apache.cassandra.schema.KeyspaceMetadata ksm =
org.apache.cassandra.config.Schema.instance.getKSMetaData("test_ks");

com.datastax.driver.core.UserType myUdt =
ksm.types.get("my_other_udt").get();

com.datastax.driver.core.UDTValue transformedValue = myUdt.newValue();

transformedValue.setUUID("id", java.util.UUID.randomUUID());
transformedValue.setString("field_a", fieldA);
transformedValue.setString("field_b", "value b");

return transformedValue;
  ';


On Thu, Apr 7, 2016 at 7:40 PM Henry M  wrote:

> I was wondering if it is possible to create an UDT and return it within a
> user defined function.
>
> I looked at this documentation
> http://docs.datastax.com/en/cql/3.3/cql/cql_using/useCreateUDF.html but
> the examples are only for basic types.
>
> This is my pseudo code I came up with... the part I think I am missing is
> how to get an instance of the UserType so that I can invoke newValue to
> create a UDTValue.
>
> Has anyone done this and know how to get the keyspace in order to call
> getUserType? Or know of an alternate approach?
>
> CREATE OR REPLACE FUNCTION test_ks.transform_udt (val my_udt)
>  RETURNS NULL ON NULL INPUT
>  RETURNS my_other_udt
>  LANGUAGE java
>   AS '
> String fieldA = val.getString("field_a");
>
> // How do you get a reference the user type?
> UserType myUdt = ?keyspace?.getUserType("my_other_udt");
>
> UDTValue transformedValue = myUdt.newValue();
>
> transformedValue.setUUID("id", UUID.randomUUID());
> transformedValue.setString("field_a", fieldA);
> transformedValue.setString("field_b", "value b");
>
> return transformedValue;
>   ';
>
>
> Thank you,
> Henry
>
>
> P.S. This is the setup for my sample table and types.
>
> drop keyspace test_ks;
>
> create keyspace test_ks WITH REPLICATION = { 'class' : 'SimpleStrategy', 
> 'replication_factor' : 1 };
>
> use test_ks;
>
> CREATE TYPE IF NOT EXISTS test_ks.my_udt (field_a text, field_b text);
> CREATE TYPE IF NOT EXISTS test_ks.my_other_udt (id uuid, field_a text, 
> field_b text);
>
> CREATE TABLE IF NOT EXISTS test_ks.sample_table(id uuid primary key, col_a 
> frozen);
>
> INSERT INTO sample_table(id, col_a) VALUES ( now() , { field_a: 'value 1', 
> field_b: 'value 2'} );
> INSERT INTO sample_table(id) VALUES ( now() );
>
>
>
>
>
>


Re: Mapping a continuous range to a discrete value

2016-04-07 Thread Henry M
I had to do something similar (in my case it was an IN  query)... I ended
up writing hack in java to create a custom Expression and injecting into
the RowFilter of a dummy secondary index (not advisable and very short term
but it keeps my application code clean). I am keeping my eyes open for the
evolution of SASI indexes (starting with cassandra 3.4
https://github.com/apache/cassandra/blob/trunk/doc/SASI.md) which should do
what you are looking.



On Thu, Apr 7, 2016 at 11:06 AM Mitch Gitman  wrote:

> I just happened to run into a similar situation myself and I can see it's
> through a bad schema design (and query design) on my part. What I wanted to
> do was narrow down by the range on one clustering column and then by
> another range on the next clustering column. Failing to adequately think
> through how Cassandra stores its sorted rows on disk, I just figured, hey,
> why not?
>
> The result? The same error message you got. But then, going back over some
> old notes from a DataStax CQL webinar, I came across this (my words):
>
> "You can do selects with combinations of the different primary keys
> including ranges on individual columns. The range will only work if you've
> narrowed things down already by equality on all the prior columns.
> Cassandra creates a composite type to store the column name."
>
> My new solution in response. Create two tables: one that's sorted by (in
> my situation) a high timestamp, the other that's sorted by (in my
> situation) a low timestamp. What had been two clustering columns gets
> broken up into one clustering column each in two different tables. Then I
> do two queries, one with the one range, the other with the other, and I
> programmatically merge the results.
>
> The funny thing is, that was my original design which my most recent, and
> failed, design is replacing. My new solution goes back to my old solution.
>
> On Thu, Apr 7, 2016 at 1:37 AM, Peer, Oded  wrote:
>
>> I have a table mapping continuous ranges to discrete values.
>>
>>
>>
>> CREATE TABLE range_mapping (k int, lower int, upper int, mapped_value
>> int, PRIMARY KEY (k, lower, upper));
>>
>> INSERT INTO range_mapping (k, lower, upper, mapped_value) VALUES (0, 0,
>> 99, 0);
>>
>> INSERT INTO range_mapping (k, lower, upper, mapped_value) VALUES (0, 100,
>> 199, 100);
>>
>> INSERT INTO range_mapping (k, lower, upper, mapped_value) VALUES (0, 200,
>> 299, 200);
>>
>>
>>
>> I then want to query this table to find mapping of a specific value.
>>
>> In SQL I would use: *select mapped_value from range_mapping where k=0
>> and ? between lower and upper*
>>
>>
>>
>> If the variable is bound to the value 150 then the mapped_value returned
>> is 100.
>>
>>
>>
>> I can’t use the same type of query in CQL.
>>
>> Using the query “*select * from range_mapping where k = 0 and lower <=
>> 150 and upper >= 150;*” returns an error "Clustering column "upper"
>> cannot be restricted (preceding column "lower" is restricted by a non-EQ
>> relation)"
>>
>>
>>
>> I thought of using multi-column restrictions but they don’t work as I
>> expected as the following query returns two rows instead of the one I
>> expected:
>>
>>
>>
>> *select * from range_mapping where k = 0 and (lower,upper) <= (150,999)
>> and (lower,upper) >= (-999,150);*
>>
>>
>>
>> k | lower | upper | mapped_value
>>
>> ---+---+---+--
>>
>> 0 | 0 |99 |0
>>
>> 0 |   100 |   199 |  100
>>
>>
>>
>> I’d appreciate any thoughts on the subject.
>>
>>
>>
>
>


Returning an UDT from a user defined function (UDF)

2016-04-07 Thread Henry M
I was wondering if it is possible to create an UDT and return it within a
user defined function.

I looked at this documentation
http://docs.datastax.com/en/cql/3.3/cql/cql_using/useCreateUDF.html but the
examples are only for basic types.

This is my pseudo code I came up with... the part I think I am missing is
how to get an instance of the UserType so that I can invoke newValue to
create a UDTValue.

Has anyone done this and know how to get the keyspace in order to call
getUserType? Or know of an alternate approach?

CREATE OR REPLACE FUNCTION test_ks.transform_udt (val my_udt)
 RETURNS NULL ON NULL INPUT
 RETURNS my_other_udt
 LANGUAGE java
  AS '
String fieldA = val.getString("field_a");

// How do you get a reference the user type?
UserType myUdt = ?keyspace?.getUserType("my_other_udt");

UDTValue transformedValue = myUdt.newValue();

transformedValue.setUUID("id", UUID.randomUUID());
transformedValue.setString("field_a", fieldA);
transformedValue.setString("field_b", "value b");

return transformedValue;
  ';


Thank you,
Henry


P.S. This is the setup for my sample table and types.

drop keyspace test_ks;

create keyspace test_ks WITH REPLICATION = { 'class' :
'SimpleStrategy', 'replication_factor' : 1 };

use test_ks;

CREATE TYPE IF NOT EXISTS test_ks.my_udt (field_a text, field_b text);
CREATE TYPE IF NOT EXISTS test_ks.my_other_udt (id uuid, field_a text,
field_b text);

CREATE TABLE IF NOT EXISTS test_ks.sample_table(id uuid primary key,
col_a frozen);

INSERT INTO sample_table(id, col_a) VALUES ( now() , { field_a: 'value
1', field_b: 'value 2'} );
INSERT INTO sample_table(id) VALUES ( now() );


Re: Mapping a continuous range to a discrete value

2016-04-07 Thread Mitch Gitman
I just happened to run into a similar situation myself and I can see it's
through a bad schema design (and query design) on my part. What I wanted to
do was narrow down by the range on one clustering column and then by
another range on the next clustering column. Failing to adequately think
through how Cassandra stores its sorted rows on disk, I just figured, hey,
why not?

The result? The same error message you got. But then, going back over some
old notes from a DataStax CQL webinar, I came across this (my words):

"You can do selects with combinations of the different primary keys
including ranges on individual columns. The range will only work if you've
narrowed things down already by equality on all the prior columns.
Cassandra creates a composite type to store the column name."

My new solution in response. Create two tables: one that's sorted by (in my
situation) a high timestamp, the other that's sorted by (in my situation) a
low timestamp. What had been two clustering columns gets broken up into one
clustering column each in two different tables. Then I do two queries, one
with the one range, the other with the other, and I programmatically merge
the results.

The funny thing is, that was my original design which my most recent, and
failed, design is replacing. My new solution goes back to my old solution.

On Thu, Apr 7, 2016 at 1:37 AM, Peer, Oded  wrote:

> I have a table mapping continuous ranges to discrete values.
>
>
>
> CREATE TABLE range_mapping (k int, lower int, upper int, mapped_value int,
> PRIMARY KEY (k, lower, upper));
>
> INSERT INTO range_mapping (k, lower, upper, mapped_value) VALUES (0, 0,
> 99, 0);
>
> INSERT INTO range_mapping (k, lower, upper, mapped_value) VALUES (0, 100,
> 199, 100);
>
> INSERT INTO range_mapping (k, lower, upper, mapped_value) VALUES (0, 200,
> 299, 200);
>
>
>
> I then want to query this table to find mapping of a specific value.
>
> In SQL I would use: *select mapped_value from range_mapping where k=0 and
> ? between lower and upper*
>
>
>
> If the variable is bound to the value 150 then the mapped_value returned
> is 100.
>
>
>
> I can’t use the same type of query in CQL.
>
> Using the query “*select * from range_mapping where k = 0 and lower <=
> 150 and upper >= 150;*” returns an error "Clustering column "upper"
> cannot be restricted (preceding column "lower" is restricted by a non-EQ
> relation)"
>
>
>
> I thought of using multi-column restrictions but they don’t work as I
> expected as the following query returns two rows instead of the one I
> expected:
>
>
>
> *select * from range_mapping where k = 0 and (lower,upper) <= (150,999)
> and (lower,upper) >= (-999,150);*
>
>
>
> k | lower | upper | mapped_value
>
> ---+---+---+--
>
> 0 | 0 |99 |0
>
> 0 |   100 |   199 |  100
>
>
>
> I’d appreciate any thoughts on the subject.
>
>
>


RE: Removing a DC

2016-04-07 Thread Anubhav Kale
Yes, that was it. Thanks a lot !!

From: Joel Knighton [mailto:joel.knigh...@datastax.com]
Sent: Thursday, April 7, 2016 10:02 AM
To: user@cassandra.apache.org
Subject: Re: Removing a DC

This sounds most like 
https://issues.apache.org/jira/browse/CASSANDRA-10371.

Are you on a version that could be affected by this issue?

Best,
Joel

On Thu, Apr 7, 2016 at 11:51 AM, Anubhav Kale 
mailto:anubhav.k...@microsoft.com>> wrote:
Hello,

We removed a DC using instructions from 
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_decomission_dc_t.html

After all nodes were gone,


1.   System.peers don’t have an entry for the nodes that were removed. 
(confirmed via a cqlsh query with consistency all)

2.   Nodetool describecluster don’t show them

3.   Nodetool gossipinfo show them as “LEFT”.

However, logs continue to spew below and restarting the node doesn’t get rid of 
this. I am thinking a rolling restart of all nodes might fix it, but I am 
curious as to where is this information still held ? I don’t think this is 
causing any badness to the cluster, but I would like to get rid of this if 
possible.

INFO  [GossipStage:83] 2016-04-07 16:38:07,859  Gossiper.java:998 - InetAddress 
/10.1.200.14
 is now DOWN
INFO  [GossipStage:83] 2016-04-07 16:38:07,861  StorageService.java:1914 - 
Removing tokens[BLAH] for 
/10.1.200.14

Thanks !



--

[https://lh6.googleusercontent.com/zujWCxiKGPoNTpU9rYwVo1GIvn0MXxFMiHJMgB_UZdESbDD0cn0gKnUw4UcYkyjI3uEbnxwVcjrGuvnXaTR4FoTt3F6u2whn_5qLRoExWTqSMYpv-9OtXIysw9rN5mXz564v5oY]

Joel Knighton
Cassandra Developer | 
joel.knigh...@datastax.com

[https://lh3.googleusercontent.com/yF5_lV4h7D2UxEplm05DKDejwQH2rN54kQRIAMwYl3cOFAWVwodcGTBiCsOkOVBSKD644UsZpdwaHZTpN4TH56tTGAjI6X0QAPdSy3BHUHKM_L6-38Z4iZnaju6iTXyRNFFJYYs][https://lh6.googleusercontent.com/rfmg_Y2DLoU0blnPfsgAXGhlvS5qFs_Env_XvgglzoC8oYjpMrXkeeGtCOs1n4O-c4hWt7sB6JP_bVoshrvSDRs9d6t2h-94rPgih78BO7eizEHIkojHoFjFlbp9ev6VXowy9Uc][https://lh5.googleusercontent.com/f51dqYsxdPnH8vFbRcv01-CfYgwLWMRy6h0duHVx20vZdVGofchf9EwXO-QbK2iYu4B_XK39s-CUkTALWWRKAT5h5muJlJDE1G9aP0AS6_CHEehFXHal9QhmRqxEy0APsne2jRY][https://lh4.googleusercontent.com/1nF3jCTQgssSgkR8t8YGl29Xh0m4j6cjnwK-f8MYw3hs0ntHRPekqX7nOgsUTC8pe1skAHyqQJk58mYl1O02CYcT9Dm_QF_bITwZrperb5ufpSCNLVAHRnzWldryaRDe5Q3AsBQ][https://lh5.googleusercontent.com/vkQxsPuzc0ZSSFU5mxnSHN7WaQN2GKK3hhORrINgHpxl6QB0eReJ3RjSNvKgFTjB85qTrvxHm125957h_vWszsE5GF2sjBJa8_kbEdN8tRZfLbCSZ2JbnrIpNH1r-PHafmFhPEE]

Re: Efficiently filtering results directly in CS

2016-04-07 Thread Jonathan Haddad
What is CS?

On Thu, Apr 7, 2016 at 10:03 AM Kevin Burton  wrote:

> I have a paging model whereby we stream data from CS by fetching 'pages'
> thereby reading (sequentially) entire datasets.
>
> We're using the bucket approach where we write data for 5 minutes, then we
> can just fetch the bucket for that range.
>
> Our app now has TONS of data and we have a piece of middleware that
> filters it based on the client requests.
>
> So if they only want english they just get english and filter away about
> 60% of our data.
>
> but it doesn't support condition pushdown.  So ALL this data has to be
> sent from our CS boxes to our middleware and filtered there (wasting a lot
> of network IO).
>
> Is there away (including refactoring the code) that I could push this this
> into CS?  Maybe some way I could discovery the CS topology and put daemons
> on each of our CS boxes and fetch from CS directly (doing the filtering
> there).
>
> Thoughts?
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
>
>


Efficiently filtering results directly in CS

2016-04-07 Thread Kevin Burton
I have a paging model whereby we stream data from CS by fetching 'pages'
thereby reading (sequentially) entire datasets.

We're using the bucket approach where we write data for 5 minutes, then we
can just fetch the bucket for that range.

Our app now has TONS of data and we have a piece of middleware that filters
it based on the client requests.

So if they only want english they just get english and filter away about
60% of our data.

but it doesn't support condition pushdown.  So ALL this data has to be sent
from our CS boxes to our middleware and filtered there (wasting a lot of
network IO).

Is there away (including refactoring the code) that I could push this this
into CS?  Maybe some way I could discovery the CS topology and put daemons
on each of our CS boxes and fetch from CS directly (doing the filtering
there).

Thoughts?

-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile



Re: Removing a DC

2016-04-07 Thread Joel Knighton
This sounds most like https://issues.apache.org/jira/browse/CASSANDRA-10371.

Are you on a version that could be affected by this issue?

Best,
Joel

On Thu, Apr 7, 2016 at 11:51 AM, Anubhav Kale 
wrote:

> Hello,
>
>
>
> We removed a DC using instructions from
> https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_decomission_dc_t.html
>
>
>
> After all nodes were gone,
>
>
>
> 1.   System.peers don’t have an entry for the nodes that were
> removed. (confirmed via a cqlsh query with consistency all)
>
> 2.   Nodetool describecluster don’t show them
>
> 3.   Nodetool gossipinfo show them as “LEFT”.
>
>
>
> However, logs continue to spew below and restarting the node doesn’t get
> rid of this. I am thinking a rolling restart of all nodes might fix it, but
> I am curious as to where is this information still held ? I don’t think
> this is causing any badness to the cluster, but I would like to get rid of
> this if possible.
>
>
>
> INFO  [GossipStage:83] 2016-04-07 16:38:07,859  Gossiper.java:998 -
> InetAddress /10.1.200.14 is now DOWN
>
> INFO  [GossipStage:83] 2016-04-07 16:38:07,861  StorageService.java:1914 -
> Removing tokens[*BLAH*] for /10.1.200.14
>
>
>
> Thanks !
>



-- 



Joel Knighton
Cassandra Developer | joel.knigh...@datastax.com


 

 


Removing a DC

2016-04-07 Thread Anubhav Kale
Hello,

We removed a DC using instructions from 
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_decomission_dc_t.html

After all nodes were gone,


1.   System.peers don't have an entry for the nodes that were removed. 
(confirmed via a cqlsh query with consistency all)

2.   Nodetool describecluster don't show them

3.   Nodetool gossipinfo show them as "LEFT".

However, logs continue to spew below and restarting the node doesn't get rid of 
this. I am thinking a rolling restart of all nodes might fix it, but I am 
curious as to where is this information still held ? I don't think this is 
causing any badness to the cluster, but I would like to get rid of this if 
possible.

INFO  [GossipStage:83] 2016-04-07 16:38:07,859  Gossiper.java:998 - InetAddress 
/10.1.200.14 is now DOWN
INFO  [GossipStage:83] 2016-04-07 16:38:07,861  StorageService.java:1914 - 
Removing tokens[BLAH] for /10.1.200.14

Thanks !


Cassandra nodes using internal network to try and talk externally

2016-04-07 Thread Chris Elsmore
Hi,

I have a Cassandra 2.2.5 cluster with a datacenter DC03 with 5 nodes in a ring 
and I have DC04 with one node. 

Setup by default with all nodes talking on the external interfaces works well, 
no problems, all nodes in each DC can see and talk to each other.

I’m trying to follow the instructions here 
http://docs.datastax.com/en/cassandra/2.2/cassandra/configuration/configMultiNetworks.html
 for the node in DC04 in preparation of adding a new node.

When I follow the instructions to set the listen_address to the internal 
address, broadcast address to the external address and to set 
listen_on_broadcast to true, the nodes in DC03 can connect but do not handshake 
with the node in DC04. The output of ‘lsof -i -P | grep 7000’ shows that the 
node in DC04 is trying to connect to the IPs of the nodes in DC04 over the 
internal network, which obviously doesn’t work.

Any clues? I’m at a loss!


Chris



Re: Cassandra Single Node Setup Questions

2016-04-07 Thread Jack Krupansky
Not that we aren't enthusiastic about you moving to Cassandra, but it needs
to be for the right reasons, and for Cassandra the right reasons are
scaling and HA.

In case it's not obvious, I would make a really lousy used-car or
real-estate/time-share salesman!

-- Jack Krupansky

On Thu, Apr 7, 2016 at 10:13 AM, Eric Evans  wrote:

> On Wed, Apr 6, 2016 at 9:15 AM, Bhupendra Baraiya
>  wrote:
> >
> > The main reason we want to migrate to Cassandra is we have a
> denormalized data structure in Ms Sql server Database and we want to move
> to Open source database...
>
>
> If it all boils down to this, then you might want to consider MySQL or
> Postgres.
>
>
> --
> Eric Evans
> eev...@wikimedia.org
>


Re: Cassandra Single Node Setup Questions

2016-04-07 Thread Eric Evans
On Wed, Apr 6, 2016 at 9:15 AM, Bhupendra Baraiya
 wrote:
>
> The main reason we want to migrate to Cassandra is we have a denormalized 
> data structure in Ms Sql server Database and we want to move to Open source 
> database...


If it all boils down to this, then you might want to consider MySQL or Postgres.


-- 
Eric Evans
eev...@wikimedia.org


Re: seconday index queries with thrift in cassandra 3.x supported ?

2016-04-07 Thread Sam Tunnicliffe
That certainly looks like a bug, would you mind opening a ticket at
https://issues.apache.org/jira/browse/CASSANDRA please?

Thanks,
Sam

On Thu, Apr 7, 2016 at 2:19 PM, Ivan Georgiev  wrote:

> Hi, are secondary index queries with thrift supported in Cassandra 3.x ?
> Asking as I am not able to get them working.
>
> I am doing a get_range_slices call with row_filter set in the KeyRange
> property, but I am getting an exception in the server with the following
> trace:
>
>
>
> INFO   | jvm 1| 2016/04/07 14:56:35 | 14:56:35.403 [Thrift:16] DEBUG
> o.a.cassandra.service.ReadCallback - Failed; received 0 of 1 responses
>
> INFO   | jvm 1| 2016/04/07 14:56:35 | 14:56:35.404
> [SharedPool-Worker-1] WARN  o.a.c.c.AbstractLocalAwareExecutorService -
> Uncaught exception on thread Thread[SharedPool-Worker-1,5,main]: {}
>
> INFO   | jvm 1| 2016/04/07 14:56:35 | java.lang.RuntimeException:
> java.lang.NullPointerException
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2450)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_72]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> [apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 | Caused by:
> java.lang.NullPointerException: null
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.index.internal.keys.KeysSearcher.filterIfStale(KeysSearcher.java:155)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.index.internal.keys.KeysSearcher.access$300(KeysSearcher.java:36)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.index.internal.keys.KeysSearcher$1.prepareNext(KeysSearcher.java:104)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.index.internal.keys.KeysSearcher$1.hasNext(KeysSearcher.java:70)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:72)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:134)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:127)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.java:123)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:289)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1792)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2446)
> ~[apache-cassandra-3.0.4.jar:3.0.4]
>
> INFO   | jvm 1| 2016/04/07 14:56:35 |   ... 4 common
> frames omitted
>
>
>
> Are we still able to do thrift seconday index queries ? Using Cassandra
> 3.0.4. Same call works fine with Cassandra 2.2.5.
>
>
>
> Regards:
>
> Ivan
>


seconday index queries with thrift in cassandra 3.x supported ?

2016-04-07 Thread Ivan Georgiev
Hi, are secondary index queries with thrift supported in Cassandra 3.x ?
Asking as I am not able to get them working.

I am doing a get_range_slices call with row_filter set in the KeyRange
property, but I am getting an exception in the server with the following
trace:

 

INFO   | jvm 1| 2016/04/07 14:56:35 | 14:56:35.403 [Thrift:16] DEBUG
o.a.cassandra.service.ReadCallback - Failed; received 0 of 1 responses

INFO   | jvm 1| 2016/04/07 14:56:35 | 14:56:35.404 [SharedPool-Worker-1]
WARN  o.a.c.c.AbstractLocalAwareExecutorService - Uncaught exception on
thread Thread[SharedPool-Worker-1,5,main]: {}

INFO   | jvm 1| 2016/04/07 14:56:35 | java.lang.RuntimeException:
java.lang.NullPointerException

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy
.java:2450) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[na:1.8.0_72]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask
.run(AbstractLocalAwareExecutorService.java:164)
~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]

INFO   | jvm 1| 2016/04/07 14:56:35 | Caused by:
java.lang.NullPointerException: null

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.index.internal.keys.KeysSearcher.filterIfStale(KeysSear
cher.java:155) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.index.internal.keys.KeysSearcher.access$300(KeysSearche
r.java:36) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.index.internal.keys.KeysSearcher$1.prepareNext(KeysSear
cher.java:104) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.index.internal.keys.KeysSearcher$1.hasNext(KeysSearcher
.java:70) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java
:72) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.s
erialize(UnfilteredPartitionIterators.java:295)
~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.ja
va:134) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.j
ava:127) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.db.ReadResponse$LocalDataResponse.(ReadResponse.j
ava:123) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65
) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:289)
~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(Stor
ageProxy.java:1792) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy
.java:2446) ~[apache-cassandra-3.0.4.jar:3.0.4]

INFO   | jvm 1| 2016/04/07 14:56:35 |   ... 4 common frames
omitted

 

Are we still able to do thrift seconday index queries ? Using Cassandra
3.0.4. Same call works fine with Cassandra 2.2.5.

 

Regards:

Ivan



Cassandra experts/consulting in Russia

2016-04-07 Thread Roman Skvazh
Hello guys!
Can you suggest me a consulting company or specialist in Apache Cassandra in 
Russia?
We need to expert support/consult our production clusters.

Thank you!

———
Roman Skvazh



Mapping a continuous range to a discrete value

2016-04-07 Thread Peer, Oded
I have a table mapping continuous ranges to discrete values.

CREATE TABLE range_mapping (k int, lower int, upper int, mapped_value int, 
PRIMARY KEY (k, lower, upper));
INSERT INTO range_mapping (k, lower, upper, mapped_value) VALUES (0, 0, 99, 0);
INSERT INTO range_mapping (k, lower, upper, mapped_value) VALUES (0, 100, 199, 
100);
INSERT INTO range_mapping (k, lower, upper, mapped_value) VALUES (0, 200, 299, 
200);

I then want to query this table to find mapping of a specific value.
In SQL I would use: select mapped_value from range_mapping where k=0 and ? 
between lower and upper

If the variable is bound to the value 150 then the mapped_value returned is 100.

I can't use the same type of query in CQL.
Using the query "select * from range_mapping where k = 0 and lower <= 150 and 
upper >= 150;" returns an error "Clustering column "upper" cannot be restricted 
(preceding column "lower" is restricted by a non-EQ relation)"

I thought of using multi-column restrictions but they don't work as I expected 
as the following query returns two rows instead of the one I expected:

select * from range_mapping where k = 0 and (lower,upper) <= (150,999) and 
(lower,upper) >= (-999,150);

k | lower | upper | mapped_value
---+---+---+--
0 | 0 |99 |0
0 |   100 |   199 |  100

I'd appreciate any thoughts on the subject.



RE: all the nost are not reacheable when running massive deletes

2016-04-07 Thread Paco Trujillo
Well, then you could trying to replace this node as soon as you have more nodes 
available. I would use this procedure as I believe it is the most efficient 
one: 
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_replace_node_t.html.

It is not always the same node, it is always one node from the seven in the 
cluster which has the high load but not always the same.

Respect to the question of the hardware ( from one of the nodes, all of them 
have the same configuration)

Disk:


-  We use sdd disks

-  Output from iostat -mx 5 100:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
   1,000,000,400,030,00   98,57

Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s avgrq-sz 
avgqu-sz   await  svctm  %util
sda   0,00 0,000,000,20 0,00 0,00 8,00 
0,000,00   0,00   0,00
sdb   0,00 0,000,000,00 0,00 0,00 0,00 
0,000,00   0,00   0,00
sdc   0,00 0,000,000,00 0,00 0,00 0,00 
0,000,00   0,00   0,00
sdd   0,00 0,200,000,40 0,00 0,0012,00 
0,002,50   2,50   0,10


-  Logs, I do not see nothing on the messages log except this:

Apr  3 03:07:01 GT-cassandra7 rsyslogd: [origin software="rsyslogd" 
swVersion="5.8.10" x-pid="1504" x-info="http://www.rsyslog.com";] rsyslogd was 
HUPed
Apr  3 18:24:55 GT-cassandra7 ntpd[1847]: 0.0.0.0 06a8 08 no_sys_peer
Apr  4 06:56:18 GT-cassandra7 ntpd[1847]: 0.0.0.0 06b8 08 no_sys_peer

CPU:


-  General use: 1 – 4 %

-  Worst case: 98% .It is when the problem comes, running massive 
deletes(even in a different machine which is receiving the deletes) or running 
a repair.

RAM:


-  We are using CMS.

-  Each node have 16GB, and we dedicate to Cassandra

o   MAX_HEAP_SIZE="10G"

o   HEAP_NEWSIZE="800M"


Regarding to the rest of questions you mention:


-  Clients: we use the datastax java driver with this configuration:
//Get contact points
  String[] 
contactPoints=this.environment.getRequiredProperty(CASSANDRA_CLUSTER_URL).split(",");
  cluster = com.datastax.driver.core.Cluster.builder()
  .addContactPoints(contactPoints)
  
//.addContactPoint(this.environment.getRequiredProperty(CASSANDRA_CLUSTER_URL))
  
.withCredentials(this.environment.getRequiredProperty(CASSANDRA_CLUSTER_USERNAME),
  
this.environment.getRequiredProperty(CASSANDRA_CLUSTER_PASSWORD))
  .withQueryOptions(new QueryOptions()
  .setConsistencyLevel(ConsistencyLevel.QUORUM))
  //.withLoadBalancingPolicy(new 
TokenAwarePolicy(new DCAwareRoundRobinPolicy(CASSANDRA_PRIMARY_CLUSTER)))
  .withLoadBalancingPolicy(new 
TokenAwarePolicy(new RoundRobinPolicy()))
  //.withLoadBalancingPolicy(new 
TokenAwarePolicy((LoadBalancingPolicy) new RoundRobinBalancingPolicy()))
  .withRetryPolicy(new 
LoggingRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE))
  
.withPort(Integer.parseInt(this.environment.getRequiredProperty(CASSANDRA_CLUSTER_PORT)))
  .build();

So request should be evenly distributed.


-  Deletes are contained in a cql file, and I am using cqlsh to execute 
them. I will try to run the deletes in small batches and separate nodes, but 
same problem appear when running repairs.

I think the problem is related with one specific column family:

CREATE TABLE snpaware.snpsearch (
idline1 bigint,
idline2 bigint,
partid int,
id uuid,
alleles int,
coverage int,
distancetonext int,
distancetonextbyline int,
distancetoprev int,
distancetoprevbyline int,
frequency double,
idindividual bigint,
idindividualmorph bigint,
idreferencebuild bigint,
isinexon boolean,
isinorf boolean,
max_length int,
morphid bigint,
position int,
qualityflag int,
ranking int,
referencebuildlength int,
snpsearchid uuid,
synonymous boolean,
PRIMARY KEY ((idline1, idline2, partid), id)
) WITH CLUSTERING ORDER BY (id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = 'KEYS_ONLY'
AND comment = 'Table with the snp between lines'
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
   AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND index_interval = 128
AND memtable_flush_period_in_ms = 0
AND populate_io_cache_on_flush = false
AND read_repair_chance = 0.1
AN