i am using Startio Cassandra it way better than stargate as it works on the
latest release of Cassandra and is better on my performance.

we are using it for fulltext search use case

Regards
Asit

On Sun, Mar 22, 2015 at 12:14 PM, Mehak Mehta <meme...@cs.stonybrook.edu>
wrote:

> Hi,
>
> On the basis of some suggestions, I tried using tuplejump for multidimensional
> queries. Since other mostly needed root permissions (for building ) which I
> don't have on my cluster account.
>
> I found a major problem in tuplejump (stargate-core). When I am using it
> with a list type field in my table. It stops working.
> For e.g.
>
> create table person (
>     id int primary key,
>     isActive boolean,
>     age int,
>     eyeColor varchar,
>     name text,
>     gender varchar,
>     company varchar,
>     email varchar,
>     phone varchar,
>     address text,
>     points list<double>,
>     stargate text
> );
>
> with indexing as:
> CREATE CUSTOM INDEX person_idx ON PERSON(stargate) USING some
> 'com.tuplejump.stargate.RowIndex' WITH options =
> {
>         'sg_options':'{
>                 "fields":{
>                 "eyeColor":{},
>                         "age":{},
>                         "phone":{}
>                         }
>         }'
> };
>
> If I insert data in the table along with points list. The following query
> won't give any results (0 rows):
>
> SELECT * FROM RESULTS1 WHERE stargate ='{
>     filter: {
>         type: "range",
>         field: "x",
>         lower: 0
>     }
> }';
>
> I tried removing points list<double> from the table and it works i.e.
> same query will return results.
> Can somebody help me with this problem as I couldn't find much support
> from Stargate.
>
> Please note that I am using Cassandra 2.0.9 compatible with Stargate-core
> as given in link (
> http://stargate-core.readthedocs.org/en/latest/quickstart.html).
>
> Thanks,
> Mehak
>
>
> On Wed, Mar 18, 2015 at 5:45 AM, Andres de la Peña <adelap...@stratio.com>
> wrote:
>
>> Hi,
>>
>> With Stratio Cassandra you can create Lucene based indexes for
>> multidimensional queries this way:
>>
>> ALTER TABLE images.results1 ADD lucene text ;
>>
>> CREATE CUSTOM INDEX lucene_idx ON images.results1 (lucene)
>> USING 'com.stratio.cassandra.index.RowIndex'
>> WITH OPTIONS = {
>>  'refresh_seconds':'1',
>>  'schema':'{
>>   fields:{
>>   image_caseid:{type:"string"},
>> x:{type:"double"},
>> y:{type:"double"} } } '};
>>
>> Then you can perform the query using the dummy column:
>>
>> SELECT * FROM images.results1 WHERE lucene='{ filter:{type:"boolean",
>> must:[
>> {field:"image_caseid", type:"match", value:"mehak"},
>> {field:"x", type:"range", lower:100},
>> {field:"y", type:"range", lower:100}
>> ]}}';
>>
>> However, you can take advantage of partition key to route the query only
>> to the nodes owning the data:
>>
>> SELECT * FROM images.results1 WHERE image_caseid='mehak' AND lucene='{
>> filter:{type:"boolean", must:[
>> {field:"x", type:"range", lower:100},
>> {field:"y", type:"range", lower:100}
>> ]}}';
>>
>> Or, even better:
>>
>> SELECT * FROM images.results1 WHERE image_caseid='mehak' AND x>100 AND
>> lucene='{ filter:{field:"y", type:"range", lower:100}}';
>>
>> Additionally, if your data are geospatial (latitude and longitude), soon
>> you will can use the incoming spatial features.
>>
>>
>>
>> 2015-03-17 23:01 GMT+01:00 Mehak Mehta <meme...@cs.stonybrook.edu>:
>>
>>> Sorry I gave you wrong table definition for query. Here a composite key
>>> of image_caseid, x and uuid which is unique. I have used x in clustering
>>> columns to query it. And used secondary index on y column.
>>>
>>> 1. Example
>>> *cqlsh:images> CREATE TABLE images.results1 (uuid uuid,
>>> analysis_execution_id varchar, analysis_execution_uuid uuid, x  double, y
>>> double, submit_date timestamp, points list<double>,  PRIMARY KEY
>>> ((image_caseid),x,uuid));*
>>> *cqlsh:images> create index results1_y on results1(y);*
>>>
>>> In the below query you can see I have image_caseid as primary key which
>>> is filtered. Even then it is giving error that "*No indexed columns
>>> present*"
>>>
>>> *cqlsh:images> select * from results1 where image_caseid='mehak' and x >
>>> 100 and y<100 order by image_caseid asc;*
>>> *code=2200 [Invalid query] message="No indexed columns present in
>>> by-columns clause with Equal operator"*
>>>
>>> 2. Example
>>> I also tried including both x and y columns as composite key even then
>>> query gives following error:
>>>
>>> *cqlsh:images> CREATE TABLE images.results1 (uuid uuid,
>>> analysis_execution_id varchar, analysis_execution_uuid uuid, x  double, y
>>> double, submit_date timestamp, points list<double>,  PRIMARY KEY
>>> ((image_caseid),x,y,uuid));*
>>>
>>> *cqlsh:images> select * from results1 where image_caseid='mehak' and x >
>>> 100 and y<100 order by image_caseid asc;*
>>> *code=2200 [Invalid query] message="PRIMARY KEY column "y" cannot be
>>> restricted (preceding column "ColumnDefinition{name=x,
>>> type=org.apache.cassandra.db.marshal.DoubleType, kind=CLUSTERING_COLUMN,
>>> componentIndex=0, indexName=null, indexType=null}" is either not restricted
>>> or by a non-EQ relation)"*
>>>
>>> Thanks,
>>> Mehak
>>>
>>>
>>> On Tue, Mar 17, 2015 at 5:19 PM, Jack Krupansky <
>>> jack.krupan...@gmail.com> wrote:
>>>
>>>> Yeah, you may have to add a dummy column populated with a constant, or
>>>> just pick some other additional column to index that you can apply = to.
>>>> It's just a requirement to assure that a filtered select isn't too-too 
>>>> slow.
>>>>
>>>> Uh.... if you are applying = to your primary key then there is no need
>>>> for any filtering. What did you really mean to do? It makes no sense the
>>>> way you have it!
>>>>
>>>> Either go with DSE Search/Solr, or google "Tuplejump Stargate" or
>>>> "Stratio".
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> On Tue, Mar 17, 2015 at 4:51 PM, Mehak Mehta <meme...@cs.stonybrook.edu
>>>> > wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> In my case I am just trying to do range queries on 2 dimensions i.e. x
>>>>> and y.
>>>>>
>>>>> *cqlsh:images> CREATE TABLE images.results1 (uuid uuid,
>>>>> analysis_execution_id varchar, analysis_execution_uuid uuid, x  double, y
>>>>> double, submit_date timestamp, points list<double>,  PRIMARY KEY
>>>>> (image_caseid));*
>>>>>
>>>>> With secondary indexes on x and y:
>>>>> *cqlsh:images> create index results1_y on results1(y);*
>>>>> *cqlsh:images> create index results1_x on results1(x);*
>>>>>
>>>>> But when I try the following the query it is not allowed with an
>>>>> error.
>>>>>
>>>>> *cqlsh:images> select * from results1 where image_caseid='mehak' and x
>>>>> > 100 and y <100 allow filtering;*
>>>>> *code=2200 [Invalid query] message="No indexed columns present in
>>>>> by-columns clause with Equal operator"*
>>>>>
>>>>> Can you tell me why this is not working. I can't understand why
>>>>> Cassandra is not allowing me to use Non EQ range query on secondary 
>>>>> indexes
>>>>> even when I have restricted my primary key by EQ as specified in error
>>>>> message.
>>>>>
>>>>> Also please elaborate on how to use Lucene/Solr in this case with
>>>>> Cassandra.
>>>>>
>>>>> Thanks,
>>>>> Mehak
>>>>>
>>>>> On Tue, Mar 17, 2015 at 2:26 PM, 健峰 高 <g...@beecloud.cn> wrote:
>>>>>
>>>>>> Lucene or Solr is better. I’m using lucene
>>>>>>
>>>>>> 在 2015年3月18日,上午2:11,Jack Krupansky <jack.krupan...@gmail.com> 写道:
>>>>>>
>>>>>> 1. Create multiple secondary indexes, one for each non-key column you
>>>>>> need to index on. Not recommended. Considered an anti-pattern for 
>>>>>> Cassandra.
>>>>>> 2. Use DSE Search/Solr.
>>>>>> 3. Use Lucene-based indexing with TumpleJump/Stargate or Stratio.
>>>>>>
>>>>>> -- Jack Krupansky
>>>>>>
>>>>>> On Tue, Mar 17, 2015 at 1:49 PM, Mehak Mehta <
>>>>>> meme...@cs.stonybrook.edu> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I want to perform range queries (as in x and y ranges) on a large
>>>>>>> data billions of rows.
>>>>>>> CQL allows me to put Non EQ restrictions on only one of the
>>>>>>> clustering columns.
>>>>>>> Its not allowing me to filter the data using any other column even
>>>>>>> with use of Allow Filtering option.
>>>>>>>
>>>>>>> cqlsh:images> *select * from results1 where image_caseid='mehak'
>>>>>>> and x > 100 and y <100 order by x allow filtering;*
>>>>>>> *code=2200 [Invalid query] message="No indexed columns present in
>>>>>>> by-columns clause with Equal operator"*
>>>>>>>
>>>>>>> My table definition is :
>>>>>>>
>>>>>>> *CREATE TABLE images.results1 (uuid uuid, analysis_execution_id
>>>>>>> varchar, analysis_execution_uuid uuid, x  double, y double, loc 
>>>>>>> varchar, w
>>>>>>> double, h double, normalized varchar, type varchar, filehost varchar,
>>>>>>> filename varchar, image_uuid uuid, image_uri varchar, image_caseid 
>>>>>>> varchar,
>>>>>>> image_mpp_x double, image_mpp_y double, image_width double, image_height
>>>>>>> double, objective double, cancer_type varchar,  Area float, submit_date
>>>>>>> timestamp, points list<double>,  PRIMARY KEY ((image_caseid),x));*
>>>>>>>
>>>>>>> Kindly suggest me how to model is data and perform 2d or in general
>>>>>>> multi dimension range queries in Cassandra efficiently.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Mehak
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>>
>> Andrés de la Peña
>>
>>
>> <http://www.stratio.com/>
>> Avenida de Europa, 26. Ática 5. 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>*
>>
>
>

Reply via email to