Hi, I checked Startio Cassandra but couldn't get any good documentation for the same. Can you give me some pointers on how to use it. Do I have to build it from the source or I can use it directly with jar files as in case of Stargate. Since I was looking for solution which I don't need a full build and can be used with existing tar of cassandra because I have some restrictions on installing stuff on my server.
Thanks, Mehak On Mon, Mar 23, 2015 at 8:17 AM, Asit KAUSHIK <asitkaushikno...@gmail.com> wrote: > i am using Startio Cassandra it way better than stargate as it works on > the latest release of Cassandra and is better on my performance. > > we are using it for fulltext search use case > > Regards > Asit > > On Sun, Mar 22, 2015 at 12:14 PM, Mehak Mehta <meme...@cs.stonybrook.edu> > wrote: > >> Hi, >> >> On the basis of some suggestions, I tried using tuplejump for >> multidimensional >> queries. Since other mostly needed root permissions (for building ) which I >> don't have on my cluster account. >> >> I found a major problem in tuplejump (stargate-core). When I am using it >> with a list type field in my table. It stops working. >> For e.g. >> >> create table person ( >> id int primary key, >> isActive boolean, >> age int, >> eyeColor varchar, >> name text, >> gender varchar, >> company varchar, >> email varchar, >> phone varchar, >> address text, >> points list<double>, >> stargate text >> ); >> >> with indexing as: >> CREATE CUSTOM INDEX person_idx ON PERSON(stargate) USING some >> 'com.tuplejump.stargate.RowIndex' WITH options = >> { >> 'sg_options':'{ >> "fields":{ >> "eyeColor":{}, >> "age":{}, >> "phone":{} >> } >> }' >> }; >> >> If I insert data in the table along with points list. The following query >> won't give any results (0 rows): >> >> SELECT * FROM RESULTS1 WHERE stargate ='{ >> filter: { >> type: "range", >> field: "x", >> lower: 0 >> } >> }'; >> >> I tried removing points list<double> from the table and it works i.e. >> same query will return results. >> Can somebody help me with this problem as I couldn't find much support >> from Stargate. >> >> Please note that I am using Cassandra 2.0.9 compatible with Stargate-core >> as given in link ( >> http://stargate-core.readthedocs.org/en/latest/quickstart.html). >> >> Thanks, >> Mehak >> >> >> On Wed, Mar 18, 2015 at 5:45 AM, Andres de la Peña <adelap...@stratio.com >> > wrote: >> >>> Hi, >>> >>> With Stratio Cassandra you can create Lucene based indexes for >>> multidimensional queries this way: >>> >>> ALTER TABLE images.results1 ADD lucene text ; >>> >>> CREATE CUSTOM INDEX lucene_idx ON images.results1 (lucene) >>> USING 'com.stratio.cassandra.index.RowIndex' >>> WITH OPTIONS = { >>> 'refresh_seconds':'1', >>> 'schema':'{ >>> fields:{ >>> image_caseid:{type:"string"}, >>> x:{type:"double"}, >>> y:{type:"double"} } } '}; >>> >>> Then you can perform the query using the dummy column: >>> >>> SELECT * FROM images.results1 WHERE lucene='{ filter:{type:"boolean", >>> must:[ >>> {field:"image_caseid", type:"match", value:"mehak"}, >>> {field:"x", type:"range", lower:100}, >>> {field:"y", type:"range", lower:100} >>> ]}}'; >>> >>> However, you can take advantage of partition key to route the query only >>> to the nodes owning the data: >>> >>> SELECT * FROM images.results1 WHERE image_caseid='mehak' AND lucene='{ >>> filter:{type:"boolean", must:[ >>> {field:"x", type:"range", lower:100}, >>> {field:"y", type:"range", lower:100} >>> ]}}'; >>> >>> Or, even better: >>> >>> SELECT * FROM images.results1 WHERE image_caseid='mehak' AND x>100 AND >>> lucene='{ filter:{field:"y", type:"range", lower:100}}'; >>> >>> Additionally, if your data are geospatial (latitude and longitude), soon >>> you will can use the incoming spatial features. >>> >>> >>> >>> 2015-03-17 23:01 GMT+01:00 Mehak Mehta <meme...@cs.stonybrook.edu>: >>> >>>> Sorry I gave you wrong table definition for query. Here a composite key >>>> of image_caseid, x and uuid which is unique. I have used x in clustering >>>> columns to query it. And used secondary index on y column. >>>> >>>> 1. Example >>>> *cqlsh:images> CREATE TABLE images.results1 (uuid uuid, >>>> analysis_execution_id varchar, analysis_execution_uuid uuid, x double, y >>>> double, submit_date timestamp, points list<double>, PRIMARY KEY >>>> ((image_caseid),x,uuid));* >>>> *cqlsh:images> create index results1_y on results1(y);* >>>> >>>> In the below query you can see I have image_caseid as primary key which >>>> is filtered. Even then it is giving error that "*No indexed columns >>>> present*" >>>> >>>> *cqlsh:images> select * from results1 where image_caseid='mehak' and x >>>> > 100 and y<100 order by image_caseid asc;* >>>> *code=2200 [Invalid query] message="No indexed columns present in >>>> by-columns clause with Equal operator"* >>>> >>>> 2. Example >>>> I also tried including both x and y columns as composite key even then >>>> query gives following error: >>>> >>>> *cqlsh:images> CREATE TABLE images.results1 (uuid uuid, >>>> analysis_execution_id varchar, analysis_execution_uuid uuid, x double, y >>>> double, submit_date timestamp, points list<double>, PRIMARY KEY >>>> ((image_caseid),x,y,uuid));* >>>> >>>> *cqlsh:images> select * from results1 where image_caseid='mehak' and x >>>> > 100 and y<100 order by image_caseid asc;* >>>> *code=2200 [Invalid query] message="PRIMARY KEY column "y" cannot be >>>> restricted (preceding column "ColumnDefinition{name=x, >>>> type=org.apache.cassandra.db.marshal.DoubleType, kind=CLUSTERING_COLUMN, >>>> componentIndex=0, indexName=null, indexType=null}" is either not restricted >>>> or by a non-EQ relation)"* >>>> >>>> Thanks, >>>> Mehak >>>> >>>> >>>> On Tue, Mar 17, 2015 at 5:19 PM, Jack Krupansky < >>>> jack.krupan...@gmail.com> wrote: >>>> >>>>> Yeah, you may have to add a dummy column populated with a constant, or >>>>> just pick some other additional column to index that you can apply = to. >>>>> It's just a requirement to assure that a filtered select isn't too-too >>>>> slow. >>>>> >>>>> Uh.... if you are applying = to your primary key then there is no need >>>>> for any filtering. What did you really mean to do? It makes no sense the >>>>> way you have it! >>>>> >>>>> Either go with DSE Search/Solr, or google "Tuplejump Stargate" or >>>>> "Stratio". >>>>> >>>>> -- Jack Krupansky >>>>> >>>>> On Tue, Mar 17, 2015 at 4:51 PM, Mehak Mehta < >>>>> meme...@cs.stonybrook.edu> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> In my case I am just trying to do range queries on 2 dimensions i.e. >>>>>> x and y. >>>>>> >>>>>> *cqlsh:images> CREATE TABLE images.results1 (uuid uuid, >>>>>> analysis_execution_id varchar, analysis_execution_uuid uuid, x double, y >>>>>> double, submit_date timestamp, points list<double>, PRIMARY KEY >>>>>> (image_caseid));* >>>>>> >>>>>> With secondary indexes on x and y: >>>>>> *cqlsh:images> create index results1_y on results1(y);* >>>>>> *cqlsh:images> create index results1_x on results1(x);* >>>>>> >>>>>> But when I try the following the query it is not allowed with an >>>>>> error. >>>>>> >>>>>> *cqlsh:images> select * from results1 where image_caseid='mehak' and >>>>>> x > 100 and y <100 allow filtering;* >>>>>> *code=2200 [Invalid query] message="No indexed columns present in >>>>>> by-columns clause with Equal operator"* >>>>>> >>>>>> Can you tell me why this is not working. I can't understand why >>>>>> Cassandra is not allowing me to use Non EQ range query on secondary >>>>>> indexes >>>>>> even when I have restricted my primary key by EQ as specified in error >>>>>> message. >>>>>> >>>>>> Also please elaborate on how to use Lucene/Solr in this case with >>>>>> Cassandra. >>>>>> >>>>>> Thanks, >>>>>> Mehak >>>>>> >>>>>> On Tue, Mar 17, 2015 at 2:26 PM, 健峰 高 <g...@beecloud.cn> wrote: >>>>>> >>>>>>> Lucene or Solr is better. I’m using lucene >>>>>>> >>>>>>> 在 2015年3月18日,上午2:11,Jack Krupansky <jack.krupan...@gmail.com> 写道: >>>>>>> >>>>>>> 1. Create multiple secondary indexes, one for each non-key column >>>>>>> you need to index on. Not recommended. Considered an anti-pattern for >>>>>>> Cassandra. >>>>>>> 2. Use DSE Search/Solr. >>>>>>> 3. Use Lucene-based indexing with TumpleJump/Stargate or Stratio. >>>>>>> >>>>>>> -- Jack Krupansky >>>>>>> >>>>>>> On Tue, Mar 17, 2015 at 1:49 PM, Mehak Mehta < >>>>>>> meme...@cs.stonybrook.edu> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I want to perform range queries (as in x and y ranges) on a large >>>>>>>> data billions of rows. >>>>>>>> CQL allows me to put Non EQ restrictions on only one of the >>>>>>>> clustering columns. >>>>>>>> Its not allowing me to filter the data using any other column even >>>>>>>> with use of Allow Filtering option. >>>>>>>> >>>>>>>> cqlsh:images> *select * from results1 where image_caseid='mehak' >>>>>>>> and x > 100 and y <100 order by x allow filtering;* >>>>>>>> *code=2200 [Invalid query] message="No indexed columns present in >>>>>>>> by-columns clause with Equal operator"* >>>>>>>> >>>>>>>> My table definition is : >>>>>>>> >>>>>>>> *CREATE TABLE images.results1 (uuid uuid, analysis_execution_id >>>>>>>> varchar, analysis_execution_uuid uuid, x double, y double, loc >>>>>>>> varchar, w >>>>>>>> double, h double, normalized varchar, type varchar, filehost varchar, >>>>>>>> filename varchar, image_uuid uuid, image_uri varchar, image_caseid >>>>>>>> varchar, >>>>>>>> image_mpp_x double, image_mpp_y double, image_width double, >>>>>>>> image_height >>>>>>>> double, objective double, cancer_type varchar, Area float, submit_date >>>>>>>> timestamp, points list<double>, PRIMARY KEY ((image_caseid),x));* >>>>>>>> >>>>>>>> Kindly suggest me how to model is data and perform 2d or in general >>>>>>>> multi dimension range queries in Cassandra efficiently. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Mehak >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> >>> Andrés de la Peña >>> >>> >>> <http://www.stratio.com/> >>> Avenida de Europa, 26. Ática 5. 3ª Planta >>> 28224 Pozuelo de Alarcón, Madrid >>> Tel: +34 91 352 59 42 // *@stratiobd <https://twitter.com/StratioBD>* >>> >> >> >