Re: [EXTERNAL] RE: SASI queries- cqlsh vs java driver

Rahul Singh Wed, 27 Feb 2019 10:57:10 -0800

+1 on Datastax and could consider looking at Elassandra.

On Thu, Feb 7, 2019 at 9:14 AM Durity, Sean R <sean_r_dur...@homedepot.com>
wrote:


> Kenneth is right. Trying to port/support a relational model to a CQL model
> the way you are doing it is not going to go well. You won’t be able to
> scale or get the search flexibility that you want. It will make Cassandra
> seem like a bad fit. You want to play to Cassandra’s strengths –
> availability, low latency, scalability, etc. so you need to store the data
> the way you want to retrieve it (query first modeling!). You could look at
> defining the “right” partition and clustering keys, so that the searches
> are within a single, reasonably sized partition. And you could have lookup
> tables for other common search patterns (item_by_model_name, etc.)
>
>
>
> If that kind of modeling gets you to a situation where you have too many
> lookup tables to keep consistent, you could consider something like
> DataStax Enterprise Search (embedded SOLR) to create SOLR indexes on
> searchable fields. A SOLR query will typically be an order of magnitude
> slower than a partition key lookup, though.
>
>
>
> It really boils down to the purpose of the data store. If you are looking
> for primarily an “anything goes” search engine, Cassandra may not be a good
> choice. If you need Cassandra-level availability, extremely low latency
> queries (on known access patterns), high volume/low latency writes, easy
> scalability, etc. then you are going to have to rethink how you model the
> data.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Kenneth Brotman <kenbrot...@yahoo.com.INVALID>
> *Sent:* Thursday, February 07, 2019 7:01 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] RE: SASI queries- cqlsh vs java driver
>
>
>
> Peter,
>
>
>
> Sounds like you may need to use a different architecture.  Perhaps you
> need something like Presto or Kafka as a part of the solution.  If the data
> from the legacy system is wrong for Cassandra it’s an ETL problem?  You’d
> have to transform the data you want to use with Cassandra so that a proper
> data model for Cassandra can be used.
>
>
>
> *From:* Peter Heitman [mailto:pe...@heitman.us <pe...@heitman.us>]
> *Sent:* Wednesday, February 06, 2019 10:05 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: SASI queries- cqlsh vs java driver
>
>
>
> Yes, I have read the material. The problem is that the application has a
> query facility available to the user where they can type in "(A = foo AND B
> = bar) OR C = chex" where A, B, and C are from a defined list of terms,
> many of which are columns in the mytable below while others are from other
> tables. This query facility was implemented and shipped years before we
> decided to move to Cassandra
>
> On Thu, Feb 7, 2019, 8:21 AM Kenneth Brotman <kenbrot...@yahoo.com.invalid>
> wrote:
>
> The problem is you’re not using a query first design.  I would recommend
> first reading chapter 5 of Cassandra: The Definitive Guide by Jeff
> Carpenter and Eben Hewitt.  It’s available free online at this link
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__books.google.com_books-3Fid-3DuW-2DPDAAAQBAJ-26pg-3DPA79-26lpg-3DPA79-26dq-3Djeff-2Bcarpenter-2Bchapter-2B5-26source-3Dbl-26ots-3D58bUYyNM-2DJ-26sig-3DACfU3U22U58-2DQPlz6kzo0zziNF-2DbP30l4Q-26hl-3Den-26sa-3DX-26ved-3D2ahUKEwi0n-2DnWzajgAhXnHzQIHf6jBJIQ6AEwAXoECAgQAQ-23v-3Donepage-26q-3Djeff-2520carpenter-2520chapter-25205-26f-3Dfalse&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=dsY_P-wGUZe0KuIuE01HDz4w9EI5AH4457c9uWyQx5g&s=C6imJ8BRMoV5A9NzORjdrEq6B77ZSAEO9dP__FAXUz8&e=>
> .
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Peter Heitman [mailto:pe...@heitman.us]
> *Sent:* Wednesday, February 06, 2019 6:33 PM
>
>
> *To:* user@cassandra.apache.org
> *Subject:* Re: SASI queries- cqlsh vs java driver
>
>
>
> Yes, I "know" that allow filtering is a sign of a (possibly fatal)
> inefficient data model. I haven't figured out how to do it correctly yet
>
> On Thu, Feb 7, 2019, 7:59 AM Kenneth Brotman <kenbrot...@yahoo.com.invalid>
> wrote:
>
> Exactly.  When you design your data model correctly you shouldn’t have to
> use ALLOW FILTERING in the queries.  That is not recommended.
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Peter Heitman [mailto:pe...@heitman.us]
> *Sent:* Wednesday, February 06, 2019 6:09 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: SASI queries- cqlsh vs java driver
>
>
>
> You are completely right! My problem is that I am trying to port code for
> SQL to CQL for an application that provides the user with a relatively
> general search facility. The original implementation didn't worry about
> secondary indexes - it just took advantage of the ability to create
> arbitrarily complex queries with inner joins, left joins, etc. I am
> reimplimenting it to create a parse tree of CQL queries and doing the ANDs
> and ORs in the application. Of course once I get enough of this implemented
> I will have to load up the table with a large data set and see if it gives
> acceptable performance for our use case.
>
> On Wed, Feb 6, 2019, 8:52 PM Kenneth Brotman <kenbrotman@yahoo.cominvalid>
> wrote:
>
> Isn’t that a lot of SASI indexes for one table.  Could you denormalize
> more to reduce both columns per table and SASI indexes per table?  Eight
> SASI indexes on one table seems like a lot.
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Peter Heitman [mailto:pe...@heitman.us]
> *Sent:* Tuesday, February 05, 2019 6:59 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: SASI queries- cqlsh vs java driver
>
>
>
> The table and secondary indexes look generally like this Note that I have
> changed the names of many of the columns to be generic since they aren't
> important to the question as far as I know. I left the actual names for
> those columns that I've created SASI indexes for. The query I use to try to
> create a PreparedStatement is:
>
>
>
> SELECT sql_id, type, cpe_id, serial, product_class, manufacturer,
> sw_version FROM mytable WHERE serial IN :v0 LIMIT :limit0 ALLOW FILTERING
>
>
>
> the schema cql statements are:
>
>
>
> CREATE TABLE IF NOT EXISTS mykeyspace.mytable (
>
>   id text,
>
>   sql_id bigint,
>
>   cpe_id text,
>
>   sw_version text,
>
>   hw_version text,
>
>   manufacturer text,
>
>   product_class text,
>
>   manufacturer_oui text,
>
>   description text,
>
>   periodic_inform_interval text,
>
>   restricted_mode_enabled text,
>
>   restricted_mode_reason text,
>
>   type text,
>
>   model_name text,
>
>   serial text,
>
>   mac text,
>
>    text,
>
>   generic0 timestamp,
>
>   household_id text,
>
>   generic1 int,
>
>   generic2 text,
>
>   generic3 text,
>
>   generic4 int,
>
>   generic5 int,
>
>   generic6 text,
>
>   generic7 text,
>
>   generic8 text,
>
>   generic9 text,
>
>   generic10 text,
>
>   generic11 timestamp,
>
>   generic12 text,
>
>   generic13 text,
>
>   generic14 timestamp,
>
>   generic15 text,
>
>   generic16 text,
>
>   generic17 text,
>
>   generic18 text,
>
>   generic19 text,
>
>   generic20 text,
>
>   generic21 text,
>
>   generic22 text,
>
>   generic23 text,
>
>   generic24 text,
>
>   generic25 text,
>
>   generic26 text,
>
>   generic27 text,
>
>   generic28 int,
>
>   generic29 int,
>
>   generic30 text,
>
>   generic31 text,
>
>   generic32 text,
>
>   generic33 text,
>
>   generic34 text,
>
>   generic35 int,
>
>   generic36 int,
>
>   generic37 int,
>
>   generic38 int,
>
>   generic39 text,
>
>   generic40 text,
>
>   generic41 text,
>
>   generic42 text,
>
>   generic43 text,
>
>   generic44 text,
>
>   generic45 text,
>
>   PRIMARY KEY (id)
>
> );
>
>
>
> CREATE INDEX IF NOT EXISTS bv_sql_id_idx ON mykeyspace.mytable (sql_id);
>
>
>
> CREATE CUSTOM INDEX IF NOT EXISTS bv_serial_idx ON mykeyspace.mytable
> (serial)
>
>        USING 'org.apache.cassandra.index.sasi.SASIIndex'
>
>        WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class':
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer',
> 'case_sensitive': 'false'};
>
>
>
> CREATE CUSTOM INDEX IF NOT EXISTS bv_cpe_id_idx ON mykeyspace.mytable
> (cpe_id)
>
>        USING 'org.apache.cassandra.index.sasi.SASIIndex'
>
>        WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class':
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer',
> 'case_sensitive': 'false'};
>
>
>
> CREATE CUSTOM INDEX IF NOT EXISTS bv_mac_idx ON mykeyspace.mytable (mac)
>
>        USING 'org.apache.cassandra.index.sasi.SASIIndex'
>
>        WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class':
> 'orgapache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer',
> 'case_sensitive': 'false'};
>
>
>
> CREATE CUSTOM INDEX IF NOT EXISTS bv_manufacturer_idx ON
> mykeyspace.mytable (manufacturer)
>
>        USING 'org.apache.cassandra.index.sasi.SASIIndex'
>
>        WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class':
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer',
> 'case_sensitive': 'false'};
>
>
>
> CREATE CUSTOM INDEX IF NOT EXISTS bv_manufacturer_oui_idx ON
> mykeyspace.mytable (manufacturer_oui)
>
>        USING 'org.apache.cassandra.index.sasiSASIIndex'
>
>        WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class':
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer',
> 'case_sensitive': 'false'};
>
>
>
> CREATE CUSTOM INDEX IF NOT EXISTS bv_hw_version_idx ON mykeyspace.mytable
> (hw_version)
>
>        USING 'org.apache.cassandra.index.sasi.SASIIndex'
>
>        WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class':
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer',
> 'case_sensitive': 'false'};
>
>
>
> CREATE CUSTOM INDEX IF NOT EXISTS bv_sw_version_idx ON mykeyspace.mytable
> (sw_version)
>
>        USING 'org.apache.cassandra.index.sasi.SASIIndex'
>
>        WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class':
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer',
> 'case_sensitive': 'false'};
>
>
>
> CREATE CUSTOM INDEX IF NOT EXISTS bv_household_id_idx ON
> mykeyspace.mytable (household_id)
>
>        USING 'org.apache.cassandra.index.sasi.SASIIndex'
>
>        WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class':
> 'org.apache.cassandra.indexsasi.analyzer.NonTokenizingAnalyzer',
> 'case_sensitive': 'false'};
>
>
>
>
>
> On Tue, Feb 5, 2019 at 3:33 PM Oleksandr Petrov <
> oleksandr.pet...@gmail.com> wrote:
>
> Could you post full table schema (names obfuscated, if required) with
> index creation statements and queries?
>
>
>
> On Mon, Feb 4, 2019 at 10:04 AM Jacques-Henri Berthemet <
> jacques-henri.berthe...@genesys.com> wrote:
>
> I’m not sure why it`s not allowed by the Datastax driver, but maybe you
> could try to use OR instead of IN?
>
> SELECT blah FROM foo WHERE <indexed column> = :val1 OR <indexed column> =
> :val2 ALLOW FILTERING
>
>
>
> It should be the same as IN query, but I don’t if it makes a difference
> for performance.
>
>
>
> *From: *Peter Heitman <pe...@heitman.us>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Monday 4 February 2019 at 07:17
> *To: *"user@cassandra.apache.org <u...@cassandraapache.org>" <
> user@cassandra.apache.org>
> *Subject: *SASI queries- cqlsh vs java driver
>
>
>
> When I create a SASI index on a secondary column, from cqlsh I can execute
> a query
>
>
>
> SELECT blah FROM foo WHERE <indexed column> IN ('mytext') ALLOW FILTERING;
>
>
>
> but not from the java driver:
>
>
>
> SELECT blah FROM foo WHERE <indexed column> IN :val ALLOW FILTERING
>
>
>
> Here I get an exception
>
>
>
> com.datastax.driver.coreexceptions.InvalidQueryException: IN predicates on
> non-primary-key columns (<indexed column>) is not yet supported
>
> at
> com.datastax.driver.coreexceptions.InvalidQueryException.copy(InvalidQueryException.java:49)
> ~[cassandra-driver-core-3.6.0.jar:na]
>
>
>
> Why are they different? Is there anything I can do with the java driver to
> get past this exception?
>
>
>
> Peter
>
>
>
>
>
>
>
>
> --
>
> alex p
>
>
> ------------------------------
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>

Re: [EXTERNAL] RE: SASI queries- cqlsh vs java driver

Reply via email to