Re: Running select against cassandra
> > Also is materialized view good for production? I agree with Sean's and Reid's sentiments about MVs. I still think of MVs as being experimental and not ready for primetime. I would wait for the improvements which may be coming in C* 4.0 but no promises there... yet. :) Cheers!
Re: Running select against cassandra
Thanks all for valuable inputs. I agree we nees to have query defined then plan the schema of table , but the server is live for 2 yrs now in production and this is new requiremnt so changing schema is not a option and secondary index is also bad idea. I was thinking to go with materialized view or see how select perform in non prod and see which fares better. So wanted to see if we ca. Do anything other than that in existing schema. Also copy option was discussed but copy doest support where clause. On Thursday, February 6, 2020, Reid Pinchback wrote: > I defer to Sean’s comment on materialized views. I’m more familiar with > DynamoDB on that front, where you do this pretty routinely. I was curious > so I went looking. This appears to be the C* Jira that points to many of > the problem points: > > > > https://issues.apache.org/jira/browse/CASSANDRA-13826 > > > > Abdul, you’d probably want to refer to that or similar info. Could be > that the more practical resolution is to just have the client write the > data twice, if there are two very different query patterns to support. > Writes usually have quite low latency in C*, so double-writing may be less > of a performance hit, and later drag on memory on I/O, than a query model > that makes you browse through more data than necessary. > > > > *From: *"Durity, Sean R" > *Reply-To: *"user@cassandra.apache.org" > *Date: *Thursday, February 6, 2020 at 4:24 PM > *To: *"user@cassandra.apache.org" > *Subject: *RE: [EXTERNAL] Re: Running select against cassandra > > > > *Message from External Sender* > > Reid is right. You build the tables to easily answer the queries you want. > So, start with the query! I inferred a query for you based on what you > mentioned. If my inference is wrong, the table structure is likely wrong, > too. > > > > So, what kind of query do you want to run? > > > > (NOTE: a select count(*) that is not restricted to within a single > partition is a very bad option. Don’t do that) > > > > The query for my table below is simply: > > select user_count [, other columns] from users_by_day where date = ? and > hour = ? and minute = ? > > > > > > Sean Durity > > > > *From:* Reid Pinchback > *Sent:* Thursday, February 6, 2020 4:10 PM > *To:* user@cassandra.apache.org > *Subject:* Re: [EXTERNAL] Re: Running select against cassandra > > > > Abdul, > > > > When in doubt, have a query model that immediately feeds you exactly what > you are looking for. That’s kind of the data model philosophy that you want > to shoot for as much as feasible with C*. > > > > The point of Sean’s table isn’t the similarity to yours, it is how he has > it keyed because it suits a partition structure much better aligned with > what you want to request. So I’d say yes, if a materialized view is how > you want to achieve a denormalized state where the query model directly > supports giving you want you want to query for, that sounds like an > appropriate option to consider. You might want a composite partition key > for having an efficient selection of narrow time ranges. > > > > *From: *Abdul Patel > *Reply-To: *"user@cassandra.apache.org" > *Date: *Thursday, February 6, 2020 at 2:42 PM > *To: *"user@cassandra.apache.org" > *Subject: *Re: [EXTERNAL] Re: Running select against cassandra > > > > *Message from External Sender* > > this is the schema similar to what we have , they want to get user > connected - concurrent count for every say 1-5 minutes. > > i am thinking will simple select will have performance issue or we can go > for materialized views ? > > > > CREATE TABLE usr_session ( > > userid bigint, > > session_usr text, > > last_access_time timestamp, > > login_time timestamp, > > status int, > > PRIMARY KEY (userid, session_usr) > > ) WITH CLUSTERING ORDER BY (session_usr ASC) > > > > > > On Thu, Feb 6, 2020 at 2:09 PM Durity, Sean R > wrote: > > Do you only need the current count or do you want to keep the historical > counts also? By active users, does that mean some kind of user that the > application tracks (as opposed to the Cassandra user connected to the > cluster)? > > > > I would consider a table like this for tracking active users through time: > > > > Create table users_by_day ( > > app_date date, > > hour integer, > > minute integer, > > user_count integer, > > longest_login_user text, > > longest_login_seconds integer, > > last_login datetime, > > last_login_user text ) > > primary key (app_date,
Re: [EXTERNAL] Re: Running select against cassandra
I defer to Sean’s comment on materialized views. I’m more familiar with DynamoDB on that front, where you do this pretty routinely. I was curious so I went looking. This appears to be the C* Jira that points to many of the problem points: https://issues.apache.org/jira/browse/CASSANDRA-13826 Abdul, you’d probably want to refer to that or similar info. Could be that the more practical resolution is to just have the client write the data twice, if there are two very different query patterns to support. Writes usually have quite low latency in C*, so double-writing may be less of a performance hit, and later drag on memory on I/O, than a query model that makes you browse through more data than necessary. From: "Durity, Sean R" Reply-To: "user@cassandra.apache.org" Date: Thursday, February 6, 2020 at 4:24 PM To: "user@cassandra.apache.org" Subject: RE: [EXTERNAL] Re: Running select against cassandra Message from External Sender Reid is right. You build the tables to easily answer the queries you want. So, start with the query! I inferred a query for you based on what you mentioned. If my inference is wrong, the table structure is likely wrong, too. So, what kind of query do you want to run? (NOTE: a select count(*) that is not restricted to within a single partition is a very bad option. Don’t do that) The query for my table below is simply: select user_count [, other columns] from users_by_day where date = ? and hour = ? and minute = ? Sean Durity From: Reid Pinchback Sent: Thursday, February 6, 2020 4:10 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Re: Running select against cassandra Abdul, When in doubt, have a query model that immediately feeds you exactly what you are looking for. That’s kind of the data model philosophy that you want to shoot for as much as feasible with C*. The point of Sean’s table isn’t the similarity to yours, it is how he has it keyed because it suits a partition structure much better aligned with what you want to request. So I’d say yes, if a materialized view is how you want to achieve a denormalized state where the query model directly supports giving you want you want to query for, that sounds like an appropriate option to consider. You might want a composite partition key for having an efficient selection of narrow time ranges. From: Abdul Patel mailto:abd786...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Thursday, February 6, 2020 at 2:42 PM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Subject: Re: [EXTERNAL] Re: Running select against cassandra Message from External Sender this is the schema similar to what we have , they want to get user connected - concurrent count for every say 1-5 minutes. i am thinking will simple select will have performance issue or we can go for materialized views ? CREATE TABLE usr_session ( userid bigint, session_usr text, last_access_time timestamp, login_time timestamp, status int, PRIMARY KEY (userid, session_usr) ) WITH CLUSTERING ORDER BY (session_usr ASC) On Thu, Feb 6, 2020 at 2:09 PM Durity, Sean R mailto:sean_r_dur...@homedepot.com>> wrote: Do you only need the current count or do you want to keep the historical counts also? By active users, does that mean some kind of user that the application tracks (as opposed to the Cassandra user connected to the cluster)? I would consider a table like this for tracking active users through time: Create table users_by_day ( app_date date, hour integer, minute integer, user_count integer, longest_login_user text, longest_login_seconds integer, last_login datetime, last_login_user text ) primary key (app_date, hour, minute); Then, your reporting can easily select full days or a specific, one-minute slice. Of course, the app would need to have a timer and write out the data. I would also suggest a TTL on the data so that you only keep what you need (a week, a year, whatever). Of course, if your reporting requires different granularities, you could consider a different time bucket for the table (by hour, by week, etc.) Sean Durity – Staff Systems Engineer, Cassandra From: Abdul Patel mailto:abd786...@gmail.com>> Sent: Thursday, February 6, 2020 1:54 PM To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: [EXTERNAL] Re: Running select against cassandra Its sort of user connected, app team needa number of active users connected say every 1 to 5 mins. The timeout at app end is 120ms. On Thursday, February 6, 2020, Michael Shuler mailto:mich...@pbandjelly.org>> wrote: You'll have to be more specific. What is your table schema and what is the SELECT query? What is the normal response time? As a basic guide for your general question, i
RE: [EXTERNAL] Re: Running select against cassandra
Reid is right. You build the tables to easily answer the queries you want. So, start with the query! I inferred a query for you based on what you mentioned. If my inference is wrong, the table structure is likely wrong, too. So, what kind of query do you want to run? (NOTE: a select count(*) that is not restricted to within a single partition is a very bad option. Don’t do that) The query for my table below is simply: select user_count [, other columns] from users_by_day where date = ? and hour = ? and minute = ? Sean Durity From: Reid Pinchback Sent: Thursday, February 6, 2020 4:10 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Re: Running select against cassandra Abdul, When in doubt, have a query model that immediately feeds you exactly what you are looking for. That’s kind of the data model philosophy that you want to shoot for as much as feasible with C*. The point of Sean’s table isn’t the similarity to yours, it is how he has it keyed because it suits a partition structure much better aligned with what you want to request. So I’d say yes, if a materialized view is how you want to achieve a denormalized state where the query model directly supports giving you want you want to query for, that sounds like an appropriate option to consider. You might want a composite partition key for having an efficient selection of narrow time ranges. From: Abdul Patel mailto:abd786...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Thursday, February 6, 2020 at 2:42 PM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Subject: Re: [EXTERNAL] Re: Running select against cassandra Message from External Sender this is the schema similar to what we have , they want to get user connected - concurrent count for every say 1-5 minutes. i am thinking will simple select will have performance issue or we can go for materialized views ? CREATE TABLE usr_session ( userid bigint, session_usr text, last_access_time timestamp, login_time timestamp, status int, PRIMARY KEY (userid, session_usr) ) WITH CLUSTERING ORDER BY (session_usr ASC) On Thu, Feb 6, 2020 at 2:09 PM Durity, Sean R mailto:sean_r_dur...@homedepot.com>> wrote: Do you only need the current count or do you want to keep the historical counts also? By active users, does that mean some kind of user that the application tracks (as opposed to the Cassandra user connected to the cluster)? I would consider a table like this for tracking active users through time: Create table users_by_day ( app_date date, hour integer, minute integer, user_count integer, longest_login_user text, longest_login_seconds integer, last_login datetime, last_login_user text ) primary key (app_date, hour, minute); Then, your reporting can easily select full days or a specific, one-minute slice. Of course, the app would need to have a timer and write out the data. I would also suggest a TTL on the data so that you only keep what you need (a week, a year, whatever). Of course, if your reporting requires different granularities, you could consider a different time bucket for the table (by hour, by week, etc.) Sean Durity – Staff Systems Engineer, Cassandra From: Abdul Patel mailto:abd786...@gmail.com>> Sent: Thursday, February 6, 2020 1:54 PM To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: [EXTERNAL] Re: Running select against cassandra Its sort of user connected, app team needa number of active users connected say every 1 to 5 mins. The timeout at app end is 120ms. On Thursday, February 6, 2020, Michael Shuler mailto:mich...@pbandjelly.org>> wrote: You'll have to be more specific. What is your table schema and what is the SELECT query? What is the normal response time? As a basic guide for your general question, if the query is something sort of irrelevant that should be stored some other way, like a total row count, or most any SELECT that requires ALLOW FILTERING, you're doing it wrong and should re-evaluate your data model. 1 query per minute is a minuscule fraction of the basic capacity of queries per minute that a Cassandra cluster should be able to handle with good data modeling and table-relevant query. All depends on the data model and query. Michael On 2/6/20 12:20 PM, Abdul Patel wrote: Hi, Is it advisable to run select query to fetch every minute to grab data from cassandra for reporting purpose, if no then whats the alternative? - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org<mailto:user-unsubscr...@cassandra.apache.org> For additional commands, e-mail: user-h...@cassandra.apache.org<mailto:user-h...@cassandra.apache.org> __
RE: [EXTERNAL] Re: Running select against cassandra
From reports on this mailing list, I do not allow materialized views. Sean Durity From: Reid Pinchback Sent: Thursday, February 6, 2020 4:10 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Re: Running select against cassandra Abdul, When in doubt, have a query model that immediately feeds you exactly what you are looking for. That’s kind of the data model philosophy that you want to shoot for as much as feasible with C*. The point of Sean’s table isn’t the similarity to yours, it is how he has it keyed because it suits a partition structure much better aligned with what you want to request. So I’d say yes, if a materialized view is how you want to achieve a denormalized state where the query model directly supports giving you want you want to query for, that sounds like an appropriate option to consider. You might want a composite partition key for having an efficient selection of narrow time ranges. From: Abdul Patel mailto:abd786...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Thursday, February 6, 2020 at 2:42 PM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Subject: Re: [EXTERNAL] Re: Running select against cassandra Message from External Sender this is the schema similar to what we have , they want to get user connected - concurrent count for every say 1-5 minutes. i am thinking will simple select will have performance issue or we can go for materialized views ? CREATE TABLE usr_session ( userid bigint, session_usr text, last_access_time timestamp, login_time timestamp, status int, PRIMARY KEY (userid, session_usr) ) WITH CLUSTERING ORDER BY (session_usr ASC) On Thu, Feb 6, 2020 at 2:09 PM Durity, Sean R mailto:sean_r_dur...@homedepot.com>> wrote: Do you only need the current count or do you want to keep the historical counts also? By active users, does that mean some kind of user that the application tracks (as opposed to the Cassandra user connected to the cluster)? I would consider a table like this for tracking active users through time: Create table users_by_day ( app_date date, hour integer, minute integer, user_count integer, longest_login_user text, longest_login_seconds integer, last_login datetime, last_login_user text ) primary key (app_date, hour, minute); Then, your reporting can easily select full days or a specific, one-minute slice. Of course, the app would need to have a timer and write out the data. I would also suggest a TTL on the data so that you only keep what you need (a week, a year, whatever). Of course, if your reporting requires different granularities, you could consider a different time bucket for the table (by hour, by week, etc.) Sean Durity – Staff Systems Engineer, Cassandra From: Abdul Patel mailto:abd786...@gmail.com>> Sent: Thursday, February 6, 2020 1:54 PM To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: [EXTERNAL] Re: Running select against cassandra Its sort of user connected, app team needa number of active users connected say every 1 to 5 mins. The timeout at app end is 120ms. On Thursday, February 6, 2020, Michael Shuler mailto:mich...@pbandjelly.org>> wrote: You'll have to be more specific. What is your table schema and what is the SELECT query? What is the normal response time? As a basic guide for your general question, if the query is something sort of irrelevant that should be stored some other way, like a total row count, or most any SELECT that requires ALLOW FILTERING, you're doing it wrong and should re-evaluate your data model. 1 query per minute is a minuscule fraction of the basic capacity of queries per minute that a Cassandra cluster should be able to handle with good data modeling and table-relevant query. All depends on the data model and query. Michael On 2/6/20 12:20 PM, Abdul Patel wrote: Hi, Is it advisable to run select query to fetch every minute to grab data from cassandra for reporting purpose, if no then whats the alternative? - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org<mailto:user-unsubscr...@cassandra.apache.org> For additional commands, e-mail: user-h...@cassandra.apache.org<mailto:user-h...@cassandra.apache.org> The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the
Re: [EXTERNAL] Re: Running select against cassandra
Abdul, When in doubt, have a query model that immediately feeds you exactly what you are looking for. That’s kind of the data model philosophy that you want to shoot for as much as feasible with C*. The point of Sean’s table isn’t the similarity to yours, it is how he has it keyed because it suits a partition structure much better aligned with what you want to request. So I’d say yes, if a materialized view is how you want to achieve a denormalized state where the query model directly supports giving you want you want to query for, that sounds like an appropriate option to consider. You might want a composite partition key for having an efficient selection of narrow time ranges. From: Abdul Patel Reply-To: "user@cassandra.apache.org" Date: Thursday, February 6, 2020 at 2:42 PM To: "user@cassandra.apache.org" Subject: Re: [EXTERNAL] Re: Running select against cassandra Message from External Sender this is the schema similar to what we have , they want to get user connected - concurrent count for every say 1-5 minutes. i am thinking will simple select will have performance issue or we can go for materialized views ? CREATE TABLE usr_session ( userid bigint, session_usr text, last_access_time timestamp, login_time timestamp, status int, PRIMARY KEY (userid, session_usr) ) WITH CLUSTERING ORDER BY (session_usr ASC) On Thu, Feb 6, 2020 at 2:09 PM Durity, Sean R mailto:sean_r_dur...@homedepot.com>> wrote: Do you only need the current count or do you want to keep the historical counts also? By active users, does that mean some kind of user that the application tracks (as opposed to the Cassandra user connected to the cluster)? I would consider a table like this for tracking active users through time: Create table users_by_day ( app_date date, hour integer, minute integer, user_count integer, longest_login_user text, longest_login_seconds integer, last_login datetime, last_login_user text ) primary key (app_date, hour, minute); Then, your reporting can easily select full days or a specific, one-minute slice. Of course, the app would need to have a timer and write out the data. I would also suggest a TTL on the data so that you only keep what you need (a week, a year, whatever). Of course, if your reporting requires different granularities, you could consider a different time bucket for the table (by hour, by week, etc.) Sean Durity – Staff Systems Engineer, Cassandra From: Abdul Patel mailto:abd786...@gmail.com>> Sent: Thursday, February 6, 2020 1:54 PM To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Subject: [EXTERNAL] Re: Running select against cassandra Its sort of user connected, app team needa number of active users connected say every 1 to 5 mins. The timeout at app end is 120ms. On Thursday, February 6, 2020, Michael Shuler mailto:mich...@pbandjelly.org>> wrote: You'll have to be more specific. What is your table schema and what is the SELECT query? What is the normal response time? As a basic guide for your general question, if the query is something sort of irrelevant that should be stored some other way, like a total row count, or most any SELECT that requires ALLOW FILTERING, you're doing it wrong and should re-evaluate your data model. 1 query per minute is a minuscule fraction of the basic capacity of queries per minute that a Cassandra cluster should be able to handle with good data modeling and table-relevant query. All depends on the data model and query. Michael On 2/6/20 12:20 PM, Abdul Patel wrote: Hi, Is it advisable to run select query to fetch every minute to grab data from cassandra for reporting purpose, if no then whats the alternative? - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org<mailto:user-unsubscr...@cassandra.apache.org> For additional commands, e-mail: user-h...@cassandra.apache.org<mailto:user-h...@cassandra.apache.org> The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not
Re: [EXTERNAL] Re: Running select against cassandra
this is the schema similar to what we have , they want to get user connected - concurrent count for every say 1-5 minutes. i am thinking will simple select will have performance issue or we can go for materialized views ? CREATE TABLE usr_session ( userid bigint, session_usr text, last_access_time timestamp, login_time timestamp, status int, PRIMARY KEY (userid, session_usr) ) WITH CLUSTERING ORDER BY (session_usr ASC) On Thu, Feb 6, 2020 at 2:09 PM Durity, Sean R wrote: > Do you only need the current count or do you want to keep the historical > counts also? By active users, does that mean some kind of user that the > application tracks (as opposed to the Cassandra user connected to the > cluster)? > > > > I would consider a table like this for tracking active users through time: > > > > Create table users_by_day ( > > app_date date, > > hour integer, > > minute integer, > > user_count integer, > > longest_login_user text, > > longest_login_seconds integer, > > last_login datetime, > > last_login_user text ) > > primary key (app_date, hour, minute); > > > > Then, your reporting can easily select full days or a specific, one-minute > slice. Of course, the app would need to have a timer and write out the > data. I would also suggest a TTL on the data so that you only keep what you > need (a week, a year, whatever). Of course, if your reporting requires > different granularities, you could consider a different time bucket for the > table (by hour, by week, etc.) > > > > > > Sean Durity – Staff Systems Engineer, Cassandra > > > > *From:* Abdul Patel > *Sent:* Thursday, February 6, 2020 1:54 PM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] Re: Running select against cassandra > > > > Its sort of user connected, app team needa number of active users > connected say every 1 to 5 mins. > > The timeout at app end is 120ms. > > > > > > On Thursday, February 6, 2020, Michael Shuler > wrote: > > You'll have to be more specific. What is your table schema and what is the > SELECT query? What is the normal response time? > > As a basic guide for your general question, if the query is something sort > of irrelevant that should be stored some other way, like a total row count, > or most any SELECT that requires ALLOW FILTERING, you're doing it wrong and > should re-evaluate your data model. > > 1 query per minute is a minuscule fraction of the basic capacity of > queries per minute that a Cassandra cluster should be able to handle with > good data modeling and table-relevant query. All depends on the data model > and query. > > Michael > > On 2/6/20 12:20 PM, Abdul Patel wrote: > > Hi, > > Is it advisable to run select query to fetch every minute to grab data > from cassandra for reporting purpose, if no then whats the alternative? > > > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > > > -- > > The information in this Internet Email is confidential and may be legally > privileged. It is intended solely for the addressee. Access to this Email > by anyone else is unauthorized. If you are not the intended recipient, any > disclosure, copying, distribution or any action taken or omitted to be > taken in reliance on it, is prohibited and may be unlawful. When addressed > to our clients any opinions or advice contained in this Email are subject > to the terms and conditions expressed in any applicable governing The Home > Depot terms of business or client engagement letter. The Home Depot > disclaims all responsibility and liability for the accuracy and content of > this attachment and for any damages or losses arising from any > inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other > items of a destructive nature, which may be contained in this attachment > and shall not be liable for direct, indirect, consequential or special > damages in connection with this e-mail message or its attachment. >
RE: [EXTERNAL] Re: Running select against cassandra
Do you only need the current count or do you want to keep the historical counts also? By active users, does that mean some kind of user that the application tracks (as opposed to the Cassandra user connected to the cluster)? I would consider a table like this for tracking active users through time: Create table users_by_day ( app_date date, hour integer, minute integer, user_count integer, longest_login_user text, longest_login_seconds integer, last_login datetime, last_login_user text ) primary key (app_date, hour, minute); Then, your reporting can easily select full days or a specific, one-minute slice. Of course, the app would need to have a timer and write out the data. I would also suggest a TTL on the data so that you only keep what you need (a week, a year, whatever). Of course, if your reporting requires different granularities, you could consider a different time bucket for the table (by hour, by week, etc.) Sean Durity – Staff Systems Engineer, Cassandra From: Abdul Patel Sent: Thursday, February 6, 2020 1:54 PM To: user@cassandra.apache.org Subject: [EXTERNAL] Re: Running select against cassandra Its sort of user connected, app team needa number of active users connected say every 1 to 5 mins. The timeout at app end is 120ms. On Thursday, February 6, 2020, Michael Shuler mailto:mich...@pbandjelly.org>> wrote: You'll have to be more specific. What is your table schema and what is the SELECT query? What is the normal response time? As a basic guide for your general question, if the query is something sort of irrelevant that should be stored some other way, like a total row count, or most any SELECT that requires ALLOW FILTERING, you're doing it wrong and should re-evaluate your data model. 1 query per minute is a minuscule fraction of the basic capacity of queries per minute that a Cassandra cluster should be able to handle with good data modeling and table-relevant query. All depends on the data model and query. Michael On 2/6/20 12:20 PM, Abdul Patel wrote: Hi, Is it advisable to run select query to fetch every minute to grab data from cassandra for reporting purpose, if no then whats the alternative? - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org<mailto:user-unsubscr...@cassandra.apache.org> For additional commands, e-mail: user-h...@cassandra.apache.org<mailto:user-h...@cassandra.apache.org> The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.
Re: Running select against cassandra
Also is materialized view good for production? We are on 3.11.4 On Thursday, February 6, 2020, Abdul Patel wrote: > Its sort of user connected, app team needa number of active users > connected say every 1 to 5 mins. > The timeout at app end is 120ms. > > > > On Thursday, February 6, 2020, Michael Shuler > wrote: > >> You'll have to be more specific. What is your table schema and what is >> the SELECT query? What is the normal response time? >> >> As a basic guide for your general question, if the query is something >> sort of irrelevant that should be stored some other way, like a total row >> count, or most any SELECT that requires ALLOW FILTERING, you're doing it >> wrong and should re-evaluate your data model. >> >> 1 query per minute is a minuscule fraction of the basic capacity of >> queries per minute that a Cassandra cluster should be able to handle with >> good data modeling and table-relevant query. All depends on the data model >> and query. >> >> Michael >> >> On 2/6/20 12:20 PM, Abdul Patel wrote: >> >>> Hi, >>> >>> Is it advisable to run select query to fetch every minute to grab data >>> from cassandra for reporting purpose, if no then whats the alternative? >>> >>> >>> >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >> >>
Re: Running select against cassandra
Its sort of user connected, app team needa number of active users connected say every 1 to 5 mins. The timeout at app end is 120ms. On Thursday, February 6, 2020, Michael Shuler wrote: > You'll have to be more specific. What is your table schema and what is the > SELECT query? What is the normal response time? > > As a basic guide for your general question, if the query is something sort > of irrelevant that should be stored some other way, like a total row count, > or most any SELECT that requires ALLOW FILTERING, you're doing it wrong and > should re-evaluate your data model. > > 1 query per minute is a minuscule fraction of the basic capacity of > queries per minute that a Cassandra cluster should be able to handle with > good data modeling and table-relevant query. All depends on the data model > and query. > > Michael > > On 2/6/20 12:20 PM, Abdul Patel wrote: > >> Hi, >> >> Is it advisable to run select query to fetch every minute to grab data >> from cassandra for reporting purpose, if no then whats the alternative? >> >> >> > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > >
Re: Running select against cassandra
You'll have to be more specific. What is your table schema and what is the SELECT query? What is the normal response time? As a basic guide for your general question, if the query is something sort of irrelevant that should be stored some other way, like a total row count, or most any SELECT that requires ALLOW FILTERING, you're doing it wrong and should re-evaluate your data model. 1 query per minute is a minuscule fraction of the basic capacity of queries per minute that a Cassandra cluster should be able to handle with good data modeling and table-relevant query. All depends on the data model and query. Michael On 2/6/20 12:20 PM, Abdul Patel wrote: Hi, Is it advisable to run select query to fetch every minute to grab data from cassandra for reporting purpose, if no then whats the alternative? - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org