Kenneth, Thank you for the reply, and sorry for my late response. I did not notice this had a reply until just now!
To answer your questions: In this statement from your email: > > SELECT * FROM m_ps_project_policy_device4 where project_id=1337 and > policy_id=7331 and device_id='1234567890' group by policy_id, device_id > limit 1; > > > > …why would you have the “group by policy_id, device_id” section at all > when you are already doing “policy_id=7331 and device_id=’1234567890’” and > returning one (limit 1)? > The clustering key for the materialized view includes far more than just policy_id and device_id. Using group by allows me to collapse the extraneous keys into just the keys I am interested in (policy_id and device_id). The limit 1 is extraneous from my query generator and can be removed. Also, in your materialized view I think the “clustering order by” should > have started with policy_id, so it would have been: …WITH CLUSTERING > ORDER BY(policy_id, device_id, date_created DESC); > > however that doesn’t seem very performant. > I ordered the clustering keys specifically to have date_created as the first key so that I can (attempt to) do what was described in my question: take the first result from a query which specifies the exact partition keys and get the most recent record back. The issue now is that I seem to be getting the oldest data first when using group by; this is what I would like to solve -- hopefully without having to create yet another materialized view with a different ordering. On Thu, Feb 21, 2019 at 7:18 PM Kenneth Brotman <kenbrot...@yahoo.com.invalid> wrote: > Hey Joseph, > > > > Also, in your materialized view I think the “clustering order by” should > have started with policy_id, so it would have been: …WITH CLUSTERING > ORDER BY(policy_id, device_id, date_created DESC); > > however that doesn’t seem very performant. > > > > Maybe in the materialized view the primary key should have a compound > partition key of (project_id, policy_id, device_id). > > > > > > *From:* Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID] > *Sent:* Thursday, February 21, 2019 3:38 PM > *To:* user@cassandra.apache.org > *Subject:* RE: Group By Does Not Follow Clustering Order > > > > Joseph, > > > > In this statement from your email: > > SELECT * FROM m_ps_project_policy_device4 where project_id=1337 and > policy_id=7331 and device_id='1234567890' group by policy_id, device_id > limit 1; > > > > …why would you have the “group by policy_id, device_id” section at all > when you are already doing “policy_id=7331 and device_id=’1234567890’” and > returning one (limit 1)? > > > > Kenneth Brotman > > > > *From:* Joseph Wonesh [mailto:joseph.won...@sticknfind.com] > *Sent:* Thursday, February 21, 2019 10:39 AM > *To:* user@cassandra.apache.org > *Subject:* Re: Group By Does Not Follow Clustering Order > > > > Hi all, > > > > I am bumping this email hoping that it can reach a larger audience. > > > > Thanks, > > Joseph > > > > On Tue, Feb 12, 2019 at 11:45 AM Joseph Wonesh < > joseph.won...@sticknfind.com> wrote: > > Hello, > > > > I have a materialized view defined by the following: > > > > *CREATE *MATERIALIZED VIEW m_ps_project_policy_device0 > *AS *SELECT policy_id, device_id, project_id, namespace, metric_type, > blufi_id, beacon_id, event_uuid, state, date_created, policy_name, > beacon_name, blufi_name, value, duration *FROM *policy_state0 > *WHERE *policy_id *IS NOT NULL AND *device_id *IS NOT NULL AND > *project_id *IS NOT NULL AND *namespace *IS NOT NULL AND *metric_type *IS NOT > NULL AND *blufi_id *IS NOT NULL AND *beacon_id *IS NOT NULL AND *event_uuid > *IS NOT NULL AND *state *IS NOT NULL AND *date_created > *IS NOT NULL *PRIMARY KEY ((project_id), policy_id, device_id, > date_created, blufi_id, beacon_id, state, namespace, metric_type, event_uuid) > *WITH *CLUSTERING *ORDER BY *(date_created *DESC*); > > > > This view works fine if i run a query like the following: > > > > SELECT * FROM m_ps_project_policy_device4 where project_id=1337 and > policy_id=7331 and device_id='1234567890' limit 1; > > > > The result of this query gives me the most recent due to the date_created > desc clustering order. > > > > However, this query does not behave as expected: > > > > SELECT * FROM m_ps_project_policy_device4 where project_id=1337 and > policy_id=7331 and device_id='1234567890' group by policy_id, device_id > limit 1; > > > > The result of this query gives me the FIRST record from the partition, > which is the OLDEST record due to the clustering order desc. > > > > Is this a natural result due to my ordering? Would I need to use a view > that has order by ASC to achieve what I want to do using the built-in group > by aggregations? I am hoping there is a way to achieve what I want to do > (getting the most recent record for each of the <project_id, policy_id, > device_id> tuples using the built-in aggregation functions. > > > > Thanks, > > Joseph Wonesh > > > This message is private and confidential. If you have received message in > error, please notify us and remove from your system. > -- This message is private and confidential. If you have received message in error, please notify us and remove from your system.