Design questions/Schema Help

Mark Mon, 26 Jul 2010 16:44:29 -0700

We are thinking about using Cassandra to store our search logs. Cansomeone point me in the right direction/lend some guidance on design? Iam new to Cassandra and I am having trouble wrapping my head around someof these new concepts. My brain keeps wanting to go back to a RDBMS design.

We will be storing the user query, # of hits returned and their sessionid. We would like to be able to answer the following questions.

- What is the n most popular queries and their counts within the last x(mins/hours/days/etc). Basically the most popular searches within agiven time range.- What is the most popular query within the last x where hits = 0. Sameas above but with an extra "where" clause

- For session id x give me all their other queries
- What are all the session ids that searched for 'foos'

We accomplish the above functionality w/ MySQL using 2 tables. One forthe raw search log information and the other to keep theaggregate/running counts of queries.

Would this sort of ad-hoc querying be better implemented using Hadoop +Hive? If so, should I be storing all this information in Cassandra thenusing Hadoop to retrieve it?


Thanks for your suggestions

Design questions/Schema Help

Reply via email to