Re: CQL Data Model question

2015-05-14 Thread Alaa Zubaidi (PDF)
when you import to the table : One line in Oracle table with K filters become C(3, K) lines in C* table. Best regards, Minh *From:* Alaa Zubaidi (PDF) [mailto:alaa.zuba...@pdf.com] *Sent:* lundi 11 mai 2015 20:32 *To:* user@cassandra.apache.org *Subject:* CQL Data Model question Hi

RE: CQL Data Model question

2015-05-12 Thread Ngoc Minh VO
From: Alaa Zubaidi (PDF) [mailto:alaa.zuba...@pdf.com] Sent: lundi 11 mai 2015 20:32 To: user@cassandra.apache.org Subject: CQL Data Model question Hi, I am trying to port an Oracle Table to Cassandra. the table is a wide table (931 columns) and could have millions of rows. name, filter1, filter2

Re: CQL Data Model question

2015-05-12 Thread Jack Krupansky
Porting an SQL data model to Cassandra is an anti-pattern - don't do it! Instead, focus on developing a new data model that capitalizes on the key strengths of Cassandra - distributed, scalable, fast writes, fast direct access. Complex and ad-hoc queries are anti-patterns as well. I'll leave it to

CQL Data Model question

2015-05-11 Thread Alaa Zubaidi (PDF)
Hi, I am trying to port an Oracle Table to Cassandra. the table is a wide table (931 columns) and could have millions of rows. name, filter1, filter2filter30, data1, data2...data900 The user would retrieve multiple rows from this table and filter (30 filter columns) by one or more (up to 3)

Re: Data tiered compaction and data model question

2015-02-19 Thread Kai Wang
is the worst-case scenario? Mohammed *From:* cass savy [mailto:casss...@gmail.com] *Sent:* Wednesday, February 18, 2015 4:21 PM *To:* user@cassandra.apache.org *Subject:* Data tiered compaction and data model question We want to track events in log Cf/table and should be able

RE: Data tiered compaction and data model question

2015-02-19 Thread Mohammed Guller
, velocity and variety. It doesn’t look like your data has the volume or velocity that a standard RDBMS cannot handle. Mohammed From: Kai Wang [mailto:dep...@gmail.com] Sent: Thursday, February 19, 2015 6:06 AM To: user@cassandra.apache.org Subject: Re: Data tiered compaction and data model question

Re: Data tiered compaction and data model question

2015-02-19 Thread Roland Etzenhammer
Hi Cass, just a hint from the off - if I got it right you have: Table 1: PRIMARY KEY ( (event_day,event_hr),event_time) Table 2: PRIMARY KEY (event_day,event_time) Assuming your events to write come in by wall clock time, the first table design will have a hotspot on a specific node getting

RE: Data tiered compaction and data model question

2015-02-18 Thread Mohammed Guller
What is the maximum number of events that you expect in a day? What is the worst-case scenario? Mohammed From: cass savy [mailto:casss...@gmail.com] Sent: Wednesday, February 18, 2015 4:21 PM To: user@cassandra.apache.org Subject: Data tiered compaction and data model question We want to track

Data tiered compaction and data model question

2015-02-18 Thread cass savy
We want to track events in log Cf/table and should be able to query for events that occurred in range of mins or hours for given day. Multiple events can occur in a given minute. Listed 2 table designs and leaning towards table 1 to avoid large wide row. Please advice on *Table 1*: not very

Re: data model question : finding out the n most recent changes items

2013-07-11 Thread aaron morton
What you described this sounds like the most appropriate: CREATE TABLE user_file ( user_id uuid, modified_date timestamp, file_id timeuuid, PRIMARY KEY(user_id, modified_date) ); If you normally need more information about the file then either store that as

Re: data model question : finding out the n most recent changes items

2013-07-11 Thread Jimmy Lin
what I mean is, I really just want the last modified date instead of series of timestamp and still able to sort or order by it. (maybe I should rephrase my question as how to sort or order by last modified column in a row) CREATE TABLE user_file ( user_id uuid, modified_date

RE: data model question : finding out the n most recent changes items

2013-07-11 Thread Lohith Samaga M
-Original Message- From: y2k...@gmail.com on behalf of Jimmy Lin Sent: Thu 11-Jul-13 13:09 To: user@cassandra.apache.org Subject: Re: data model question : finding out the n most recent changes items what I mean is, I really just want the last modified date instead of series of timestamp and still

Re: data model question : finding out the n most recent changes items

2013-07-11 Thread Jimmy Lin
- From: y2k...@gmail.com on behalf of Jimmy Lin Sent: Thu 11-Jul-13 13:09 To: user@cassandra.apache.org Subject: Re: data model question : finding out the n most recent changes items what I mean is, I really just want the last modified date instead of series of timestamp and still able

data model question : finding out the n most recent changes items

2013-07-10 Thread Jimmy Lin
I have an application that need to find out the n most recent modified files for a given user id. I started out few tables but still couldn't get what i want, I hope someone get point to some right direction... See my tables below. #1 won't work, because file_id's timeuuid contains creation

RE: CQL3 Data Model Question

2013-05-08 Thread Adriano Paggi
: Hiller, Dean [mailto:dean.hil...@nrel.gov] Sent: Martes, 07 de Mayo de 2013 05:52 p.m. To: user@cassandra.apache.org Subject: Re: CQL3 Data Model Question Playorm is not yet on CQL3 and cassandra doesn't work well with +10,000 CF's as we went down that path and cassandra can't cope, so we have one

CQL3 Data Model Question

2013-05-07 Thread Keith Wright
Hi all, I was hoping you could provide some assistance with a data modeling question (my apologies if a similar question has already been posed). I have time based data that I need to store on a per customer (aka app id ) basis so that I can easily return it in sorted order by event time.

Re: CQL3 Data Model Question

2013-05-07 Thread Hiller, Dean
@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: CQL3 Data Model Question Hi all, I was hoping you could provide some assistance with a data modeling question (my apologies if a similar question has already been posed). I have time based data that I need to store on a per customer (aka app id ) basis so

Re: CQL3 Data Model Question

2013-05-07 Thread Keith Wright
user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Tuesday, May 7, 2013 2:02 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: CQL3 Data Model Question Hi all, I was hoping you could provide some

Re: CQL3 Data Model Question

2013-05-07 Thread Hiller, Dean
: CQL3 Data Model Question Hi all, I was hoping you could provide some assistance with a data modeling question (my apologies if a similar question has already been posed). I have time based data that I need to store on a per customer (aka app id ) basis so that I can easily return it in sorted

Re: Data model question, storing Queue Message

2012-04-30 Thread Morgan Segalis
Hi Aaron, Thank you for your answer, I was beginning to think that my question would never be answered ;-) Actually, this is what I was going for, except one thing, instead of partitioning row per month, I though about partitioning per day, like that everyday I launch the cleaning tool, and

Re: Data model question, storing Queue Message

2012-04-30 Thread samal
On Mon, Apr 30, 2012 at 4:25 PM, Morgan Segalis msega...@gmail.com wrote: Hi Aaron, Thank you for your answer, I was beginning to think that my question would never be answered ;-) Actually, this is what I was going for, except one thing, instead of partitioning row per month, I though

Re: Data model question, storing Queue Message

2012-04-30 Thread Morgan Segalis
Hi Samal, Thanks for the TTL feature, I wasn't aware of it's existence. Day's partitioning will be less wider than month partitionning (about 30 times less give or take ;-) ) Per day it should have something like 100 000 messages stored, most of it would be retrieved so deleted before the TTL

Re: Data model question, storing Queue Message

2012-04-30 Thread samal
On Mon, Apr 30, 2012 at 5:52 PM, Morgan Segalis msega...@gmail.com wrote: Hi Samal, Thanks for the TTL feature, I wasn't aware of it's existence. Day's partitioning will be less wider than month partitionning (about 30 times less give or take ;-) ) Per day it should have something like 100

Re: Data model question, storing Queue Message

2012-04-30 Thread Morgan Segalis
Isn't kafka too young for production using purpose ? Clearly that would fit much better my needs but I can't afford early stage project not ready for production. Is it ? Le 30 avr. 2012 à 14:28, samal samalgo...@gmail.com a écrit : On Mon, Apr 30, 2012 at 5:52 PM, Morgan Segalis

Re: Data model question, storing Queue Message

2012-04-30 Thread aaron morton
Isn't kafka too young for production using purpose ? The best way to advance the project is to use it and contribute your experience and time. btw, checking out kafka is a great idea. There are people around having Fun Times with Kafka in production Cheers - Aaron Morton

Re: Data model question, storing Queue Message

2012-04-29 Thread aaron morton
Message Queue is often not a great use case for Cassandra. For information on how to handle high delete workloads see http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra It hard to create a model without some idea of the data load, but I would suggest you start with: CF:

Data model question, storing Queue Message

2012-04-26 Thread Morgan Segalis
Hi everyone ! I'm fairly new to cassandra and I'm not quite yet familiarized with column oriented NoSQL model. I have worked a while on it, but I can't seems to find the best model for what I'm looking for. I have a Erlang software that let user connecting and communicate with each others,

Re: Materialized Views or Index CF - data model question

2012-04-11 Thread aaron morton
Craftsman wrote: Howdy, Can I ask a data model question here? We have a book table with 20 columns, 300 million rows, average row size is 1500 bytes. create table book( book_id, isbn, price, author, titile, ... col_n1, col_n2, ... col_nm ); Data usage: We need to query

Re: Materialized Views or Index CF - data model question

2012-04-10 Thread Data Craftsman
, Data Craftsman wrote: Howdy, Can I ask a data model question here? We have a book table with 20 columns, 300 million rows, average row size is 1500 bytes. create table book( book_id, isbn, price, author, titile, ... col_n1, col_n2, ... col_nm ); Data usage: We need to query

Re: Materialized Views or Index CF - data model question

2012-04-08 Thread aaron morton
if Cassandra will work for you. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/04/2012, at 6:46 AM, Data Craftsman wrote: Howdy, Can I ask a data model question here? We have a book table with 20 columns, 300 million rows

Re: Materialized Views or Index CF - data model question

2012-04-05 Thread Radim Kolar
Will 1500 bytes row size be large or small for Cassandra from your understanding? performance degradation starts at 500MB rows, its very slow if you hit this limit.

Re: data model question

2012-03-13 Thread Tamar Fraenkel
Thanks! Better than mine, as it considered later additions of services! Will update my code, Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Mon, Mar 12, 2012 at

Re: data model question

2012-03-12 Thread aaron morton
In this case, where you know the query upfront, I add a custom secondary index using another CF to support the query. It's a little easier here because the data wont change. UserLookupCF (using composite types for the key value) row_key: system_name:id e.g. facebook:12345 or twitter:12345

Re: data model question

2012-03-12 Thread Sasha Dolgy
Alternate would be to add another row to your user CF specific for Facebook ids. Column ID would be the Facebook identifier and value would be your internal uuid. Consider when you want to add another service like twitter. Will you then add another CF per service or just another row specific

data model question

2012-03-11 Thread Tamar Fraenkel
Hi! I need some advise: I have user CF, which has a UUID key which is my internal user id. One of the column is facebook_id of the user (if exist). I need to have the reverse mapping from facebook_id to my UUID. My intention is to add a CF for the mapping from Facebook Id to my id: user_by_fbid

Re: data model question

2012-03-11 Thread Marcel Steinbach
Either you do that or you could think about using a secondary index on the fb user name in your primary cf. See http://www.datastax.com/docs/1.0/ddl/indexes Cheers Am 11.03.2012 um 09:51 schrieb Tamar Fraenkel ta...@tok-media.com: Hi! I need some advise: I have user CF, which has a UUID key

Re: data model question

2012-03-11 Thread Tamar Fraenkel
Hi! Thanks for the response. From what I read, secondary indices are good only for columns with few possible values. Is this a good fit for my case? I have unique facebook id for every user. Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com

Re: Data Model Question

2012-01-23 Thread aaron morton
1. regarding time slicing, if at any point of time I am interested in what happened in the last T minutes, then I will need to query more than one row of the DimentionUpdates, right? Yerp. Sometimes that's is what's needed. 2. What did you mean by You will also want to partition the list

Re: Data Model Question

2012-01-22 Thread aaron morton
In general if you are collecting data over time you should consider partitioning the row's to avoid creating very large rows. Also if you have a common request you want to support consider modeling it directly rather than using secondary indexes. Assuming my understanding of the problem is in

Re: Data Model Question

2012-01-21 Thread R. Verlangen
A couple of days ago I came across Countandra ( http://countandra.org/ ). It seems that it might be a solution for you. Gr. Robin 2012/1/20 Tamar Fraenkel ta...@tok-media.com ** Hi! I am a newbie to Cassandra and seeking some advice regarding the data model I should use to best address

Re: Data Model Question

2012-01-21 Thread Jean-Nicolas Boulay Desjardins
But What about: Rainbird? On Sat, Jan 21, 2012 at 10:52 AM, R. Verlangen ro...@us2.nl wrote: A couple of days ago I came across Countandra ( http://countandra.org/ ). It seems that it might be a solution for you. Gr. Robin 2012/1/20 Tamar Fraenkel ta...@tok-media.com Hi! I am a

Re: Data Model Question

2012-01-21 Thread Milind Parikh
I used rainbird as inspiration for Countandra ( some of publicly available data structures from rainbird preso). That said, there are significant differences between the two architectures. Additiomally as Cassandra begins to provide triggets, some very interesting things will become possible in

Re: Data Model Question

2012-01-21 Thread Jean-Nicolas Boulay Desjardins
Milind Parikh, Rainbird is back by Twitter... My worry is that you might not be around in the future... Also, do you have evidence that your system is better? Because Rainbird is used by Twitter. On Sat, Jan 21, 2012 at 6:55 PM, Milind Parikh milindpar...@gmail.com wrote: I used rainbird as

Re: Data Model Question

2012-01-21 Thread Tamar Fraenkel
Hi It may be my lack of knowledge but both has to do with counting, which is not what I need. What is wrong with the two models I suggested? Tamar Sent from my iPod On Jan 22, 2012, at 2:49 AM, Jean-Nicolas Boulay Desjardins jnbdzjn...@gmail.com wrote: Milind Parikh, Rainbird is back by

Re: Data Model Question

2012-01-21 Thread Edward Capriolo
On Sat, Jan 21, 2012 at 7:49 PM, Jean-Nicolas Boulay Desjardins jnbdzjn...@gmail.com wrote: Milind Parikh, Rainbird is back by Twitter... My worry is that you might not be around in the future... Also, do you have evidence that your system is better? Because Rainbird is used by Twitter. On

Data Model Question

2012-01-20 Thread Tamar Fraenkel
Hi! I am a newbie to Cassandra and seeking some advice regarding the data model I should use to best address my needs. For simplicity, what I want to accomplish is: I have a system that has users (potentially ~10,000 per day) and they perform actions in the system (total of ~50,000 a day). Each

Re: Data Model Question

2010-12-02 Thread aaron morton
Have you considered using Solr / lucene for the search? It has a lot more search features, and it really good at faceted navigation through a product catalogue. It sounds like it would be a better fit for this task. You can build facets for your price ranges, do the product name thing and

Re: Data Model Question

2010-12-02 Thread Jake Luciani
You can also run Solr with Cassandra as the backend: https://github.com/tjake/Lucandra/tree/solandra /shameless_plug -Jake On Thu, Dec 2, 2010 at 6:27 AM, aaron morton aa...@thelastpickle.comwrote: Have you considered using Solr / lucene for the search? It has a lot more search features,

Re: Data Model Question

2010-12-02 Thread Pablo D. Salgado
Hello Aaron and Jake, Thank you for your replay. I've worked with cassandra for 6 month but I never use Lucandra. I will try Lucandra, but I must ask (before start), Is possible reach my searching/pagination/sorting requeriments with Lucandra? Thank you in advance, Pablo 2010/12/2 Jake Luciani

Re: Data Model Question

2010-12-02 Thread Aaron Morton
I say yes to all your questions about what you can do with Solr.Some background the on the technology...Lucene is a Java library for doing full text searchhttp://lucene.apache.org/java/docs/index.htmlSolr turns lucene into a HTTP server and adds a bunch of other features such as making it easier

Re: Data Model Question

2010-12-02 Thread Pablo D. Salgado
Hello Aaron, Thanks for your reply. I will try it. Greetings, Pablo 2010/12/2 Aaron Morton aa...@thelastpickle.com I say yes to all your questions about what you can do with Solr. Some background the on the technology... Lucene is a Java library for doing full text search

Data Model Question

2010-12-01 Thread Pablo D. Salgado
Hello, I need to store products data (product.name, product.price, product.state and product.owner) in Cassandra 0.7 rc1. The problem is that I need to get products where product.price XX AND product.price XX AND product.name = XXX AND product.state = XXX. Also I need return the products with

Re: Data model question - column names sort

2010-04-19 Thread Jonathan Ellis
On Thu, Apr 15, 2010 at 6:01 PM, Sonny Heer sonnyh...@gmail.com wrote: Need a way to have two different types of indexes. Key: aTextKey ColumnName: aTextColumnName:55 Value: Key: aTextKey ColumnName: 55:aTextColumnName Value: All the valuable information is stored in the column name

Data model question - column names sort

2010-04-15 Thread Sonny Heer
Need a way to have two different types of indexes. Key: aTextKey ColumnName: aTextColumnName:55 Value: Key: aTextKey ColumnName: 55:aTextColumnName Value: All the valuable information is stored in the column name itself. Above two can be in different column families... Queries: Given a key,