Re: CQL Data Model question

2015-05-14 Thread Alaa Zubaidi (PDF)
DF) [mailto:alaa.zuba...@pdf.com] > *Sent:* lundi 11 mai 2015 20:32 > *To:* user@cassandra.apache.org > *Subject:* CQL Data Model question > > > > Hi, > > > > I am trying to port an Oracle Table to Cassandra. > > the table is a wide table (931 columns) and could h

Re: CQL Data Model question

2015-05-12 Thread Jack Krupansky
Porting an SQL data model to Cassandra is an anti-pattern - don't do it! Instead, focus on developing a new data model that capitalizes on the key strengths of Cassandra - distributed, scalable, fast writes, fast direct access. Complex and ad-hoc queries are anti-patterns as well. I'll leave it to

RE: CQL Data Model question

2015-05-12 Thread Ngoc Minh VO
From: Alaa Zubaidi (PDF) [mailto:alaa.zuba...@pdf.com] Sent: lundi 11 mai 2015 20:32 To: user@cassandra.apache.org Subject: CQL Data Model question Hi, I am trying to port an Oracle Table to Cassandra. the table is a wide table (931 columns) and could have millions of rows. name, filter1, filter2

CQL Data Model question

2015-05-11 Thread Alaa Zubaidi (PDF)
Hi, I am trying to port an Oracle Table to Cassandra. the table is a wide table (931 columns) and could have millions of rows. name, filter1, filter2filter30, data1, data2...data900 The user would retrieve multiple rows from this table and filter (30 filter columns) by one or more (up to 3)

Re: Data tiered compaction and data model question

2015-02-19 Thread Roland Etzenhammer
Hi Cass, just a hint from the off - if I got it right you have: Table 1: PRIMARY KEY ( (event_day,event_hr),event_time) Table 2: PRIMARY KEY (event_day,event_time) Assuming your events to write come in by wall clock time, the first table design will have a hotspot on a specific node getting al

RE: Data tiered compaction and data model question

2015-02-19 Thread Mohammed Guller
, velocity and variety. It doesn’t look like your data has the volume or velocity that a standard RDBMS cannot handle. Mohammed From: Kai Wang [mailto:dep...@gmail.com] Sent: Thursday, February 19, 2015 6:06 AM To: user@cassandra.apache.org Subject: Re: Data tiered compaction and data model question

Re: Data tiered compaction and data model question

2015-02-19 Thread cass savy
ohammed >>> >>> >>> >>> *From:* cass savy [mailto:casss...@gmail.com] >>> *Sent:* Wednesday, February 18, 2015 4:21 PM >>> *To:* user@cassandra.apache.org >>> *Subject:* Data tiered compaction and data model question >>> >>> >>&

Re: Data tiered compaction and data model question

2015-02-19 Thread Kai Wang
t in a day? What is >> the worst-case scenario? >> >> >> >> Mohammed >> >> >> >> *From:* cass savy [mailto:casss...@gmail.com] >> *Sent:* Wednesday, February 18, 2015 4:21 PM >> *To:* user@cassandra.apache.org >> *Subject:* Dat

Re: Data tiered compaction and data model question

2015-02-18 Thread cass savy
avy [mailto:casss...@gmail.com] > *Sent:* Wednesday, February 18, 2015 4:21 PM > *To:* user@cassandra.apache.org > *Subject:* Data tiered compaction and data model question > > > > We want to track events in log Cf/table and should be able to query for > events that occurred in r

RE: Data tiered compaction and data model question

2015-02-18 Thread Mohammed Guller
What is the maximum number of events that you expect in a day? What is the worst-case scenario? Mohammed From: cass savy [mailto:casss...@gmail.com] Sent: Wednesday, February 18, 2015 4:21 PM To: user@cassandra.apache.org Subject: Data tiered compaction and data model question We want to track

Data tiered compaction and data model question

2015-02-18 Thread cass savy
We want to track events in log Cf/table and should be able to query for events that occurred in range of mins or hours for given day. Multiple events can occur in a given minute. Listed 2 table designs and leaning towards table 1 to avoid large wide row. Please advice on *Table 1*: not very wid

Re: data model question : finding out the n most recent changes items

2013-07-11 Thread Eric Stevens
I think there is not an extremely simple solution to your problem. You will probably need to use multiple tables to get the view you need. One keyed just by file UUID, which tracks some basic metadata about the file including the last modified time. Another as a materialized view of the most rece

Re: data model question : finding out the n most recent changes items

2013-07-11 Thread Jimmy Lin
t; > > > > -Original Message- > From: y2k...@gmail.com on behalf of Jimmy Lin > Sent: Thu 11-Jul-13 13:09 > To: user@cassandra.apache.org > Subject: Re: data model question : finding out the n most recent changes > items > > what I mean is, I really just w

RE: data model question : finding out the n most recent changes items

2013-07-11 Thread Lohith Samaga M
-Original Message- From: y2k...@gmail.com on behalf of Jimmy Lin Sent: Thu 11-Jul-13 13:09 To: user@cassandra.apache.org Subject: Re: data model question : finding out the n most recent changes items what I mean is, I really just want the last modified date instead of series of timestamp and still

Re: data model question : finding out the n most recent changes items

2013-07-11 Thread Jimmy Lin
what I mean is, I really just want the last modified date instead of series of timestamp and still able to sort or order by it. (maybe I should rephrase my question as how to sort or order by last modified column in a row) CREATE TABLE user_file ( user_id uuid, modified_date timest

Re: data model question : finding out the n most recent changes items

2013-07-11 Thread aaron morton
What you described this sounds like the most appropriate: CREATE TABLE user_file ( user_id uuid, modified_date timestamp, file_id timeuuid, PRIMARY KEY(user_id, modified_date) ); If you normally need more information about the file then either store that as addit

data model question : finding out the n most recent changes items

2013-07-09 Thread Jimmy Lin
I have an application that need to find out the n most recent modified files for a given user id. I started out few tables but still couldn't get what i want, I hope someone get point to some right direction... See my tables below. #1 won't work, because file_id's timeuuid contains creation time,

RE: CQL3 Data Model Question

2013-05-08 Thread Adriano Paggi
ssage- From: Hiller, Dean [mailto:dean.hil...@nrel.gov] Sent: Martes, 07 de Mayo de 2013 05:52 p.m. To: user@cassandra.apache.org Subject: Re: CQL3 Data Model Question Playorm is not yet on CQL3 and cassandra doesn't work well with +10,000 CF's as we went down that path and cassandra can'

Re: CQL3 Data Model Question

2013-05-07 Thread Hiller, Dean
27;t believe that >>we really have any hotspots from what I can tell. >> >>Dean >> >>From: Keith Wright mailto:kwri...@nanigans.com>> >>Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" >>mailto:user@cassand

Re: CQL3 Data Model Question

2013-05-07 Thread Keith Wright
ndra.apache.org<mailto:user@cassandra.apache.org>" >mailto:user@cassandra.apache.org>> >Date: Tuesday, May 7, 2013 2:02 PM >To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" >mailto:user@cassandra.apache.org>> >Subject: CQL3 Data Model Q

Re: CQL3 Data Model Question

2013-05-07 Thread Hiller, Dean
right mailto:kwri...@nanigans.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Tuesday, May 7, 2013 2:02 PM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:use

CQL3 Data Model Question

2013-05-07 Thread Keith Wright
Hi all, I was hoping you could provide some assistance with a data modeling question (my apologies if a similar question has already been posed). I have time based data that I need to store on a per customer (aka app id ) basis so that I can easily return it in sorted order by event time.

Re: Data model question, storing Queue Message

2012-04-30 Thread aaron morton
> Isn't kafka too young for production using purpose ? The best way to advance the project is to use it and contribute your experience and time. btw, checking out kafka is a great idea. There are people around having Fun Times with Kafka in production Cheers - Aaron Morton Fre

Re: Data model question, storing Queue Message

2012-04-30 Thread Morgan Segalis
Isn't kafka too young for production using purpose ? Clearly that would fit much better my needs but I can't afford early stage project not ready for production. Is it ? Le 30 avr. 2012 à 14:28, samal a écrit : > > > On Mon, Apr 30, 2012 at 5:52 PM, Morgan Segalis wrote: > Hi Samal, > > Th

Re: Data model question, storing Queue Message

2012-04-30 Thread samal
On Mon, Apr 30, 2012 at 5:52 PM, Morgan Segalis wrote: > Hi Samal, > > Thanks for the TTL feature, I wasn't aware of it's existence. > > Day's partitioning will be less wider than month partitionning (about 30 > times less give or take ;-) ) > Per day it should have something like 100 000 message

Re: Data model question, storing Queue Message

2012-04-30 Thread Morgan Segalis
Hi Samal, Thanks for the TTL feature, I wasn't aware of it's existence. Day's partitioning will be less wider than month partitionning (about 30 times less give or take ;-) ) Per day it should have something like 100 000 messages stored, most of it would be retrieved so deleted before the TTL f

Re: Data model question, storing Queue Message

2012-04-30 Thread samal
On Mon, Apr 30, 2012 at 4:25 PM, Morgan Segalis wrote: > Hi Aaron, > > Thank you for your answer, I was beginning to think that my question would > never be answered ;-) > > Actually, this is what I was going for, except one thing, instead of > partitioning row per month, I though about partition

Re: Data model question, storing Queue Message

2012-04-30 Thread Morgan Segalis
Hi Aaron, Thank you for your answer, I was beginning to think that my question would never be answered ;-) Actually, this is what I was going for, except one thing, instead of partitioning row per month, I though about partitioning per day, like that everyday I launch the cleaning tool, and it

Re: Data model question, storing Queue Message

2012-04-29 Thread aaron morton
Message Queue is often not a great use case for Cassandra. For information on how to handle high delete workloads see http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra It hard to create a model without some idea of the data load, but I would suggest you start with: CF: Us

Data model question, storing Queue Message

2012-04-26 Thread Morgan Segalis
Hi everyone ! I'm fairly new to cassandra and I'm not quite yet familiarized with column oriented NoSQL model. I have worked a while on it, but I can't seems to find the best model for what I'm looking for. I have a Erlang software that let user connecting and communicate with each others, whe

Re: Materialized Views or Index CF - data model question

2012-04-11 Thread aaron morton
x column data value as column name and book_id as value? >> >> You do not need a different CF for each custom secondary index. Try putting >> the name of the index in the row key. >> >> What will you recommend? >> >> Take another look at the queries you *need* to support. Then b

Re: Materialized Views or Index CF - data model question

2012-04-10 Thread Data Craftsman
. Then build a small > proof of concept to see if Cassandra will work for you. > > Hope that helps. > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 6/04/2012, at 6:46 AM, Data Craftsman wrote: > > Howdy, &g

Re: Materialized Views or Index CF - data model question

2012-04-08 Thread aaron morton
ill you recommend? Take another look at the queries you *need* to support. Then build a small proof of concept to see if Cassandra will work for you. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/04/2012, at 6:46 AM, Data Craftsm

Re: Materialized Views or Index CF - data model question

2012-04-05 Thread Radim Kolar
Will 1500 bytes row size be large or small for Cassandra from your understanding? performance degradation starts at 500MB rows, its very slow if you hit this limit.

Re: data model question

2012-03-12 Thread Tamar Fraenkel
Thanks! Better than mine, as it considered later additions of services! Will update my code, Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Mon, Mar 12, 2012 at 11:

Re: data model question

2012-03-12 Thread Sasha Dolgy
Alternate would be to add another row to your user CF specific for Facebook ids. Column ID would be the Facebook identifier and value would be your internal uuid. Consider when you want to add another service like twitter. Will you then add another CF per service or just another row specific now

Re: data model question

2012-03-12 Thread aaron morton
In this case, where you know the query upfront, I add a custom secondary index using another CF to support the query. It's a little easier here because the data wont change. UserLookupCF (using composite types for the key value) row_key: e.g. "facebook:12345" or "twitter:12345" col_name : e.g

Re: data model question

2012-03-11 Thread Tamar Fraenkel
Hi! Thanks for the response. >From what I read, secondary indices are good only for columns with few possible values. Is this a good fit for my case? I have unique facebook id for every user. Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com

Re: data model question

2012-03-11 Thread Marcel Steinbach
Either you do that or you could think about using a secondary index on the fb user name in your primary cf. See http://www.datastax.com/docs/1.0/ddl/indexes Cheers Am 11.03.2012 um 09:51 schrieb Tamar Fraenkel : Hi! I need some advise: I have user CF, which has a UUID key which is my internal u

data model question

2012-03-11 Thread Tamar Fraenkel
Hi! I need some advise: I have user CF, which has a UUID key which is my internal user id. One of the column is facebook_id of the user (if exist). I need to have the reverse mapping from facebook_id to my UUID. My intention is to add a CF for the mapping from Facebook Id to my id: user_by_fbid =

Re: Data Model Question

2012-01-23 Thread aaron morton
> 1. regarding time slicing, if at any point of time I am interested in what > happened in the last T minutes, then I will need to query more than one row > of the DimentionUpdates, right? Yerp. Sometimes that's is what's needed. > 2. What did you mean by "You will also want to partition the

Re: Data Model Question

2012-01-22 Thread Tamar Fraenkel
Hi! Thank you very much for your response!   I have couple of questions regarding it, some are just to make sure I understood you:   1. regarding time slicing, ifat any point of time I am interested in what happened in the last T minutes, then I will need to query more than one row of the Dimention

Re: Data Model Question

2012-01-22 Thread aaron morton
In general if you are collecting data over time you should consider partitioning the row's to avoid creating very large rows. Also if you have a common request you want to support consider modeling it directly rather than using secondary indexes. Assuming my understanding of the problem is in

Re: Data Model Question

2012-01-21 Thread Edward Capriolo
On Sat, Jan 21, 2012 at 7:49 PM, Jean-Nicolas Boulay Desjardins < jnbdzjn...@gmail.com> wrote: > Milind Parikh, Rainbird is back by Twitter... My worry is that you > might not be around in the future... Also, do you have evidence that > your system is better? Because Rainbird is used by Twitter. >

Re: Data Model Question

2012-01-21 Thread Tamar Fraenkel
Hi It may be my lack of knowledge but both has to do with counting, which is not what I need. What is wrong with the two models I suggested? Tamar Sent from my iPod On Jan 22, 2012, at 2:49 AM, Jean-Nicolas Boulay Desjardins wrote: > Milind Parikh, Rainbird is back by Twitter... My worry is

Re: Data Model Question

2012-01-21 Thread Jean-Nicolas Boulay Desjardins
Milind Parikh, Rainbird is back by Twitter... My worry is that you might not be around in the future... Also, do you have evidence that your system is better? Because Rainbird is used by Twitter. On Sat, Jan 21, 2012 at 6:55 PM, Milind Parikh wrote: > > I used rainbird as inspiration for Countand

Re: Data Model Question

2012-01-21 Thread Milind Parikh
I used rainbird as inspiration for Countandra (& some of publicly available data structures from rainbird preso). That said, there are significant differences between the two architectures. Additiomally as Cassandra begins to provide triggets, some very interesting things will become possible in Co

Re: Data Model Question

2012-01-21 Thread Jean-Nicolas Boulay Desjardins
But What about: Rainbird? On Sat, Jan 21, 2012 at 10:52 AM, R. Verlangen wrote: > > A couple of days ago I came across Countandra ( http://countandra.org/ ). It > seems that it might be a solution for you. > > Gr. Robin > > > 2012/1/20 Tamar Fraenkel >> >> Hi! >> >> I am a newbie to Cassandra a

Re: Data Model Question

2012-01-21 Thread R. Verlangen
A couple of days ago I came across Countandra ( http://countandra.org/ ). It seems that it might be a solution for you. Gr. Robin 2012/1/20 Tamar Fraenkel > ** > > Hi! > > I am a newbie to Cassandra and seeking some advice regarding the data > model I should use to best address my needs. > >

Data Model Question

2012-01-20 Thread Tamar Fraenkel
Hi! I am a newbie to Cassandra and seeking some advice regarding the data model I should use to best address my needs. For simplicity, what I want to accomplish is: I have a system that has users (potentially ~10,000 per day) and they perform actions in the system (total of ~50,000 a day). Each Use

Re: Data Model Question

2010-12-02 Thread Pablo D. Salgado
Hello Aaron, Thanks for your reply. I will try it. Greetings, Pablo 2010/12/2 Aaron Morton > I say yes to all your questions about what you can do with Solr. > > Some background the on the technology... > > Lucene is a Java library for doing full text search > http://lucene.apache.org/java/doc

Re: Data Model Question

2010-12-02 Thread Aaron Morton
I say yes to all your questions about what you can do with Solr. Some background the on the technology...Lucene is a Java library for doing full text search http://lucene.apache.org/java/docs/index.htmlSolr turns lucene into a HTTP server and adds a bunch of other features such as making it easier

Re: Data Model Question

2010-12-02 Thread Pablo D. Salgado
Hello Aaron and Jake, Thank you for your replay. I've worked with cassandra for 6 month but I never use Lucandra. I will try Lucandra, but I must ask (before start), Is possible reach my searching/pagination/sorting requeriments with Lucandra? Thank you in advance, Pablo 2010/12/2 Jake Luciani

Re: Data Model Question

2010-12-02 Thread Jake Luciani
You can also run Solr with Cassandra as the backend: https://github.com/tjake/Lucandra/tree/solandra -Jake On Thu, Dec 2, 2010 at 6:27 AM, aaron morton wrote: > Have you considered using Solr / lucene for the search? It has a lot more > search features, and it really good at faceted navigatio

Re: Data Model Question

2010-12-02 Thread aaron morton
Have you considered using Solr / lucene for the search? It has a lot more search features, and it really good at faceted navigation through a product catalogue. It sounds like it would be a better fit for this task. You can build facets for your price ranges, do the product name thing and filt

Data Model Question

2010-12-01 Thread Pablo D. Salgado
Hello, I need to store "products" data (product.name, product.price, product.state and product.owner) in Cassandra 0.7 rc1. The problem is that I need to get "products" where product.price > XX AND product.price < XX AND product.name = XXX AND product.state = XXX. Also I need return the products

Re: Data model question - column names sort

2010-04-19 Thread Jonathan Ellis
On Thu, Apr 15, 2010 at 6:01 PM, Sonny Heer wrote: > Need a way to have two different types of indexes. > > Key: aTextKey > ColumnName: aTextColumnName:55 > Value: "" > > Key: aTextKey > ColumnName: 55:aTextColumnName > Value: "" > > All the valuable information is stored in the column name itself

Data model question - column names sort

2010-04-15 Thread Sonny Heer
Need a way to have two different types of indexes. Key: aTextKey ColumnName: aTextColumnName:55 Value: "" Key: aTextKey ColumnName: 55:aTextColumnName Value: "" All the valuable information is stored in the column name itself. Above two can be in different column families... Queries: Given a ke