subject:"Re\: Partition Key \- Wide rows\?"

Re: Partition Key - Wide rows?

2016-10-06 Thread Saladi Naidu

It depends on Partition/Primary key design. In order to execute all 3 queries, 
Partition Key is Org id and others are Clustering keys. if there are many org's 
it will be ok, but if it is one org then a single partition  will hold all the 
data and its not good Naidu Saladi 
 

On Thursday, October 6, 2016 12:14 PM, Ali Akhtar  
wrote:
 

 Thanks, Phil.
1- In my use-case, its probably okay to partition all the org data together. 
This is for a b2b enterprise SaaS application, the customers will be 
organizations.
So it is probably okay to store each org's data next to each other, right?
2- I'm thinking of having the primary key be: (org_id, team_id, project_id, 
issue_id). 
In the above case, will there be a skinny row per issue, or a wide row per org 
/ team / project?
3- Just to double check, with the above primary key, can I still query using 
just the org_id, org + team id, and org + team + project id?
4- If I wanted to refer to a particular issue, it looks like I'd need to send 
all 4 parameters. That may be problematic. Is there a better way of modeling 
this data?


On Thu, Oct 6, 2016 at 9:30 PM, Philip Persad  wrote:



1) No.  Your first 3 queries will work but not the last one (get issue by id).  
In Cassandra when you query you must include every preceding portion of the 
primary key.

2) 64 bytes (16 * 4), or somewhat more if storing as strings?  I don't think 
that's something I'd worry too much about.

3) Depends on how you build your partition key.  If partition key is (org id), 
then you get one partition per org (probably bad depending on your dataset).  
If partition key is (org id, team id, project id) then you will have one 
partition per project which is probably fine ( again, depending on your 
dataset).

Cheers,

-PhilFrom: Ali Akhtar
Sent: ‎2016-‎10-‎06 9:04 AM
To: user@cassandra.apache.org
Subject: Partition Key - Wide rows?

Heya,
I'm designing some tables, where data needs to be stored in the following 
hierarchy:
Organization -> Team -> Project -> Issues
I need to be able to retrieve issues:
- For the whole org - using org id- For a team (org id + team id)- For a 
project (org id + team id + project id)- If possible, by using just the issue id
I'm considering using all 4 ids as the primary key. The first 3 will use UUIDs, 
except issue id which will be an alphanumeric string, unique per project.
1) Will this setup allow using all 4 query scenarios?2) Will this make the 
primary key really long, 3 UUIDs + similar length'd issue id?3) Will this store 
issues as skinny rows, or wide rows? If an org has a lot of teams, which have a 
lot of projects, which have a lot of issues, etc, could I have issues w/ 
running out of the column limit of wide rows?4) Is there a better way of 
achieving this scenario?

Re: Partition Key - Wide rows?

2016-10-06 Thread Jonathan Haddad

>  In my use-case, its probably okay to partition all the org data together.


Maybe, maybe not.  Cassandra doesn't handle really big partitions very well
right now.  If you've got more than 100MB of data per org, you're better
off breaking it up (by project or team) and doing multiple queries to
stitch the data together client side.



On Thu, Oct 6, 2016 at 10:14 AM Ali Akhtar  wrote:

> Thanks, Phil.
>
> 1- In my use-case, its probably okay to partition all the org data
> together. This is for a b2b enterprise SaaS application, the customers will
> be organizations.
>
> So it is probably okay to store each org's data next to each other, right?
>
> 2- I'm thinking of having the primary key be: (org_id, team_id,
> project_id, issue_id).
>
> In the above case, will there be a skinny row per issue, or a wide row per
> org / team / project?
>
> 3- Just to double check, with the above primary key, can I still query
> using just the org_id, org + team id, and org + team + project id?
>
> 4- If I wanted to refer to a particular issue, it looks like I'd need to
> send all 4 parameters. That may be problematic. Is there a better way of
> modeling this data?
>
>
>
> On Thu, Oct 6, 2016 at 9:30 PM, Philip Persad 
> wrote:
>
>
>
> 1) No.  Your first 3 queries will work but not the last one (get issue by
> id).  In Cassandra when you query you must include every preceding portion
> of the primary key.
>
> 2) 64 bytes (16 * 4), or somewhat more if storing as strings?  I don't
> think that's something I'd worry too much about.
>
> 3) Depends on how you build your partition key.  If partition key is (org
> id), then you get one partition per org (probably bad depending on your
> dataset).  If partition key is (org id, team id, project id) then you will
> have one partition per project which is probably fine ( again, depending on
> your dataset).
>
> Cheers,
>
> -Phil
> --
> From: Ali Akhtar 
> Sent: ‎2016-‎10-‎06 9:04 AM
> To: user@cassandra.apache.org
> Subject: Partition Key - Wide rows?
>
> Heya,
>
> I'm designing some tables, where data needs to be stored in the following
> hierarchy:
>
> Organization -> Team -> Project -> Issues
>
> I need to be able to retrieve issues:
>
> - For the whole org - using org id
> - For a team (org id + team id)
> - For a project (org id + team id + project id)
> - If possible, by using just the issue id
>
> I'm considering using all 4 ids as the primary key. The first 3 will use
> UUIDs, except issue id which will be an alphanumeric string, unique per
> project.
>
> 1) Will this setup allow using all 4 query scenarios?
> 2) Will this make the primary key really long, 3 UUIDs + similar length'd
> issue id?
> 3) Will this store issues as skinny rows, or wide rows? If an org has a
> lot of teams, which have a lot of projects, which have a lot of issues,
> etc, could I have issues w/ running out of the column limit of wide rows?
> 4) Is there a better way of achieving this scenario?
>
>
>
>
>
>

Re: Partition Key - Wide rows?

2016-10-06 Thread Ali Akhtar

Thanks, Phil.

1- In my use-case, its probably okay to partition all the org data
together. This is for a b2b enterprise SaaS application, the customers will
be organizations.

So it is probably okay to store each org's data next to each other, right?

2- I'm thinking of having the primary key be: (org_id, team_id, project_id,
issue_id).

In the above case, will there be a skinny row per issue, or a wide row per
org / team / project?

3- Just to double check, with the above primary key, can I still query
using just the org_id, org + team id, and org + team + project id?

4- If I wanted to refer to a particular issue, it looks like I'd need to
send all 4 parameters. That may be problematic. Is there a better way of
modeling this data?

On Thu, Oct 6, 2016 at 9:30 PM, Philip Persad 
wrote:

>
>
> 1) No.  Your first 3 queries will work but not the last one (get issue by
> id).  In Cassandra when you query you must include every preceding portion
> of the primary key.
>
> 2) 64 bytes (16 * 4), or somewhat more if storing as strings?  I don't
> think that's something I'd worry too much about.
>
> 3) Depends on how you build your partition key.  If partition key is (org
> id), then you get one partition per org (probably bad depending on your
> dataset).  If partition key is (org id, team id, project id) then you will
> have one partition per project which is probably fine ( again, depending on
> your dataset).
>
> Cheers,
>
> -Phil
> --
> From: Ali Akhtar 
> Sent: ‎2016-‎10-‎06 9:04 AM
> To: user@cassandra.apache.org
> Subject: Partition Key - Wide rows?
>
> Heya,
>
> I'm designing some tables, where data needs to be stored in the following
> hierarchy:
>
> Organization -> Team -> Project -> Issues
>
> I need to be able to retrieve issues:
>
> - For the whole org - using org id
> - For a team (org id + team id)
> - For a project (org id + team id + project id)
> - If possible, by using just the issue id
>
> I'm considering using all 4 ids as the primary key. The first 3 will use
> UUIDs, except issue id which will be an alphanumeric string, unique per
> project.
>
> 1) Will this setup allow using all 4 query scenarios?
> 2) Will this make the primary key really long, 3 UUIDs + similar length'd
> issue id?
> 3) Will this store issues as skinny rows, or wide rows? If an org has a
> lot of teams, which have a lot of projects, which have a lot of issues,
> etc, could I have issues w/ running out of the column limit of wide rows?
> 4) Is there a better way of achieving this scenario?
>
>
>
>
>

RE: Partition Key - Wide rows?

2016-10-06 Thread Philip Persad

1) No.  Your first 3 queries will work but not the last one (get issue by
id).  In Cassandra when you query you must include every preceding portion
of the primary key.

2) 64 bytes (16 * 4), or somewhat more if storing as strings?  I don't
think that's something I'd worry too much about.

3) Depends on how you build your partition key.  If partition key is (org
id), then you get one partition per org (probably bad depending on your
dataset).  If partition key is (org id, team id, project id) then you will
have one partition per project which is probably fine ( again, depending on
your dataset).

Cheers,

-Phil
--
From: Ali Akhtar 
Sent: ‎2016-‎10-‎06 9:04 AM
To: user@cassandra.apache.org
Subject: Partition Key - Wide rows?

Heya,

I'm designing some tables, where data needs to be stored in the following
hierarchy:

Organization -> Team -> Project -> Issues

I need to be able to retrieve issues:

- For the whole org - using org id
- For a team (org id + team id)
- For a project (org id + team id + project id)
- If possible, by using just the issue id

I'm considering using all 4 ids as the primary key. The first 3 will use
UUIDs, except issue id which will be an alphanumeric string, unique per
project.

1) Will this setup allow using all 4 query scenarios?
2) Will this make the primary key really long, 3 UUIDs + similar length'd
issue id?
3) Will this store issues as skinny rows, or wide rows? If an org has a lot
of teams, which have a lot of projects, which have a lot of issues, etc,
could I have issues w/ running out of the column limit of wide rows?
4) Is there a better way of achieving this scenario?

Re: Partition Key - Wide rows?

Re: Partition Key - Wide rows?

Re: Partition Key - Wide rows?

RE: Partition Key - Wide rows?

4 matches

Site Navigation

Mail list logo

Footer information