Re: [sqlite] Re: concers about database size

2006-03-24 Thread Jim C. Nasby
On Wed, Mar 22, 2006 at 07:35:32PM +0100, Daniel Franke wrote: > > I can tell you that even 750M rows wouldn't be a huge deal for PostgreSQL, > > and 20G of data is nothing. Though your table would take somewhere > > around 30G due to the higher per-row overhead in PostgreSQL; I'm not > > really su

Re: [sqlite] Re: concers about database size

2006-03-22 Thread Daniel Franke
> > This may take a while, about 20 hours maybe. The partition has approx > > 10GB, I can't afford more. Let's hope that this is sufficient. > > 20 hours seems rather long. Even if you have to worry about uniqueness > constraints, there are ways to deal with that that should be much faster > (deal

Re: [sqlite] Re: concers about database size

2006-03-22 Thread Jim C. Nasby
On Thu, Mar 16, 2006 at 09:53:27PM +0100, Daniel Franke wrote: > > > That would be an excellent question to add to the FAQ: > > "How do I estimate the resource requirements for a database?" > > I spent some time to create 3GB of sample data (just zeros, about half the > size of the actual data s

Re: [sqlite] Re: concers about database size

2006-03-16 Thread John Stanton
Jay Sprenkle wrote: On 3/16/06, Daniel Franke <[EMAIL PROTECTED]> wrote: The original idea was to get rid of thousands of files to store their data in one single container. Those (ASCII) files add up to approx 5GB ... If so, are you trying to use a blender to stir the ocean? You might reeval

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Carl Jacobs
Daniel > The most common scenarios: > - get a single marker from subset of individuals > - get a subset of markers from a single individual > - get a subset of markers from a subset of individuals Sounds like this might define your database. Each individual has 500,000 markers, but maybe they

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Daniel Franke
> > This may take a while, about 20 hours maybe. The partition has approx > > 10GB, I can't afford more. Let's hope that this is sufficient. > > Import data without indexes, then add then when the import is complete. > It's much faster. Thanks. I can not do this with the real data, unfortunatelly

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Jay Sprenkle
On 3/16/06, Daniel Franke <[EMAIL PROTECTED]> wrote: > > > That would be an excellent question to add to the FAQ: > > "How do I estimate the resource requirements for a database?" > > I spent some time to create 3GB of sample data (just zeros, about half the > size of the actual data set I have to

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Daniel Franke
> That would be an excellent question to add to the FAQ: > "How do I estimate the resource requirements for a database?" I spent some time to create 3GB of sample data (just zeros, about half the size of the actual data set I have to deal with). I'm currently importing it into the database. As

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Jay Sprenkle
> Why would it require a lot of RAM? I ask not because I doubt you, but > because my intuition says that a BTree based database should scale > pretty well. While certainly it would run faster if you can fit the > whole thing in RAM, if the index can be made to fit in RAM it seems > like the data

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Nathan Kurz
On Thu, Mar 16, 2006 at 11:34:31AM -0600, Jay Sprenkle wrote: > > > You may legitimately need one really large table but most applications > > > don't. > > Too bad. My guess is that you're doing the right thing trying to consolidate. > It's going to take expensive hardware no matter what you end u

RE: [sqlite] Re: concers about database size

2006-03-16 Thread Fred Williams
> -Original Message- > From: Jay Sprenkle [mailto:[EMAIL PROTECTED] > Sent: Thursday, March 16, 2006 10:15 AM > To: sqlite-users@sqlite.org > Subject: Re: [sqlite] Re: concers about database size > > > > Does sound like an awful lot of data. I think the questi

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Jay Sprenkle
> > You may legitimately need one really large table but most applications > > don't. > > The most common scenarios: > - get a single marker from subset of individuals > - get a subset of markers from a single individual > - get a subset of markers from a subset of individuals > > There is no "o

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Daniel Franke
> How do you use your data? Do you really need to compare all the information > about any and all individuals? > [...] > You may legitimately need one really large table but most applications > don't. The most common scenarios: - get a single marker from subset of individuals - get a subset of

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Jay Sprenkle
> The table in question: > CREATE TABLE genotypes (markerid integer NOT NULL REFERENCES marker(id), > individualid integer NOT NULL REFERENCES > individuals(id), > genA integer, > genB integer); > > I don't see how to segment t

[sqlite] Re: concers about database size

2006-03-16 Thread Daniel Franke
Note: this is a combined reply to answers sent by Fred and Jay. Fred wrote: > > > If so, are you trying to use a blender to stir the ocean? > > > You might reevaluate if you're using the right tool for the job. > > > > That's my question: IS sqlite the right tool here? =) > > And I believe he is a

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Jay Sprenkle
On 3/16/06, Daniel Franke <[EMAIL PROTECTED]> wrote: > Each single file contains detailed genotypic information of many > individuals at a given genomic region. We have to implement _loads_ of > quality control measures to ensure the maximum possible data > correctness. Earlier, we did this manual

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Jay Sprenkle
> Does sound like an awful lot of data. I think the question might be > reworded to ask is there any manageable logical groups of data that > might lend themselves to simple segmentation into separate > tables/databases? I thought about asking that as well, but wondered if there wasn't a simpler

[sqlite] Re: concers about database size

2006-03-16 Thread Daniel Franke
> > > You might reevaluate if you're using the right tool for the job. > > That's my question: IS sqlite the right tool here? =) > Then I guess the right question is what are your goals? To make > maintenance easier? > Why were the thousands of files a problem? Short answer: I want to improve - d

RE: [sqlite] Re: concers about database size

2006-03-16 Thread Fred Williams
> -Original Message- > From: Daniel Franke [mailto:[EMAIL PROTECTED] > Sent: Thursday, March 16, 2006 9:32 AM > To: sqlite-users@sqlite.org > Subject: [sqlite] Re: concers about database size > > > > > But now, there's another thing.I figured out how

Re: [sqlite] Re: concers about database size

2006-03-16 Thread Jay Sprenkle
On 3/16/06, Daniel Franke <[EMAIL PROTECTED]> wrote: > The original idea was to get rid of thousands of files to store their data > in one single container. Those (ASCII) files add up to approx 5GB ... > > > If so, are you trying to use a blender to stir the ocean? > > You might reevaluate if you'r

[sqlite] Re: concers about database size

2006-03-16 Thread Daniel Franke
> > But now, there's another thing.I figured out how large my database > > will become and I'm scared of its size: up to 20GB and more! A single > > table, 4 columns, each holding an integer (32 bit) will have > > approximately 750 million rows. This mounts up to ~11GB. Adding an > > unique two-col