Dear all,

my aim is to aggregate some data. To be more specific: I have a table 'mytable' 
with variables a, b, c, and ctr. The values for variable ctr are all 1 
(individual level data in the raw format).
If I want to know how many people with the same covariate setting
there are, I used the following SQL code in the past:

SELECT a, b, c, sum(ctr) as aggregat
FROM mytable
GROUP by a, b, c;


The problem I face now is that 'mytable' contains more records than my
older datasets (the new data contain between one and five million
records) and also more than three variables (10 INTEGER variables, 1
TEXT variables).

I assume there is a better way to perform this aggregation than what I
did so far. I'd be very happy if I get some suggestions. One possibility I see 
is to split the initial raw data into smaller chunks, perform the aggregation 
for all chunks and aggregate the chunks again. But I don't think this is a very 
good solution and I hope that someone here on this list can give me some hints.


I am using sqlite 3.2.8 on WinXP Pro. 
I am not an expert on databases, but so far it was not necessary to
really learn a lot since sqlite was very easy to use (Main use so far:
storing my data in various tables and extract/aggregate/combine what I
need from R via the RSqlite package (www.r-project.org).

I hope this is the right place to ask this question and hopefully this
question has not been asked zillions of times before (I searched a bit
around but maybe not in the right places to be successful).

Thank you very much in advance,
Roland

-- 
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten 
Browser-Versionen downloaden: http://www.gmx.net/de/go/browser

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to