Dear all, my aim is to aggregate some data. To be more specific: I have a table 'mytable' with variables a, b, c, and ctr. The values for variable ctr are all 1 (individual level data in the raw format). If I want to know how many people with the same covariate setting there are, I used the following SQL code in the past:
SELECT a, b, c, sum(ctr) as aggregat FROM mytable GROUP by a, b, c; The problem I face now is that 'mytable' contains more records than my older datasets (the new data contain between one and five million records) and also more than three variables (10 INTEGER variables, 1 TEXT variables). I assume there is a better way to perform this aggregation than what I did so far. I'd be very happy if I get some suggestions. One possibility I see is to split the initial raw data into smaller chunks, perform the aggregation for all chunks and aggregate the chunks again. But I don't think this is a very good solution and I hope that someone here on this list can give me some hints. I am using sqlite 3.2.8 on WinXP Pro. I am not an expert on databases, but so far it was not necessary to really learn a lot since sqlite was very easy to use (Main use so far: storing my data in various tables and extract/aggregate/combine what I need from R via the RSqlite package (www.r-project.org). I hope this is the right place to ask this question and hopefully this question has not been asked zillions of times before (I searched a bit around but maybe not in the right places to be successful). Thank you very much in advance, Roland -- Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten Browser-Versionen downloaden: http://www.gmx.net/de/go/browser ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------