Hi all, based on Pavels input, I did some quick trials and thought I let you know the results. I'm running on an old pentiu...@3.2ghz with 3G memory and Win-7. So, don't expect super-fast times!
1. I have one DB with 5411 units and 1517 measurements with columns unit_id, m_1, .. m_1517 No indices and not 100% populated: .db size is 48'834kB, whereas for full population I'd expect 5411*1517*8 = 65'668kB. To dump the entire db into a .csv file with sqlite3.exe takes ~35s To dump a random column takes 0.42 .. 1.02s with a mean of 0.65s To dump 5 randomly selected columns takes 0.65 .. 1.16s with mean 0.88s To dump 100 randomly selected columns takes 2.34 .. 3.11s with mean 2.54s (I always dump all units and the times include the time to start sqlite3 - so it could be improved...; The run on one column shows the most dispersion and it is not clear which columns cost most... Each series above includes 100 random selections) 2. I quickly created a DB for 5500 units and 1500 measurements in the format unit_id, meas_id, meas_val This database is fully populated with random values for meas_val and is 191MByte in size. Dumping the whole db costs 77s, but this may be an unfair measurement since I simply run select * from table; rather than restoring the rectangular format of units x measurements. Of course, now dumping one meas_id for all unit_id is expensive, so I need an index on meas_id to run queries like 'select * from table where meas_id=n'. That increases .db size to 293MByte. Likely, I'd need another index on unit_id as well... so db size would grow another 100MByte, which I clearly woudn't like! With the index, dumping one meas_id with the query above costs 0.61s (I didn't do multiple runs here...), but if I now want to select multiple meas_id with queries like select * from table where meas_id IN (n1, n2, ..) it costs for 5 meas_id 1.62s and for 100 meas_id 27.2s (!!), so I'm penalized not only in size but also in time compared to case 1 above. So, I can't see the advantage of normalization here. (Note that I do not plan to join or index on measurement columns!) So, back to my first question... what do I pay when I increase the compile time limits of SQLite from 2000 to 32k columns? To be honest, I still consider staying in the 2000 column limit with a blocked table layout similar as follows: unit_id, block_id, meas_1, .... meas_2000 so that measurements 2001 ... 4000 would come on a new row. (Yea, that's 2002 cols, so I need take care... :-) ) Thanks for your help anyway - and let me know if you have further input or remarks, either regarding above measurements (did I miss some important stuff?) or for the case of >2000 measurements. Greetings - Stefan -- View this message in context: http://old.nabble.com/Limitation-on-Column-count-tp27117364p27203919.html Sent from the SQLite mailing list archive at Nabble.com. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users