On Fri, May 19, 2006 at 10:02:50PM +0300, Hannu Krosing wrote: > > > It's just SELECT count(*) FROM (SELECT * FROM accounts ORDER BY bid) a; > > > If the tape routines were actually storing visibility information, I'd > > > expect that to be pretty compressible in this case since all the tuples > > > were presumably created in a single transaction by pgbench. > > Was he not using pg_bench data ?
Hmm, so there was only 3 integer fields and one varlena structure which was always empty. This prepended with a tuple header with mostly blank fields or at least repeated, yes, I can see how we might get a 25-to-1 compression. Maybe we need to change pgbench so that it puts random text in the filler field, that would at least put some strain on the compression algorithm... > I guess that tapefiles compress better than averahe table because they > are sorted, and thus at least a little more repetitive than the rest. > If there are varlen types, then they usually also have abundance of > small 4-byte integers, which should also compress at least better than > 4/1, maybe a lot better. Hmm, that makes sense. That also explains the 37-to-1 compression I was seeing on indexes :). Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate.
signature.asc
Description: Digital signature