Tom Lane wrote:
Dirk Riehle <d...@riehle.org> writes:
For one analysis, I created a table of about 15 columns, and then ran an insert with a subselect on the table, where the select was drawing from three other tables, merging over ids. One of the other tables has about 60M rows.

The insert ran for about 18h before running out of disk space. Before the query, there had been about 1TB free on the disk.

After the insert query failed, the disk space was not made available again; PostgreSQL did not free it up (or would not free it up quickly).

What that part sounds like is you mistyped the insert such that it
was inserting a huge number of rows.  It's not too hard to do if
you get the join condition wrong --- what you meant to be a sane
join can easily turn into a Cartesian product.  In theory vacuum
could reclaim the space eventually, but it'd take awhile.

Hmm... here the insert:

insert into t select ... from commits c, diffs d, sloc_metrics sm, where d.commit_id = c.id and sm.diff_id = d.id;

sm has the 46M rows, d has 60M rows. I don't know enough about database engines to understand why it would not be able to incrementally do the matching but rather might run into a cartesian product?

In any case, I pared down the tables and it ran through quickly without further problems. So maybe I had a typo in there.

I rebooted soon thereafter.

During boot, fsck (must have been fsck) gave me tons of freeing up inode messages.

And this part is a filesystem bug; it cannot possibly be Postgres'
fault that the filesystem got corrupted.

One would think so? But the reboot should have stopped PostgreSQL properly.

Thanks for the help!

Dirk

--
Phone: +1 (650) 215 3459
Blog: http://dirkriehle.com
http://twitter.com/dirkriehle

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to