On 21.03.2012 13:14, Fujii Masao wrote:
     PANIC:  space reserved for WAL record does not match what was
written, CurrPos: C/0, EndPos: B/FF000000

So I think that the patch would have a bug which handles WAL boundary wrongly.

Thanks for the testing! These WAL boundary issues are really tricky, you found bugs in that area before, and I found and fixed one before posting the last version, and apparently there's still at least one left.

Overall, what do you (and others) think about the state of this patch? I'm starting to feel that this needs to be pushed to 9.3. That bug might not be very hard to fix, but the fact that these bugs are still cropping up at this late stage makes me uneasy. That boundary calculation in particular is surprisingly tricky, and I think it could be made less tricky with some refactoring of the WAL-logging code, replacing XLogRecPtr with uint64s, like Peter (IIRC) suggested a while ago. And that seems like 9.3 material. Also, there's still these two known issues:

1. It slows down the WAL insertion in a single backend by about 10%
2. With lots of backends inserting tiny records concurrently, you get spinlock contention, which consumes a lot of CPU. Without the patch, you get lwlock contention and bandwidth isn't any better, but you sleep rather than spin.

I'm afraid those issues aren't easily fixable. I haven't been able to identify the source of slowdown in the single-backend case, it seems to be simply the distributed cost of the extra bookkeeping. That might be acceptable, 10% slowdown of raw WAL insertion speed is not good, but WAL insertion only accounts for a fraction of the total CPU usage for any real workload, so I believe the slowdown of a real application would be more like 1-3%, at worst. But I would feel more comfortable if we had more time to test that.

The spinlock contention issue might be acceptable too. I think it would be hard to run into it in a real application, and even then, the benchmarks show that although you spend a lot of CPU time spinning, you get at least the same overall bandwidth with the patch, which is what really matters. And I think it could be alleviated by reducing the time the spinlock is held, and I think that could be done by making the space reservation calculations simpler. If we got rid of the limitation that the WAL record header is never split across WAL pages, and always stored the continuation record header on all WAL pages, the space reservation calculation could be reduced to essentially "currentpos += size". But that again seems 9.3 material.

So, although none of the issues alone is a show-stopper, but considering all these things together, I'm starting to feel that this needs to be pushed to 9.3. Thoughts?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to