Oliver, > Haven't you just replaced one preprocessing step with another, then?
Generally not. The most common problem with the current choice of escape character is that there are *lots* of data load scenarios with backslash in the text strings. The extra preprocessing to escape them is unnecessary on other databases and, in effect, causes the load to be even slower because you have to prepare the data ahead of time. Also, note that this patch can also do escape processing and the net result will still be 5+ times faster than what is there. In the data warehousing industry, data conversion and manipulation is normally kept distinct from data loading. Conversion is done by tools called ETL (Extract Transform Load) and the database will have a very fast path for direct loading of the resulting data. PostgreSQL is definitely a strange database right now in that there is a default filter applied to the data on load. It's even more strange because the load path is so slow, and now that we've found that the slowness is there mostly because of non-optimized parsing and attribute conversion routines. The question of how to do escape processing is a separate one, but is wrapped up in the question of whether to introduce a new loading routine or whether to optimize the old one. - Luke ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq