I got interested enough in the psql-with-flex problem to go off and solve it. Attached is a working patch, which I'm now debating whether to apply. Comments solicited...
The patch removes about 200 lines of very spaghetti-ish code in mainloop.c. However, it adds an 875-line flex source file, which might be thought a bad tradeoff :-(. One bright spot is that about half of that total is a direct copy of the main backend lexer, so it's not really as much new, separately maintainable code as all that. Also, Andrew Dunstan's patch for supporting dollar-quoting would add about 100 lines to mainloop.c, versus only a dozen or so lines in the flex implementation. Once that's taken into account I don't think there is a lot of difference in effective SLOC to maintain. I'm also of the opinion that the new C code in psqlscan.l is much more straightforward than the code removed from mainloop.c, though having just written it, I'm no doubt pretty biased. Bruce was asking about speed. On normal-size queries I cannot measure any difference at all. For testing purposes I made up a file containing a single 750K query (just a "SELECT big-honking-string-constant", with the string literal broken into lines of 75 bytes). The client-side (psql) CPU time to run this file looks about like this on my machine: PGCLIENTENCODING UNICODE SJIS CVS tip 1.57 1.82 flex implementation 0.93 2.33 The flex implementation is consistently faster than CVS tip when dealing with backend-compatible encodings (such as UTF-8). It's consistently slower when it has to deal with a non-backend-safe encoding such as SJIS or Big5. But for real-world cases the differential is down in the noise either way. I'm inclined to apply this but I can see where a person not comfortable with flex might feel differently. Opinions? regards, tom lane
bin00000.bin
Description: psql-flex.patch.gz
---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings