On Wed, Jan 04, 2006 at 01:08:47AM +0200, Krasimir Angelov wrote: > Hi John Goerzen, > > I wonder which design decisions are causing you troubles. Could you > explain this? All features which you mentioned can be added easily to > HSQL as well. It is better to share the effort on single library > rather than to have multiple similar libraries. I am willing to work > on HSQL improvement.
Hi Krasimir, First off, thank you for all your work on HSQL. It is great to have a database layer like that for Haskell, and despite my troubles with it, I think you have done a wonderful service for the community. I continue to have HSQL code in production and have found it a useful tool. The final thing that prompted me to do this was that the PostgreSQL -- and possibly the Sqlite -- module for HSQL was segfaulting. I spent quite a bit of time with gdb and the HSQL code, and even with Simon Marlow's assistance, was unable to track down the precise cause. To make matters worse, the problem was intermittent. The Haskell program in question was pure Haskell, and switching it to HDBC solved this issue. I also had extremely high memory usage when dealing with large result sets -- somewhere on the order of 700MB; the same consumes about 12MB with HDBC. My guess from looking briefly at the code is that the entire result set is being read into memory up front. There were a number of other problems as well: * No support for prepared queries or for supplying replacable parameters. (Supported everywhere in HDBC, which removes the need to have escaping.) That's really my #1 complaint (well, aside from the segfaulting <g>). * Escaping function was global, rather than per-DB, which caused some trouble with Sqlite3 at least. (See SF bug 1324873 that I submitted on Oct. 12 with no replies since then) * No way to retrieve result data by column index instead of column name * HSQL provided no way to see the result set as a lazy list, and the public API provided no way to implement that. (There is 'fetch', but it seems that the entire result set was read into memory in advance anyway.) * The code wasn't very easy to understand. (This may be just me though.) * Unclear semantics in multithreaded programs. * No testsuite. I knew I couldn't fix it the right way in HSQL (since I had trouble following the code), and it also seemed like these weren't high-priority issues for you. (No blame here; it's the same way for me with the code I maintain. I can't expect you to fix my bugs in something that's free.) In hindsight, I should have contacted you first, and I apologize for not doing that. I just sorta sat down to design a DB API that I'd like, and pretty soon had a working prototype, and then some drivers... I'm dangerous when I'm on vacation ;-) I'm not quite sure where to go from here. Both packages have features that the other lack. I don't think that it's possible to merge all the HDBC features into HSQL without a major API and architecture refactoring. The HSQL features that HDBC lacks are mostly in progress already, and I've tried to design the HDBC API with them in mind. So, I'd invite you to take a look at the HDBC API at http://darcs.complete.org/hdbc/doc/Database-HDBC.html and let me know how you think we might be able to collaborate. If nothing else, I'm sure we can share ideas. (Some of that you'll see in HDBC, I'm sure.) Perhaps we could even have a HDBC backend for HSQL and vice-versa. -- John _______________________________________________ Haskell mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell
