RE: Database interface - would like advice on oracle library binding
The following code illustrates a _generic_ interface to low-level database code. The left-fold iterator doQuery is completely generic over any possible iterator -- no matter how many columns the query returns, what are the types of these columns and what is the type of the seed (accumulator). The code for doQuery remains the same. The iterator allocates buffers for columns at the beginning and frees the buffers at the very end. Again, this buffer handling is generic. There is no longer need to write extraction/unmarshalling function for specific types of rows. We only need fetching functions for specific datatypes (not columns!). Again, the query and the row buffer management code is completely generic. I guess I'm repeating myself. The tests: -- Query returns one column of type String -- Never mind undefined: we return some static data in the buffers, -- we don't have any oracle to bind to test1 = doQuery undefined undefined iter1 ([]::[String]) where iter1:: String - [String] - Either [String] [String] iter1 s acc = Right $ s:acc -- Query returns two columns of types String and Int test2 = doQuery undefined undefined iter2 ([]::[(String,Int)]) where iter2:: String - Int - [(String,Int)] - Either [(String,Int)] [(String,Int)] iter2 s i acc = Right $ (s,i):acc -- Query returns three columns of types Int, String and Int test3 = doQuery undefined undefined iter3 ([]::[(Int,String,Int)]) where iter3:: Int - String - Int - [(Int,String,Int)] - Either [(Int,String,Int)] [(Int,String,Int)] iter3 i1 s i2 acc = Right $ (i1,s,i2):acc Use the function runtests to run either of these tests. The code follows. Compiler flags: -fglasgow-exts -fallow-overlapping-instances -- DB column buffers type BufferSize = Int data BufferType = ORA_char | ORA_int type Position = Int -- column number of the result table data Buffer = Buffer { bufptr :: String -- for this stub, just use String , nullindptr :: String -- likewise , retsizeptr :: String -- likewise , size:: BufferSize , pos:: Position , ora_type:: BufferType } -- understandably, below is just a stub ... alloc_buffer (siz, typ) ps = return $ Buffer { bufptr = show ps, pos = ps, size = siz, ora_type = typ} -- In this stub, don't do anything free ptr = return () -- DB Column types class DBType a where alloc_buffer_hints:: a - (BufferSize, BufferType) col_fetch:: Buffer - IO a instance DBType String where alloc_buffer_hints _ = (2000, ORA_char) col_fetch buffer = return (bufptr buffer) instance DBType Int where alloc_buffer_hints _ = (4, ORA_int) col_fetch buffer = return (read $ bufptr buffer) -- need to add more ... -- Row iteratees. Note, the folowing two instances cover ALL possible -- iteratees. No other instances are needed class SQLIteratee iter seed where iter_apply:: [Buffer] - seed - iter - IO (Either seed seed) alloc_buffers:: Position - iter - seed - IO [Buffer] instance (DBType a) = SQLIteratee (a-seed-Either seed seed) seed where iter_apply [buf] seed fn = col_fetch buf = (\v - return$ fn v seed) alloc_buffers n _ _ = sequence [alloc_buffer (alloc_buffer_hints (undefined::a)) n] instance (SQLIteratee iter' seed, DBType a) = SQLIteratee (a-iter') seed where iter_apply (buf:others) seed fn = col_fetch buf = (\v - iter_apply others seed (fn v)) alloc_buffers n fn seed = do this_buffer - alloc_buffer (alloc_buffer_hints (undefined::a)) n other_buffers - alloc_buffers (n+1) (fn (undefined::a)) seed return (this_buffer:other_buffers) free_buffers = mapM_ free -- The left fold iterator -- the query executor data Session -- not relevant for this example data SQLStmt db_execute session query = return () db_fetch_row buffers = return () -- use static data doQuery:: (SQLIteratee iter seed) = Session - SQLStmt - iter - seed - IO seed -- In this example, we just allocate buffers, fetch two rows and terminate -- with a clean-up doQuery session query iteratee seed = do buffers - alloc_buffers 0 iteratee seed db_execute session query db_fetch_row buffers (Right seed) - iter_apply buffers seed iteratee db_fetch_row buffers (Right seed) - iter_apply buffers seed iteratee free_buffers buffers return seed -- Tests -- Query returns one column of type String test1 = doQuery undefined undefined iter1 ([]::[String]) where iter1:: String - [String] - Either [String] [String] iter1 s acc = Right $ s:acc -- Query returns two columns of types String and Int test2 = doQuery undefined undefined iter2 ([]::[(String,Int)]) where iter2:: String - Int - [(String,Int)] - Either [(String,Int)] [(String,Int)] iter2 s i acc = Right $ (s,i):acc -- Query returns three columns of types Int, String and Int test3 = doQuery
Database interface - would like advice on oracle library binding
(2nd attempt; mailman thinks I'm not a list member, but it still keeps sending me mail.) Still making slow progress on an Oracle database binding... now I'm trying to fit the API I have into some sort of abstract interface (like the one(s) discussed previously: http://haskell.org/pipermail/haskell-cafe/2003-August/004957.html ). 1. Is the left-fold the best/only interface to expose? I think yes, but that doesn't allow fine-grained control over cursors i.e. being able to open many cursors at once and interleave fetches from them. Or does it? 2. I'm finding it hard to write a doQuery function that takes an extraction function that isn't a pig to write. Some advice would be useful here... (and a long-ish explanation follows): The Oracle Call Interface (OCI) requires that I allocate buffers for the result of a single row fetch, before the first row is fetched. So a query involves: - prepare statement etc - allocate buffers (one for each column - call OCI C function DefineByPos) - fetch row - extract/marshal data from buffer into Haskell types (which are then processed by fold function) - fetch, marshal (repeat until no more rows) - free buffers i.e. the same buffers are re-used for each row. The problem for me is how to specify the left-fold function in terms of the low-level API. If I want to supply extraction functions (to extract Haskell values from result buffer), how do I manage the buffer allocation in the doQuery function? The buffer allocate/free code also needs to know the column positions and types, in the same manner as the extract functions. I want to be able to write code like this: results - doQuery dbconn sqltext [] \row results - do name- stringv row 1 address - stringv row 2 return (name,address):results .. where the stringv function extracts/marshals a Haskell String from the result buffer. The intermediate approach I currently have means I have to pass an IO action into the doQuery function that, when evaluated, allocates the buffer and returns two more actions: - an action that extracts the row as a tuple - another action that frees the buffer The doQuery function evaluates the initial action (to allocate the buffer), uses the extract action to build the result (at present a list), and when there are no more rows, uses the free action to free the buffer. This approach is quite awkward (especially w.r.t. writing extract functions), and it's hard for me to see how to build a better interface. Hard, because of the memory management requirements. Here's a chunk of the code. A lot of it is OCI plumbing, but I hope you can see how awkward it is to create an extract function (see ex3 at the bottom). Given pointers to the buffer, extract a string of variable length (you have to terminate it yourself). fetchStringVal :: OCIColInfo - IO String fetchStringVal (_, bufptr, nullindptr, retsizeptr) = do retsize - liftM cShort2Int (peek retsizeptr) nullind - liftM cShort2Int (peek nullindptr) -- unused pokeByteOff (castPtr bufptr) retsize nullByte val - peekCString (castPtr bufptr) return val Free a single column's buffer. freeColBuffer :: OCIColInfo - IO () freeColBuffer (_, bufptr, nullindptr, retsizeptr) = do free bufptr free retsizeptr free nullindptr Create a buffer for a string column, and return the extract and free IO actions. getExtractFnString :: Int - ErrorHandle - StmtHandle - IO (IO String, IO ()) getExtractFnString posn err stmt = do c - defineCol err stmt posn 2000 oci_SQLT_CHR return ((fetchStringVal c), (freeColBuffer c)) doQuery uses the extractFns action to create the result buffer, and the two actions (extract and free) which are passed to doQuery2. doQuery2 :: ErrorHandle - StmtHandle - IO a - IO () - [a] - IO [a] doQuery2 err stmt extractData freeMem results = do rc - fetch err stmt if rc == oci_NO_DATA then do freeMem return results else do v - extractData doQuery2 err stmt extractData freeMem (v:results) doQuery :: Session - String - (ErrorHandle - StmtHandle - IO (IO a, IO ())) - IO [a] doQuery (Sess env err con) qry extractFns = do stmt - getStmt env err prepare err stmt qry execute err con stmt (extractData, freeMem) - extractFns err stmt doQuery2 err stmt extractData freeMem [] The interface provided by doQuery means I have to write extract functions like this. Here's one for a select that returns three String columns. It's quite awkward... ex3 :: ErrorHandle - StmtHandle - IO (IO (String, String, String), IO ()) ex3 err stmt = do (fetchcol1, freecol1) - getExtractFnString 1 err stmt (fetchcol2, freecol2) - getExtractFnString 2 err stmt (fetchcol3, freecol3) - getExtractFnString 3 err stmt return ( do { s1 - fetchcol1; s2 - fetchcol2; s3 - fetchcol3; return (s1, s2, s3) } , do { freecol1; freecol2; freecol3 } )
Database interface - would like advice on oracle library binding
Bayley, Alistair writes: : | Still making slow progress on an Oracle database binding... now I'm trying | to fit the API I have into some sort of abstract interface (like the one(s) | discussed previously: | http://haskell.org/pipermail/haskell-cafe/2003-August/004957.html ). | | | 1. Is the left-fold the best/only interface to expose? I think yes, | but that doesn't allow fine-grained control over cursors i.e. being | able to open many cursors at once and interleave fetches from | them. Or does it? It looks like the interleaving would be limited to a nested loop structure: a cursor could be processed in full during one extraction for another cursor. Application-side nested loop structures are often a poor substitute for server-side joins. | 2. I'm finding it hard to write a doQuery function that takes an | extraction function that isn't a pig to write. Some advice would be | useful here... (and a long-ish explanation follows): : Here's my attempt to summarise the piggishness you describe: The interface to Oracle requires that you initialise a cursor by allocating a suitably typed buffer for each column prior to fetching the first row, and finalise a cursor by freeing those buffers after fetching the last row. This means that we must iterate over the columns 3 times. We would prefer to express this iteration only once, and have the other 2 happen automatically within the library. (As opposed to what ex3 does, which is to iterate for getExtractFnString, iterate for fetchcolN, and iterate for freecolN.) Here's one approach: find the OCI equivalent of JDBC's ResultSetMetaData, and use it to drive the allocation and freeing of buffers. Here's another: Add a mode attribute to the abstract data type which encompasses ErrorHandle and StmtHandle. (I'll persist in referring to that ADT as Cursor.) Expect extraction functions to be written along these lines: \cursor result - do field1 - getIntcursor field2 - getString cursor field3 - getString cursor return ((field1, field2, field3):result, True) Make getInt (and friends) behave differently depending on the mode of the cursor they're passed: either allocate a buffer and return _|_, decode and return the current column of the current row, or free a buffer and return _|_. doQuery could then apply the extraction function once in Allocate mode after opening the cursor, once per fetched row in Decode mode, and once in Free mode at the end. There's nothing to stop an extraction function from varying the number of get___ functions it applies, or trying to match their results when not in Decode mode. These weakness could be mitigated by: Pointing out that some database connection standards (JDBC, and for all I know also ODBC) don't guarantee that you can still get at a row's 1st column after you've looked at its 2nd column, i.e. there's a precedent for such fragility. Complicating the extraction functions by giving them the type (Cursor - b - IO (IO (b, Bool))) , expecting that all the get___ functions are applied in the outer IO layer, and undertaking that the inner IO layer will only be used in Decode mode. Regards, Tom ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe