Nathan Kurz <[EMAIL PROTECTED]> writes: > On Thu, Dec 15, 2005 at 09:17:48PM +0000, Andrew McDermott wrote: >> For example, I'm currently computing a histogram in application code for >> a moderately sized table (7+ million rows), but I'm wondering whether it >> would be quicker to get the results computed via a custom SQLite >> function. I'm expecting it to be quicker because the current >> implementation traverses the JNI boundary for each row in the result set >> whereas a custom function wouldn't need to do this. > > No, it is not currently possible. The way the VDBE is set up, all > functions return only a single value. But do you need multiple rows, > or multiple values? For a histogram, would multiple values suffice?
Thanks for this suggestion. I have done this and it works just fine. > > If so you can fake it. With aggregate functions it's possible to > compute the histogram, and then return it in some special form. You > could for example define a histogram function that returns a comma > separated list of text, or you could have it return an array as a blob. > >> select histogram(column) from table; >> 1,4,1 > > Or you could split the histogram function into pieces, one for each > bin, and write a simple aggregate function that just counts the > occurences of one particular value: > >> select number(column,1), number(column,2), number(column,3) from table; >> 1,4,1 > > Both of these would be quite easy to do. That said, I'd really love > if it were possible to return either multiple values or multiple rows > from a user defined function. It would make my life much easier, > since then I could work with the results at the SQL level rather than > the application level. As it is, I'm currently I'm doing stuff like > writing functions that cache the results and return a handle, then > writing accessor functions that lookup the result via that handle. > > My impression is that multiple values would be difficult, as the VDBE > parser would have to know in advance what columns would be returned by > the function to be able to use the results internally. Multiple rows > (of a single value) seem like they would be a fairly simple addition, > as one would only need to specify that a function might return > multiple rows, and not how many or their type. > > --nate > -- andy