Nathan Kurz <[EMAIL PROTECTED]> writes:
> On Thu, Dec 15, 2005 at 09:17:48PM +0000, Andrew McDermott wrote: >> For example, I'm currently computing a histogram in application code for >> a moderately sized table (7+ million rows), but I'm wondering whether it >> would be quicker to get the results computed via a custom SQLite >> function. I'm expecting it to be quicker because the current >> implementation traverses the JNI boundary for each row in the result set >> whereas a custom function wouldn't need to do this. > > No, it is not currently possible. The way the VDBE is set up, all > functions return only a single value. But do you need multiple rows, > or multiple values? For a histogram, would multiple values suffice? Thanks for this suggestion. I have done this and it works just fine. > > If so you can fake it. With aggregate functions it's possible to > compute the histogram, and then return it in some special form. You > could for example define a histogram function that returns a comma > separated list of text, or you could have it return an array as a blob. > >> select histogram(column) from table; >> 1,4,1 > > Or you could split the histogram function into pieces, one for each > bin, and write a simple aggregate function that just counts the > occurences of one particular value: > >> select number(column,1), number(column,2), number(column,3) from table; >> 1,4,1 ----------------- This pertains to the mailing list thread http://www.mail-archive.com/[email protected]/msg12076.html Andy, Nathan, I've recently "discovered" sqlite and I am very happy with it! As such, this is my first post, and I came across this thread starting back in 2005 while searching for how to return multiple values in table form from a user-defined extension function. I'm sure that this problem has been solved already, and that I'm not doing anything new, but I didn't see this approach in the mailing list, so I thought I'd give an alternate way to return multiple values (or approximate the return, rather than having to deal with a comma-delimited list, which doesn't allow further SQL set functions to be applied to the result). The approach is to construct (and index) a simple one-dimensional table with integers ranging from 1 to some arbitrarily large number (larger than the number of rows in your largest table, for example) and then use this as a type of "iterator" across the data. For example, to create a histogram from data values, you may try the following: sqlite>.mode column sqlite>.header on sqlite>create table his(value int); sqlite>insert into his values (1); sqlite>insert into his values (2); sqlite>insert into his values (3); sqlite>insert into his values (4); sqlite>insert into his values (5); sqlite>insert into his values (4); sqlite>insert into his values (1); sqlite>insert into his values (1); sqlite>insert into his values (1); sqlite>create table iterator(i int); sqlite>insert into iterator values (1); sqlite>insert into iterator values (2); sqlite>insert into iterator values (3); sqlite>insert into iterator values (4); sqlite>insert into iterator values (5); sqlite>insert into iterator values (6); sqlite>insert into iterator values (7); sqlite>insert into iterator values (8); sqlite>insert into iterator values (9); sqlite>create index iter on iterator(i); sqlite>select iterator.i as i, count(his.value) as count from his,iterator where his.value=iterator.i group by iterator.i; i count --------- --------- 1 4 2 1 3 1 4 2 5 1 sqlite>select iterator.i as i, iterator.i+2 as j, count(his.value) as count from his,iterator where his.value between iterator.i and iterator.i+2 group by iterator.i; i j count --------- --------- --------- 1 3 6 2 4 4 3 5 4 4 6 3 5 7 1 An interesting thing about using such an 'iterator table' is that it can be used for traversals using simple functions as well as aggregate functions. Consider, for example, the simple function charindex contained in extension-functions.c contributed by Mikey C. and packaged by Liam Healy. This function takes two arguments, one a string to be found in another larger string, designated by the second argument. There is an optional third argument, which designates the string position from which to begin searching. The function returns the integer position of the start of the first string in the second string: sqlite>select charindex("ELVIS","ELVIShasleftthebuildingbutELVISwillbebacktomorrownight") as pos; pos --------- 1 To find all occurrences of ELVIS, and more importantly, to return the results in tabular form, use the iterator table and the optional third parameter, to which we pass the iterator value: sqlite>select charindex("ELVIS","ELVIShasleftthebuildingbutELVISwillbebacktomorrownight",i ) as pos from iterator where iterator.i<length("ELVIShasleftthebuildingbutELVISwillbebacktomorrownight") and pos!=0 group by pos; pos --------- 1 27 Please forgive me if I'm making this too simple and people already know this. But I use this a lot with my own user-defined functions and it saves me a lot of trouble, especially since I can do it all in SQL without a lot of ODBC back and forth, which is slow. Best Regards, Michael --------------------------------- Michael Janis <mailto:[EMAIL PROTECTED]> [EMAIL PROTECTED] [EMAIL PROTECTED] --------------------------------- _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

