Nathan Kurz <[EMAIL PROTECTED]> writes:

 

> On Thu, Dec 15, 2005 at 09:17:48PM +0000, Andrew McDermott wrote:

>> For example, I'm currently computing a histogram in application code for

>> a moderately sized table (7+ million rows), but I'm wondering whether it

>> would be quicker to get the results computed via a custom SQLite

>> function.  I'm expecting it to be quicker because the current

>> implementation traverses the JNI boundary for each row in the result set

>> whereas a custom function wouldn't need to do this.

> 

> No, it is not currently possible.  The way the VDBE is set up, all

> functions return only a single value.  But do you need multiple rows,

> or multiple values?  For a histogram, would multiple values suffice?

 

Thanks for this suggestion.  I have done this and it works just fine.

 

> 

> If so you can fake it.  With aggregate functions it's possible to

> compute the histogram, and then return it in some special form.  You

> could for example define a histogram function that returns a comma

> separated list of text, or you could have it return an array as a blob.

> 

>> select histogram(column) from table;

>> 1,4,1

> 

> Or you could split the histogram function into pieces, one for each

> bin, and write a simple aggregate function that just counts the

> occurences of one particular value:

> 

>> select number(column,1), number(column,2), number(column,3) from table;

>> 1,4,1

 

-----------------

 

This pertains to the mailing list thread
http://www.mail-archive.com/[email protected]/msg12076.html

 

Andy, Nathan,

 

I've recently "discovered" sqlite and I am very happy with it!  As such,
this is my first post, and I came across this thread starting back in 2005
while searching for how to return multiple values in table form from a
user-defined extension function.  I'm sure that this problem has been solved
already, and that I'm not doing anything new, but I didn't see this approach
in the mailing list, so I thought I'd give an alternate way to return
multiple values (or approximate the return, rather than having to deal with
a comma-delimited list, which doesn't allow further SQL set functions to be
applied to the result).

 

The approach is to construct (and index) a simple one-dimensional table with
integers ranging from 1 to some arbitrarily large number (larger than the
number of rows in your largest table, for example) and then use this as a
type of "iterator" across the data.

 

For example, to create a histogram from data values, you may try the
following:

 

sqlite>.mode column

sqlite>.header on

sqlite>create table his(value int);

sqlite>insert into his values (1);

sqlite>insert into his values (2);

sqlite>insert into his values (3);

sqlite>insert into his values (4);

sqlite>insert into his values (5);

sqlite>insert into his values (4);

sqlite>insert into his values (1);

sqlite>insert into his values (1);

sqlite>insert into his values (1);

 

sqlite>create table iterator(i int);

sqlite>insert into iterator values (1);

sqlite>insert into iterator values (2);

sqlite>insert into iterator values (3);

sqlite>insert into iterator values (4);

sqlite>insert into iterator values (5);

sqlite>insert into iterator values (6);

sqlite>insert into iterator values (7);

sqlite>insert into iterator values (8);

sqlite>insert into iterator values (9);

sqlite>create index iter on iterator(i);

 

sqlite>select iterator.i as i, count(his.value) as count from his,iterator
where his.value=iterator.i group by iterator.i;

i           count

---------   ---------

1           4

2           1

3           1

4           2

5           1

 

sqlite>select iterator.i as i, iterator.i+2 as j, count(his.value) as count
from his,iterator where his.value between iterator.i and iterator.i+2 group
by iterator.i;

i          j          count

---------  ---------  ---------

1          3          6

2          4          4

3          5          4

4          6          3

5          7          1

 

An interesting thing about using such an 'iterator table' is that it can be
used for traversals using simple functions as well as aggregate functions.
Consider, for example, the simple function charindex contained in
extension-functions.c contributed by Mikey C. and packaged by Liam Healy.
This function takes two arguments, one a string to be found in another
larger string, designated by the second argument.  There is an optional
third argument, which designates the string position from which to begin
searching.  The function returns the integer position of the start of the
first string in the second string:

 

sqlite>select
charindex("ELVIS","ELVIShasleftthebuildingbutELVISwillbebacktomorrownight")
as pos;

pos

---------

1

 

To find all occurrences of ELVIS, and more importantly, to return the
results in tabular form, use the iterator table and the optional third
parameter, to which we pass the iterator value:

 

sqlite>select
charindex("ELVIS","ELVIShasleftthebuildingbutELVISwillbebacktomorrownight",i
) as pos from iterator

       where
iterator.i<length("ELVIShasleftthebuildingbutELVISwillbebacktomorrownight")
and pos!=0

       group by pos;

pos

---------

1

27

 

Please forgive me if I'm making this too simple and people already know
this.  But I use this a lot with my own user-defined functions and it saves
me a lot of trouble, especially since I can do it all in SQL without a lot
of ODBC back and forth, which is slow.

 

 

Best Regards,

 

Michael

 

---------------------------------

Michael Janis

 <mailto:[EMAIL PROTECTED]> [EMAIL PROTECTED]

[EMAIL PROTECTED]

---------------------------------

 

_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to