A reversed-string index, which would be used transparently on ends-with
queries, would not have to store the data redundantly in the base table.
 That is what I think you are suggesting?Though I would welcome the "smart"
approach you suggest, it is far  more ambitious than a raw-reverse function.
I was trying to keep my request in the shallow end of the swimming pool.

The main drawback of the raw-reverse version of the index-optimized
ends-with query is the need to store the reversed form in the base
table. Disk is very cheap nowadays, so that's not where the real problem
lies; the real problem is that the unreversed and the reversed form of the
text data might fall out of synch with each other.  A smart unicode-savvy
"reverse index" for use with ends-with queries could avoid that particular
pitfall. But it would require a change to the query-parsing logic.

SQLite's unoptimized ends-with queries are fast on desktop PCs if the table
fits in available RAM :  5.5 seconds on the first query when the table is
still disk-resident and 120ms  after the table is ram-resident, this against
 a table with 265K rows.  The common scenario where an index would
significantly improve ends-with performance is a (typically handheld)
computing device with gigabytes of disk but a relatively underpowered CPU
and limited RAM .  Shared-resource web hosting environments are another. The
shared-hosting environment might be used to justify a request for the smart
unicode-savvy approach but not a request for the raw-reverse function,
because one would have ready access to the naive solution via the UDF
mechanism.

Regards
Tim Romano
Swarthmore PA

P.S. A raw-reverse function should probably be called FLIP() so as not to
get bollixed up with any reverse() function in UDF libraries.  FLIP connotes
something inelegant as well :-)
On Fri, Jul 30, 2010 at 1:30 PM, Roger Binns <rog...@rogerbinns.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 07/30/2010 08:00 AM, Tim Romano wrote:
> > Several things might be responsible for there being no "outpouring of
> > support" for a raw reverse function.
>
> Also note that you are proposing it as a very specific workaround for a
> specific scenario which is why you want the codepoints reversed rather than
> being done "correctly" when dealing with combining unicode codepoints.  ie
> your required reverse semantics are not correct for 100% of people who
> would
> want a reverse function :)
>
> - From your further description it is apparent that you don't actually want
> a
> reverse function, but rather a string endswith function that is not O(n).
>
> Another example usage is processing web logs, where for example you want to
> find all accesses to names ending in 'google.com'.
>
> > That the SQL dev team has not responded I take as an indication that this
> > would be very low priority for them,
>
> I don't think the issue was articulated well enough (I think my second
> paragraph above is better :-) and the solution you required is not a SQL
> standard nor fits in with how SQLite development works in general, and you
> rejected the existing means by which SQLite allows people to customise it.
>
> You can of course demand their attention through the mechanisms listed at:
>
>  http://www.hwaci.com/sw/sqlite/prosupport.html
>
> > Anyway, the lack of such a function is only a problem for me vis-a-vis an
> > amalgamation distribution of SQLite.
>
> It is only a problem if you are using an environment where SQLite is hidden
> inside a black box and that environment prevents you from using any of the
> SQLite extension mechanisms.  Given SQLite's very liberal license that
> really is a self inflicted problem.
>
> What I suggest you do is a survey of how other databases do this, and
> provide some timings when using SQLite as it currently stands and with a
> workaround such as using a custom reverse function.  It becomes a far more
> compelling case when you can show query times going from unusable to quick,
> as well as some example usage scenarios.  (Yes this is a fair bit of work,
> but that is exactly what you are requesting of the SQLite team - why should
> they do lots of work if you are not prepared to.)
>
> There are also many other possible solutions than a visible reverse
> function.  For example an index behind the scenes could be created that
> allows searching from either end, or something similar with a collation.
>
> > I may decide to forget about it as a
> > target platform and rewrite the app as a web-service, where I would
> indeed
> > have access to the UDF mechanism.  But there was merit in having the
> > application work in offline mode too.
>
> Are the data sets the same size?  I'd expect offline use to be done with
> smaller data sets and using a server for when they are larger.  Pulling
> numbers out of thin air, maybe using O(n) queries up to a million rows is
> just fine and after that you use the server with performance enhancements?
>
> Roger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAkxTDDMACgkQmOOfHg372QS4zACgg+X3EhdX0Ue4tz4IC9bUH+Ov
> CioAoN1+zhreXEGCtw0MCh5k7QcH/0Yn
> =zmY5
> -----END PGP SIGNATURE-----
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to