On 02/22/2016 10:27 AM, Ephraim Stevens wrote:
> Greetings All,
>
> I use the Full Text Search facilities extensively. The feature is intuitive
> and works beautifully and super fast even with 100GB+ databases. I must
> applaud the SQLITE dev team on such amazing work.
>
> I have two questions with regards to FTS.
>
> 1) I'm going out on a limb here:
>
> As some of you may know, there is a undocumented, and not officially
> supported feature/bug where you can specify your own delimiters using the
> "simple" tokenizer:
>
>
> http://article.gmane.org/gmane.comp.db.sqlite.general/74199
>
>
>
> for example to create an table where tokens are split by an exclamation
> point and carat, one would execute:
>
> create virtual table TEST using FTS4(title, body, tokenize=simple "" "!^");
>
> Note the first argument is ignored, the second argument is where the
> delimiters should be specified in quotes.
>
> This is very, very convenient and I've used it extensively without issue (I
> hope the feature never gets removed).
>
> The question here is, how can one specify non printable characters within
> the sqlite3 command shell. Often, delimiters are non-printable characters
> such as NUL (0x00) or SOH (0x01) or even the newline character.  If I
> wanted to specify either one of those in my delimiter string parameter
> above, how could it successfully be done?
>
>

0x00 is likely impossible, but for other ASCII range characters you can 
probably just insert the value into the SQL string, printable or not.


>
>
>
> 2)  The colon ':'  is used in the FTS engine as a column specifier:
>
> select * from FTS_TABLE where COL:foo MATCH "test123".   This means show me
> any rows that have test123 in the COL column only (and not any other
> column).
>
> Is there a way to adjust/change this column delimiter during table setup or
> is this a hard-coded value?  I ask because sometimes, I need to be able to
> perform searches with the colon character as part of a string.  For
> example, when searching on NETWORK MAC addresses. At the same time, I don't
> want to lose the ability to utilize column specifiers.

I think quoted strings are passed to the tokenizer without 
interpretation. So if your tokenizer considers ":" to be a token 
character, you might be able to do:

   ... MATCH '"01:23:45:67:89:AB"' ...

or similar.

Dan.



Reply via email to