Interesting point.  This seems like the kind of thing that could be
implemented in the existing fts codebase without involving a version
change.  It also may be more general than just hyphenated words, for
instance $12.50 might be more usefully translated as the phrase search
"12 50" than all documents which contain 12 and 50 anywhere.  I bet
five minutes of thought would result in 25 other examples, just in
English.

Fair warning, though: It's not entirely clear that the fts search
syntax should aim to hew too closely to consumer-oriented search
syntax.  It's sort of in a strange place, most people would think it a
poor idea (indeed, dangerous!) to put user-entered expressions in
their WHERE clauses.  In other search systems you might have an API
for constructing query trees which you would pass in, but that may be
weird to express via SQL.  Not sure where this pretty muddy point
leaves us, because it's probably not relevant to this specific case
:-).

Caveat for the above: I've spent all of five minutes thinking about
your posting, and I was interrupted in the middle.  But I'll try to
factor it in to future thinking.

Thanks,
scott



On Wed, Apr 30, 2008 at 8:58 AM, Ralf Junker <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I have a small concern about the FTS negative term search syntax. Currently, 
> all terms following any minus sign ("-") are excluded from the search. This 
> is a very welcome feature, but consider searching for these hyphenated words:
>
>  Coca-Cola       -> FTS finds Coca, but never Cola
>  low-budget      -> FTS finds low, but never budget
>  twelve-year-old -> FTS finds twelve, but never year and never old
>  part-time       -> FTS finds part, but never time
>  full-time       -> FTS finds full, but never time
>
> These results do not match what most users will expect. Well, one can ask 
> them to leave out the minus sign, but users will habitually leave it in 
> because they learned from major search engines that it is the intended 
> behavior. Consider Google, which explicitly states:
>
> "Note: when you include a negative term in your search, be sure to include a 
> space before the minus sign."
>
> Source: 
> http://www.google.com/support/bin/static.py?page=searchguides.html&ctx=basics
>
> Therefore I would like to consider adding these search syntax rules:
>
> 1. A minus sign excludes a search term only when located at the beginning of 
> the search query or after a white space (space, tab, etc.):
>
>  "low-budget"  -> Find both low and budet.
>  "low -budget" -> Find low, but not budget.
>  "-low budget" -> Do not find low, but find budget.
>  "-low-budget" -> Do not find the "low budget" phrase.
>
> 2. In case the minus sign is a term separator and two or more search terms 
> are separated by sisngle minus signs only, they constitue a phrase search:
>
>  "twelve-year-old" -> "twelve year old" (phrase search)
>  "part-time"       -> "part time" (phrase search)
>
> I believe that these changes would make the FTS search syntax more intuitive 
> to use and more conformant to major search engines.
>
> Would there be a chance that they could be implemented in current FTS3 and/or 
> the upcomming FTS4?
>
> Any thoughts?
>
> Ralf
>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to