Hi Clint,

If you are using Lucene, you need to use the Lucene query syntax for
your searches.  Here is the online guide:

http://lucene.apache.org/java/2_9_3/queryparsersyntax.html

Cheers,

Matt Bishop

On Jan 2, 4:20 am, Clint Hyde <clinthyd...@gmail.com> wrote:
> this is version 1.2.140, with Lucene 2.9.3
> Here's the schema description: domain is stories in magazines. I have approx 
> 150 thousand stories entered.
> The tables are STORY, AUTHOR, PHOTOGRAPHER, TAG.
> They are linked via foreign keys in STORY_X_AUTHOR, STORY_X_PHOTOGRAPHER, 
> STORY_X_TAG. This is because a story can have multiple authors, authors have 
> multiple stories, etc.
> Lucene indexes exist for all the first four tables. I use FTL_SEARCH_DATA for 
> word-based queries.
> In many cases, this is all ok.
> But in some, it is not.
> Here's a typical result:
> select * from magazine_index.ftl_search_Data('MDT OR MM', 0,0) 
> ;SCHEMA  TABLE  COLUMNS  KEYS  SCORE  MAGAZINE_INDEXTAG(TAG_ID)(172)1.0MAGAZINE_INDEXTAG(TAG_ID)(11171)0.8492550849914551MAGAZINE_INDEXSTORY(STORY_ID)(35977)0.8492550849914551MAGAZINE_INDEXSTORY(STORY_ID)(35978)0.8492550849914551MAGAZINE_INDEXSTORY(STORY_ID)(62788)0.8492550849914551MAGAZINE_INDEXSTORY(STORY_ID)(38561)0.625MAGAZINE_INDEXTAG(TAG_ID)(9415)0.6187184453010559MAGAZINE_INDEXTAG(TAG_ID)(29177)0.5307844281196594
> Note that I have used an OR query there, MDT OR MM. Either word could be in 
> either TAG or STORY; there are NO cases for both. This works fine. 172 is the 
> TAG_ID for MM 11171 is the TAG_ID for MDT; the other TAG_IDs are where the 
> tag-name has multiple words including mm or mdt.
> But if I use AND:
> select * from magazine_index.ftl_search_Data('MDT AND mm', 0,0) 
> ;SCHEMA  TABLE  COLUMNS  KEYS  SCORE  (no rows, 11 ms)
> You can see that I get nothing. This is bad, will get (already have) unhappy 
> users. Problem is that the indexing isn't quite what I thought it would be. 
> Things work fine as long as the query will match in a single table; across 
> tables doesn't work.
> What is going on: if the AND'd words are in the same story title, the search 
> works ok. But when I want to AND a title word and a tag word, that always 
> fails, because the indexes Lucene makes don't go across tables. In fact, that 
> may not even mean anything to Lucene.
> The problem is that Lucene only indexes a single table against itself, and so 
> Lucene's searches can't do what I need. In some abstract sense, this may have 
> an algorithmic solution, I expect it is exponential, in effect forcing me to 
> do my own outside join, because the search results don't feed directly into 
> another query.
> Recognizing when to do my own join requires me to parse the query, which I 
> really dont want to do--if I have to do that, I might as well create my own 
> indexing engine for this specific problem; also not what I want to do.
> Suggestions anyone? (restructuring the data back into a flat file is really 
> not what I want to do, although that would solve this problem)
>  -- clint

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To post to this group, send email to h2-datab...@googlegroups.com.
To unsubscribe from this group, send email to 
h2-database+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/h2-database?hl=en.

Reply via email to