Re: simple search api (was Re: mimetype standardisation by testsets)

Mikkel Kamstrup Erlandsen Mon, 27 Nov 2006 08:44:08 -0800

2006/11/27, Kevin Krammer <[EMAIL PROTECTED]>:


On Monday 27 November 2006 12:08, Mikkel Kamstrup Erlandsen wrote:

I am not a searching or indexing expert, merely wanted to input some
information regarding D-Bus sync/async calls :)

> I think you raise a really good question Kevin. Let me  first introduce
> some  terminology to ease the communication.
>
> Page Query: All results for a given query is returned in one chunk. This
> call is still *async* since it is over dbus. This is how it is
sugegstedin
> on the WasabiDraft wiki page.
>
> Async Query: Query results trickle in as the search engine picks them
up.
> Ie all query results are not returned in one batch.

I'd rather call it "Full" and "Partial" Query or Query with "Full"
or "Partial" delivery.




I was not trying to establish a convention, I just needed some words for
it.  For what it's worth I think Full- and Partial Delivery are the best
terms. However, for method names I actually think my names make more sense.

In the page query the client can simulate an async query by requesting
> several blocking queries with the same query string, but different
> page-ranges. This gives a small problem with page ordering, but nothing
> that the client app could not work around. The big benefit for page
queries
> is that server side sorting (score, relevance, date, whatever) is a
> no-brainer for the client. Just append the "sort:<sorttype>" switch to
the
> query string.

How long does a search service have to cache such a query - result
combination?



That's up to the implementation.

Or is searching so fast, that the same query can be re-done on every call?


Again, some backends will have native caching capabilities, others won't. I
think we should focus on keeping the interface easy to use for application
developers, and leave the headaches to the search engine devs... Sorry guys
:-)

In the async query you have a sorting problem. The client cannot sort the
> hits, unless each returned URI also has metadata associated with it (it
> looks this stuff up with another dbus call). I see a huge benefit in
> allowing the results to trickle in (and allows for canceling queries as
> Kevin points out). The async query is also much more suitable for live
> queries (in the sense of updating the query when the on-disk files
change -
> or are deleted/created).

Would it be possible to associate a sorting key with each match?
If so it could be part of the returned data, i.e. the result being an
array of
tuples of URI and key.




I don't know if this would make sense actually... How would the backend know
what the final sort order would be if it hasn't collected all hits? - I'm
not ruling it out, I'm just not able to see how it would work out...

So what do I think? I see 2 options:
>
> 1) Change the Query method name to PageQuery and add another AsyncQuery
> with a signature and behavior we need to think a bit about.
>
> 2) Don't change the org.freedesktop.search.simple interface, but create
> another interface generally aimed at live queries - or maybe include
this
> in the "advanced" search interface when we get to defining that.

A more advanced interface could be based on query objects, i.e. the client
requests a remote peer object for a specific query and the service creates
an
handler object and returns the object path.



Yeah, that could be an idea. This would not be a good idea for apps spawning
tons of searches though. And I actually think we should pay close attention
to catering for massive search requests. I can easily picture a future where
there are some client or other that does a bunch of searches in the
background showing relevant information to your current context... (just one
example).

The client can then call this object's methods and listen to this object's

signals, without needing to reference it with the query string at each
call
or on each signal. The object path will be the reference



Again, I like the idea - but I see some problems with it though (as
mentioned above). Maybe it should rather be a server side client proxy or
something (that sounds like an oxymoron :-)). Where the remote object does
not represent a query, but rather a dedicated connection. I know that this
is possible with dbus, but I have never played around with it...

Cheers,
Mikkel

_______________________________________________
xdg mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/xdg

Re: simple search api (was Re: mimetype standardisation by testsets)

Reply via email to