Wednesday, from David Axmark:
> > Your other point about exact vs. approximate answers is unclear, I expect
> > that Google's answers are exact for their currently available indexes at any
> > given time.  But even if they are approximate, I'd be happy with that too.
> > The scoring on a FULLTEXT search in Mysql is "exact" but based on a
> > formula that is approximate anyway.
>
> No, MySQL returns all data according to a search. Web engines return
> what they they find on one search machine. So you can get different
> results with Google every time you hit refresh if you are routed to
> different machines. This had happened to me when I was looking for the
> number of matches and not the result itself.
>
> So we should try to make fulltext searches with a limit between 10 and
> 100 be fast to be closer to google.
>
> I have also head about some other things web search engines do since I
> know some people at FAST but I have forgot that already.

My opinion is, that mySQL itself never should try to find approximate
matches. This is against the definition of SQL itself. SQL is a fourth
generation language. That means, if you say "SELECT", the engine selects.
And it has to be as exactly that, what I have searched, every time, on
every machine in any combination with the same data.

So SQL needs a new language construct to make an approximate search. But
what is an approximate search? How is approximate defined?

I don't think it is a good idea to implement it in this way.
Approximazation must be always done on the application level, cause it is
highly dependend on application, what an approximate result could be.

> We will try to make every feature as good as possible. But we do have
> limited resources.

Exactly. FTS is not so important as other features and people which want
you to include a new feature should think about supporting mysql with
money. :-)

But (yes, we support mysql! :-) I think the need is growing rapidly, cause
the amount of data, that has to be indexed is growing over the years. And
other DB's have much more experices with it. Currently we can live with
the speed. Those who cannot live with it should buy better machines,
think about their SE-concept or support mysql.

Search engines techniques are *not* trivial, so the last way is in my eyes
one of the cheapest.


> Well there is always the option of sponsoring further fulltext
> development. We have a guy who has been working on the GNU fulltext
> engines who is interesting in working with MySQL fulltext. But for the
> moment we can not afford it.

This was my first thought: People write about speed problems and how to
cluster and so on. Things, that I would calculate with weeks and high TCO.

But it maybe much cheaper to pay mySQL for this. How much do you estimate
would it cost to implement inverted files? I think this is difficult,
cause Sergei told me, that he couldn't use mySQL-index files any more.

I just ask, nothing special in sight, but many questions from everyone who
needs it. Cause FTS is a feature which highly improves the value of a web
site. And coustomers have no problem to pay for things they think they get
money for. FTS is such a thing.

But perhaps if we know, under which circumstances FTS is improved, it is
easier for us to find a possible way to share the costs for it or find a
compromise. I also understand, if mySQL don't want to speak about it
here.

I think it is also important for us, how much it can be theoretically
improved. My calculations showed me a theoretical speed up of factor 100
or so. This is ... wow. But in live everything is most times slower...


> So if some of you are interested in sponsoring this (or know about
> others who might be) write to [EMAIL PROTECTED]

Or like this... maybe we find coustomers who needs it. Think it's
possible.

My personal feeling is and my stomach says, that fulltext indexing is a
feature, which needs to be expanded.

-- 

SSilk - Alexander Aulbach - Herbipolis/Frankonia Minoris



---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to