[sqlite] how can i optimize this query

2006-08-24 Thread Cesar David Rodas Maldonado

I have done this project
http://www.phpclasses.org/browse/package/3303.htmlis a fulltext
search. Please help me to optimize this souce, couse i am
planning to move this souce to C. This a fulltext search that supports
SQLite or mysql. Please Help me to do this project better. Thanks to all

On 8/24/06, Cesar David Rodas Maldonado <[EMAIL PROTECTED]> wrote:


I will like to join to your group! what i have to do? how can we exchange
info?

Thanks Pierre Aubert


On 8/24/06, Pierre Aubert < [EMAIL PROTECTED]> wrote:
>
> Hello Cesar,
> I have a lot of idea for ft3 but not much time currently. If you want to
> join the project you are welcome. My next work will be on creating a sqlite
> extension called ft3:) with a module and
> virtual tables.
>
> Pierre
>
> P.S.: don't stress over your english, French is my mother tongue and I
> do a lot of mistakes
> myself. On the other hand, it is necessary to speak in english thus ...
>
> - Message d'origine 
> De : Cesar David Rodas Maldonado <[EMAIL PROTECTED] >
> À : Pierre Aubert <[EMAIL PROTECTED]>
> Envoyé le : Mercredi, 23 Août 2006, 7h25mn 47s
> Objet : Re: [sqlite] Re : [sqlite] how can i optimize this query
>
>
> Hello Pierre Aubert
>
> I saw your project ft3, that was my inspiration, can I ask you more
> about
> your projects? more details?
>
> Thanks Cesar Rodas
>
> On 8/23/06, Pierre Aubert <[EMAIL PROTECTED]> wrote:
> >
> > Hello Cesar,
> > may be you can have a look at ft3 (ft3.sourceforge.net) which does the
> > same thing.
> > Some advices:
> > * partition your table ft_index on wordid
> >I use 1024  tables and use them like that:
> > ft_index[ wordid % 1024 ]
> >   to access them. This reduce the size of the table and thus the size
> of
> > the index.
> > * use a large block size via pragma (this speed up large reads)
> >
> > FT3 ways is to switch to a binary index stored into a blob for each
> word.
> > Something like
> > create table ft_index (
> > wordid int,
> > scores blob
> > );
> >
> > and the blob is an array of (docid, rank, position, ...)
> >
> > This is a lot faster of course but you have to deal with the join
> > yourself.
> >
> > PIerre.
> >
> > P.S.: I will try to use the new virtual table interface to access the
> blob
> > datas.
> >
> > - Message d'origine 
> > De : Cesar David Rodas Maldonado < [EMAIL PROTECTED]>
> > À : sqlite-users@sqlite.org
> > Envoyé le : Mardi, 22 Août 2006, 10h10mn 30s
> > Objet : [sqlite] how can i optimize this query
> >
> > I have the next table with about 10.000.000 of records-
> >
> > CREATE TABLE ft_index (
> >   docid int(11) NOT NULL default '0',
> >   wordid int(11) NOT NULL default '0',
> >   posicion int(11) NOT NULL default '0',
> >   ranking float NOT NULL default '0',
> >   lang int(11) NOT NULL default '0',
> >   KEY docid (docid,wordid,posicion,ranking),
> >   KEY lang (lang)
> > );
> >
> > How can i optimize the next query, couse i need velocity (this is for
> a
> > fulltext search project):
> >
> >
> > select
> > t0.*,
> > t0.ranking + t1.ranking + t2.ranking + t3.ranking + t4.ranking as
> ranking
> > from ft_index as t0
> > inner join ft_index as t1 on (t0.docid = t1.docid)
> > inner join ft_index as t2 on ( t0.docid = t2.docid)
> > inner join ft_index as t3 on (t0.docid = t3.docid)
> > inner join ft_index as t4 on (t0.docid = t4.docid)
> > where (t0.wordid = '18929') AND (t1.wordid = '27283') AND( t2.wordid =
> > '4351' and t2.posicion + 1 = t3.posicion and t3.wordid = '9418' and
> > t3.posicion + 1 = t4.posicion ) group by t0.docid order by ranking;
> >
> > Every inner join is for search a word that i save in another table
> (with
> > the
> > number of words).
> >
> >
> >
> >
> >
> >
>
>



[sqlite] Re : [sqlite] how can i optimize this query

2006-08-23 Thread Pierre Aubert
Hello Cesar,
may be you can have a look at ft3 (ft3.sourceforge.net) which does the same 
thing.
Some advices:
* partition your table ft_index on wordid
   I use 1024  tables and use them like that:
ft_index[ wordid % 1024 ] 
  to access them. This reduce the size of the table and thus the size of the 
index.
* use a large block size via pragma (this speed up large reads)

FT3 ways is to switch to a binary index stored into a blob for each word. 
Something like
create table ft_index (
wordid int,
scores blob
);

and the blob is an array of (docid, rank, position, ...)

This is a lot faster of course but you have to deal with the join yourself.

PIerre.

P.S.: I will try to use the new virtual table interface to access the blob 
datas.

- Message d'origine 
De : Cesar David Rodas Maldonado <[EMAIL PROTECTED]>
À : sqlite-users@sqlite.org
Envoyé le : Mardi, 22 Août 2006, 10h10mn 30s
Objet : [sqlite] how can i optimize this query

I have the next table with about 10.000.000 of records-

CREATE TABLE ft_index (
  docid int(11) NOT NULL default '0',
  wordid int(11) NOT NULL default '0',
  posicion int(11) NOT NULL default '0',
  ranking float NOT NULL default '0',
  lang int(11) NOT NULL default '0',
  KEY docid (docid,wordid,posicion,ranking),
  KEY lang (lang)
);

How can i optimize the next query, couse i need velocity (this is for a
fulltext search project):


select
t0.*,
t0.ranking + t1.ranking + t2.ranking + t3.ranking + t4.ranking as ranking
from ft_index as t0
inner join ft_index as t1 on (t0.docid = t1.docid)
inner join ft_index as t2 on (t0.docid = t2.docid)
inner join ft_index as t3 on (t0.docid = t3.docid)
inner join ft_index as t4 on (t0.docid = t4.docid)
where (t0.wordid = '18929') AND (t1.wordid = '27283') AND( t2.wordid =
'4351' and t2.posicion + 1 = t3.posicion and t3.wordid = '9418' and
t3.posicion + 1 = t4.posicion ) group by t0.docid order by ranking;

Every inner join is for search a word that i save in another table (with the
number of words).






[sqlite] how can i optimize this query

2006-08-22 Thread Cesar David Rodas Maldonado

I have the next table with about 10.000.000 of records-

CREATE TABLE ft_index (
 docid int(11) NOT NULL default '0',
 wordid int(11) NOT NULL default '0',
 posicion int(11) NOT NULL default '0',
 ranking float NOT NULL default '0',
 lang int(11) NOT NULL default '0',
 KEY docid (docid,wordid,posicion,ranking),
 KEY lang (lang)
);

How can i optimize the next query, couse i need velocity (this is for a
fulltext search project):


select
t0.*,
t0.ranking + t1.ranking + t2.ranking + t3.ranking + t4.ranking as ranking
from ft_index as t0
inner join ft_index as t1 on (t0.docid = t1.docid)
inner join ft_index as t2 on (t0.docid = t2.docid)
inner join ft_index as t3 on (t0.docid = t3.docid)
inner join ft_index as t4 on (t0.docid = t4.docid)
where (t0.wordid = '18929') AND (t1.wordid = '27283') AND( t2.wordid =
'4351' and t2.posicion + 1 = t3.posicion and t3.wordid = '9418' and
t3.posicion + 1 = t4.posicion ) group by t0.docid order by ranking;

Every inner join is for search a word that i save in another table (with the
number of words).