Hi,

this is the exact table definition:

create table wordlinks (
                 id integer primary key,
                 document integer,
                 docword integer,
                 count integer
                 )

in fact i wouldn't need the primary key. i need an index on docword and count

tables of about 2.000.000 records take about 10 - 20 minutes to add those indexes, 
where the
index on 'docword' is slowest, on 'count' goes relative fast.

each document will contain about 50-400 words (say, 250 approx).
docword can be considered almost a 'random' number, since documents will not be alike. 
this
may be part of the problem, docword may take 10-thousands of unique numbers per table 
of
2.000.000. count will be a number of around 1-50.

so i am indexing a large amount of documents here. most common words will probably 
have an
index < 20000 or so, but number of unique words can be quite high because of different
languages involved.

an piece of a table is listed below.

regards, and thanks for your help,

rene

sqlite> select * from wordlinks where id>16250 limit 100;
16251|93|7487|1|
16252|93|6606|1|
16253|93|7491|1|
16254|93|7492|4|
16255|93|7495|1|
16256|93|7497|1|
16257|93|7499|2|
16258|93|7500|1|
16259|93|914|7|
16260|93|1874|17|
16261|93|1364|2|
16262|93|7505|8|
16263|93|1704|4|
16264|93|918|3|
16265|93|921|3|
16266|93|922|1|
16267|93|923|19|
16268|93|7514|1|
16269|93|1883|54|
16270|93|7518|2|
16271|93|7521|4|
16272|93|4894|1|
16273|93|7526|5|
16274|93|924|163|
16275|93|7528|1|
16276|93|1373|3|
16277|93|7531|1|
16278|93|927|20|
16279|93|1706|6|
16280|93|7534|5|
16281|93|7535|4|
16282|93|7538|1|
16283|93|935|16|
16284|93|7540|1|
16285|93|71|178|
16286|93|7544|2|
16287|93|4615|64|
16288|93|7546|22|
16289|93|943|5|
16290|93|946|3|
16291|93|372|5|
16292|93|949|1|
16293|93|951|7|
16294|93|7554|1|
16295|93|1385|8|
16296|93|955|2|
16297|93|7557|5|
16298|93|72|134|
16299|93|3008|16|
16300|93|3011|3|
16301|93|7560|1|
16302|174|74|17|
16303|174|5245|1|
16304|174|5252|5|
16305|174|808|4|
16306|174|5265|1|
16307|174|864|1|
16308|174|5277|2|
16309|174|5281|1|
16310|174|5287|1|
16311|174|5292|1|
16312|174|5299|1|
16313|174|5304|1|
16314|174|1041|1|
16315|174|5316|2|
16316|174|5322|8|
16317|174|5327|6|
16318|174|5333|1|
16319|174|5337|3|
16320|174|5343|1|
16321|174|5347|1|
16322|174|5353|1|
16323|174|5358|1|
16324|174|5364|4|
16325|174|5370|1|
16326|174|5376|2|
16327|174|5382|1|
16328|174|5388|1|
16329|174|5395|1|
16330|174|5402|2|
16331|174|5409|1|
16332|174|5416|1|
16333|174|5421|2|
16334|174|5430|1|
16335|174|5437|1|
16336|174|820|6|
16337|174|103|2|
16338|174|5456|1|
16339|174|1|6|
16340|174|1919|2|
16341|174|5474|1|
16342|174|5479|2|
16343|174|5486|1|
16344|174|5493|1|
16345|174|1190|1|
16346|174|5503|1|
16347|174|5509|1|
16348|174|5518|3|
16349|174|2047|5|
16350|174|2155|1|


Op maandag, mei 24. 2004 16:18:08 schreef "D. Richard Hipp" <[EMAIL PROTECTED]> :
> rene wrote:
> >
> > My questions are:
> > *is this a bug?
> > *why does indexing take that long?
> > *will it be fixed in 3.0?
> > *should i consider using embedded mysql instead (not tested that if it got better 
> > performance
> > though)?
> > *would it help if i create a seperate database for this particulair table? (there 
> > is about 30%
> > space used by other tables)
> >
>
> This is a performance issue that should be looked into.
> You can help by providing us with the schema that you
> are using and several dozen rows of sample data.
>
> --
> D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to