Can't speak to tsearch2 in specific but I have learned to be very cautious -- caching does indeed make a noticible difference on this sort of thing, especially if you have enough RAM to hold a significant amount of the data. Either keep changing the query target or do something violent to wipe the cache(s).
Greg Williamson DBA GlobeXplorer LLC -----Original Message----- From: Kris Kiger [mailto:[EMAIL PROTECTED] Sent: Friday, September 24, 2004 2:59 PM To: [EMAIL PROTECTED] Subject: Re: [ADMIN] tsearch2 poor performance Here is the explain analyze output, funny thing, after I ran josh's query, mine ran a lot faster....maybe it forced a caching?; search_test=# explain analyze select count(*) from product where vector @@ to_tsquery('oil'); Aggregate (cost=6113.09..6113.09 rows=1 width=0) (actual time=19643.372..19643.376 rows=1 loops=1) -> Index Scan using vector_idx on product (cost=0.00..6105.58 rows=3001 width=0) (actual time=0.381..18145.917 rows=226357 loops=1) Index Cond: (vector @@ '\'oil\''::tsquery) Filter: (vector @@ '\'oil\''::tsquery) Total runtime: 19643.597 ms search_test=# explain analyze select count(*) from product where vector @@ to_tsquery('hydrogen'); Aggregate (cost=6113.09..6113.09 rows=1 width=0) (actual time=19629.766..19629.769 rows=1 loops=1) -> Index Scan using vector_idx on product (cost=0.00..6105.58 rows=3001 width=0) (actual time=0.378..18127.573 rows=226868 loops=1) Index Cond: (vector @@ '\'hydrogen\''::tsquery) Filter: (vector @@ '\'hydrogen\''::tsquery) Total runtime: 19629.992 ms Here is Josh's; search_test=# explain analyze SELECT count(q) FROM product, to_tsquery('oil') AS q WHERE vector @@ q; Aggregate (cost=6150597.03..6150597.03 rows=1 width=32) (actual time=21769.526..21769.530 rows=1 loops=1) -> Nested Loop (cost=0.00..6143097.02 rows=3000001 width=32) (actual time=0.424..20450.208 rows=226357 loops=1) -> Function Scan on q (cost=0.00..12.50 rows=1000 width=32) (actual time=0.023..0.031 rows=1 loops=1) -> Index Scan using vector_idx on product (cost=0.00..6105.58 rows=3000 width=32) (actual time=0.376..18165.415 rows=226357 loops=1) Index Cond: (product.vector @@ "outer".q) Filter: (product.vector @@ "outer".q) Total runtime: 21769.786 ms Disabling Index usage slowed it down: search_test=# explain analyze select count(*) from product where vector @@ to_tsquery('hydrogen'); Aggregate (cost=347259.51..347259.51 rows=1 width=0) (actual time=24675.933..24675.936 rows=1 loops=1) -> Seq Scan on product (cost=0.00..347252.00 rows=3001 width=0) (actual time=0.320..23164.492 rows=226868 loops=1) Filter: (vector @@ '\'hydrogen\''::tsquery) Total runtime: 24676.091 ms Time: 24678.842 ms search_test=# explain analyze SELECT count(q) FROM product, to_tsquery('oil') AS q WHERE vector @@ q; Aggregate (cost=67847264.50..67847264.50 rows=1 width=32) (actual time=83631.201..83631.204 rows=1 loops=1) -> Nested Loop (cost=12.50..67839764.50 rows=3000001 width=32) (actual time=0.214..82294.710 rows=226357 loops=1) Join Filter: ("outer".vector @@ "inner".q) -> Seq Scan on product (cost=0.00..339752.00 rows=3000000 width=32) (actual time=0.107..27563.952 rows=3000000 loops=1) -> Materialize (cost=12.50..22.50 rows=1000 width=32) (actual time=0.003..0.006 rows=1 loops=3000000) -> Function Scan on q (cost=0.00..12.50 rows=1000 width=32) (actual time=0.019..0.023 rows=1 loops=1) Total runtime: 83631.385 ms Here are the results of stat: search_test=# select * from stat('select vector from product') order by ndoc desc, nentry; word | ndoc | nentry --------------+---------+--------- anoth | 1187386 | 1477442 bear | 696668 | 780963 take | 675319 | 736410 relat | 491469 | 528259 toward | 490653 | 528369 defin | 490572 | 527099 live | 490538 | 527401 beyond | 490124 | 527957 behind | 490087 | 527735 insid | 489530 | 527074 near | 489504 | 527721 around | 489244 | 526870 mean | 478201 | 512699 complex | 440339 | 468669 light | 438685 | 468140 ball | 438567 | 468168 pit | 438293 | 467807 dress | 438128 | 467260 player | 437633 | 466753 secret | 433279 | 457246 love | 423777 | 442694 give | 423691 | 441305 need | 423336 | 434409 peopl | 423336 | 434409 believ | 423336 | 434409 rememb | 423336 | 434409 howev | 421762 | 434194 real | 419906 | 435074 furthermor | 416672 | 434413 indic | 416508 | 434919 exampl | 416508 | 434919 alway | 415543 | 432861 sometim | 415543 | 432861 see | 410706 | 434586 inde | 408379 | 434283 fruit | 363203 | 381862 cook | 362674 | 381112 graduat | 362444 | 381284 chees | 362358 | 381040 hesit | 307431 | 317550 self | 301001 | 312312 hard | 300138 | 310167 spirit | 299310 | 312092 know | 298246 | 309010 laugh | 294136 | 302392 make | 287633 | 295003 find | 287550 | 294770 goe | 279336 | 287025 team | 228000 | 234703 footbal | 228000 | 234703 void | 227914 | 234681 formless | 227914 | 234681 board | 227907 | 234797 chess | 227907 | 234797 submarin | 227869 | 234727 inferior | 227858 | 234357 viper | 227855 | 234865 cylind | 227847 | 234505 suit | 227822 | 234376 class | 227822 | 234376 action | 227822 | 234376 diskett | 227802 | 234786 roller | 227792 | 234524 coaster | 227792 | 234524 mate | 227785 | 234431 ritual | 227785 | 234431 engin | 227784 | 234575 steam | 227784 | 234575 industri | 227780 | 234312 fire | 227775 | 234532 hydrant | 227775 | 234532 briar | 227769 | 234524 patch | 227769 | 234524 mastadon | 227677 | 234665 defend | 227617 | 234410 blade | 227603 | 234356 razor | 227603 | 234356 cab | 227578 | 234554 driver | 227578 | 234554 cough | 227570 | 234324 syrup | 227570 | 234324 cowboy | 227566 | 234663 chop | 227564 | 234437 pork | 227564 | 234437 ceo | 227557 | 234760 rattlesnak | 227554 | 234323 hell | 227540 | 234313 flavor | 227540 | 234313 maelstrom | 227537 | 234404 mulch | 227531 | 234311 cyprus | 227531 | 234311 tack | 227525 | 234462 carpet | 227525 | 234462 movi | 227505 | 234207 theater | 227505 | 234207 spider | 227466 | 234524 cone | 227463 | 234198 pine | 227463 | 234198 garbag | 227459 | 234207 beer | 227443 | 234077 bottl | 227443 | 234077 polygon | 227438 | 234267 judg | 227425 | 234565 blith | 227409 | 233979 traffic | 227403 | 234051 paper | 227397 | 234028 napkin | 227397 | 234028 apart | 227393 | 233911 build | 227393 | 233911 cocker | 227368 | 233926 spaniel | 227368 | 233926 bay | 227358 | 234261 cargo | 227358 | 234261 order | 227357 | 233885 short | 227357 | 233885 polar | 227326 | 234118 demon | 227324 | 234442 minivan | 227317 | 234292 bulb | 227314 | 234089 fundrais | 227308 | 234235 eggplant | 227306 | 234202 cake | 227299 | 234075 bowl | 227299 | 234110 paycheck | 227295 | 234224 sheriff | 227292 | 234313 turkey | 227271 | 234267 turn | 227265 | 234210 signal | 227265 | 234210 chestnut | 227250 | 234104 hole | 227239 | 233975 puncher | 227239 | 233975 tabloid | 227238 | 234341 microscop | 227236 | 234067 reclin | 227234 | 233946 dolphin | 227231 | 234080 pen | 227222 | 234269 pig | 227222 | 234269 wed | 227221 | 233860 bullfrog | 227211 | 234144 truck | 227208 | 233980 pickup | 227208 | 233980 agent | 227201 | 233840 insur | 227201 | 233840 girl | 227201 | 233934 scout | 227201 | 233934 drill | 227200 | 233986 power | 227200 | 233986 ocean | 227187 | 234211 case | 227173 | 233983 crank | 227173 | 233983 squid | 227169 | 234056 senat | 227167 | 234147 fraction | 227161 | 234065 custom | 227152 | 234128 burglar | 227148 | 234014 grizzli | 227133 | 233955 wheel | 227122 | 233813 asteroid | 227108 | 233928 anomali | 227106 | 234156 acceler | 227103 | 233428 particl | 227103 | 233428 saw | 227082 | 233934 chain | 227082 | 233934 reactor | 227035 | 234061 wedg | 227033 | 234143 photon | 227029 | 234017 deficit | 227029 | 234102 vacuum | 227021 | 233760 cleaner | 227021 | 233760 cashier | 227010 | 233858 scyth | 227001 | 233928 cloud | 226981 | 233569 format | 226981 | 233569 tornado | 226968 | 234058 grand | 226936 | 233730 piano | 226936 | 233730 tripod | 226930 | 233755 tomato | 226928 | 233915 sandwich | 226923 | 233786 earring | 226912 | 233665 train | 226912 | 233712 freight | 226912 | 233712 skyscrap | 226901 | 233755 abstract | 226890 | 233658 mortician | 226883 | 233781 warranti | 226876 | 233935 atom | 226868 | 233467 hydrogen | 226868 | 233467 satellit | 226866 | 233680 corpor | 226858 | 233818 globul | 226853 | 233980 cow | 226832 | 233808 jersey | 226832 | 233808 salad | 226830 | 233400 buzzard | 226804 | 233825 lot | 226794 | 233643 park | 226794 | 233643 prime | 226793 | 233325 minist | 226793 | 233325 clot | 226780 | 233380 blood | 226780 | 233380 tuba | 226765 | 233575 tape | 226749 | 233388 record | 226749 | 233388 line | 226747 | 233574 dancer | 226747 | 233574 nation | 226736 | 233796 bartend | 226653 | 233422 hockey | 226645 | 233178 canyon | 226617 | 233699 ski | 226610 | 233451 lodg | 226610 | 233451 stovepip | 226608 | 233489 crane | 226590 | 233578 sand | 226572 | 233270 grain | 226572 | 233270 dust | 226570 | 233391 bunni | 226570 | 233391 lover | 226564 | 233628 fairi | 226554 | 233743 plaintiff | 226537 | 233563 wheelbarrow | 226520 | 233206 food | 226445 | 233228 stamp | 226445 | 233228 umbrella | 226380 | 233273 avocado | 226375 | 232942 oil | 226357 | 233266 filter | 226357 | 233266 financi | 220105 | 225116 complet | 162829 | 164065 ridicul | 162346 | 163592 handl | 162200 | 163390 singl | 162200 | 163390 single-handl | 162200 | 163390 greedili | 162123 | 163379 careless | 162009 | 163193 somewhat | 161979 | 163205 accur | 161975 | 163228 overwhelm | 161946 | 163107 usual | 161930 | 163158 ostens | 161826 | 163020 lazili | 161809 | 163133 slyli | 161803 | 163149 underhand | 161751 | 162955 non | 161585 | 162823 chalant | 161585 | 162823 non-chal | 161585 | 162823 seldom | 161525 | 162739 accident | 161511 | 162676 almost | 161508 | 162782 often | 161488 | 162733 bare | 161401 | 162659 eager | 161278 | 162513 wise | 161073 | 162341 inexor | 161042 | 162265 feverish | 160805 | 162020 thorough | 160611 | 161823 home | 154672 | 155766 return | 154672 | 155766 lost | 154655 | 155567 glori | 154655 | 155567 start | 154655 | 155567 reminisc | 154655 | 155567 rumin | 154577 | 155776 read | 154529 | 155642 magazin | 154529 | 155642 pray | 154478 | 155748 floor | 154396 | 155477 sweep | 154396 | 155477 nag | 154271 | 155259 feel | 154271 | 155259 remors | 154271 | 155259 procrastin | 154256 | 155371 wake | 154220 | 155397 sleep | 154217 | 155353 panic | 154189 | 155346 get | 154168 | 155253 drunk | 154168 | 155253 stink | 154168 | 155253 hibern | 154158 | 155358 die | 153973 | 155223 fli | 153943 | 155056 rage | 153943 | 155056 flagel | 153916 | 155067 self-flagel | 153916 | 155067 daydream | 153864 | 155043 medit | 153816 | 154935 ceas | 153735 | 154815 exist | 153735 | 154815 joy | 153672 | 154754 beam | 153672 | 154754 trembl | 153656 | 154799 loud | 153635 | 154665 hide | 153592 | 154797 break | 153559 | 154599 coffe | 153559 | 154599 earn | 153538 | 154540 mile | 153538 | 154540 flier | 153538 | 154540 frequent | 153538 | 154540 leav | 153535 | 154730 rejoic | 153226 | 154412 sell | 147231 | 148103 plan | 147046 | 147809 escap | 147046 | 147809 throw | 146973 | 147764 negoti | 146905 | 147704 prenupti | 146905 | 147704 agreement | 146905 | 147704 card | 146892 | 147731 trade | 146892 | 147731 basebal | 146892 | 147731 oper | 146888 | 147787 small | 146888 | 147787 stand | 146888 | 147787 drink | 146881 | 147727 night | 146881 | 147727 steal | 146835 | 147847 pencil | 146835 | 147847 seek | 146816 | 148029 figur | 146801 | 147908 write | 146736 | 147720 letter | 146736 | 147720 recogn | 146723 | 147823 truce | 146684 | 147630 eat | 146670 | 147874 compet | 146647 | 147760 buy | 146642 | 147522 gift | 146642 | 147522 expens | 146642 | 147522 big | 146626 | 147717 fan | 146626 | 147717 fall | 146597 | 147601 assist | 146587 | 147589 requir | 146587 | 147589 chang | 146542 | 147479 heart | 146542 | 147479 conquer | 146542 | 147695 money | 146481 | 147450 borrow | 146481 | 147450 ignor | 146475 | 147643 share | 146415 | 147404 shower | 146415 | 147404 fault | 146413 | 147361 subtl | 146413 | 147361 kind | 146402 | 147492 great | 146397 | 147367 upon | 146396 | 147366 honor | 146396 | 147366 bestow | 146396 | 147366 pee | 146394 | 147477 avoid | 146392 | 147388 contact | 146392 | 147388 pink | 146372 | 147347 slip | 146372 | 147347 aid | 146367 | 147225 teach | 146366 | 147516 sanit | 146361 | 147477 lice | 146360 | 147409 buri | 146360 | 147483 cold | 146357 | 147220 pour | 146357 | 147220 freez | 146357 | 147220 water | 146357 | 147220 sea | 146347 | 147217 deep | 146347 | 147217 fish | 146347 | 147217 organ | 146321 | 147476 grit | 146289 | 147227 satiat | 146251 | 147349 assimil | 146251 | 147377 tri | 146188 | 147200 seduc | 146188 | 147200 reach | 146132 | 147008 understand | 146132 | 147008 brainwash | 146068 | 147158 admir | 146050 | 147021 caricatur | 145989 | 147107 deriv | 145941 | 146790 pervers | 145941 | 146790 satisfact | 145941 | 146790 moral | 145854 | 146733 lectur | 145854 | 146733 befriend | 145799 | 146963 learn | 145758 | 146666 lesson | 145758 | 146666 play | 145738 | 146706 pinochl | 145738 | 146706 peek | 145698 | 146737 danc | 145555 | 146637 fashion | 78762 | 79203 muddi | 78750 | 79236 hypnot | 78747 | 79204 childlik | 78579 | 79002 loyal | 78575 | 79056 mysteri | 78554 | 79047 annoy | 78532 | 79032 slow | 78517 | 78996 twist | 78515 | 79016 unstabl | 78510 | 78945 feder | 78501 | 78967 rever | 78501 | 79008 wrinkl | 78495 | 78965 rude | 78495 | 78975 boil | 78493 | 78972 high | 78481 | 78940 paid | 78481 | 78940 geosynchron | 78478 | 78931 greasi | 78476 | 78961 cosmopolitan | 78459 | 78903 fat | 78438 | 78935 inciner | 78429 | 78896 dot | 78426 | 78864 polka | 78426 | 78864 polka-dot | 78426 | 78864 outer | 78415 | 78910 phoni | 78411 | 78895 pathet | 78405 | 78869 purpl | 78405 | 78895 frozen | 78403 | 78886 nearest | 78396 | 78879 statesmanlik | 78386 | 78830 dirt | 78376 | 78828 encrust | 78376 | 78828 dirt-encrust | 78376 | 78828 sur | 78371 | 78895 obsequi | 78369 | 78805 salti | 78360 | 78834 imagin | 78356 | 78808 south | 78325 | 78787 american | 78325 | 78787 load | 78318 | 78832 righteous | 78282 | 78760 fractur | 78281 | 78737 educ | 78278 | 78682 colleg | 78278 | 78682 college-educ | 78278 | 78682 mitochondri | 78269 | 78745 treacher | 78265 | 78697 spartan | 78252 | 78707 felin | 78244 | 78713 ravish | 78242 | 78765 patern | 78241 | 78701 psychot | 78238 | 78693 shabbi | 78228 | 78685 dreamlik | 78224 | 78642 loath | 78221 | 78653 self-loath | 78221 | 78653 world | 78203 | 78658 call | 78183 | 78610 so-cal | 78183 | 78610 radioact | 78182 | 78623 alleg | 78178 | 78664 cantanker | 78159 | 78620 makeshift | 78159 | 78648 gentl | 78156 | 78609 fri | 78143 | 78648 linguist | 78141 | 78586 overrip | 78134 | 78572 varig | 78132 | 78609 vapor | 78105 | 78548 impromptu | 78104 | 78569 actual | 78104 | 78592 self-actu | 78104 | 78592 frighten | 78100 | 78544 molten | 78100 | 78567 gratifi | 78098 | 78528 bur | 78094 | 78563 hairi | 78092 | 78563 foreign | 78083 | 78569 tatter | 78050 | 78518 frustrat | 78044 | 78474 stoic | 78036 | 78503 eurasian | 78033 | 78513 proverbi | 78031 | 78519 green | 78024 | 78450 skinni | 78023 | 78524 familiar | 78016 | 78477 optim | 78006 | 78483 bohemian | 78002 | 78500 overpr | 77983 | 78411 pompous | 77955 | 78460 difficult | 77938 | 78375 raspi | 77924 | 78461 soggi | 77912 | 78381 resplend | 77910 | 78351 blotch | 77910 | 78380 fals | 77908 | 78409 infect | 77907 | 78399 magnific | 77898 | 78350 snooti | 77897 | 78422 moron | 77886 | 78362 moldi | 77865 | 78370 precis | 77860 | 78331 crispi | 77856 | 78324 smelli | 77813 | 78279 tempor | 77810 | 78244 alaskan | 77808 | 78258 elus | 77775 | 78245 miser | 77772 | 78232 flatul | 77761 | 78201 orbit | 77723 | 78157 mean-spirit | 77660 | 78113 flabbi | 77649 | 78110 nuclear | 77609 | 78069 go | 15532 | 15545 made | 1 | 1 america | 1 | 1 If you need anything else, let me know! Kris Oleg Bartunov wrote: >Kris, > >could you post 'explain analyze' output ? >Also, could you disable index usage (set enable_indexscan=off) >and rerun search using tsearch2 ? > >also, could you run 'stat' function to see frequency distribution >of words. See http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes >for details. > >Oleg > > >>Hi all. I am doing some work with tsearch2 and am not sure what to >>expect out of it, performance wise. Here is my setup: >> >> Table "public.product" >> Column | Type | Modifiers >>-------------+----------+------------------------------------------------- >> description | text | >> product_id | integer | default nextval('product_product_id_seq'::text) >> vector | tsvector | >>Indexes: >> "vector_idx" gist (vector) >>Triggers: >> tsvectorupdate BEFORE INSERT OR UPDATE ON product FOR EACH ROW EXECUTE PROCEDURE >> tsearch2('vector', 'description') >> >>This table has 3,000,000 rows in it. Each description field has roughly 50 >>characters. There are fewer than ten thousand distinct words in my 3,000,000 rows. >>The vector was filled using the description fields values. I ran a vacuum full >>analyze before executing any of my queries. >> >>Here are a couple of tests I performed using the tsearch index and like; >> >>search_test=# select count(*) from product where vector @@ to_tsquery('oil'); >> count >>-------- >> 226357 >>(1 row) >> >>Time: 191056.230 ms >> >>search_test=# select count(*) from product where vector @@ to_tsquery('hydrogen'); >> count >>-------- >> 226868 >>(1 row) >> >>Time: 306411.957 ms >> >>search_test=# select count(*) from product where description like '% oil %'; >> count >>-------- >> 226357 >>(1 row) >> >>Time: 38426.851 ms >> >>search_test=# select count(*) from product where description like '% hydrogen %'; >> count >>-------- >> 226868 >>(1 row) >> >>Time: 38265.421 ms >> >> >>Both of the likes are using a sequential scan and both of the tsearch queries use >>the gist index. Did I miss a configuration parameter, are these queries incorrectly >>using tsearch2,or is this tsearch2's average performance? Thanks in advance for the >>input! >> >>Kris >> >> ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html