Re: [GENERAL] How to inject knowledge into a Postgres database
On Tuesday 11 October 2005 01:50, Jim C. Nasby wrote: What's the datatype on naamvrouw? Notice that it's being casted to text, which means an index on that column won't be used. Datatype is text. My problem is that I want it to use the date of birth index. I've increased the statistics to 1000, and only occasionally, because of the very uneven distribution of surnames, the planner does the wrong thing. (Thinking that only 14 rows will match 'blaa' instead of 767.) I've tried to set an even higher limit, but 1000 seems to be the maximum. Is this hardcoded, or can I set a higher maximum somewhere ? If the only penalty is slower analyzing, I don't care: we analyze at night when these system are idle. Cheers, Han Holl ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] How to inject knowledge into a Postgres database
[EMAIL PROTECTED] writes: I've increased the statistics to 1000, and only occasionally, because of the very uneven distribution of surnames, the planner does the wrong thing. I've tried to set an even higher limit, but 1000 seems to be the maximum. Is this hardcoded, or can I set a higher maximum somewhere ? You'd have to change the source code, but it's a simple tweak in the ALTER SET STATISTICS code. If the only penalty is slower analyzing, I don't care: we analyze at night when these system are idle. You'd be wrong about that --- the planner operations that use the data would necessarily be slower, too. I don't have any concrete information about how much slower, but I'd be hesitant to raise the figure much beyond 1000 ... However, if you can show you have a real-world case that benefits, I'd be willing to think about raising the wired-in limit to 1 or so. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] How to inject knowledge into a Postgres database
On Thursday 13 October 2005 16:06, Tom Lane wrote: You'd have to change the source code, but it's a simple tweak in the ALTER SET STATISTICS code. I don't think I'd want to do that. If the only penalty is slower analyzing, I don't care: we analyze at night when these system are idle. You'd be wrong about that --- the planner operations that use the data would necessarily be slower, too. I don't have any concrete information about how much slower, but I'd be hesitant to raise the figure much beyond 1000 ... However, if you can show you have a real-world case that benefits, I'd be willing to think about raising the wired-in limit to 1 or so. The example I gave earlier in the thread, date_of_birth = 'some-date' and surname like 'blaa%', was a real life example, but I had to pull it from a logfile that logs queries longer than 500 ms. It happens two or three times a day in a laboratory with 50 people querying the database all day. Estimates for date_of_birth number of rows are quite good (even at the default stats of 10) but surnames are just too unevenly distributed. But in 99% of all cases the guess is right, and by making it a nested query I could improve 1% and worsen 99%. Cheers, Han Holl ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] How to inject knowledge into a Postgres database
Tom, Oleg, Yonathan, thanks for the suggestions. Indeed, upping the statistics from 10 to 100 helped. But order by did not: palga=# explain analyze select rapnaam from udps where geboortedatum = '1966-01-01' and naamvrouw like 'vos%' order by geboortedatum; QUERY PLAN -- Sort (cost=18.07..18.08 rows=1 width=18) (actual time=774.941..774.941 rows=0 loops=1) Sort Key: main.geboortedatum - Index Scan using nv on main (cost=0.00..18.06 rows=1 width=18) (actual time=746.121..746.121 rows=0 loops=1) Index Cond: (((naamvrouw)::text = 'vos'::character varying) AND ((naamvrouw)::text 'vot'::character varying)) Filter: ((geboortedatum = '1966-01-01'::date) AND ((naamvrouw)::text ~~ 'vos%'::text)) Total runtime: 775.068 ms (6 rows) I got a similar problem with a functional index, but I guess my only option is to create a real column with the results of the function, and replace the functional index with a real one. Thanks to all, Han Holl ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] How to inject knowledge into a Postgres database
What's the datatype on naamvrouw? Notice that it's being casted to text, which means an index on that column won't be used. On Mon, Oct 10, 2005 at 11:29:38AM +0200, [EMAIL PROTECTED] wrote: Tom, Oleg, Yonathan, thanks for the suggestions. Indeed, upping the statistics from 10 to 100 helped. But order by did not: palga=# explain analyze select rapnaam from udps where geboortedatum = '1966-01-01' and naamvrouw like 'vos%' order by geboortedatum; QUERY PLAN -- Sort (cost=18.07..18.08 rows=1 width=18) (actual time=774.941..774.941 rows=0 loops=1) Sort Key: main.geboortedatum - Index Scan using nv on main (cost=0.00..18.06 rows=1 width=18) (actual time=746.121..746.121 rows=0 loops=1) Index Cond: (((naamvrouw)::text = 'vos'::character varying) AND ((naamvrouw)::text 'vot'::character varying)) Filter: ((geboortedatum = '1966-01-01'::date) AND ((naamvrouw)::text ~~ 'vos%'::text)) Total runtime: 775.068 ms (6 rows) I got a similar problem with a functional index, but I guess my only option is to create a real column with the results of the function, and replace the functional index with a real one. Thanks to all, Han Holl ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] How to inject knowledge into a Postgres database
Oleg Bartunov wrote: You could increase statistics or try contrib/tsearch2 Oleg On Fri, 7 Oct 2005, [EMAIL PROTECTED] wrote: Hello, We have a table of people with a date-of-birth and a surname, both indexed. We have queries like this: select report from table where dateofbirth = '1966-12-12' and surname like 'boer%' The planner estimates that in a 1.5M record table 'boer%' will have one record and 1966-12-12 about 40. Unfortunately, names are not evenly distributed, and in some combinations it will have to process many thousands of records. On some older and slower systems, this hurts. Is there a way to tell the planner to always prefer the dateofbirth index ? Alternatively, to inform it about the wildly uneven distribution of surnames (this must be even worse in China). A third possibility would be to rewrite such a query as a nested query: is there a rewrite query - hook (like apaches mod_rewrite) available ? Or is there a somewhere a proxy-server that could do this ? Cheers, Han Holl ---(end of broadcast)--- TIP 6: explain analyze is your friend Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match You can also force it to use the index using ORDER BY dateofbirth. Cheers, Ben-Nes Yonatan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] How to inject knowledge into a Postgres database
Hello, We have a table of people with a date-of-birth and a surname, both indexed. We have queries like this: select report from table where dateofbirth = '1966-12-12' and surname like 'boer%' The planner estimates that in a 1.5M record table 'boer%' will have one record and 1966-12-12 about 40. Unfortunately, names are not evenly distributed, and in some combinations it will have to process many thousands of records. On some older and slower systems, this hurts. Is there a way to tell the planner to always prefer the dateofbirth index ? Alternatively, to inform it about the wildly uneven distribution of surnames (this must be even worse in China). A third possibility would be to rewrite such a query as a nested query: is there a rewrite query - hook (like apaches mod_rewrite) available ? Or is there a somewhere a proxy-server that could do this ? Cheers, Han Holl ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] How to inject knowledge into a Postgres database
[EMAIL PROTECTED] writes: Is there a way to tell the planner to always prefer the dateofbirth index ? Alternatively, to inform it about the wildly uneven distribution of surnames (this must be even worse in China). Increase the statistics target for that column, and re-ANALYZE the table. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] How to inject knowledge into a Postgres database
You could increase statistics or try contrib/tsearch2 Oleg On Fri, 7 Oct 2005, [EMAIL PROTECTED] wrote: Hello, We have a table of people with a date-of-birth and a surname, both indexed. We have queries like this: select report from table where dateofbirth = '1966-12-12' and surname like 'boer%' The planner estimates that in a 1.5M record table 'boer%' will have one record and 1966-12-12 about 40. Unfortunately, names are not evenly distributed, and in some combinations it will have to process many thousands of records. On some older and slower systems, this hurts. Is there a way to tell the planner to always prefer the dateofbirth index ? Alternatively, to inform it about the wildly uneven distribution of surnames (this must be even worse in China). A third possibility would be to rewrite such a query as a nested query: is there a rewrite query - hook (like apaches mod_rewrite) available ? Or is there a somewhere a proxy-server that could do this ? Cheers, Han Holl ---(end of broadcast)--- TIP 6: explain analyze is your friend Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match