Re: [PERFORM] Maximum number of sequences that can be created
2012/5/11 Robert Klemme shortcut...@googlemail.com On Fri, May 11, 2012 at 12:50 PM, Vidhya Bondre meetvbon...@gmail.com wrote: Is there any max limit set on sequences that can be created on the database ? Also would like to know if we create millions of sequences in a single db what is the downside of it. The sequences AFAIK are accounted as relations. Large list of relations may slowdown different system utilities like vacuuming (or may not, depends on queries and indexes on pg_class). On the contrary: what would be the /advantage/ of being able to create millions of sequences? What's the use case? We are using sequences as statistics counters - they produce almost no performance impact and we can tolerate it's non-transactional nature. I can imaging someone who wants to have a sequence per user or other relation row. -- Best regards, Vitalii Tymchyshyn
Re: [PERFORM] Maximum number of sequences that can be created
On Sun, May 13, 2012 at 10:12 AM, Віталій Тимчишин tiv...@gmail.com wrote: 2012/5/11 Robert Klemme shortcut...@googlemail.com On the contrary: what would be the /advantage/ of being able to create millions of sequences? What's the use case? We are using sequences as statistics counters - they produce almost no performance impact and we can tolerate it's non-transactional nature. I can imaging someone who wants to have a sequence per user or other relation row. I can almost see the point. But my natural choice in that case would be a table with two columns. Would that actually be so much less efficient? Of course you'd have fully transactional behavior and thus locking. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any disadvantages of using =ANY(ARRAY()) instead of IN?
Hello Noah, Thanks a lot for your feedback and explanations. Since you have 15+ tables at the top level, the genetic query optimizer should be kicking in and delivering a plan in reasonable time, albeit with plan quality hazards. There's a danger zone when the deterministic planner is still in effect but {from,join}_collapse_limit have limited the scope of its investigation. If you're in that zone and have not hand-tailored your explicit join order, poor plans are unsurprising. What exact configuration changes are you using? Basically only the changes, suggested here a year ago, which made the problem go away for less complex queries: geqo_threshold = 20 from_collapse_limit = 13 join_collapse_limit = 13 Hundreds of rows, no. Consider this example: IN(...): Total runtime: 2200.767 ms ANY(ARRAY(...)): Total runtime: 11748.348 ms In case there is an index on C, the resulting index scan is, even with 1000 elements, 3 times faster on my Notebook. However, both queries execute in next-to-no time (15 vs 5ms). Filing a bug report with the content you've already posted would not add much, but a self-contained test case could prove useful. Many of the deficiencies that can make ANY(ARRAY(...)) win do represent unimplemented planner intelligence more than bugs. Incidentally, you can isolate whether ANY(ARRAY(...))'s advantage comes solely from suppressing the subquery collapse. Keep IN but tack OFFSET 0 onto the subquery. If this gives the same performance as ANY(ARRAY(...)), then the subquery-collapse suppression was indeed the source of advantage. I see your point, some dumb logic to replace IN with ANY(ARRAY wouldn't always yield better results. I'll try to come up with a self-containing testcase. Thanks again, Clemens -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Maximum number of sequences that can be created
On Sun, May 13, 2012 at 1:12 AM, Віталій Тимчишин tiv...@gmail.com wrote: 2012/5/11 Robert Klemme shortcut...@googlemail.com On Fri, May 11, 2012 at 12:50 PM, Vidhya Bondre meetvbon...@gmail.com wrote: Is there any max limit set on sequences that can be created on the database ? Also would like to know if we create millions of sequences in a single db what is the downside of it. The sequences AFAIK are accounted as relations. Large list of relations may slowdown different system utilities like vacuuming (or may not, depends on queries and indexes on pg_class). Not may slow down. Change that to will slow down and possibly corrupt your system. In my experience (PG 8.4.x), the system can handle in the neighborhood of 100,000 relations pretty well. Somewhere over 1,000,000 relations, the system becomes unusable. It's not that it stops working -- day-to-day operations such as querying your tables and running your applications continue to work. But system operations that have to scan for table information seem to freeze (maybe they run out of memory, or are encountering an O(N^2) operation and simply cease to complete). For example, pg_dump fails altogether. After 24 hours, it won't even start writing to its output file. The auto-completion in psql of table and column names freezes the system. It takes minutes to drop one table. Stuff like that. You'll have a system that works, but can't be backed up, dumped, repaired or managed. As I said, this was 8.4.x. Things may have changed in 9.x. Craig
Re: [PERFORM] Any disadvantages of using =ANY(ARRAY()) instead of IN?
On Tue, May 01, 2012 at 04:34:10PM +0200, Clemens Eisserer wrote: select . from table1 ... left outer join table 15 WHERE table1.id IN (select id join table16 ... join table20 WHERE table20.somevalue=?) Starting with some amount of joins, the optimizer starts to do quite suboptimal things like hash-joining huge tables where selctivity would very low. I already raised join_collapse_limit and from_collapse_limit, but after a certain point query planning starts to become very expensive. On Sun, May 13, 2012 at 04:35:30PM +0200, Clemens Eisserer wrote: Since you have 15+ tables at the top level, the genetic query optimizer should be kicking in and delivering a plan in reasonable time, albeit with plan quality hazards. ??There's a danger zone when the deterministic planner is still in effect but {from,join}_collapse_limit have limited the scope of its investigation. ??If you're in that zone and have not hand-tailored your explicit join order, poor plans are unsurprising. ??What exact configuration changes are you using? Basically only the changes, suggested here a year ago, which made the problem go away for less complex queries: geqo_threshold = 20 from_collapse_limit = 13 join_collapse_limit = 13 Given those settings and the query above, the planner will break the 15 top-level tables into lists of 13 and 2 tables, the 20 subquery tables into lists of 13 and 7 tables. The split points arise from order of appearance in the query text. The planner then optimizes join order within each list but not across lists. That perfectly explains your observation of hash-joining huge tables where selctivity would very low. If it were me, I would try two things. First, set from_collapse_limit = 100, join_collapse_limit = 100, geqo_threshold = 8. This will let the planner consider all join orders for the 35 tables; it will use the genetic query optimizer to choose one. See if the plan time and resulting plan are decent. Second, set from_collapse_limit = 100, join_collapse_limit = 100, geqo = off and EXPLAIN the query. This will take forever and might exhaust all system memory. If it does complete, you'll see the standard planner's opinion of an optimal join order. You can then modify the textual join order in your query to get the same plan despite returning to a lower join_collapse_limit. Since Hibernate is generating your queries, that may prove inconvenient. It's the remaining escape hatch if the genetic query optimizer does not suffice. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance