Re: [HACKERS] MySQL Compatibility WAS: postgresql.conf settings autoconfiguration
Dear all, Just a quick message about postgresql.conf auto-configuration. When MySQL users test PostgreSQL, they load their data and run simple SQL queries. If postgresql.conf is configured with default values, it may produce slow results. Would there be a way for postgresql.conf to auto configure? This is how MySQL works. Nothing is configurable. And people love it. So why not implement a simple auto configuration mechanism which would set basic parameters. Example autotune=true; autotune_profile='web site'; autotune_profile='dedicated SQL server'; ... Kind regards, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] MySQL Compatibility WAS: 8.5 release timetable, again
Le mercredi 26 août 2009 à 15:51 -0700, Josh Berkus a écrit : I doubt this would be an accurate description of all Drupal developers. My opinion was : Before adding replication to PostgreSQL, it would be better to support a basic set of MySQL syntax seems relevant: DELETE FROM table1, table2 FROM table1 INNER JOIN table2 ON ... UPDATE table1, table1 SET table1.foo=1, table2.bar=3 WHERE table1.foo=2 INNER JOIN table2 ..., etc ... There is not much to add to PostgreSQL, but it seems relevant. Otherwise, replicating some MySQL SQL syntax will not work. As you know, people willing to use PostgreSQL replication are possibly already MySQL replication users. So if they test and PostgreSQL fails, this is too bad. Drupal was only an example, I did not mean to criticize all Drupal developers. Just to say they focus on PHP and not SQL. They don't have time to port MySQL code. Besides, they are very nice people. So I apologize for this short (false) focus. Bye, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] MySQL Compatibility WAS: 8.5 release timetable, again
Le jeudi 27 août 2009 à 14:27 -0500, Jaime Casanova a écrit : the point was that if we simply were saying: hey! mysql can interpret this, make postgres do the same then we could end up with a lot of broken stuff... just because mysql users think is wonderful to not have to write sane code... Not MySQL in general. Only the subset which helps and seems reasonable. Bye, JM signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] 8.5 release timetable, again
Le mercredi 26 août 2009 à 01:36 -0600, Rick Gigger a écrit : One possible reason that replication is more critical now than it was a year ago is the rise in cloud computing. The ability to fire up instances on demand is much more useful when some of those boxes can be database servers and those databases servers can replicate the primary database and start doing something useful. As far as I can tell this one feature alone is the one thing that makes it hard to convince people to migrate away from Mysql despite it's demonstrable inferiority in many other areas. I think you should have a deep look at these two manuals that I wrote for Drupal: Guidelines for writing MySQL and PostgreSQL compliant SQL http://drupal.org/node/14 and Quidelines for writing SQL efficient code: http://drupal.org/node/559302 I have been using PostgreSQL since 1998. I took part in the development of pgAdmin 1 and pgAdmin 2. I am very proud of the PostgreSQL community, but I think it should fix some important issues in the domain of SQL compliance and compatibility. When reading developers comments on Drupal web site, it seems that there is deep misunderstanding between developers and SQL databases. I would say that 1% of developers know database technology. For example, most Drupal developers think that an INNER JOIN is faster than a LEFT JOIN. The reality of facts is that developers will not even test PostgreSQL and stop using it after the first SQL warning or error. So I would recommend focussing on usability. Then you can work on replication and materilized views. You probably know that a cloud requires several computers. With materialized view, a single computer can perform 100 times better. I see plenty of of possibilities to improve speed using materialized views. But first and firstly ... focus on usability. Then make a pre-release of a PostgreSQL 8.4.2 release or so ... We need to wipe out this MySQL issue once for all. If there is a compat MySQL mode or functions, then include it in core. This is too important for PostgreSQL success. Why MySQL usability is achieved add materialized views and MySQL is dead in the performance statistics, beaten 99% by PostgreSQL. Kind regards, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] PostgreSQL does CAST implicitely between int and a domain derived from int during SELECT query
Le mercredi 26 août 2009 à 10:07 -0400, Andrew Dunstan a écrit : Isn't that cast effectively a no-op? What is no-op in English (I am French, pardon my English). I will update table definition in 5 minutes. Stay tuned. Kind regards, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
[HACKERS] PostgreSQL does CAST implicitely between int and a domain derived from int during SELECT query
Dear friends, Thank you for your previous answers. I am running into a systemic problem using Drupal under PostgreSQL 8.4 Drupal relies heavily on a domain derived from int: CREATE DOMAIN int_unsigned AS integer CONSTRAINT int_unsigned_check CHECK ((VALUE = 0)); Analysing slow queries, I noticed that PostgreSQL 8.4 would cast data from int4 to int_unsigned. Details and query plan: http://drupal.org/node/559986 Some queries range between 400ms and 700ms. The problem is that Drupal send 300 SQL queries per page. Of course these queries are cached in PHP. But not solving it makes it difficult to run a Drupal site under heavy load. What do you think? Is this because of the constraint or is this a possible feature missing in PostgreSQL. What solution do you recommend to solve this problem? Is there a way to use a custom type? Should we convert the schema to int and add local constraints? A lot of SQL code hanging around could be impacted and this seems like a serious issue to me. So it seems difficult to convert all these applications. For usability, it should be better to avoid cast between int and int_unsigned. Kind regards, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] PostgreSQL does CAST implicitely between int and a domain derive from int during SELECT query
Le mercredi 26 août 2009 à 09:16 -0500, Kevin Grittner a écrit : Please repost with that the pgsql-performance list Done: http://drupal.org/node/559986 Kind regards, JMP signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] 8.5 release timetable, again
Le mercredi 26 août 2009 à 09:30 -0500, Kevin Grittner a écrit : It's not clear to me what you feel is needed. That could mean many things Dear Kevin, I rarely post on Hackers, so I will try to explain: * I use PostgreSQL since 1998. * I took part in the development of pgAdmin 12. * I love PostgreSQL and I believe MySQL sucks. Recently I was forced to use MySQL for Kdenlive.org project and the database sometimes stops responding sending nothing or crap. I believe that if you use MySQL in your company for a paid work, you can die of a heart attack. But at the same time: * I rewrote very long and tedious queries from PHPBB. Maybe 100 of them. They are now part of PhpBB3. I drove all queries below 30 ms and this enables PhpBB to scale easily. I consider this is my work. * I think Drupal queries presently have performance problems. If I wanted, I would be able to drive down Drupal web site, using a collection of simple queries on projects, forum and comments. But I don't of course. This is why I wrote http://drupal.org/node/559302 and http://drupal.org/node/14 Everytime I try a new Drupal module under PostgreSQL, I run into tiny SQL problems ranging from error to performance drop. The most problematic problem is http://drupal.org/node/559986 To fix a problem, I need to open a thread on Drupal web site and wait for the maintainer to discuss and commit. To give an example, Drupal includes a query optimizer written in PHP, which sometimes adds DISTINCT to queries. In the forum, some incredible query fetches 22000 rows, copies the rows to an arrays and then computes the array. This allows to display previous and next message. But we are not going to change the world of MySQL users, which believe they know SQL programming and in reality are complete beginners, who like to boast about farms and replication, just like Windows users like to collect Adobe products on DVDs and discuss with friends about them. IMHO for what I know from the porting work (I worked heavily on PHPBB3 and now Drupal), the first goal is to achieve compatibility with issues mentioned there: http://drupal.org/node/14 and add mysql compat functions in PostgreSQL core without breaking existing code. Then I can insure that 99% of MySQL compatibility problems will be behind. When this is achieve, we will be able to start education of developers. And this will take another decade. To win over MySQL, the best is to work on materialized views. There are very good articles available from hackers. Why not port to C. Materialized which which update when the data is needed would be perfect. Then we can convince Drupal hackers to add views in the schema. The trick would be that MySQL would support normal views, whereas we would also support materialized. We can do the same with nearly all available frameworks: PhpBB, etc ... Web apps are 95% of PostgreSQL possible users. Kind regards, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] 8.5 release timetable, again
Dear Kevin So when you talk about focusing on usablility improvements you mean that priority should be given to supporting MySQL-specific syntax extensions and ensuring that there are no queries where the MySQL optimizer comes up with a more efficient plan than PostgreSQL? Yes. PostgreSQL should be able to run MySQL code quoted here: This is a prerequisite for people to be willing to test and adopt PostgreSQL. People are not willing to debug frameworks like Drupal and port them to PostgreSQL. We are quite alone and lost. One concern I have is that you don't mention PostgreSQL configuration in your performance advice, and I seem to remember you said that you didn't tune your postgresql.conf file beyond boosting the shared_buffers setting. If that's true, you might be somewhat surprised with the performance improvements if you tweak just a few other settings. shared_buffer 24M. Kind regards, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] MySQL Compatibility WAS: 8.5 release timetable, again
Second, we're not going to support MySQL's *bugs* and *bad design decisions* which is what lazy developers actually want; they want something exactly the same as MySQL, including bugs. If they want that, they can use MySQL. We are not MySQL, and trying to out-MySQL MySQL is stupid, just as it would be to copy Oracle exactly. I understand what you mean. To tell you how lazy MySQL people are is my last experience in the Drupal world. In short, on my devel server, Drupal previous/next link display SQL script returns 21.000 rows. Of course, it should return only two rows. The 21.000 rows are then processed step by step by a PHP script. I open a bug and one of Drupal core developers answers: Jean-Michel, this is a friendly warning, please change your behavior. This is getting really annoying. In fact, this core developer does not like the way I try to explain how to use databases. I wrote two tutorials: http://drupal.org/node/559302 and http://drupal.org/node/14 The truth is that Drupal core developers do not believe fixing the prev/next link script is important. They don't care for SQL and don't understand the relationship between SQL queries and CPU cycles. Read this page: http://drupal.org/project/prev_next This performance issue was known for several months and these MySQL developers did not even fixed it. Finally, I escalated to the founder of the Drupal community to ask him to integrate a proper SQL Previous/Next script into Drupal. Their last answer: Holy crap jmpoure, this is not how the community works. We don't beg to Dries. Read: http://drupal.org/node/559424 Now, there are things which MySQL does better which we should fix, because they are real problems for our users who already like PostgreSQL. These include simple replication, upgrade in place, driver maintenance, covering indexes, MERGE, etc. But we'll do these because they make *Postgres* better, not because MySQL has them. After reading my story, I hope we can agree that noone is going to port any MySQL code to PostgreSQL ever. This demands too much intellectual efforts. Many people will migrate from DB2 and Oracle to PostgreSQL. But no MySQL developer is going to use PostgreSQL if he needs to modify SQL queries. I don't want to be offensive, but I really believe it. So we should support a minimal set of MySQL SQL instructions. After several years of porting MySQL code to PostrgeSQL, I believe that this limited list is enough: http://drupal.org/node/14 This is quite a straightforward need. Without this list of issues, PostgreSQL may never be able to run popular products developed under MySQL. Think of all commercial and free software projects. The impact of MySQLisms are huge. I can only compare it to Windows vs. GNU/Linux or FreeBSD. This is what comes in mind first. We are not leaving in a perfect world and there no reason to achieve perfectness. So let's support this list, please: http://drupal.org/node/14 Kind regards, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] MySQL Compatibility WAS: 8.5 release timetable, again
Le mercredi 26 août 2009 à 22:06 +0100, Greg Stark a écrit : With your current approach you're likely to get dismissed out of hand, not unlike what I can well believe happened in the Drupal world. This is the case. That would be unfortunate because I think there are 2 or 3 real improvements hidden in your list. Then explain I don't have your skills. Thanks. signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] MySQL Compatibility WAS: 8.5 release timetable, again
Le mercredi 26 août 2009 à 16:56 -0400, Alvaro Herrera a écrit : Have you tried Drupal 7? It's said to have many of these corrected. Maybe you should stop wasting your time with 6.x. I am running a large community on the Net and people would like to migrate our framework to Drupal. We agreed to. I already assembled 30 modules to develop a complete service. But it does not scale very well as my testings show that queries are slow. I wrote SQL guides to inform the Drupal community. It seems that Drupal developers do not make a relationship between database and CPU time or memory usage. Therefore Drupal PHP cache is filled with SQL queries. At it demands more and more memory. I am probably going to migrate to Drupal 6.x in a few days and I will fix queries by hand in case of problems. This is what happened when we migrated to PhpBB. Bye, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] DELETE syntax on JOINS
With proper foreign keys and ON DELETE CASCADE, why would supporting such syntax even be necessary? Porting existing abstraction layers from ANSI JOINs to ON DELETE CASCADE is complicated. What I don't get is why this is such a usability issue. Subqueries in DELETE FROM work perfectly well, and provide more flexibility than most users know what to do with. The ANSI syntax allows deleting one or several tables at once. Subqueries are not supported by MySQL on DELETE. Again, this is a usability issue to gain market shares and happy users againts MySQL. Kind regards, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
[HACKERS] DELETE syntax on JOINS
Dear Friends, First, thank you very much for considering a fix on the GROUP BY issue. I am starting a new thread about another issue: It seems that DELETE cannot understand INNER JOINS and needs HAVING. Read: http://drupal.org/node/62 (main message) http://drupal.org/node/555648 I don't see why PostgreSQL would not be able to run queries like: DELETE h FROM history AS h INNER JOIN term_node AS tn ON (h.nid = tn.nid) INNER JOIN term_data AS td ON (td.tid = tn.tid) WHERE h.uid = 2067 AND td.vid = 2 Ultimately, why not allow: DELETE h, tn FROM history AS h INNER JOIN term_node AS tn ON (h.nid = tn.nid) INNER JOIN term_data AS td ON (td.tid = tn.tid) WHERE h.uid = 2067 AND td.vid = 2 IMHO this would improve compliance towards other database systems. To me this seems to be in the reasonable scope of compatibility. Kind regards, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
[HACKERS] Feedback about Drupal SQL debugging
Dear friends, I have been using PostgreSQL since 6.3 releases and I am a real fan. Of course, I never use nor trust MySQL to deliver data. Now I use Drupal 6.3 with PostgreSQL 8.4. I loose a lot of time correcting Drupal SQL. You may be interested in my developer feedback. I gathered some real examples here: Guidelines for writing MySQL and PostgreSQL compliant SQL = http://drupal.org/node/14 This page gathers most frequent problems that Drupal users and developers encounter when using PostgreSQL. I would be delighted to have your feedback. Could some issues reasonably be fixed for a better Drupal support? Kind regards, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] Feedback about Drupal SQL debugging
I've registered for the Drupal site so that I can fix and/or expand some of your items. Thanks. I corrected the index on dual fields page. If you want to discuss Drupal PostgreSQL again, please post on the pgsql-advocacy list or the pgsql-php mailing lists. pgsql-hackers isn't the best place to get people to help you. I would prefer no, please. This post is made to understand what needs to be done at PostgreSQL level for better Drupal supports. As written previously, Drupal developers write MySQL code. Some of this code is not portable, okay. BTW, why don't we have a multi-argument version of CONCAT()? In 8.4, it would be possible ... I should add it to mysqlcompat library. yes. In PostgreSQL core ... PostgreSQL requires all non-aggregated fields to be present in the GROUP BY clause (I fixed 10 such issues in Drupal code). http://drupal.org/node/30 Why can't PostgreSQL add the required field automatically? Could this be added to PostgreSQL to-do-list? Kind regards, Jean-Michel signature.asc Description: Ceci est une partie de message numériquement signée
Re: [HACKERS] XML index support
Thank you all for these kind answers. Cheers, JMP -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] XML index support
Dear friends, You may know me as I worked a long time ago on pgAdmin I with Dave. As this is an XML related question and XML is quite new, I am posting on hackers ML. If any solution is being developped, please inform us. We would like to develop a free REST database (real-estate standard) based on a PostgreSQL schema. This is a free solution for free data also, released under BSD or GPL license. We are charity. This is supposed to a killer application, so we need to drive down queries to 3ms to 5ms to allow hundreds of simultaneous queries. Php will probably be part of the package. What is in your opinion the best way to achive this : * develop a traditional databe and implement materialized views in PL OR * implement the new XML type with some index (GIST) designed for PostgreSQL. Which one? OR * write triggers behind XML tables to pullulate a traditionnal database, add indexes and query traditionnal database. Kind regards, Jean-Michel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] XML index support
Dear friends, You may know me as I worked a long time ago on pgAdmin I with Dave. As this is an XML related question and XML is quite new, I am posting on hackers ML. We would like to develop a free REST database (real-estate standard) based on a PostgreSQL schema. This is a free solution for free data also, released under BSD or GPL license. We are charity. This is supposed to a killer application, so we need to drive down queries to 3ms to 5ms to allow hundreds of simultaneous queries. Php will probably be part of the package. What is in your opinion the best way to achive this : * develop a traditional databe and implement materialized views in PL OR * implement the new XML type with some index (GIST) designed for PostgreSQL. Which one? OR * write triggers behind XML tables to pullulate a traditionnal database, add indexes and query traditionnal database. Kind regards, Jean-Michel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgresql Materialized views
Dear Friends, I hope that this flame war can stop, as it is useless. The logic of free software is that developers pick-up issues, based on their skills and interest. The power of the cummunity is to gather very talented developers from all over the planet. Freedom is the logic and there is no need to drive the community. In the end, PostgreSQL relies on the knownledge of talented developers. When posting this thread, I hope that a talented developer would some day pick-up the materialized view issue and work on it, during a process of discussion. In a few days, I will post some precise statistics on how much MATERIALIZED VIEW could boost a 400.000 posts PhpBB 3.0 forum. I will write the required PL trigger, test them and install them on my server. Then each individual is able to decide whether materiazed views are important or not. This is my definition of freedom. Freedom of choice. I don't ask for more. Now, I hope that the list can return to a more peaceful state. Kind regards, Jean-Michel Pouré ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Postgresql Materialized views
In fairness to Jean-Michel, he has spent hundreds of hours in the past doing just that and far more for the pgAdmin users in the community - I'm sure we can excuse him for asking for what many do think would be a useful feature in the hopes that someone listening might just decide to pick it up. In the meantime though - have you tried rolling your own materialised views with some triggers Jean-Michel? I have good results doing that in the past. Dear friends, For of all, thanks for picking up my message and replying. I agree with Dave Page and others that it is hard to find contributors for a Free project. When writing 2% of pgAdmin2 code, I noticed that you could count 10.000 end-users for ONE developper. This is why I usually offer a bounty for any kind of feature. Two months ago, my last bounty for a Kdenlive feature (100€), but no-one aggreed to receive the money after developing the feature, probably because there were dozens of contibutors (people writing code on top of someone else code). But the power of PostgreSQL is to be a real community, like Xorg or Apache. This makes all the difference. This was the heart of my message. This being said, our behind the scene needs are: * We are a non-profit organisation based in France, providing real-estate listing services. * We are going to spend 2000€ buying two servers (Phantom 4-die + 8GB 2U servers), running on Debian: one for PostgreSQL, the other for Apache2. * We plan to launch the first virtual real-estate agency managed by a free community. * We are going to use a mixture of PhpBB 3.0, Drupal and a custom applications. * We plan to handle thousands of simultaneous users. So ... Based on my current logs, I know that I may have to buy and run a web farm. Which I do not want, because it implies extra hosting costs. AND I know that ONE PostgreSQL database can handle 1000 simultaneous queries, when using server-side code. I wrote a small PostgreSQL query optimization HOWTO : http://area51.phpbb.com/phpBB/viewtopic.php?f=3t=29292 Pardon me if some information is not accurate. I tried my best to educate PhpBB developers choosing the right technology. PostgreSQL... Now, I am going to write the requested server-side code for PhpBB 3.0 Then I will benchmark queries on our large server. In the end I hope that someone will be willing to pick-up this MATERIALIZED VIEW issue, so that any applications running on PostgreSQL can benefit from lightning speed. If someone is willing to receive a bounty, please contact me. Being a non-profit organisation, I can ask money to contributors on our web site. This will not be any kind of large sum of money, something between 100€ and 500€. Kind regards and happy new year. Jean-Michel Pouré ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Postgresql Materialized views
Dear Friends, In my past development projects, I always used the concept of Materialized VIEW to speed-up SELECTs over INSERTs. You are well aware of Jonathan Gardner preliminary work: http://www.varlena.com/varlena/GeneralBits/Tidbits/matviews.html When do you plan to add MATERIALIZED VIEWS to PostgreSQL? This would be major improvement ni the case of Web applications. I run a 400.000+ message board using PhpBB 3.0. After optimization, some queries still need 30 millisecond to run. With Materialized views, it should be possible to drive these queries to 1 millisecond. This means that in some situations a PostgreSQL backend could handle 10 times more queries. My database handles 10 to 20 queries every second. There are 100 selects for 1 INSERT. But my database could well handle over 500 queries a second using materialized views. At my level, here are my plans: 1) Publish some long query LOGs from my database, longuer than 30 milliseconds. 2) Write some PL code to demonstrate the interest in Materialized Views. Publish benckmarks showing time improvement, like 1 milisecond. 30x faster. 3) Then wait for someone on Hackers mailing list to pick-up this important issue and integrate Materialized views in PostgreSQL schema and SQL language. Any information and discussion about materialized views is welcome. Please pick-up this important issue for developpers. There is no need to concentrate on complex issues, when handling materialized views could boost somme web apps. by a factor of 10 or more. Kind regards and happy new year. I hope that 2008 will be the year of materialized views. Jean-Michel Pouré ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Materialized Views in PostgreSQL
Dear friends and all, Johnathan Gardner did a wonderful job on materialized views: http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html Are there any plans to integrate materialized views written in plain C in the future. This can help gain a lot of time in Web applications. For example, Web CMS or even forums have 99% SELECT queries. Being able to use materialized view would boost speed. Kind regards, Jean-Michel ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Materialized Views in PostgreSQL
Dear friends and all, Johnathan Gardner did a wonderful job on materialized views: http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html Are there any plans to integrate materialized views written in plain C in the future. This can help gain a lot of time in Web applications. For example, Web CMS (or even forums) have 99% SELECT queries. Being able to use materialized view would boost speed. Kind regards, Jean-Michel ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster