[HACKERS] SoC ECPG Enhancements
The Google Summer of Code web page lists as possible project ECPG Enhancments: Enable ECPG to generate calls directly for libpq rather than calls to its own libraries. What would be the point of that? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] Bitmapscan changes
Hannu Krosing wrote: Ühel kenal päeval, K, 2007-03-14 kell 10:22, kirjutas Heikki Linnakangas: Clustered indexes have roughly the same performance effect and use cases as clustered indexes on MS SQL Server, and Index-Organized-Tables on Oracle, but the way I've implemented them is significantly different. On other DBMSs, the index and heap are combined to a single b-tree structure. The way I've implemented them is less invasive, there's no changes to the heap for example, and it doesn't require moving live tuples. Do you keep visibility info in the index ? No. If there is no visibility data in index, then I can't see, how it gets the same performance effect as Index-Organized-Tables, as lot of random heap access is still needed. Let me illustrate the effect in the best case, with a table that consists of just the key: Normal b-tree: Root - leaf - heap aaa - aaa - aaa bbb - bbb ccc - ccc ddd - ddd - ddd eee - eee fff - fff ggg - ggg - ggg hhh - hhh iii - iii Clustered b-tree: Root - heap aaa - aaa bbb ccc ddd - ddd eee fff ggg - ggg hhh iii The index is much smaller, one level shallower in the best case. A smaller index means that more of it fits in cache. If you're doing random access through the index, that means that you need to do less I/O because you don't need to fetch so many index pages. You need to access the heap anyway for the visibility information, as you pointed out, but the savings are coming from having to do less index I/O. How close to the best case do you get in practice? It depends on your schema, narrow tables or tables with wide keys gain the most, and on the clusteredness of the table. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Bitmapscan changes
Alvaro Herrera wrote: Which is why we don't do things that way. The code must fit within the general architecture before application -- particularly if it's an internal API change. That's what the review process is for. Yes, of course. As I've said, I have the time to work on this, but I need get the review process *started*. Otherwise I'll just tweak and polish the patch for weeks, and end up with something that gets rejected in the end anyway. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] where to add/change commands
Hi folks I am trying to make CLUSTER command just a bit verbose,as an option here. Added bits to gram.y that should support CLUSTER [VERBOSE] ... but psql doesn't seem to pick it up. Where else do I need to update it, besides gram.y, cluster.c and of course adding new field to ClusterStmt. I tried seaching the list for such hints, but didn't found anything usefull. thanks. -- Grzegorz Jaskiewicz [EMAIL PROTECTED] C/C++ Freelance to hire. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] CLUSTER and MVCC
Heikki Linnakangas wrote: One potential issue I'm seeing is that if we rely on the unbroken chain starting from OldestXmin, and that tuple isn't there because of a bug, for example, the later version of the tuple is skipped and the row is lost. After further thought, I don't feel comfortable with the idea because of the above loss of robustness. I'm thinking of keeping an in-memory mapping of old and new tids of updated tuples while clustering, instead. That means that cluster requires a little bit of memory for each RECENTLY_DEAD updated tuple. In the worst case that means that you run out of memory if there's too many of those in the table, but I doubt that's going to be a problem in practice. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] where to add/change commands
Grzegorz Jaskiewicz wrote: I am trying to make CLUSTER command just a bit verbose,as an option here. Added bits to gram.y that should support CLUSTER [VERBOSE] ... but psql doesn't seem to pick it up. psql? There's some code to detect commands that can't be run in a transaction block in src/bin/psql/common.c, maybe that's what you're looking for. Or did you mean something else? How doesn't it pick it up? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] Bitmapscan changes
Joshua D. Drake wrote: This is what I suggest. Provide a tarball of -head with the patch applied. Here you are: http://community.enterprisedb.com/git/pgsql-git-20070315.tar.gz Provide a couple of use cases that can be run with explanation of how to verify the use cases. There's a number of simple test cases on the web page that I've used (perfunittests). I can try to simplify them and add explanations. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] where to add/change commands
On Mar 15, 2007, at 11:36 AM, Heikki Linnakangas wrote: Grzegorz Jaskiewicz wrote: I am trying to make CLUSTER command just a bit verbose,as an option here. Added bits to gram.y that should support CLUSTER [VERBOSE] ... but psql doesn't seem to pick it up. psql? There's some code to detect commands that can't be run in a transaction block in src/bin/psql/common.c, maybe that's what you're looking for. meaning that if I type in CLUSTER VERBOSE in psql, I get syntax error. Or did you mean something else? How doesn't it pick it up? well, probably patch's worth 1000 words. Hope that attaching such small file isn't a crime in the neck'o'woods. clusterverbose.patch Description: Binary data -- Grzegorz Jaskiewicz [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] where to add/change commands
Hi, psql? There's some code to detect commands that can't be run in a transaction block in src/bin/psql/common.c, maybe that's what you're looking for. Or did you mean something else? How doesn't it pick it up? I think he probably meant that he was getting a syntax error, even after making all the changes. Grzegorz, I would have suggested to make an entry for VERBOSE in parser/keywords.c, but it already seems to contain an entry for VERBOSE. I hope you are using the opt_verbose rule in your gram.y in the CLUSTER [VERBOSE] case. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] where to add/change commands
On Mar 15, 2007, at 11:46 AM, NikhilS wrote: Grzegorz, I would have suggested to make an entry for VERBOSE in parser/keywords.c, but it already seems to contain an entry for VERBOSE. I hope you are using the opt_verbose rule in your gram.y in the CLUSTER [VERBOSE] case. sure I am. My 'bison' skills are not very high, but I was trying to mimic 'VACUUM' syntax there. see the patch. -- Grzegorz Jaskiewicz [EMAIL PROTECTED] C/C++ freelance to hire ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] where to add/change commands
Grzegorz Jaskiewicz wrote: meaning that if I type in CLUSTER VERBOSE in psql, I get syntax error. Your patch works perfectly fine for me. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] where to add/change commands
On Mar 15, 2007, at 12:26 PM, Peter Eisentraut wrote: Your patch works perfectly fine for me. Ok, I'll try make distclean/make -j2/sudo make install, reboot and will see. I was kinda beliving that remaking whole thing from scratch over and over again wouldn't be a necessity here. thanks. -- Grzegorz Jaskiewicz C/C++ freelance for hire ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] where to add/change commands
Grzegorz Jaskiewicz [EMAIL PROTECTED] writes: On Mar 15, 2007, at 12:26 PM, Peter Eisentraut wrote: Your patch works perfectly fine for me. Ok, I'll try make distclean/make -j2/sudo make install, reboot and will see. I was kinda beliving that remaking whole thing from scratch over and over again wouldn't be a necessity here. thanks. You may want to configure with --enable-depend. It shouldn't be necessary to get make to notice changes to gram.y but perhaps your problem lies elsewhere. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] where to add/change commands
ok, it works okay. Thanks. I am really serious about adding this patch. I would like it to not only show which tables/indices are being clustered, but also some sort of progress information (print procentage of job being done, etc). Any hints, as on what might be useful for others. Perhaps there's a slim chance I could offer it to mainline? -- Grzegorz Jaskiewicz C/C++ freelance for hire ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] [RFC] CLUSTER VERBOSE
Hi folks, I figure - I should start brand new thread for this one - so here you go. I am in a need for verbose CLUSTER. Ie. one that would give me feedback and progress. Because CLUSTER is divided into two major operations, (data reordering, index rebuild) - I see it this way: CLUSTER on I: index name T: table name, data reordering CLUSTER on I: index name T: table name, index rebuild and than: CLUSTER 10% CLUSTER 12% , etc (yeah, I know how hard it is to write good progress ..). I don't have even slight doubt that it can be useful, just like VACUUM VERBOSE is. So no question about it. I am seeking for comments. Ideas. The patch would not be very intrusive, atm no one is using VERBOSE for CLUSTER, because it is not there. And nothing would change in this area. I am looking for opinions, on what information should be presented. Perhaps there's also use for some information it might gather elsewhere (stats, etc) - but that's not really my point atm. Thanks for all comments. btw, I would really appreciate not CCing me on this, I am subscribed here for yeaaars now (8.0 times). ta. -- Grzegorz Jaskiewicz C/C++ freelance for hire ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] how to add seconds to a TimestampTz
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Is there a better way than going to time_t and back? Isn't the standard SQL-level locution timestamptz + numeric_value * '1 second'::interval ? I'm not sure what would be the most convenient realization of this at the C level, but please stay away from time_t ... Yes, it is, but we would have to be calling interval input, interval-times-integer, then interval-plus-timestamp, each time we scheduled a worker (to calculate the time of next worker start), which sounds like too much work ... an idea would be to store the result of interval input and multiplication, and recalculate each time we got SIGHUP. But then, maybe this is premature optimization on my part. I'll write it in the natural form for now, and then we'll see if we should rewrite it; and then, maybe it's easier to write something equivalent to TimestampDifference ... -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] how to add seconds to a TimestampTz
On Mar 15, 2007, at 5:58 AM, Tom Lane wrote: ? I'm not sure what would be the most convenient realization of this at the C level, but please stay away from time_t ... what's wrong with time_t ? Does postgres has some sort of time API, that can be used instead? -- Grzegorz Jaskiewicz C/C++ freelance for hire ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Additional global stats for pg_database?
Hi! I'd like to add the ability to show global counters of a couple of things currently available in pg_stat_all_tables (for example number of rows added etc) in a global counter for the entire cluster, for monitoring of an entire cluster without having to connect to each individual database. Should I try adding this to the exsting pg_stat_database view, or should I look at creating a separate view or two? //Magnus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] how to add seconds to a TimestampTz
If you read the autovacuum_naptime into an Interval object once, why can't you just use timestamptz_pl_interval ? You won't be using the interval input/output repeatedly surely. Regards Sailesh -- Sailesh Krishnamurthy Amalgamated Insight [W] (650) 242-3503 [C] (650) 804-6585 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alvaro Herrera Sent: Wednesday, March 14, 2007 3:46 PM To: Hackers Subject: [HACKERS] how to add seconds to a TimestampTz Is there a better way than going to time_t and back? I am currently using this: db-next_worker = time_t_to_timestamptz(timestamptz_to_time_t(current_time) + autovacuum_naptime); (db-next_worker is a TimestampTz, as is current_time. autovacuum_naptime is integer for a number of seconds) but it doesn't seem clean, and the comments in the functions more or less say that their use is discouraged. I saw about doing it via the interval input/output but that's an awful lot of work ... Is this the first time this is going to be done in the backend? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
On Wed, 14 Mar 2007, David Fetter wrote: On Tue, Mar 13, 2007 at 05:54:34PM +, Richard Huxton wrote: David Fetter wrote: On Tue, Mar 13, 2007 at 02:21:37PM +, Richard Huxton wrote: David Fetter wrote: On Tue, Mar 13, 2007 at 09:31:45AM +, Richard Huxton wrote: * Another good example is the questionnaire. With all due respect, this is a solved problem *without EAV or run-time DDL*. The URL below has one excellent approach to this. http://www.varlena.com/GeneralBits/110.php Which broadly speaking was the solution I used for my questionnaire, except I had a restricted set of types so basically just coerced them to text and side-stepped the inheritance issue. To the extent that it's dynamic, it's still just EAV though. That's precisely the difference between the above solution and yours, and it's the difference between a good design and one that will come up and bit you on the as^Hnkle. It's still basically EAV (either approach). The key fault with EAV is that the tables have no semantic meaning - answer_int contains number of oranges, days since birth and the price of a tube ticket in pennies. Stuffing all of those into an answer_int is *precisely* what the end user must not do. That's pilot error. Now, with a questionnaire that might not matter because everything is an answer and you're not necessarily going to do much more than count/aggregate it. See above. It doesn't remove the need for run-time DDL if you allow users to add their own questions. Sure it does. When a user, who should be talking with you, wants to ask a new kind of question, that's the start of a discussion about what new kind(s) of questions would be generally applicable in the questionnaire schema. Then, when you come to an agreement, you roll it into the new schema, and the whole system gets an improvement. Fine, but if you're not letting the user extend the system, then it's not really addressing Edward's original posting, is it? It's my contention that Edward's original idea is ill-posed. SQL is just fine for doing this kind of thing, and it's *not that hard*. If the user's talking to me, I might as well just write the DDL myself - it's the talk that'll take the time, not writing a dozen lines of SQL. It's the talk that's the important part. Machines are really bad at seeing the broader picture. In the attempt to save a few minutes' discussion, he's trying to borrow that time from a system asked to do things that computers are inherently bad at doing, and every end user will pay that time back at a very high rate of interest. This is precisely the kind of false economy that so plagues software development and maintenance these days. The interesting part of the problem (from a Comp-Sci point of view) is precisely in automating part of that discussion. It's providing an abstraction so that you don't end up with a mass of attributes while still providing freedom to the user. This freedom and efficiency you're talking about is better supplied, IMHO, by putting a standard DDL for questionnaires up on a pgfoundry or an SF.net. That way, improvements to the DDL get spread all over the world, and a very large amount of wheel reinvention gets avoided. Reusable components are a big chunk of both freedom and efficiency. :) Cheers, D Maybe I should rethink the problem a bit - from the very brief initial research I've done, it seems EAV schemas have two common uses: 1) When new attributes have to be created on-the-fly 2) When the number of possible properties for an entity greatly (orders of magnitude) exceeds the number of properties any one entity is likely to have. I'm not sure about solving the first problem - there seems to be a lot of debate around this. I can see reasons for and against allowing this. However I think the second is a very real problem. One such example is a patient record system. For each patient we have a table of common data (dob, sex, height, weight etc) but as well as this a patient can present with many symptoms. This might be a table of 40,000 possible symptoms. Lets say we want to run a query on these symptoms (using a boolean expression) to return the patient records which match the query string on the symptoms. (This turns out to be a very similar problem to the 'tags' example I first presented) - assume a similar schema. With more than a couple of symptoms and a complex tree, the resulting SQL can span pages. When I first started thinking about this project I believed the two problems essentially to be the same class of problem, but this may not be the case. What do people think? - Also, thanks everyone for your input thus far. It has been very valuable. Eddie Stanley ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
David Fetter wrote: On Wed, Mar 14, 2007 at 12:07:45PM +1300, Edward Stanley wrote: On Wed, 14 Mar 2007, David Fetter wrote: On Tue, Mar 13, 2007 at 05:54:34PM +, Richard Huxton wrote: David Fetter wrote: On Tue, Mar 13, 2007 at 02:21:37PM +, Richard Huxton wrote: David Fetter wrote: On Tue, Mar 13, 2007 at 09:31:45AM +, Richard Huxton wrote: * Another good example is the questionnaire. With all due respect, this is a solved problem *without EAV or run-time DDL*. The URL below has one excellent approach to this. http://www.varlena.com/GeneralBits/110.php Which broadly speaking was the solution I used for my questionnaire, except I had a restricted set of types so basically just coerced them to text and side-stepped the inheritance issue. To the extent that it's dynamic, it's still just EAV though. That's precisely the difference between the above solution and yours, and it's the difference between a good design and one that will come up and bit you on the as^Hnkle. It's still basically EAV (either approach). The key fault with EAV is that the tables have no semantic meaning - answer_int contains number of oranges, days since birth and the price of a tube ticket in pennies. Stuffing all of those into an answer_int is *precisely* what the end user must not do. That's pilot error. Now, with a questionnaire that might not matter because everything is an answer and you're not necessarily going to do much more than count/aggregate it. See above. It doesn't remove the need for run-time DDL if you allow users to add their own questions. Sure it does. When a user, who should be talking with you, wants to ask a new kind of question, that's the start of a discussion about what new kind(s) of questions would be generally applicable in the questionnaire schema. Then, when you come to an agreement, you roll it into the new schema, and the whole system gets an improvement. Fine, but if you're not letting the user extend the system, then it's not really addressing Edward's original posting, is it? It's my contention that Edward's original idea is ill-posed. SQL is just fine for doing this kind of thing, and it's *not that hard*. If the user's talking to me, I might as well just write the DDL myself - it's the talk that'll take the time, not writing a dozen lines of SQL. It's the talk that's the important part. Machines are really bad at seeing the broader picture. In the attempt to save a few minutes' discussion, he's trying to borrow that time from a system asked to do things that computers are inherently bad at doing, and every end user will pay that time back at a very high rate of interest. This is precisely the kind of false economy that so plagues software development and maintenance these days. The interesting part of the problem (from a Comp-Sci point of view) is precisely in automating part of that discussion. It's providing an abstraction so that you don't end up with a mass of attributes while still providing freedom to the user. This freedom and efficiency you're talking about is better supplied, IMHO, by putting a standard DDL for questionnaires up on a pgfoundry or an SF.net. That way, improvements to the DDL get spread all over the world, and a very large amount of wheel reinvention gets avoided. Reusable components are a big chunk of both freedom and efficiency. :) Cheers, D Maybe I should rethink the problem a bit - from the very brief initial research I've done, it seems EAV schemas have two common uses: 1) When new attributes have to be created on-the-fly 2) When the number of possible properties for an entity greatly (orders of magnitude) exceeds the number of properties any one entity is likely to have. Um, no. The first use case is bad coding practice, and the second is a classic case for a join table, which is the standard way to handle M:N relationships. I'm not sure about solving the first problem - there seems to be a lot of debate around this. I can see reasons for and against allowing this. However I think the second is a very real problem. One such example is a patient record system. For each patient we have a table of common data (dob, sex, height, weight etc) but as well as this a patient can present with many symptoms. This might be a table of 40,000 possible symptoms. Here's how I'd do that: CREATE TABLE patient ( patient_id SERIAL PRIMARY KEY, /* for simplicity. Some combination of columns in the table would also have a UNIQUE NOT NULL constraint on it. */ ... ); CREATE TABLE symptom ( symptom_id SERIAL PRIMARY
Re: [HACKERS] Calculated view fields (8.1 != 8.2)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tom Lane wrote: Gaetano Mendola [EMAIL PROTECTED] writes: I can immagine a case when a lower module exports a view to upper layer stating the interface as list of fields: first_name, last_name, with an *hidden* field that is a function call that updates the statistics on how many time a given record was selected, then this technique can not be used anymore starting with 8.2.x. You're living in a dream world if you think that works reliably in *any* version of Postgres. But for starters, what is your definition of selected --- pulled from the physical table? Accumulated into an aggregate? Delivered as a recognizable row to the client? Delivered N times to the client due to joining N times to some other table? Well that was a not good example, I don't have any problem in mark from now on all my function as stable/immutable (the one I use on views) but still I believe is source of bad performance evaluate a function on rows discarded and at same time this break the principle of least surprise. Regards Gaetano Mendola -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF9U8R7UpzwH2SGd4RAhoGAKDSpUSQ3lGEdIdFWLwQjxoZXUAS1ACdGtht TZg9BKScbzGO0MzpHy0Gr80= =auwk -END PGP SIGNATURE- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
On Wednesday 2007-03-14 08:26, Csaba Nagy wrote: On Wed, 2007-03-14 at 16:08, [EMAIL PROTECTED] wrote: On Wed, Mar 14, 2007 at 02:28:03PM +, Gregory Stark wrote: David Fetter [EMAIL PROTECTED] writes: CREATE TABLE symptom ( symptom_id SERIAL PRIMARY KEY, /* See above. */ ... ); CREATE TABLE patient_presents_with ( patient_id INTEGER NOT NULL REFERENCES patient(patient_id), symptom_id INTEGER NOT NULL REFERENCES symptom(symptom_id), UNIQUE(patient_id, symptom_id) ); I'm just glad I don't have your doctor. I hope mine doesn't think symptoms are all boolean values. Where is the boolean above? It is M:N, with each having whatever data is required. The boolean is assumed in the symptoms table. In any case, even if it's not a boolean value, even if maybe the symptoms table is a complex one on it's own, it still is one single type for all symptoms of all patients. The real problem is that in some real world applications you have a mix of wildly varying types of attributes a user might want to use, and you can't know what those will be beforehand... the symptoms thing is simple to solve in the way David did it, but there really are other situations which a simple m:n can't easily cover. How would you handle a data base of user settings for 10K different applications and 100M different users where each application must be able to store it's own (type safe !!) settings in the same structure, and applications come and go with their own settings ? Come up with a good solution to this combined with queries like give me all the users who have this set of settings set to these values running fast, and then you're talking. Cheers, Csaba. At that point traditional referential database design starts to break down. If you need THAT MUCH flexibility it's probably time to look at network databases, knowledge bases, extensible knowledge bases, and ad hoc knowledge bases (OWL, RDF, etc). Flexibility, friendliness to marketeers or accountants, extesiblity, none are really the strong points of relational database. Databases as they exist today do best with finite domains that can be formally organized. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Additional global stats for pg_database?
Magnus Hagander wrote: Hi! I'd like to add the ability to show global counters of a couple of things currently available in pg_stat_all_tables (for example number of rows added etc) in a global counter for the entire cluster, for monitoring of an entire cluster without having to connect to each individual database. Should I try adding this to the exsting pg_stat_database view, or should I look at creating a separate view or two? I'd say add that to each database entry. To get the sum of all databases you'd need some sort of refactoring, because currently when a backend reads the stats file it only gets its own stats. One idea is to create an additional counter in the stats file, stored separately from database entries. If you add a tuples updated/deleted/inserted to databases, autovacuum can make use of that too (it's even mentioned on an XXX comment in the code somewhere). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] CLUSTER and MVCC
Heikki Linnakangas [EMAIL PROTECTED] writes: I'm thinking of keeping an in-memory mapping of old and new tids of updated tuples while clustering, instead. That means that cluster requires a little bit of memory for each RECENTLY_DEAD updated tuple. In the worst case that means that you run out of memory if there's too many of those in the table, but I doubt that's going to be a problem in practice. That is more or less isomorphic to what VACUUM FULL does. While people have complained about VACUUM FULL's memory usage on occasion, just at the moment I feel that the main problem with it is complexity. If we still haven't gotten all the bugs out of VACUUM FULL after more than eight years of work on it, what are the odds that we can make CLUSTER do it right the first time? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Backend crash in 8.2.3 with plpgsql function
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tom Lane wrote: SM [EMAIL PROTECTED] writes: I got a backend crash in Postgresql 8.2.3 with the plpgsql function. The following statement in psql causes a signal 11: psql=# create function testpl() returns void as 'begin return; end;'language 'plpgsql'; Worksforme ... For me too. $ psql Welcome to psql 8.2.3, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit kalman=# create function testpl() returns void as 'begin return; end;'language 'plpgsql'; CREATE FUNCTION kalman=# select version(); version - PostgreSQL 8.2.3 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.1 20070105 (Red Hat 4.1.1-51) (1 row) Regards Gaetano Mendola -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF+V2/7UpzwH2SGd4RAk29AJ44FZFMnsFHJV+uOcQZpuD0cGN/YACgjxjY 4lVP/g+/PLs2+RfOFtpBJtE= =/Vae -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] UPDATE using sub selects
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 NikhilS wrote: I have tried some update-subselect variations and they seem to work. For example the case in the src/test/regress/sql/update.sql, which used to fail till now, seems to work: UPDATE update_test SET (a,b) = (select a,b FROM update_test where c = 'foo') WHERE a = 10; What's the expected result if the tuple from subselect is more than 1? I expect no update at all in case of void result set, is this the case ? Regards Gaetano Mendola -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF+Vx47UpzwH2SGd4RAvyVAKCGK5pC9B/lmrNjAFPGXhm5ialwSwCglM2n DxrxWyvJASX5WSF9B8cAMas= =AoVF -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Additional global stats for pg_database?
On Thu, Mar 15, 2007 at 10:25:57AM -0400, Alvaro Herrera wrote: Magnus Hagander wrote: Hi! I'd like to add the ability to show global counters of a couple of things currently available in pg_stat_all_tables (for example number of rows added etc) in a global counter for the entire cluster, for monitoring of an entire cluster without having to connect to each individual database. Should I try adding this to the exsting pg_stat_database view, or should I look at creating a separate view or two? I'd say add that to each database entry. To get the sum of all databases you'd need some sort of refactoring, because currently when a backend reads the stats file it only gets its own stats. One idea is to create an additional counter in the stats file, stored separately from database entries. If you add a tuples updated/deleted/inserted to databases, autovacuum can make use of that too (it's even mentioned on an XXX comment in the code somewhere). As you can surely tell, I didn't actually check what code would be needed before I asked ;-) I'll see what/when I can do. //Magnus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] how to add seconds to a TimestampTz
Grzegorz Jaskiewicz wrote: On Mar 15, 2007, at 5:58 AM, Tom Lane wrote: ? I'm not sure what would be the most convenient realization of this at the C level, but please stay away from time_t ... what's wrong with time_t ? Does postgres has some sort of time API, that can be used instead? Sure. See TimestampTz (as mentioned in the subject). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] CLUSTER and MVCC
Tom Lane wrote: Heikki Linnakangas [EMAIL PROTECTED] writes: I'm thinking of keeping an in-memory mapping of old and new tids of updated tuples while clustering, instead. That means that cluster requires a little bit of memory for each RECENTLY_DEAD updated tuple. In the worst case that means that you run out of memory if there's too many of those in the table, but I doubt that's going to be a problem in practice. That is more or less isomorphic to what VACUUM FULL does. While people have complained about VACUUM FULL's memory usage on occasion, just at the moment I feel that the main problem with it is complexity. If we still haven't gotten all the bugs out of VACUUM FULL after more than eight years of work on it, what are the odds that we can make CLUSTER do it right the first time? Well, I can't guarantee that there's no bugs. To copy a chain correctly, we need to correctly detect tuples that have a t_ctid pointing to a non-dead tuple (non-dead meaning HeapTupleSatisfiesVacuum(tuple) != DEAD), and tuples that are being pointed to by a non-dead tuple. If we incorrectly detect that a tuple belongs to either of those categories, when in fact it doesn't, we don't corrupt anything, but we waste a little bit of memory memorizing the tuple unnecessarily. To detect tuples in the first category, we need to check that xmax of the tuple isn't invalid, and t_ctid doesn't point to itself. To detect tuples in the second category, we need to check that xmin isn't invalid, and is greater than OldestXmin. With both categories correctly identified, it's just a matter of mapping old ctids to corresponding tids in the new heap. Unlike in my first proposal, if something nevertheless goes wrong in detecting the chains, we only lose the chaining between the tuples, but we don't otherwise lose any data. The latest version of each row is fine anyway. I think this approach is pretty robust, and it fails in a good way. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] Bitmapscan changes
Heikki Linnakangas wrote: Joshua D. Drake wrote: This is what I suggest. Provide a tarball of -head with the patch applied. Here you are: http://community.enterprisedb.com/git/pgsql-git-20070315.tar.gz Provide a couple of use cases that can be run with explanation of how to verify the use cases. There's a number of simple test cases on the web page that I've used (perfunittests). I can try to simplify them and add explanations. I am downloading now. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
Josh Berkus wrote: And then what? dynamically construct all your SQL queries? Sure, sounds like a simple solution to me... Not to mention DB security issues. How do you secure your database when your web client has DDL access? So, Edward, the really *interesting* idea would be to come up with a secure, normalized way to do UDFs *without* EAV tables. People would be very impressed. I have a system with many essentially user-defined fields, and was thinking of creating something similar to an Array type and writing some GIST indexes for it. My current workaround is to store them as a YAML document and use tsearch to index it (with application logic to further refine the results) - but a EAV datatype that could be put in tables and effectively indexed would be of quite a bit of interest here. And yes, a better say to do UDFs would be even cooler. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Thu, Mar 15, 2007 at 08:31:24AM -0700, Ron Mayer wrote: Josh Berkus wrote: And then what? dynamically construct all your SQL queries? Sure, sounds like a simple solution to me... Not to mention DB security issues. How do you secure your database when your web client has DDL access? So, Edward, the really *interesting* idea would be to come up with a secure, normalized way to do UDFs *without* EAV tables. People would be very impressed. I have a system with many essentially user-defined fields, and was thinking of creating something similar to an Array type and writing some GIST indexes for it. Do give hstore a try. It indexes over name/value mappings (the same authors have intarray, indexing over arrays). If it doesn't solve your problem it'll be at least a very good starting point for what you mention above (basically it's a lossy GiST index based on a hash bitmap. I was at least quite impressed). Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFF+WumBcgs9XrR2kYRAukhAJ9mPk89FfLX4E3dIBGlhfVTwqup0QCdE8Tx nh05Y1WpyX36uKI+9qspO60= =C/5K -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
On Mar 15, 2007, at 11:31 , Ron Mayer wrote: Josh Berkus wrote: And then what? dynamically construct all your SQL queries? Sure, sounds like a simple solution to me... Not to mention DB security issues. How do you secure your database when your web client has DDL access? So, Edward, the really *interesting* idea would be to come up with a secure, normalized way to do UDFs *without* EAV tables. People would be very impressed. I have a system with many essentially user-defined fields, and was thinking of creating something similar to an Array type and writing some GIST indexes for it. My current workaround is to store them as a YAML document and use tsearch to index it (with application logic to further refine the results) - but a EAV datatype that could be put in tables and effectively indexed would be of quite a bit of interest here. And yes, a better say to do UDFs would be even cooler. Out of all the databases that I have used, postgresql offers the most flexible DDL- mostly for one reason: they can operate within transactions. To handle arbitrary strings as column identifiers, the column names could actually be stripped down to lower-case letters and the real title could be stored in a separate table or as column comments. Mr. Berkus' concern regarding the security implications is already handled by privilege separation or security-definer functions. The OP's concern about the difficulty about querying a schema structure is alleviated via any number of APIs in Perl, JDBC, etc. It seems to me that postgresql is especially well-suited to run DDL at runtime, so what's the issue? -M ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] tsearch_core for inclusion
Last try there was a fight about syntax of introduced commands. And we (Oleg and me) developed variant of patch with another syntax. We will not change docs until agreement will be reached, current version http://mira.sai.msu.su/~megera/pgsql/ftsdoc/ Following demonstrates subset of FTS syntax using example from http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-complete-tut.html. Syntax which was proposed: begin; CREATE FULLTEXT CONFIGURATION public.pg ( LOCALE = 'ru_RU.UTF-8' ) LIKE english WITH MAP; CREATE FULLTEXT DICTIONARY pg_dict ( OPT = 'pg_dict.txt' ) LIKE synonym; CREATE FULLTEXT DICTIONARY en_ispell ( OPT = 'DictFile=/usr/local/share/dicts/ispell/english-utf8.dict, AffFile=/usr/local/share/dicts/ispell/english-utf8.aff, StopFile=/usr/local/share/dicts/ispell/english-utf8.stop' ) LIKE ispell_template; ALTER FULLTEXT DICTIONARY en_stem SET OPT='/usr/local/share/dicts/ispell/english-utf8.stop'; ALTER FULLTEXT MAPPING ON pg FOR lword,lhword,lpart_hword WITH pg_dict,en_ispell,en_stem; DROP FULLTEXT MAPPING ON pg FOR email, url, sfloat, uri, float; end; Patch is http://www.sigaev.ru/misc/tsearch_core-0.38.gz Making above FTS syntax as Peter suggested, we removed unnecessary parenthesis, equal sign, OPT becomes OPTION. begin; CREATE FULLTEXT CONFIGURATION public.pg LOCALE 'ru_RU.UTF-8' LIKE english WITH MAP; CREATE FULLTEXT DICTIONARY pg_dict OPTION 'pg_dict.txt' LIKE synonym; CREATE FULLTEXT DICTIONARY en_ispell OPTION 'DictFile=/usr/local/share/dicts/ispell/english-utf8.dict, AffFile=/usr/local/share/dicts/ispell/english-utf8.aff, StopFile=/usr/local/share/dicts/ispell/english-utf8.stop' LIKE ispell_template; ALTER FULLTEXT DICTIONARY en_stem SET OPTION '/usr/local/share/dicts/ispell/engl ish-utf8.stop'; ALTER FULLTEXT MAPPING ON pg FOR lword,lhword,lpart_hword WITH pg_dict,en_ispell,en_stem; DROP FULLTEXT MAPPING ON pg FOR email, url, sfloat, uri, float; end; Patch is http://www.sigaev.ru/misc/tsearch_core-0.38.1.gz Comparing that syntaxes with current tsearch2 is placed at http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-syntax-compare.html So, which is syntax more attractive? And is there some another objections? -- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
Ron Mayer [EMAIL PROTECTED] writes: I have a system with many essentially user-defined fields, and was thinking of creating something similar to an Array type and writing some GIST indexes for it. My current workaround is to store them as a YAML document and use tsearch to index it (with application logic to further refine the results) - but a EAV datatype that could be put in tables and effectively indexed would be of quite a bit of interest here. And yes, a better say to do UDFs would be even cooler. I think that's what hstore does. Honestly I'm not entirely certain though. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
On Thu, 2007-03-15 at 17:01, A.M. wrote: It seems to me that postgresql is especially well-suited to run DDL at runtime, so what's the issue? The issue is that some applications are not well suited to run DDL at runtime :-) As I already mentioned in another post in this thread, our application also has a requirement of user defined fields in one table. Problem is, that table is so accessed in peak hours, that it is simply impossible to take an exclusive lock on it without causing an extended perceived downtime of the application. And guess what, users will always want to add new fields in peak hours... We did solve this in our case with some application logic, but a generic solution would be nice ;-) Cheers, Csaba. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
Gregory Stark wrote: Ron Mayer [EMAIL PROTECTED] writes: I have a system with many essentially user-defined fields, and was thinking of creating something similar to an Array type and writing some GIST indexes for it. My current workaround is to store them as a YAML document and use tsearch to index it (with application logic to further refine the results) - but a EAV datatype that could be put in tables and effectively indexed would be of quite a bit of interest here. And yes, a better say to do UDFs would be even cooler. I think that's what hstore does. Honestly I'm not entirely certain though. Does hstore nest? My impression is that it doesn't. Which might well not matter, of course. cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Proposal: minor SPI interface change
While fooling with getting SPI to use the plancache, I got annoyed again about how SPI_prepare() and related functions refer to SPI plan pointers as void *. I'm all for having the structure be opaque to callers, but to me void * means any old pointer, which this surely is not. I'd like to change things so that spi.h does typedef struct _SPI_plan *SPIPlanPtr; and then SPI_prepare is declared to return SPIPlanPtr not void *, and likewise for the other SPI functions dealing with plans. AFAICS this does not break code that refers to plan pointers as void * because C compilers will allow implicit casts between void * and SPIPlanPtr. However, for code that we feel like updating, the result is more readable and less error-prone. There are people out there who want their code to compile against multiple PG versions. To use the improved notation and still compile against pre-8.3 headers, they could do #if CATALOG_VERSION_NO whatever typedef void *SPIPlanPtr; #endif Comments, objections? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [RFC] CLUSTER VERBOSE
On 3/15/07, Grzegorz Jaskiewicz [EMAIL PROTECTED] wrote: I figure - I should start brand new thread for this one - so here you go. I am in a need for verbose CLUSTER. Ie. one that would give me feedback and progress. Because CLUSTER is divided into two major operations, (data reordering, index rebuild) - I see it this way: CLUSTER on I: index name T: table name, data reordering CLUSTER on I: index name T: table name, index rebuild and than: CLUSTER 10% CLUSTER 12% , etc Well, I'm afraid that would be inconsistent with other VERBOSE commands (VACUUM VERBOSE), which don't give a progress indication other than that of specific stage being finished. I think if you want to add VERBOSE to cluster, it should behave exactly like all other 'VERBOSE' commands. And as for progress indication, there has been proposals for more or less similar feature, like: http://archives.postgresql.org/pgsql-hackers/2006-07/msg00719.php As I recall the ideas which caught most traction were indicating current progress via shared memory (pg_stat_activity) and a GUC variable which instructs the server to send notices indicating the progress status. The latter is harder. I'm afraid creating such a feature 'just for CLUSTER' is not the greatest idea -- there a lots of other places where having a progress bar would be a great benefit. REINDEX, most ALTER TABLEs, CREATE INDEX, even long running SELECTs, UPDATEs and DELETEs not to mention VACUUM would equally benefit from it. I think you will be having hard time trying to push CLUSTER-specific extension when there is a need for more generic one. Regards, Dawid ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] pltcl vs. multilib machines
So I just updated my x86_64 machine to Fedora Core 6, and I find that rebuilding our CVS HEAD gives a batch of noise that was not there before: /usr/bin/ld: skipping incompatible /usr/lib/libtcl8.4.so when searching for -ltcl8.4 /usr/bin/ld: skipping incompatible /usr/lib/libdl.so when searching for -ldl /usr/bin/ld: skipping incompatible /usr/lib/libdl.a when searching for -ldl /usr/bin/ld: skipping incompatible /usr/lib/libpthread.so when searching for -lpthread /usr/bin/ld: skipping incompatible /usr/lib/libpthread.a when searching for -lpthread /usr/bin/ld: skipping incompatible /usr/lib/libieee.a when searching for -lieee /usr/bin/ld: skipping incompatible /usr/lib/libm.so when searching for -lm /usr/bin/ld: skipping incompatible /usr/lib/libm.a when searching for -lm /usr/bin/ld: skipping incompatible /usr/lib/libc.so when searching for -lc /usr/bin/ld: skipping incompatible /usr/lib/libc.a when searching for -lc /usr/bin/ld: skipping incompatible /usr/lib/libc.so when searching for -lc /usr/bin/ld: skipping incompatible /usr/lib/libc.a when searching for -lc It turns out that this is because the link command for pltcl includes -L/usr/lib, so that gets searched before /usr/lib64. And the reason the command includes that is that that's what it says in TCL_LIB_SPEC in /usr/lib/tclConfig.sh. There is also a /usr/lib64/tclConfig.sh which says the right things, but we aren't finding that because this is how we determine where to look for tclConfig.sh: $ echo 'puts $auto_path' | tclsh /usr/share/tcl8.4 /usr/share /usr/lib /usr/lib64 $ Perhaps I should lobby the Red Hat guys to change the order of that result, but really this is more our problem than theirs: whichever way tclsh reports it, it will be wrong for trying to build Postgres with the other word width on a multilib machine. The ideal thing would be to try to verify that the found tclConfig.sh is compatible with the compiler switches we want to use, but I can't think of any reasonably robust way to do that (ie, something that's likely to work with non-gcc compilers...). Any ideas? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] UPDATE using sub selects
Gaetano Mendola [EMAIL PROTECTED] writes: NikhilS wrote: I have tried some update-subselect variations and they seem to work. For example the case in the src/test/regress/sql/update.sql, which used to fail till now, seems to work: UPDATE update_test SET (a,b) = (select a,b FROM update_test where c = 'foo') WHERE a = 10; What's the expected result if the tuple from subselect is more than 1? Error, per SQL99 section 7.14: 1) If the cardinality of a row subquery is greater than 1 (one), then an exception condition is raised: cardinality violation. I expect no update at all in case of void result set, is this the case ? No, you get nulls; it's a subquery not a join. Per SQL99 7.1: c) If the row value constructor is a row subquery, then: i) Let R be the result of the row subquery and let D be the degree of R. ii) If the cardinality of R is 0 (zero), then the result of the row value constructor is D null values. iii) If the cardinality of R is 1 (one), then the result of the row value constructor is R. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch_core for inclusion
On Thursday 15 March 2007 12:17, Teodor Sigaev wrote: Last try there was a fight about syntax of introduced commands. And we (Oleg and me) developed variant of patch with another syntax. We will not change docs until agreement will be reached, current version http://mira.sai.msu.su/~megera/pgsql/ftsdoc/ Following demonstrates subset of FTS syntax using example from http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-complete-tut.html. This is nice. snip Comparing that syntaxes with current tsearch2 is placed at http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-syntax-compare.html So, which is syntax more attractive? Honestly I don't find any of these syntax's to be head and shoulders above the others, but would probably lean toward the original syntax, since it has some level of familiarity among the existing user base. And is there some another objections? Most people whom I talk to about tsearch who want the syntax changed to make it easier want something akin to just CREATE INDEX fti1 on t1(c1) USING FULLTEXT and then be done with it. This patch isn't going to give people that. I'm also concerned about the stability of the tsearch api in general wrt including it in core. Currently the recommended upgrade practice is to dump/reload without tsearch, installing the new servers version of tsearch instead. IMHO this is not an acceptable solution for core-included features. So is this actually going to be improved in a core tsearch? system tables are not dumped by default, so that seems easier, until you consider that your custom tsearch install will then be lost on upgrade... oops! -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] tsearch_core for inclusion
And is there some another objections? Most people whom I talk to about tsearch who want the syntax changed to make it easier want something akin to just CREATE INDEX fti1 on t1(c1) USING FULLTEXT and then be done with it. This patch isn't going to give people that. +1 (again) Although I would see something like this: CREATE INDEX fti1 on t1(c1) USING FULLTEXT [WITH] GIST | GIN Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] tsearch_core for inclusion
Joshua D. Drake [EMAIL PROTECTED] writes: +1 (again) Although I would see something like this: CREATE INDEX fti1 on t1(c1) USING FULLTEXT [WITH] GIST | GIN Surely the CREATE INDEX syntax has got enough warts on it already. Can't we express this as a particular operator class, or something that doesn't add any new syntax? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch_core for inclusion
Tom Lane wrote: Joshua D. Drake [EMAIL PROTECTED] writes: +1 (again) Although I would see something like this: CREATE INDEX fti1 on t1(c1) USING FULLTEXT [WITH] GIST | GIN Surely the CREATE INDEX syntax has got enough warts on it already. Can't we express this as a particular operator class, or something that doesn't add any new syntax? Do you mean something like: CREATE INDEX fti1 ON t1 USING GIST|GIN(c1) WITH (FULLTEXT) The above I got from the idea of FILLFACTOR so you could have something like WITH (FILLFACTOR = 70, FULLTEXT) (not even sure if that is relevant) OR CREATE INDEX ftil ON t1 USING GIST|GIN(C1 FULLTEXT); Where FULLTEXT is like VARCHAR OPS? I could live with that. Sincerely, Joshua D. Drake regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch_core for inclusion
Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: Surely the CREATE INDEX syntax has got enough warts on it already. Do you mean something like: CREATE INDEX ftil ON t1 USING GIST|GIN(C1 FULLTEXT); Where FULLTEXT is like VARCHAR OPS? Yeah, that one. It might be more consistent to spell it as fulltext_ops but I wouldn't insist on it. Of course the issue not addressed here is where you specify all the secondary configuration data (the stuff currently handled by config tables in the contrib implementation). Perhaps the WITH clause would work for that, though in the current code WITH is targeted at the index AM not individual opclasses. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] tsearch_core for inclusion
Tom Lane wrote: Joshua D. Drake [EMAIL PROTECTED] writes: Tom Lane wrote: Surely the CREATE INDEX syntax has got enough warts on it already. Do you mean something like: CREATE INDEX ftil ON t1 USING GIST|GIN(C1 FULLTEXT); Where FULLTEXT is like VARCHAR OPS? Yeah, that one. It might be more consistent to spell it as fulltext_ops but I wouldn't insist on it. *shrug* fulltext_ops is probably more accurate but FULLTEXT is more friendly :). I find you normally can't have both, my vote would probably be consistency. Of course the issue not addressed here is where you specify all the secondary configuration data (the stuff currently handled by config tables in the contrib implementation). Perhaps the WITH clause would work for that, though in the current code WITH is targeted at the index AM not individual opclasses. Not sure what to say here. WITH seems logical and I don't think we want to add yet another keyword but I certainly see your point. Sincerely, Joshua D. Drake regards, tom lane -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] tsearch_core for inclusion
On Thu, 15 Mar 2007, Robert Treat wrote: On Thursday 15 March 2007 12:17, Teodor Sigaev wrote: Last try there was a fight about syntax of introduced commands. And we (Oleg and me) developed variant of patch with another syntax. We will not change docs until agreement will be reached, current version http://mira.sai.msu.su/~megera/pgsql/ftsdoc/ Following demonstrates subset of FTS syntax using example from http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-complete-tut.html. This is nice. snip Comparing that syntaxes with current tsearch2 is placed at http://mira.sai.msu.su/~megera/pgsql/ftsdoc/fts-syntax-compare.html So, which is syntax more attractive? Honestly I don't find any of these syntax's to be head and shoulders above the others, but would probably lean toward the original syntax, since it has some level of familiarity among the existing user base. And is there some another objections? Most people whom I talk to about tsearch who want the syntax changed to make it easier want something akin to just CREATE INDEX fti1 on t1(c1) USING FULLTEXT and then be done with it. This patch isn't going to give people that. Since we use standard postgresql-ish CREATE INDEX command, I assume people want to skip creation of tsvector column ? How they could manage complex document indexing, when document is a combination (with different weights) of many text attributes from several tables, for example ? There are several other issues with that approach, for example, we need to store positional information somewhere for ranking information. It's awkward to parse document every time to get this information. I'm also concerned about the stability of the tsearch api in general wrt including it in core. Currently the recommended upgrade practice is to dump/reload without tsearch, installing the new servers version of tsearch instead. IMHO this is not an acceptable solution for core-included features. So is this actually going to be improved in a core tsearch? system tables are not dumped by default, so that seems easier, until you consider that your custom tsearch install will then be lost on upgrade... oops! This is exact reason why we want to include tsearch into core, it was discussed several times. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] tsearch_core for inclusion
On Thu, 15 Mar 2007, Joshua D. Drake wrote: And is there some another objections? Most people whom I talk to about tsearch who want the syntax changed to make it easier want something akin to just CREATE INDEX fti1 on t1(c1) USING FULLTEXT and then be done with it. This patch isn't going to give people that. +1 (again) Although I would see something like this: CREATE INDEX fti1 on t1(c1) USING FULLTEXT [WITH] GIST | GIN Yet another syntax addition :) What's wrong with the standard CREATE INDEX ? Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] SoC ECPG Enhancements
Peter, The Google Summer of Code web page lists as possible project ECPG Enhancments: Enable ECPG to generate calls directly for libpq rather than calls to its own libraries. What would be the point of that? More importantly, Michael hasn't volunteered to mentor this year, so I think we should probably take the idea down. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] My honours project - databases using dynamically attached entity-properties
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Thu, Mar 15, 2007 at 12:49:10PM -0400, Andrew Dunstan wrote: Gregory Stark wrote: Ron Mayer [EMAIL PROTECTED] writes: I have a system with many essentially user-defined fields, and was thinking of creating something similar to an Array type and writing some GIST indexes for it. [...] I think that's what hstore does. Honestly I'm not entirely certain though. Does hstore nest? My impression is that it doesn't. Which might well not matter, of course. If what you mean is to have mappings of mappings then no. Hstore implements a data type for a (finite) mapping (a set of key - value pairs, think hash for perl folks), with operations like H1 contains H2 (in the sense that all key-value pairs in H2 are also in H1) supported by an index. Keys and values are strings. But I don't see why it couldn't be extended to more structured data types (one of the good things of hstore at the moment is its surprising simplicity, because it provides a good example of what GiST is good for, therefore I'd understand if the authors don't wanted to add that much hair without need). In my view it's one of the cutest applications of GiST. Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFF+i5OBcgs9XrR2kYRAprYAJ46ecAnSNWnh+rizGdhWLutU7BphACdHweH k3rdsxmUzuxthwldXvJ3UKw= =+sT5 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] tsearch_core for inclusion
Most people whom I talk to about tsearch who want the syntax changed to make it easier want something akin to just CREATE INDEX fti1 on t1(c1) USING FULLTEXT and then be done with it. This patch isn't going to give people that. Since we use standard postgresql-ish CREATE INDEX command, I assume people want to skip creation of tsvector column ? That would be great. How they could manage complex document indexing, when document is a combination (with different weights) of many text attributes from several tables, for example ? Just to give you some more work, could I have both, please. Seriously, if the current powerful functionality could be combined with a dead simple solution for new users and those who don't need it, that would be very good. This could be an auto generated hidden column or something, as long as the user doesn't need to see or care about it in the simple case. /Magnus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly