Re: [GENERAL] select where in and order
Tony Smith wrote on Thursday, March 09, 2006 6:33 PM To: pgsql-general@postgresql.org Subject: [GENERAL] select where in and order I have two tables action and group: action id, name group: action_id rank I what to select from action table by order by the rank in the group table. If I use select * from action where id in (select action_id from group order by rank) Try something like: select a.*, g.action_id FROM action a INNER JOIN group g ON a.id = g.action_id ORDER BY g.action_id ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Looking for a fix to index bloat
[EMAIL PROTECTED] wrote onTuesday, February 28, 2006 7:13 AM We are suffering from the same issue that is described in this email thread http://archives.postgresql.org/pgsql-general/2005-07/msg00486.php. I don't know if this is the appropriate place to make this request, so if not, please forgive me. However, in our particular case, we don't have enough disk space nor money to allow the indexes to grow to a steady state. This isn't what you asked for, but if you can afford a modest amount of downtime you could always drop/recreate the indexes. -Roger
Re: [GENERAL] Database Comparison tool?
I have a script I've been using that does a db comparison, and it works very well. In order to ensure things are in the right order, I have to ... - query for table and view names (FROM pg_tables WHERE schemaname = 'public' ...), with an ORDER BY clause, natch. - create a batch command file with one line for each table and view. This command is a pg_dump of the schema, which is appended to an output file I also output function definitions, as well as the actual data (not just the schema) of some tables that basically have static or lookup data. There's a little more to it to suit my particular needs, but the general approach works well. Oh, one other thing I sometimes need to do is to delete lines with db owner if the two db's have different owners. -Roger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Rick Gigger Sent: Thursday, February 09, 2006 11:09 PM To: Philippe Ferreira Cc: Nicholas Walker; pgsql-general@postgresql.org Subject: Re: [GENERAL] Database Comparison tool? Is the ordering guaranteed to be the same on both boxes if you do this? Rick On Feb 9, 2006, at 1:03 PM, Philippe Ferreira wrote: Are there any tools that can compare a database schema, and produce sql of the changes from one version to the next. We have a development server, and it would be great to be able to just run a tool, where we could produce the changes, review it, and then commit to production. Hi, Do a pgdump of both databases, and use the diff tool to compare the two generated files ! (But I hope your databases are not too big...) Philippe Ferreira. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] xml_valid function
John Gray wrote on Friday, January 27, 2006 12:24 PM On Wed, 25 Jan 2006 17:11:04 -0800, George Pavlov wrote: Not sure what the correct forum for pgxml/xml2 questions is. I was wondering what is the definition of valid that the xml_valid(text) function that is part of that module uses? It seems different from the W3C definition of valid XML (is there an implicit DTD?) Maybe it is more akin to well-formed? It is indeed well-formed. That just seemed a long name for the function! John Valid means it's been checked against, and conforms to, a DTD. If it hasn't been then it can't be said to be valid. -Roger ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Postgres 8.1.2, Java, JDO, and case sensitivity woes
testdb=# CREATE TABLE foo (field_one int4); CREATE TABLE testdb=# INSERT INTO foo VALUES (1); INSERT 0 1 testdb=# INSERT INTO foo VALUES (2); INSERT 0 1 testdb=# SELECT * FROM foo; field_one --- 1 2 (2 rows) testdb=# SELECT * FROM FOO; field_one --- 1 2 (2 rows) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Matthew Hixson Sent: Friday, January 27, 2006 3:53 PM To: Postgres General Subject: [GENERAL] Postgres 8.1.2, Java, JDO, and case sensitivity woes I'm trying to get a large Java application which makes use of an Oracle JDO layer to work with Postgres. Set aside for a moment the discussion of whether or not that is going to work. What I have found is that different parts of this application are referring to a table in all uppercase and in other parts referring to the table all in lowercase. Is there a way to configure Postgres so that it does not treat FOO and foo as two different tables? Thanks, -M@ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Performance large tables.
Benjamin Arai wrote on Saturday, December 10, 2005 3:37 PM ... On the other hand there is a weekly update (This is the problem) that updates all of the modified records for a bunch of finacial data such as closes and etc. For the most part they are records of the type name,date,value. The update currently takes almost two days. The update does deletions, insertion, and updates depending on what has happened from the previous week. For the most part the updates are simple one liners. I currently commit in large batch to increase performance but it still takes a while as stated above. From evaluating the computers performance during an update, the system is thrashing both memory and disk. I experimented with batch size and found that large batches (thousands or tens of thousands) slowed things down in our situation, while using a batch size of around 100 or so sped things up tremendously. Of course, YMMV ... -Roger ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Reduce NUMERIC size by 2 bytes, reduce max length to 508 digits
Bruce Momjian Sent: Friday, December 02, 2005 9:39 AM There is a patch under consideration for 8.2 that would reduce the storage requirement for numeric values by two bytes, but also reduce the range of allowed numeric values to 508 digits. ... Is that an acceptable tradeoff (reduced size, reduced range) for our users? I would be in favor of this change. What's the plan for anyone who may be currently using 508 digits (if there is anyone!)? -Roger ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Aggregates, group, and order by
On Monday, November 07, 2005 12:12 AM Michael Glaesemann wrote: select bar_id, array_accum(foo_value) from ordered_foo group by bar_id order by bar_id; bar_id | array_accum +- 1 | {delta,alpha,charlie,bravo} 2 | {C,B,A,D} The result I'd like to see is bar_id | array_accum +- 1 | {alpha,bravo,charlie,delta} 2 | {A,B,C,D} select bar_id, array_accum(foo_value) from (SELECT * FROM ordered_foo ORDER BY foo_pos) foo group by bar_id order by bar_id; bar_id,array_accum 1,{alpha,bravo,charlie,delta} 2,{A,B,C,D} ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] PSQL suggested enhancement
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Michael Glaesemann Sent: Wednesday, October 19, 2005 11:24 PM On Oct 20, 2005, at 14:50 , Dean Gibson (DB Administrator) wrote: PSQL has the option to output the result of queries in several different formats, including HTML. Suggestion: have an option to output query results in XML format. My personal opinion on this is that there are a lot of different ideas about how the XML should/could be written, and a the current output can be piped to a script in insert favorite scripting language here to format to match anyone's wish. Rather than have psql decree the one and true format or include so many as to make maintainability burdensome, further formatting is best left to the end user. By that reasoning there shouldn't be html format either. The beauty of xml output is that it can easily be processed via xslt to create whatever format the user wants. So there need not be one true format but simply one root format that could be easily transformed. -Roger Michael Glaesemann ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] PSQL suggested enhancement
On Thursday, October 20, 2005 1:01 PM, Martijn van Oosterhout wrote: On Thu, Oct 20, 2005 at 09:28:25AM -0700, Dean Gibson (DB Administrator) wrote: I just find it surprising that XML is not one of the formats provided, considering that XML is considered a data interchange format (much more than HTML, which is a representation format). All jokes aside, saying output to XML is like saying all our documentation will use words from the english dictionary. Yes, you made a constraint but until you decide grammer, syntax and style, you havn't decided anything yet. Realistically, psql is only ever going to support one XML format, we can't add more styles every time someone asks for one. Hence, we should pick the one that is going to be most easily loaded into other programs (the entire point of XML, right?). To that end, we should probably aim for something like the OpenDocument table format, which looks a bit like below. At least that way you stand a chance of being able to import it and/or display it. If pg outputs a simple xml format, it can easily be transformed via xslt into OpenDoc table format, alternate html formats, or the alternate xml format of your choice. I would argue against outputting this one specific OpenDoc format, even though it is the flavor of the month right now. -Roger ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Dump all except some tables?
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Jim C. Nasby Sent: Thursday, October 06, 2005 3:34 PM Subject: Re: [GENERAL] Dump all except some tables? ... I find myself wondering if it would be good to allow for specifying a set of rules for what to dump in a file, probably something like a set of regexes with a way to specify if it's an include or exclude rule. Seems like it would be a heck of a lot simpler to do that for complex cases than deal with a pile of spaghetti on the command-line It may be useful to cut down on command line clutter if one could specify a file holding a list of table names to include/exclude. -Roger Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Migration from MS SQL 2K
PGAdmin II had a "Migration Wizard" plugin that did a great job. I used it a lot to convert MS SQL 7 and 2000 db's to Postgres 7.x. It did not convert views, but I wrote some code that did that. Unfortunately, PGAdmin II doesn't work with PG 8. And PGAdmin III doesn't have the Migration Wizard plugin, at least it didn't the last I checked. So ... you could use PGAdmin II and the Migration Wizard to convert to Postgres 7.x. Then dump the db, and import into PG 8. Or course, there may be more sophisticated (and possible $$) software that will do what you want more simply. -Roger -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of TheNice SpiderSent: Wednesday, September 28, 2005 10:48 PMTo: pgsql-general@postgresql.orgSubject: [GENERAL] Migration from MS SQL 2K Hi, Currently I have serious problem to migrate a "production" MS SQL 2K to Postgres 8 for Windows. Is there any tools to import MS SQL 2K to Postgres 8 including: - PK and FK - View, Trigger, Store Procedure - Username - Index and Constraint - Diagrams Regards. Yahoo! for GoodClick here to donate to the Hurricane Katrina relief effort.
Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.
I have a view to generate a list of instructors and a count of their future classes. instructors is a link table between class and person. CREATE VIEW future_instructor_counts AS SELECT person.id AS person_id, first_name, last_name, count(instructors.class) AS class_count FROM class, instructors, person WHERE class.id= instructors.class AND person.id = instructors.person AND class_time now() GROUP BY person_id, first_name, last_name; The trick is to do the data aggregation separately, then JOIN in whatever other fields you want. Something like this: CREATE VIEW future_instructor_counts AS SELECT * FROM (SELECT person.id AS person_id, first_name, last_name) personinfo INNER JOIN (SELECT class.id FROM class WHERE class_time now() ) classes INNER JOIN (SELECT id, count(class) AS class_count FROM instructors GROUP BY id) classcount ON personinfo.person_id = instructors.id AND classes.id = instructors.id In many cases when using aggregate functions you get just the fields you need from the agg function (typically an id plus the aggregate result) and JOIN with other tables (or even the same table) to get other info such as first_name, last_name, etc. Otherwise, if you GROUP BY additional fields so you can get them in the output, you may be making the db do additional work. 1) With an aggregate function in the query, is there any way to remove the AND class_time now() so that timestamp can be passed in the select? That is, I'd like to be able to do this? select * from instructor_counts where class_time now(); But class_time is not part of the VIEW so that's not valid. No problem, just make it a part of the view. See the classes section below. CREATE VIEW future_instructor_counts AS SELECT * FROM (SELECT person.id AS person_id, first_name, last_name) personinfo INNER JOIN -- Add class_time field! (SELECT class.id, class_time FROM class WHERE class_time now() ) classes INNER JOIN (SELECT id, count(class) AS class_count FROM instructors GROUP BY id) classcount ON personinfo.person_id = instructors.id AND classes.id = instructors.id [Disclaimer: I've not tested this code at all. It could help if you sent table definitions and maybe even dummy data via insert commands.] And if it was included then I don't have an aggregate function any more - no more grouping. If you do the agg function separately like this that isn't an issue. You join tables to get whatever fields you'd like to have in your output. 2) I think I'm missing something obvious. I know that I need to specify all my non-aggregate columns in the GROUP BY, but I don't under stand why. Really, the results are just grouped only by person.id so why the need to specify the other columns. And if you don't specify all the columns then Postgresql reports: ERROR: column person.id must appear in the GROUP BY clause or be used in an aggregate function Is there a reason Postgresql doesn't just add the column automatically? It does in other cases (like a missing table in a join). As I mention above, if you GROUP BY additional fields just to get them in the output, you may be making the db do additional work. I seem to remember that in a later SQL standard (ie, after SQL-99 but I could be wrong) I believe it allows you to specify additional fields in SELECT that are not in the GROUP BY clause. But PG isn't there yet. -Roger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Bill Moseley Sent: Friday, September 16, 2005 11:30 AM To: pgsql-general@postgresql.org Subject: [GENERAL] Setting WHERE on a VIEW with aggregate function. I have a view to generate a list of instructors and a count of their future classes. instructors is a link table between class and person. CREATE VIEW future_instructor_counts AS SELECT person.id AS person_id, first_name, last_name, count(instructors.class) AS class_count FROM class, instructors, person WHERE class.id= instructors.class AND person.id = instructors.person AND class_time now() GROUP BY person_id, first_name, last_name; I have two very basic SQL questions: 1) With an aggregate function in the query, is there any way to remove the AND class_time now() so that timestamp can be passed in the select? That is, I'd like to be able to do this? select * from instructor_counts where class_time now(); But class_time is not part
Re: [GENERAL] Setting WHERE on a VIEW with aggregate function.
I have a view to generate a list of instructors and a count of their future classes. select * from instructor_counts where class_time now(); But class_time is not part of the VIEW so that's not valid. No problem, just make it a part of the view. No problem, I said ... famous last words. select * from instructor_counts order by class_count desc; -- Returns: person_id | first_name | class_count ---++- 1 | Joe| 2 2 | Mary | 1 3 | Bob| 1 4 | Cindy | 1 (4 rows) My GOAL above is to be able to add a WHERE class_time $some_time. Armed with your table ddl and sample data I can see the problem more clearly. Unfortunately class_time cannot be a column in the view output. For example, look at the Joe line above ... if he teaches two classes which class_time would it show? Since class_time can't be a column in the view output it can't be used in a WHERE clause. So it would appear to me that you won't able to meet your goal by simply using a view. However, there are other solutions you can use which may or may not be appropriate. I can think of three at the moment. #1: Function Solution: To use functions you may first need to run this at command line: createlang plpgsql electric Then create a function that you can pass in a date to: CREATE FUNCTION getclasscount(timestamp) RETURNS refcursor AS ' DECLARE curs refcursor; BEGIN OPEN curs FOR SELECT * FROM (SELECT person, count(class) AS class_count FROM instructors INNER JOIN class ON class.id = instructors.person WHERE class.class_time $1 GROUP BY person ) classcount INNER JOIN (SELECT person.id AS person_id, first_name FROM person ) personinfo ON personinfo.person_id = classcount.person RETURN curs; END; ' LANGUAGE 'plpgsql'; This would be the best solution if you are in control of the application source code. In Java, for example, it's relatively simple to call this function and return the result as a result set. If you're working in Java I'd be glad to show you same sample code. If you really do need a view for some reason, then this wouldn't work. #2: Simplify the Select Criteria Solution: = A slightly less flexible approach, but one that may be workable, would be to add two boolean columns (with default value of '0') to the class table: completed and current. Then once a semester you run a simple query that updates them. Something like: UPDATE class SET current = '1' where class_time = '2005-09-01'; UPDATE class SET completed = '1' where class_time '2005-09-01'; Then the view would be: CREATE VIEW vclasscount AS SELECT * FROM (SELECT person, count(class) AS class_count FROM instructors INNER JOIN (SELECT id FROM class WHERE class.completed = '0' AND class.current = '0') futureclasses ON futureclasses.id = instructors.class GROUP BY person ) classcount INNER JOIN (SELECT person.id AS person_id, first_name FROM person ) personinfo ON personinfo.person_id = classcount.person This would be a nice simple solution, but may not be flexible enough in that you can't specify an ad-hoc date or date-range ... you would only see future class count. 3: Use a Temp Table === Again, if you have control of application logic, you could: 1) SELECT * INTO futureclasses FROM class where class_time ? 2) Then make the view against futureclasses rather than classes. Good luck! -Roger Bill Moseley ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Problem using NULLIF in a CASE expression
Try this: SELECT CASE WHEN btrim(' A string', ' ') = '' OR field-name IS NULL THEN NULL ELSE 6 END AS type_id; -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Bruno BAGUETTE Sent: Friday, September 09, 2005 4:45 AM To: pgsql-general@postgresql.org Subject: [GENERAL] Problem using NULLIF in a CASE expression Hello, I have a problem using NULLIF inside a CASE expression. I explain the goal of my query : I have a table where there is two fields : one varchar field and one INT8 field. When the varchar field is empty I have to display the INT8 field. If the varchar field is not empty I have to display a NULL value (even if the INT8 field contains something). BUT : that table is used by several applications and theses applications does not always store NULL in the varchar field. Some applications store an empty string '', or stores space char(s). SO, I have to do a btrim and to get a NULL value when I just have '' after the btrim. I wrote another (and quite shorter!) SQL query to resume the problem : SELECT CASE NULLIF(btrim(' A string', ' '), '') WHEN NOT NULL THEN NULL ELSE 6 END AS type_id; ERROR: operator does not exist: text = boolean HINT: No operator matches the given name and argument type(s). You may need to add explicit type casts. Why this query does not accept the NULLIF ? What can I do to solve that problem ? Thanks in advance ! Bruno BAGUETTE. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] [OT?] ETL tools
Jose Gonzalez wrote: The situation is a bit chaotic as they're using a lot of local Access databases, some databases hosted in an old version of Microsoft SQL Server and a lot of data in other non relational files (SPSS, Excel, ...). I was hoping to impose a bit of order and I started installing a current version of PostgreSQL to host all the databases they're using. ... Maybe I could try another approach? Personally, I would write code (Java or whatever) to do the work. There will almost certainly be cases where you need to do special data massaging, or special rules for special cases, and that will be a lot easier to do when you are in complete control of what happens. I would be afraid that an ETL tool ... 1. Would have a tedious learning curve. 2. You would discover (after x hours) that it doesn't do something you absolutely need to be able to do Then again, I haven't used any ETL tools (well, not for a long, long time), unless you count PGAdmin [http://www.pgadmin.org/] The PGAdmin-II app had an excellent MS SQL Server - Postgres data conversion plug-in. I used it many, many times with zero problems, with both SQL Server 7 and 2000. Unfortunately, the last I checked the current PGAdmin-III app doesn't seem to have or support this plugin, and the PGAdmin-II app doesn't work with Postgres 8 iirc. If, somehow, this converter was available again you could give it a shot, but I don't think it supported much more than straight table copy type stuff. The problem with writing the code is that you'll need to do it from a platform that can access all the data sources. I've used Java for these types of tasks. Postgres, of course, has a JDBC driver, so there's no problem there. MS SQL Server 2000 has a Microsoft JDBC driver, but I have used one that I bought that works with SQL Server 7, which did not come with a Microsoft JDBC driver. So if you're using a pre-2000 version of SQL Server you will need to hunt up a JDBC driver. (Actually, the ODBC-JDBC bridge exists ... that is not recommended for any kind of real world use, but maybe would work for a one-time pull.) I've successfully accessed Excel data from Java using the free Java Excel API [http://www.andykhan.com/jexcelapi/index.html] I've used the sun sun.jdbc.odbc.JdbcOdbcDriver driver to access MS Access files. Again, this uses the bridge, but for just reading data would probably prove adequate. Good luck! -Roger Jose ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] plpgsql: returning multiple named columns from function *simply*
John Lawler wrote: In MSSQL, I can write a stored procedure that does something like this: CREATE PROCEDURE test( @lookup char(50)) WITH ENCRYPTION AS BEGIN -- ... a bunch of code to do some lookup, and then ... SELECT @Result1 AS Result1, @Result2 AS Result2, @Result3 AS Result3, @Result4 AS Result4 Not sure if this a function like this does everything you may want, but it may work for you. Disclaimer: I have not actually used the proc with the As for the column names, but I'd expect it should work. CREATE FUNCTION getlogdata(int4, int4, int4, int4) RETURNS refcursor AS ' DECLARE curs refcursor; BEGIN OPEN curs FOR SELECT logdata.datavalue As Result1,logdata.timestamp As Result2 from logdata where logfielddatatype = $1 and graphtargetlog = $2 and (timestamp = $3 OR $3 = 0) and (timestamp = $4 OR $4 = 0) order by timestamp; RETURN curs; END; ' LANGUAGE 'plpgsql'; ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Upgrade OS from Redhat AS3 (2.4 kernel) to AS4 (2.6) - Advice on keeping PG happy?
Due to performance issues (which I wrote about on pg-performance list) we will be upgrading our Redhat AS3.0 box (kernel 2.4) to Redhat AS4 (kernel 2.6) in order to take advantage of the better I/O that the 2.6 kernel offers. (I say upgrading but actually we'll wipe the box and start fresh.) We are running Postgres 8.0.3. I had assumed we would have to backup/restore the entire db, but I got to wondering if it would be possible to simply point the new Postgres install to the existing db (the data itself is on a separate drive array) and be back up and running without the time to do the restore. Is this a relatively simple matter? Or is it too hazardous and tricky to bother with? If it's possible I'd appreciate a list of what I'd need to save off the old drive, where the os and Postgres program itself lived. Also: Anything to watch out for with Postgres on 2.6 kernel as opposed to 2.4? Any settings that should be different? We've been advised to try "elevator=deadline". We write pretty much 24 hours a day, and have I/O issues! -Roger