[GENERAL] Group by clause creating "ERROR: wrong record type supplied in RETURN NEXT" (version 8.1.11 -- grr...)
Hi all, If I try to execute a dynamic query inside a function with a group by statement, returning a setof, I get a weird error. It may be due to the antiquated database version, but I would appreciate all the info I can get (I tried looking in the PG bug tracker, but ... hahaha). If it is as simple as upgrading, I will lean on the admin (I don't control the box, or this wouldn't be an issue). I will try to duplicate on a new machine later this week. First the pass-through function (takes a sql statement, tries to clean it, executes it): create or replace function mkn.query_table_data (selectstring_p text) returns setof record as $_$ DECLARE outputrec_v record; nasty_strings_re_v text; rowcnt int := 0; BEGIN -- build regex from table of nasty strings nasty_strings_re_v := (select (array_to_string(array_accum(badword), '|')) from mkn.badwords); raise debug '%', nasty_strings_re_v; if (selectstring_p ~* nasty_strings_re_v) then -- bad stuff in query raise exception 'Disallowed strings in query'; else -- get the records and return them for outputrec_v in execute selectstring_p loop rowcnt := rowcnt + 1; return next outputrec_v; end loop; -- if no rows selected raise an exception (catch later) if rowcnt <= 0 then raise exception 'Zero rows returned'; end if; insert into mkn.custom_queries_log (query, output_rows, error_code, error_msg) values (selectstring_p, rowcnt, NULL, NULL); end if; END; $_$ LANGUAGE plpgsql; -- Now a query that works OK being passed through this function select * from mkn.query_table_data ('select p087001 as pop from datatable_00041 order by pop desc limit 10') as FOO (pop integer); pop -- 3583 3555 3417 3410 3352 3133 3122 3013 2957 2941 (10 rows) -- Now a query that fails being passed through the function select * from mkn.query_table_data ('select sum(p087001) as pop from datatable_00040 group by substr(geo_id, 13, 6) order by pop desc limit 10') as FOO (pop integer); ERROR: wrong record type supplied in RETURN NEXT CONTEXT: PL/pgSQL function "query_table_data" line 15 at return next -- Now, what happens if I run the failing inside query directly from psql select sum(p087001) as pop from datatable_00041 group by substr(geo_id, 13, 6) order by pop desc limit 10; pop -- 7498 7181 7130 7094 6879 6839 6677 6662 6632 6567 (10 rows) -- Now, the version: select version(); version --- PostgreSQL 8.1.11 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-14) (1 row) -- thanks to everyone for their help, yet again! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] "INNER JOIN .... USING " in an UPDATE
Hi all, (Version 3.5.5) I have tried to figure this out, but the docs, google, and my all imagination fail me. I want to use a join clause with a "using list" in an update statement. The following works, but it uses the WHERE version of a join: update new_pivoted_table a set "2008-11-10" = b.data_stuff from test_pivot b where a.id=b.id and a.id2=b.id2 and date_ex='2008-11-10'; UPDATE 3 The following doesn't work, to my chagrin: wsprague=# update new_pivoted_table a set "2008-11-10" = b.data_stuff from test_pivot b join a using (id, id2) where date_ex='2008-11-10'; ERROR: relation "a" does not exist Neither does this: wsprague=# update new_pivoted_table set "2008-11-10" = b.data_stuff from test_pivot b join new_pivoted_table using (id, id2) where date_ex='2008-11-10'; ERROR: table name "new_pivoted_table" specified more than once The following gives the wrong answer (updates five rows all to 2 -- wrong): update new_pivoted_table set "2008-11-10" = b.data_stuff from test_pivot b join new_pivoted_table c using (id, id2) where date_ex='2008-11-10'; I would REALLY LIKE to use the using clause, because it works great in autogenerated code for EXECUTE statements, if at all possible. I also would like to avoid iterating. Here is "test_pivot": id | id2 | date_ex | data_stuff +-++ 1 | one | 2008-10-10 | 1 1 | one | 2008-11-10 | 2 2 | one | 2008-11-10 | 3 1 | one | 2008-12-10 | 4 2 | one | 2008-12-10 | 5 3 | one | 2008-12-10 | 6 1 | two | 2008-11-10 | 7 1 | two | 2008-11-10 | 8 2 | two | 2008-10-10 | 9 Here is new_pivoted_table: id | id2 | 2008-10-10 | 2008-11-10 | 2008-12-10 +-+++ 1 | one ||| 1 | two ||| 2 | one ||| 2 | two ||| 3 | one ||| Tx! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Using refcursors in application code (php, python, whatever...)
Hi all, Does anyone have any advice on using application code with a refcursor? This is a follow up to my "is there a safe-ish way to execute arbitrary sql" ? Now that I have way to execute my arbitrary sql, I need to generate an html table with the arbitrary result inside some PHP (or whatever), so I need to loop over the results of the cursor (or do some weird return setof magic which I just figured out). Sorry to ask here, but I think it is more likely that the readers will even know what a refcursor is than on a massive php email list. Here is what I have working in psql: select query_table_data('select * from mkn_data.datatable_00013', 'boobear'); query_table_data -- boobear (1 row) mkn=# FETCH FORWARD ALL FROM boobear; col1 | col2 ---+-- bob | 34 alice | 32 (2 rows) Now I need something that says $curs.fetchall() -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Read only access, via functions only
> you can set transactions read only but the user can turn them off. Use > views, functions and GRANT. Views was the key word. I had tried to do it with functions and GRANT alone. Thanks to the collective brain that is a listserv. -W -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Read only access, via functions only
Hi all Is there a away to set up a schema such that a certain role has (1) read only access to (2) all the tables, but (3) must use predefined functions to use that access? Items 1 and 2 are so that the end user doesn't stomp on the data. I want item 3 in order to force the application programmers (PHP) not to craft their own select statements but to go through the API Thoughts? I have about given up, but maybe somebody can make it easy for me. Running version 8.2.something (stupid Linux Distros) (I can post code if necessary, I am just feeling a little lazy...) Thanks W -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Semi-customized queries? Subset of SQL? Accessing the parser? Injection?
> Or do it with simple combo boxes if you > want to limit the users to crippled queries.) I want to limit my users to *half* crippled queries -- arbitrary column lists, where clauses, group by lists, and sort by lists. I want to make sure that they aren't doing any data modifications nested inside a where clause or a column definition as a subquery. > I don't see anything that suggests hacking the SQL parser > is going to be a useful thing to do. I would think that I could *use* (definitely not hack -- good god!) the parser to ask how deep the nested subqueries are, etc. > I'm guessing that roles, constraints, resource limits and possibly > a sacrificial replicated database will provide the answer to your > actual problem, but we'd need to know what that is first. I am thinking that I may need to give them all, as in all or nothing..., and kind of follow David Wilson's plan above. I was hoping someone had already done what Sam Mason suggested as being the "fun thing", though ... Oh -- I think query builders are a thing of the devil. Thanks to all for putting up with my lack of good of writing. -W -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Semi-customized queries? Subset of SQL? Accessing the parser? Injection?
> If they're that smart, they're smart enough to deal with SQL, and > likely to be frustrated by a like-sql-but-not command language or > a GUI query designer. > > Instead, create a user that only has enough access to read data (and > maybe create temporary tables) and use that user to give them > a sql commandline. > > It'll be drastically less development effort for you, and the end result > is less likely to frustrate your users. Can't do that. (Or I wouldn't have asked the question.) Need a WWW interface, period. Thanks for the comment, though. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Semi-customized queries? Subset of SQL? Accessing the parser? Injection?
Hi all, I am writing an application that allows users to analyze demographic and economic data, and I would like the users to be able to pick columns, transform columns with functions (economists take the logarithm of everything), and write customized WHERE and GROUP-BY clauses. This is kind of like passing through a query to the DB in a library catalog. Has anybody found a good way to do this, especially inside the database from a plpgsql function (select * from custom_query('table1', 'col1 > 100')) ? I don't want to just concatenate a user supplied WHERE clause, at least without somehow checking the resulting statement for (1) only one statement, (2) no data modification clauses, and (3) only one "level" in the tree. It seems like if I could interact with an SQL parser through a script, I could accomplish this relatively easily. Perhaps SPI can help me (give me hints!), though I don't really want to write any C. Perhaps I am wrong about the possibility of this at all. I realize that roles and permissions can help protect the system, but I still feel nervous. Has anybody done a similar thing, or tried? The problem is that if we try to parameterize everything, then we don't really allow the kind of data exploration that we are shooting for and these guys / gals are smart enough to deal with a little syntax. Thanks! -W -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance of views
>> Am I right to avoid to VIEWS within application code? How one uses views is more a matter of taste and best practices, than a matter of rules like this. Frankly, this "rule" sounds rather ill conceived. My feeling is that views can be difficult to maintain when they are nested, but otherwise use them whenever they simplify matters. I also feel that they should only be defined for when they represent, well, views of the data that make sense in a long term way; don't use them if for a one-off application. As for performance -- let me reiterate: create the most elegant design, possibly with views, and only worry about performance AFTER PROFILING. -W -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Update with a Repeating Sequence
Untested ideas (beware): Use an insert trigger that: curr_seq := select max(seq) from foo where field_id = NEW.field_id if curr_seq is null then NEW.seq := 0 else NEW.seq := curr_seq + 1 (You have to figure out how to build the trigger infrastructure...) If you need to do it on a table that is already populated, let us know. On Tue, Oct 14, 2008 at 9:21 AM, Bill Thoen <[EMAIL PROTECTED]> wrote: > Grzegorz Jas'kiewicz wrote: >> >> alter table foo add newid sequencial; >> >> alter table foo drop field_id; >> >> alter table foo rename newid to field_id; >> > I can't do that; I need to preserve the field_id values. > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] deleting the master but not the detail
hi I have one of those master-detail relationships here and I need to be able to delete the master but leave the details untouched when you create the table with an FK constraint, use the ON DELETE SET NULL option, or SET DEFAULT. And read the docs on "CREATE TABLE": http://www.postgresql.org/docs/8.3/static/sql-createtable.html >>> >>> Then remove the referential integrity constraint, since it's obviously >>> incompatible with your business requirements. Wrong. See above. Unless I misunderstand, then I apologize for the noise -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Table size
> I meant, "I did not know such facility exists" When you use pgautodoc, it automatically grabs those comments and puts them in the web page it crreates... more coolness! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Making a schema "read-only" (was Unexpected message in grant/revoke script)
Thanks to Eric and Tom, I think I have got it. Here is the function for adding a new student, who can select anything in public and can do anything at all in their own schema. revoke all on schema public from public; -- done only once create or replace function new_student (text) returns void as $$ declare t_name text; begin -- personal schema execute 'create role ' || $1 || ' LOGIN'; execute 'create schema authorization ' || $1 ; -- public schema execute 'revoke all on schema public from ' || $1; execute 'grant usage on schema public to ' || $1; for t_name in select table_name from information_schema.tables where table_schema = 'public' order by table_name loop raise notice 'granting select to %s on %s', $1, t_name; execute 'grant select on ' || t_name || ' to ' || $1; end loop; end; $$ language plpgsql ; select new_student ('fobar'); --etc -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Making a schema "read-only" (was Unexpected message in grant/revoke script)
On Fri, Mar 14, 2008 at 1:30 PM, Erik Jones <[EMAIL PROTECTED]> wrote: > > > On Mar 14, 2008, at 3:22 PM, Webb Sprague wrote: > > >>> Start with > >>>revoke all on schema public from public > >>> and then grant only what you want. > > > > Oh -- to grant select permissions on all the tables in the public > > schema, do I have to do it table-by-table? I know I can write a loop > > an use information_schema if necessary, but if I don't have to I would > > like to know. > > > > Thx > > Typically what's done is to do that for one user "group" and then make > all of your actual login users members of that group. Oh -- I guess overlapping groups would work, but wouldn't I still have to give select permissions to this collective role by going table-by-table? And I want a separate role for each person, so that they can't stomp all over each others data. And now I have a new problem -- what could be wrong with the alter schema owner to line below: revoke all on schema public from public; create or replace function new_student (text) returns void as $$ declare t_name text; begin -- personal schema execute 'create role ' || $1 || ' LOGIN'; execute 'create schema ' || $1; execute 'alter schema ' || $1 || ' owner to ' || $1; -- THIS THROWS AN ERROR see below execute 'grant all on schema ' || $1 || ' to ' || $1 || ' with grant option'; for t_name in select table_name from information_schema.tables where table_schema = 'public' order by table_name loop raise notice 'granting select to %s on %s', $1, t_name; execute 'grant select on ' || t_name || ' to ' || $1; end loop; end; $$ language plpgsql ; oregon=# select new_student('foobarbar'); ERROR: unrecognized node type: 1651470182 CONTEXT: SQL statement "alter schema foobarbar owner to foobarbar" PL/pgSQL function "new_student" line 7 at EXECUTE statement Thanks again for helping me understand this most tedious of database stuff w -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Making a schema "read-only" (was Unexpected message in grant/revoke script)
> > Start with > > revoke all on schema public from public > > and then grant only what you want. Oh -- to grant select permissions on all the tables in the public schema, do I have to do it table-by-table? I know I can write a loop an use information_schema if necessary, but if I don't have to I would like to know. Thx -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Making a schema "read-only" (was Unexpected message in grant/revoke script)
On Fri, Mar 14, 2008 at 12:55 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Webb Sprague" <[EMAIL PROTECTED]> writes: > > Also, I revoked what I thought was everything possible on the public > > schema, but a user is still able to create a table in that schema -- > > could someone explain: > > > oregon=# revoke create on schema public from foobar cascade; > > REVOKE > > You've got a conceptual error here: Not surprising... > Start with > revoke all on schema public from public > and then grant only what you want. I will give it a go, and thanks! > > regards, tom lane > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Making a schema "read-only" (was Unexpected message in grant/revoke script)
> I have the following function: Now that I know how to write the function, my design flaws and lack of understanding are more apparent... ... I was trying to give all logged in users read-only access to the public schema, and full access to the schema that corresponds to their username. The idea is that they can grab data from public with a select into, and if they need to further process it, they need to store the derived table in their personal schema. Is this possible? Is it possible to do without granting/revoking on each table one-by-one in public (I had incorrect syntax in the function below trying to grant select to a schema)? Also, I revoked what I thought was everything possible on the public schema, but a user is still able to create a table in that schema -- could someone explain: oregon=# revoke create on schema public from foobar cascade; REVOKE oregon=# revoke all on schema public from foobar cascade; REVOKE oregon=# commit; COMMIT oregon=# revoke all on schema public from foobar cascade; REVOKE oregon=# set role foobar; SET oregon=> create table public.foo (id int); CREATE TABLE oregon=> commit; COMMIT oregon=> This is for a class -- all the students need access to the data, but I don't want them to practice deletes on the example table... I apologize for the stupid questions, but I haven't ever had call to dive into the weirdnessess of grant/ revoke before. Thanks again! > > create function new_student (text) returns text as $$ > declare > wtf integer := 1; > begin > execute 'create schema ' || $1; > execute 'create role ' || $1 || 'LOGIN'; > execute 'revoke all on schema public from ' || $1; > execute 'grant select on schema public to ' || $1; > execute 'grant all on schema ' || $1 || ' to ' || $1 || ' > with grant option'; > return $1; > end; > $$ language plpgsql > ; > > When I run this with select new_student('foobar'), I get the following > error message > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Unexpected message in grant/revoke script
Hi all, I have the following function: create function new_student (text) returns text as $$ declare wtf integer := 1; begin execute 'create schema ' || $1; execute 'create role ' || $1 || 'LOGIN'; execute 'revoke all on schema public from ' || $1; execute 'grant select on schema public to ' || $1; execute 'grant all on schema ' || $1 || ' to ' || $1 || ' with grant option'; return $1; end; $$ language plpgsql ; When I run this with select new_student('foobar'), I get the following error message -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Unexpected error in grant/revoke script
Hi all, I have the following function: create function new_student (text) returns text as $$ declare wtf integer := 1; begin execute 'create schema ' || $1; execute 'create role ' || $1 || 'LOGIN'; execute 'revoke all on schema public from ' || $1; execute 'grant select on schema public to ' || $1; execute 'grant all on schema ' || $1 || ' to ' || $1 || ' with grant option'; return $1; end; $$ language plpgsql ; When I run this with select new_student('foobar'), I get the following error message oregon=# \i new_student.sql CREATE FUNCTION oregon=# select new_student('foobar'); ERROR: role "foobar" does not exist CONTEXT: SQL statement "revoke all on schema public from foobar" PL/pgSQL function "new_student" line 6 at EXECUTE statement However, I can run the following from the command line just fine: # create role foobar login; # revoke all on schema public from foobar; Can anyone explain and help me fix? TIA. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] how do you write aggregate function
On Mon, Mar 10, 2008 at 11:00 AM, Justin <[EMAIL PROTECTED]> wrote: > > That comment was not meant to be an insult or disparaging in any way what > so ever. If it was taken as such then i'm sorry. I am sure it would have been fine in person, I just think over email it sounded abrasive. But could you stop topquoting please? > It seems the biggest performance hit is copying of the array content from > one memory variable to another which is happening allot. Yeah, I think arrays just can't handle a whole lot of data, that is all. They are "tricky", and shouldn't be used for heavy lifting (more than 1k of elements feels like you are asking for trouble). > I'm not really against using a temp tables to hold onto values. I used to > do that in Foxpro when i hit the hard limit on its array but other problems > start popping up. If we use a temp table keeping track what going with > other users can make life fun. I think temp tables have scope, though you should test this, so that you can use them with impunity in functions and not worry with multiple users. > I really want to figure this out how to speed it up. I have to write allot > more aggregate functions to analyze R&D data which will happen latter this > year. right now this function will be used in calculating manufacturing > cost. I think a combination of aggregate functions along with some more design would be best. For example: can you have a trigger calculate the normalized weight of a row on insert? Have triggers keep another table with summary information updated as you modify the data? Etc. There is a lot to PG that would help for this kind of thing. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] data import - duplicates
I haven't tested but this is what I would do (uses arrays, which are handy when you need them), with the names changed to protect the innocent: begin; -- create a table with some duplicates in one of the columns (y is ck); wsprague=# select x, x%4 as y into temp fbar from generate_series(1,10) as x(a); SELECT wsprague=# select * from fbar; x | y +--- 1 | 1 2 | 2 3 | 3 4 | 0 5 | 1 6 | 2 7 | 3 8 | 0 9 | 1 10 | 2 (10 rows) -- create a table with the pk, an array of the duplicate keys, and the length of that array select y, x_list, array_upper(x_list, 1) as x_len into baz from (select y, array_accum(x) as x_list from fbar group by y ) a ; wsprague=# select * from baz; y | x_list | x_len ---+--+--- 3 | {3,7}| 2 2 | {2,6,10} | 3 1 | {1,5,9} | 3 0 | {4,8}| 2 (4 rows) -- delete all rows that don't have ck in the first element of the pk list wsprague=# delete from fbar where not exists (select 1 from baz where fbar.x=baz.x_list[1]);DELETE 6 wsprague=# select * from fbar; x | y ---+--- 1 | 1 2 | 2 3 | 3 4 | 0 (4 rows) commit; I hope that makes sense. It should be easy to make into a function (like clean_table(table_name text, pk_name text, dup_key_name text). I don't know how well it will work with indexes. You could probably even write a function to do the entire import inside postgres, starting with a copy and moving to merging tables and ending with some consistency checks, and thus benefit from transaction isolation of the whole process HTH On Sat, Mar 8, 2008 at 9:42 PM, brian <[EMAIL PROTECTED]> wrote: > I have ~350K rows of sample data that has come to me in 64 text files > (CSV) that I'd like to import into a new database. Each file can be > considered its own category and is so named. That is, each file name > will be inserted into a separate categories table. I'd like to relate > each row to its category. However, while the rows are unique in each > file, some rows are found in more than one file. > > I also must parse some of the fields into separate columns, but that's > no big deal. But it means that I must do some pre-processing on these > files, in any case. > > After some thought, I thought I might brute-force the problem with Perl > by reading each line of each file into an assoc. array unless it's > already in it, in which case I'd append the "key" based on the list of > categories that line is found in (I'd still need to parse outthe keys > later but I'd get rid of the dupes). Each array row would be like so: > > 'key' => '1,4,etc.', 'text' => 'a line' > > Aside from the fact that the array search would become ridiculously > large after just a couple of files, I realised that this is a > non-starter for the simple fact that this data comprises less than 25% > of the eventual total. So refactoring it in this way would be a waste of > time (there will probably be dupes in the files to come). > > So, I'd like to instead parse out my columns properly and write each > line (along with its category key) to a new, single file to be copied > into a working table. ONce I've done so, is there an efficient method I > can use to select all duplicates (save for the category key) into a set > from which I could then select into the final table (and insert the keys > into the category join table)? > > For example (pk is the PK from the working table and ck is the category > key), my dupes query on the working table would give the following set: > > pk ck > 1 1 a a a a > 2 3 a a a a > 3 3 b b b b > 4 7 b b b b > 5 4 a a a a > > I would then want to insert just the unique rows into the final table > yet add all of the the PKs and CKs to the category join table. After > that was done, I'd delete all of these from the working table and then > move the unique rows that are left to the final table (and insert the > keys into the join table). > > I hope that makes sense. I'm not looking for anyone to do my homework > for me; I'm sure I could fix up a tasty function for this (the data is > destined for MySQL, alas, but I'll be damned if I won't use PG for the > heavy lifting). What I'm really looking for is a handy way to grab all > of those dupes. > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to make update rapidly?
Post the table, the query, and the explain output, and then we can help you. On Feb 19, 2008 7:38 PM, hewei <[EMAIL PROTECTED]> wrote: > Hi,Every body; >I have a table contains 100,000 rows, and has a primary key(int). > Now ,I need to execute sql command like "update .. where id=*"(id > is primary key). > I expect execute 1200-1600 sqlcommands per second(1200-1600/s). >In test,when the id increase by degrees in sqlcommands, then I can reach > the speed(1600/s); > But in fact , the id in sqlcommands is out of rule, then the speed is > very slow, just 100/s. > what can i do? can you help me ? > ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] dynamic crosstab
> > > It occurs to me that it shouldn't be terribly difficult to make an > > > alternate version of crosstab() that returns an array rather than > > > tuples (back when crosstab() was first written, Postgres didn't > > > support NULL array elements). Is this worth considering for 8.4? > > > > How about returning generic rows? Is that possible? > > One hack I've used in the past to get those is serializing the rows: > XML, YAML and most recently JSON. > > > It would be really neat if you didn't have to specify the return > > type in the query that invoked the crosstab. > > It would be handy :) +1 What about (for a 2 dim crosstab anyway) take a table and two column names to group by, and return the following results: an 1-d array with the column names, a 1-d with the rownames, and a 2-d array with the cell values; a function to take these three arrays and make csv readable text would be great; also a function to "explode" the arrays into a table (like an array_accum inverse), but this would take a type or something. Is this what every one means anyway? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] arrays of floating point numbers / linear algebra operations into the DB
> I'm quite proud, this is my first C extension function ;-) > I'd gladly post the code if it's ok for the list users. It's more or > less 100 lines of code. This approach seems promising... I would definitely like to see it. > By the way, Webb: I took a look at GSL and it seems to me that, from a > linear algebra point of view, it's basically cblas, so I'd use cblas > directly. > Please let me know your thoughts/advices, The only thing about GSL is that it would make it easier to tie into some very sophisticated stuff later, and (I think) the basic linear algebra is probably just as fast as CBLAS, and we could implement it first. It would also be easy to define a big project as : "bring GSL to Postgres", and then people could work on pieces. But if you actually write it, you get to decide :) GSL licensing is GNU ish, so may be that is a deal breaker, too. w > e. > > ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] arrays of floating point numbers / linear algebra operations into the DB
On Feb 1, 2008 2:31 AM, Enrico Sirola <[EMAIL PROTECTED]> wrote: > Hello, > I'd like to perform linear algebra operations on float4/8 arrays Having avoided a bunch of real work wondering about linear algebra and PG, did you consider the Gnu Scientific Library ? We would still need to hook everything together, but it seems to do a lot of this, and is written in C, etc. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] arrays of floating point numbers / linear algebra operations into the DB
> ...linear algebra ... > >>> ... matrices and vectors . > >> ...Especially if some GIST or similar index could efficiently search > >> for vectors "close" to other vectors... > > > > Hmm. If I get some more interest on this list (I need just one LAPACK > > / BLAS hacker...), I will apply for a pgFoundry project and appoint > > myself head of the peanut gallery... > > I think you should start one. I'd be happy to help. OK. You are on. I think designing an interface is the first step, and I am inclined to use matlab syntax plus cool things I wish they had (convolution matrices, recycling, etc). > I'm rather proficient in C; somewhat literate about postgres' GIST > stuff (I think a couple of my bugfix patches were accepted in postgis); Nifty! I am having trouble bending my head around how we can fit 10K by 10K matrices into Datums, but if you have worked with PostGIS then a lot of those big geographic fields might help. > and deal with a big database doing lots of similarity-based searches (a > 6'2" guy with light brown hair being similar to a 6'1" guy with dark > blond hair) - and am experimenting with modeling some of the data as > vectors in postgres. Well, I bet a good linear algebra library would help. A lot. :) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] arrays of floating point numbers / linear algebra operations into the DB
(I had meant also to add that a linear algebra package would help Postgres to be the mediator for real-time data, from things like temprature sensors, etc, and their relationship to not-so-scientific data, say in a manufacturing environment). On Feb 1, 2008 12:19 PM, Ron Mayer <[EMAIL PROTECTED]> wrote: > Webb Sprague wrote: > > On Feb 1, 2008 2:31 AM, Enrico Sirola <[EMAIL PROTECTED]> wrote: > >> I'd like to perform linear algebra operations on float4/8 arrays... > > > > If there were a coherently designed, simple, and fast LAPACK/ MATLAB > > style library and set of datatypes for matrices and vectors in > > Postgres, I think that would be a HUGE plus for the project! > > I'd also be very excited about this project. > > Especially if some GIST or similar index could efficiently search > for vectors "close" to other vectors. That would be very interesting as we could play with a multitude of different distance metrics from Analysis!!! Wow! > I'd be eager to help, test, debug, etc; but probably aren't qualified > to take the lead on such a project. I almost think the hardest part would be to spec it out and design the interface to the libraries. Once we had that, the libraries are already there, though figuring out how we are going to handle gigabyte size elements (e.g. a satellite image) will require some finesse, and perhaps some tiling ... Hmm. If I get some more interest on this list (I need just one LAPACK / BLAS hacker...), I will apply for a pgFoundry project and appoint myself head of the peanut gallery... ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] arrays of floating point numbers / linear algebra operations into the DB
On Feb 1, 2008 2:31 AM, Enrico Sirola <[EMAIL PROTECTED]> wrote: > Hello, > I'd like to perform linear algebra operations on float4/8 arrays. > These tasks are tipically carried on using ad hoc optimized libraries > (e.g. BLAS). If there were a coherently designed, simple, and fast LAPACK/ MATLAB style library and set of datatypes for matrices and vectors in Postgres, I think that would be a HUGE plus for the project! I would have used it on a project I am working on in mortality forecasting (I would have been able to put all of my mathematics in the database instead of using scipy), it would tie in beautifully with the GIS and imagery efforts, it would ease fancy statistics calculation on database infrastructure, it would provide useful libraries for the datamining/ knowledge discovery types, etc, etc. If we just had fast matrix arithmetic, eigen-stuff (including singular value decomposition), convolution, random matrix generation, and table <-> matrix functions, that would be amazing and would provide the material for further library development since a lot of complex algorithms just fall out when you can do advanced linear algebra. We need to be able to convert transparently between matrices/ vectors (which I think should be simple N by 1 matrices by default) and arrays, but we would probably want to go for a separate datatype in order to get speed since scientifically important matrices can be HUGE. Just my fairly worthless $0.02, as I all I would provide would be to be a tester and member of the peanut-gallery, but there you go. Seems like a perfect Summer Of Code project for someone better at C-level programming than me. -W ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] SVN event hooks for PL/PGSQL functions and DDL?
I don't mean to be thick headed, but ... On Jan 18, 2008 1:11 AM, Blazej Oleszkiewicz <[EMAIL PROTECTED]> wrote: > Its simple. ... it seems not so simple, as (1) SVN docs say don't modify the repository before a commit, (2) I would think that I would need to call pg_dump BEFORE the commit in order to get a diff on the function code if appropriate, playing hell with (1) above. If the version control doesn't track my on the fly changes to the functions, this little project is all for naught. I will start experimenting today, but I would love any random thoughts on this topic. -W > Create runable script "post-commit" in > /hooks > > script body may look like > > == begin == > #! /bin/sh > pg_dump [option...] [dbname] > /path/to/db/backups/$2.dump > == end == > > "$1" is REPOS > "$2" is REVISION > > > It works as follows > When you make commit on SVN it runs the hook script which makes db > dump (look at pg_dump options). > > Regards, > Blazej > > 2008/1/16, Webb Sprague <[EMAIL PROTECTED]>: > > > In another thread, someone mentioned writing hooks for Subversion that > > would grab function definitions and DDL statements from the current > > database and push them into the repository? > > > > Does anyone have a few scripts/ cookbook examples for this? Is there > > a cookbook section on the postgres wiki where they might go? I am > > sure I can figure this out, but I wouldn't mind cheating > > > > Thx > > > > ---(end of broadcast)--- > > TIP 6: explain analyze is your friend > > > ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] SVN event hooks for PL/PGSQL functions and DDL?
In another thread, someone mentioned writing hooks for Subversion that would grab function definitions and DDL statements from the current database and push them into the repository? Does anyone have a few scripts/ cookbook examples for this? Is there a cookbook section on the postgres wiki where they might go? I am sure I can figure this out, but I wouldn't mind cheating Thx ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Getting process id of a connection?
> > I think select pg_backend_pid(); will do that. Perfect. I tried googling but I didn't try \df *pid* which would have found it I tried to figure out the pg_stat_activity, but I can't think of a WHERE condition that would make it give me the info I wanted. Thx again to everyone. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Getting process id of a connection?
Hi all, Is there a way to determine the pid of a database connection from within that connection? As a hypothetical example, I would like to be able to do the following: $ps x PID TTY STAT TIME COMMAND 11674 ?S 0:00 sshd: [EMAIL PROTECTED]/1 11675 pts/1Ss 0:00 -bash 11682 pts/1T 0:00 psql 11685 pts/1R+ 0:00 ps x psql=# select CURRENT_PID; 11682 I want this so that I can log the psycopg2 connection pid, and kill it to test reconnection code. Thanks! -W ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] FK creation -- "ON DELETE NO ACTION" seems to be a no-op
> > > > ... "currentsessions_username_fkey" FOREIGN KEY (username) > > REFERENCES authorizedusers(username) ON UPDATE CASCADE > > Hmm, NO ACTION is the default. Oh, how embarrassing. Never mind... > > -- > Alvaro Herrerahttp://www.CommandPrompt.com/ > The PostgreSQL Company - Command Prompt, Inc. > ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] FK creation -- "ON DELETE NO ACTION" seems to be a no-op
Hi list, First, my select version() gives: PostgreSQL 8.2.4 on x86_64-pc-linux-gnu, compiled by GCC x86_64-pc-linux-gnu-gcc (GCC) 4.1.2 (Gentoo 4.1.2 p1.0.2) The Problem: when I run ALTER TABLE currentsessions ADD CONSTRAINT currentsessions_username_fkey FOREIGN KEY (username) REFERENCES authorizedusers on delete no action on update cascade; It is not updating the table correctly to reflect the "no action". After \d currentsessions (and some editing) I get the following, which, as I understand it, should say something about the "ON DELETE NO ACTION": ... "currentsessions_username_fkey" FOREIGN KEY (username) REFERENCES authorizedusers(username) ON UPDATE CASCADE However, if I drop the constraint and rebuild it with ALTER TABLE currentsessions ADD CONSTRAINT currentsessions_username_fkey FOREIGN KEY (username) REFERENCES authorizedusers on delete cascade on update cascade; I get: ... "currentsessions_username_fkey" FOREIGN KEY (username) REFERENCES authorizedusers(username) ON UPDATE CASCADE ON DELETE CASCADE Is this a bug or a strange thing I had thought I understood? Or has it been fixed in 8.2.5 ("Fix incorrect handling of some foreign-key corner cases (Tom)" -- from the release notes) If anyone wants more data, just holler. Thanks W ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] find out data types using sql or php
> ... to determine the field datatype (text, numeric, > bool, etc.) I am not sure if this helps, but you can dig around in the system files (pg_catalog.*), and probably write a query that gets the types of every column in the data table you want to insert to. Not a big push, but maybe it will get you started. I don't have links at hand -- sorry. > and then use some regex or something along those lines to > attempt to verify that the data is good, and then mark the is_ok field > (obviously a bool) as true, and use is_ok = TRUE in the insert/select > statement. Can somebody give me a push in the right direction? This sounds like a lot of programming, but that you have the right idea. I am not sure how you would use the "is_ok" within SQL; I would probably only try to insert data that is ok, and filter that in the application. I might also think in terms of wrapping everything in a transaction, assuming it is all good, and then rolling back and catching your exception and giving the user a decent error -- such as "reformat your spreadsheet, doofus, and try uploading again"... Very doable with psycopg2 and python. -W > > ---(end of broadcast)--- > TIP 3: Have you checked our extensive FAQ? > >http://www.postgresql.org/docs/faq > ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Inheritance fixing timeline? (Was "Inherited FK Indexing")
> > Is it > > possible to have FK that spans into child tables? > > This is a well known (and documented, see [1]) deficiency. It's due to > the current implementation of indices, which are bound to exactly one > table, meaning they do return a position within the table, but cannot > point to different tables. Is this set to be fixed in any particular release? (Here's hoping that someone smarter than I am is working on it...) -W ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Do AGGREGATES consistently use sort order?
order/aggregate thing is a general question. > > Yes. > > You can even do this with GROUP BY as long as the leading columns of the ORDER > BY inside the subquery exactly matches the GROUP BY columns. > > In theory we can't promise anything about future versions of Postgres but > there are lots of people doing this already so if ever this was lost there > would probably be some new explicit way to achieve the same thing. Does anyone have any spec links, oracle behavior, or whatever? For now I will trust Postgres to continue behaving sanely, but I am curious. Thx to Gregory for the quick reply > > -- > Gregory Stark > EnterpriseDB http://www.enterprisedb.com > ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Do AGGREGATES consistently use sort order?
I have the following query: select array_accum(name) from (select name from placenames where desig='crater' order by name desc) a; with array_accum defined as: CREATE AGGREGATE array_accum ( BASETYPE = anyelement, SFUNC = array_append, STYPE = anyarray, INITCOND = '{}' ); Can I count on this aggregate to take each new item in sorted order when it adds it to the state vector? So that if I have the following: oregon_2007_08_20=# select * from (select name from placenames where desig='crater' order by name desc) a; name Yapoah Crater West Crater Twin Craters Timber Crater Red Crater Newberry Crater Nash Crater Mount Mazama Millican Crater Little Nash Crater Le Conte Crater Jordan Craters Diamond Craters Coffeepot Crater Cayuse Crater Black Crater Big Hole Belknap Crater (18 rows) I can always count on (note the order name): \a oregon_2007_08_20=# select array_accum(name) from (select name from placenames where desig='crater' order by name desc) a; array_accum {"Yapoah Crater","West Crater","Twin Craters","Timber Crater","Red Crater","Newberry Crater","Nash Crater","Mount Mazama","Millican Crater","Little Nash Crater","Le Conte Crater","Jordan Craters","Diamond Craters","Coffeepot Crater","Cayuse Crater","Black Crater","Big Hole","Belknap Crater"} (1 row) I am interested in stitching a line out of points in postgis, but the order/aggregate thing is a general question. Thx W ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Postgresql performance in production environment
> The command you gave only shows currently > active users, not the historic peak of connections for instance. I'll > keep digging tha manual but would love any nudges in the right > direction, thanks! Can you set up a snapshot in a cronjob? It would still only be sample of a sample, but? > > ---(end of broadcast)--- > TIP 4: Have you searched our list archives? > >http://archives.postgresql.org/ > ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] [RESEND] Transaction auto-abort causes grief with Spring Framework
> Isn't the 'try' statement rather similar to a 'savepoint' command? I > realize it would be difficult to override the behaviour of try {...} > catch (...) {...}, but it shouldn't be too hard to wrap it somehow for > exceptions in database code. Yes, but I believe the OP was getting two levels of his application mixed up: he was doing something that caused a rollback in the *database*, then hoping to recover in a catch block in the *application* without terminating the aborted transaction in the database. Or so I gather. You are right in another post about the purpose and design of transactions, and don't use the discussion here as a model, though drivers seem to often do weird stuff with transactions behind your back. Psycopg (python) does an implicit begin, so you must commit, which then starts another begin automatically. I think you can set a handle to do autocommit, but I never do. This seems best because it forces you to handle transactions explicitly, but I can imagine other (bad) approaches, and Spring may use them (though I think the difficulty is that MS-SQL is sloppy, not Spring, and that the OP is still getting used to TX's and MS-SQL covered up some things that shouldn't have been covered). W ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] [RESEND] Transaction auto-abort causes grief with Spring Framework
> Pgs... like a warning that you can't do this; > > begin > insert 1 --works > insert 2 --fails > commit > > row 1 will exist in db (yes, no kidding). This will not work in pg, > which I now see is obviously correct. This should either a FAQ for MS-SQL or Spring, but since PG does it canonically it doesn't actually get asked very frequently ... Sounds like a nightmare specifically designed for vendor lock-in. My condolences. I am not sure how you can insert into a log even with savepoints, unless you put the logging statement first and then follow it with the insert. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] [RESEND] Transaction auto-abort causes grief with Spring Framework
> It is still a possible point of confusion, but I am starting to think > that pgsql has it right, and mssql has it wrong. (I feel betrayed!) This > issue probably deserves a prominant place in the FAQ! Betrayed? yes. Surprised? I hope not :) I think your driver (JDBC? or?) is doing autocommit (though I am not sure), and in the discussion we are confounding confusing rollbacks (in the database) with exception handling (in the application,which, I presume, is Java?). You may be best reworking your transactions, but in order to clear up the confusion you will probably need to post some actual code here (we can handle it, don't worry...). -W ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] [RESEND] Transaction auto-abort causes grief with Spring Framework
> it is all so easy with other dbs, but with postgresql > it is a nightmare... the only solution I can see is to remove the > declarative transactions in Spring and start using manual transactions > blocks around everything that could possibly go wrong... just because of > a quirk in postgresql This may or may not be a design flaw in postgresql, but perhaps you could reduce the emotional pitch of your emails - some people here (and I am not one of them, to be honest) put a lot of work for no payoff except the pride of contributing to a good product, and to have it insulted seems, well, personally insulting, not to mention unprofessional and a little silly and not something that belongs on a listserv. That being said, I feel your pain. Perhaps we can take this as an opportunity to make the product better, rather than an opportunity to bag on it by calling it "quirky" and "a nightmare". > > > ---(end of broadcast)--- > TIP 9: In versions below 8.0, the planner will ignore your desire to >choose an index scan if your joining column's datatypes do not >match > ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Matrix (linear algebra) operations and types in PG?
Hi all, Take a look at pl/r, http://www.joeconway.com/plr/. This is a interface to R: http://www.r-project.org/ I'm not familiar with this, but i think, this may be helpful for you. Shoot -- I should have said that I knew about plr -- supposedly a great project (maybe the reason there is no matrix type/operator in PG), but not what I am interested in, for reasons more to do with curiousity and the joy of hacking than with any pragmatic need. (Thanks, though, Andreas) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Matrix (linear algebra) operations and types in PG?
Hi all, This is just a random question/idea (I tried googling, but couldn't get an answer quickly): Has anyone written a library to effect linear algebra types and operations through Postgres? E.G., convert a float 2-d array to a matrix, convert a float 1-d array to a vector, multiply them, take the matrix power, calculate eigenvalues, some easy functions for converting tables to matrices, etc. This might be extremely useful to some of the data analysis folks out there--I would have used it a year ago on a project. It seems like a fairly straightforward project, which would consist primarily in hooking up the LAPACK libraries and defining some types and operations. Is there something tricky that I haven't foreseen? If I had the summer free, I would apply for a google summer of code project (hint to the world...). ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] simple coordinate system
http://www.postgresql.org/docs/8.2/static/datatype-geometric.html Have you looked at these yet? If not, you asked your question prematurely and should have read the docs. If so, in what respect do they not work for you? On 3/15/07, Robin Ericsson <[EMAIL PROTECTED]> wrote: On 3/15/07, Webb Sprague <[EMAIL PROTECTED]> wrote: > ... planning a simple coordinate system, where objects are described > > as x, y and z. Are there any contribs or extensions available that can > > help me with datatypes, calculation of length between two points, etc? > > google postgis. It is for geographic stuff, so maybe overkill, but > maybe not. There are are also geometry types native to Postgres that > you can find in the docs I was thinking about PostGIS, but it seemed overkill for my purpose. Therefore I asked in the first place :) -- regards, Robin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] simple coordinate system
... planning a simple coordinate system, where objects are described as x, y and z. Are there any contribs or extensions available that can help me with datatypes, calculation of length between two points, etc? google postgis. It is for geographic stuff, so maybe overkill, but maybe not. There are are also geometry types native to Postgres that you can find in the docs -- regards, Robin ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Database slowness -- my design, hardware, or both?
OK, I modified things to use interpolation. Here's the updated query: explain UPDATE Transactions SET previous_value = previous_value(id) WHERE new_value IS NOT NULL AND new_value <> '' AND node_id IN (351, 169, 664, 240); And here is the query plan produced by explain: QUERY PLAN -- Bitmap Heap Scan on transactions (cost=8842.88..98283.93 rows=407288 width=249) Recheck Cond: (node_id = ANY ('{351,169,664,240}'::integer[])) Filter: ((new_value IS NOT NULL) AND (new_value <> ''::text)) -> Bitmap Index Scan on node_id_idx (cost=0.00..8842.88 rows=434276 width=0) Index Cond: (node_id = ANY ('{351,169,664,240}'::integer[])) (5 rows) I'm still a bit surprised by how different the query plan came out with what would seem like a minor change. Do you have new \timings? What you or I think is a minor change isn't necessarily what the planner thinks is a minor change, especially when you change data from something that requires a query to something that is determinate. I would suggest changing your function to remove as many such queries as possible too (I am thinking of the order by limit 1). This would be a good move also in that you narrow down the amount of moving parts to diagnose and it just makes the whole thing cleaner. The meta-moral is that db optimization requires systematic experimentation. Use the database to store the results of the various experiments! In light of this, I would suggest you try removing the check clauses and seeing if you get a difference too. Just like Francis Bacon said -- don't deduce from first principles, experiment! I would also try amortizing the analysis with triggers, rather than building the table all at once; this may be better or worse, depending on the on-line character of the application (e.g., if they are waiting at an ATM, in might be a deal breaker to add two seconds to do an insert / update, but not if you are tailing off a log file that gets updated every minute or so.) W ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Database slowness -- my design, hardware, or both?
I am not able to look as closely as it deserves ... ... but I see two seq scans in your explain in a loop -- this is probably not good. If you can find a way to rewrite the IN clause (either de-normalizing through triggers to save whatever you need on an insert and not have to deal with a set, or by using except in the query, or someing else more creative)... Also -- there is a good book on temporal databases by Snodgrass that might give some interesting ideas; maybe you have already seen it, but still. I am thinking you could increment a sequence variable to give you a "tick" integer with each action, rather than trying to use timestamps with all their overhead and inaccuracy (1 second is a long time, really). Lamport also did work on clocks that might apply. Also have you tried dropping all your fk and checks just to see if you get a difference in speed on an update? It would be interesting, perhaps. If you could get rid of the sorted limit 1 clause in your function, there would be less variablity and make it easier to understand; you probably need to denormalize somehow, perhaps using ticks; I am not sure Could a trigger set your previous_value on insert or update, rather than querying for it later? I'm now thinking of separating each activity into its own database, in the hopes that this will improve the system speed enough to do what I need. But I'm far from convinced that this will really help. Better to figure out the real problem -- more interesting, more scalable. Hope my incoherent brain dump lights a spark. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Importing *huge* mysql database into pgsql
I would like to convert a mysql database with 5 million records and growing, to a pgsql database. All the stuff I have come across on the net has things like "mysqldump" and "psql -f", which sounds like I will be sitting forever getting this to work. Have you tried it? 5 million rows seem doable. In postgres, make sure you disable indexes and checks when you do your import, and use the bulk copy. How long is forever? Can you go offline? If you only need to do it once, it probably won't be too painful W ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Database slowness -- my design, hardware, or both?
. Heh. Sure thing. I wasn't sure how much detail to give when initially posting. Looks like enough to get the real experts on the list started :) I will try to look again tommorrow, but I bet other folks have better intuition than me. How much concurrency is there on your database? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Database slowness -- my design, hardware, or both?
Well, I've tried to do massive UPDATEs as much as possible. But the patterns that we're looking for are basically of the variety, "If the user clicks on X and then clicks on Y, but without Z between the two of them, and if these are all part of the same simulation run, then we tag action X as being of interest to us." So it's oodles of keeping track of back-and-forth for each of the rows in the table, and looking forward and backward in the table. I agree that row-at-a-time thinking isn't the best way to work, but I didn't see a good alternative for our purposes. I'm open to any and all suggestions. Can you post at least some table schemas, indexes, queries, and explain output? I think of database optimization as a serious case of devil in the details, and generalities (like -- make sure you index, make sure your indexes help using explain, avoid row-at-a-time thinking) won't get you far. So if we had something concrete to work with, well, we would have something concrete to work with. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Why can't I put a BEFORE EACH ROW trigger on a view?
I also think that a view is supposed to be just that -- a *view* of underlying data, which in no way modifies the data. I don't know much about the design ideas behind SQL, but I think this view of views (haha) is an underlying assumption. If you are modifying data when you do a select on a view, you are probably not using SQL the way it was intended (not that that is a bad thing, but ...) Postgresql has "rules" which I *think* can rewrite select statements. Rules are kind of a pain, but maybe what you want. On 2/24/07, Webb Sprague <[EMAIL PROTECTED]> wrote: >. I have _additional_ > constraints to place on modifications done through > views, and trickyness involved in modifying the > underlying tables. Write a function foo that returns a set, then a view: "create view as select * from foo()". Incorporate all the trickiness in the function, including variables, multiple temporary tables, whatever you need. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Why can't I put a BEFORE EACH ROW trigger on a view?
. I have _additional_ constraints to place on modifications done through views, and trickyness involved in modifying the underlying tables. Write a function foo that returns a set, then a view: "create view as select * from foo()". Incorporate all the trickiness in the function, including variables, multiple temporary tables, whatever you need. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] backup database by cloning itself
Here is the link to Elein's presentation: http://www.varlena.com/GeneralBits/Tidbits/tt.pdf What [about] postgresql "time travel"? I have never used it, and it looks a little bit unmaintained, but it might be perfect with some tweaking: ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] backup database by cloning itself
>> Maybe his real goal "all the backups readily available to be read by >> my program (opening the backup read only)" is to have a historical >> record of what certain records looked like in the past. What postgresql "time travel"? I have never used it, and it looks a little bit unmaintained, but it might be perfect with some tweaking: contrib/spi/README.timetravel I know that someone at SFPUG presented on using time travel last year, and she may have more info . ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] MOVE cursor in plpgsql?
> Is there a way to move a cursor in plpgsql in the same way as in > regular sql? ... Wouldn't it be easier to list the parts in a random order (simply ORDER BY RANDOM()) and then use modulo by number of actors (but there's no ROWNUM so a loop is needed anyway). Something like... I think you are right Thomas about the modulo, and that might be even better stylistically (more obvious to the reader) -- thanks! I still wonder if MOVing cursor in plpgsql is impossible, possible, or available soon? Thanks again, W ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] MOVE cursor in plpgsql?
Hi all, Is there a way to move a cursor in plpgsql in the same way as in regular sql? The function below would like to move the cursor back to the start each time the cursor runs out of rows, creating pairs of integers that are randomly put together. The "motivation" for this is to randomly assign parts of a shakespearian play (N=25) to not enough actors (N=6), giving each actor several parts. (To be truly fair, I would have to weight by number of lines, but that is for version 2... ) If there is a more graceful way to solve the problem, I am interested, but I would like to know about the MOVE issue in any case. CREATE TYPE int_pair as (i int, j int); CREATE OR REPLACE FUNCTION ASSIGN_RAND (tab1_nm text, tab2_nm text) RETURNS SETOF int_pair AS $PROC$ DECLARE curs1 REFCURSOR; tab1_id int; tab2_id int; id_pair int_pair; BEGIN OPEN curs1 FOR EXECUTE 'select id from (select id, random() as r from ' || quote_ident($2) || ' order by r) x'; FOR tab1_id IN EXECUTE 'select id from (select id, random() as r from ' || quote_ident($1) || ' order by r) x' LOOP id_pair.i := tab1_id; fetch curs1 into id_pair.j; IF NOT FOUND THEN MOVE FIRST IN curs1; -- XXX gives error! fetch curs1 into id_pair.j; END IF; RETURN NEXT id_pair; END LOOP; END; $PROC$ language plpgsql; ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Regular expressions and arrays and ANY() question
I am trying to figure out how to use a regex and an ANY(), without any luck, to determine if at least one element of an array (on the right) matches the given constant pattern (on the left). I think the problem is because the pattern expects to be on the right side with the target on the left, but I want to do it reversed. (ie I want 'Corvalli%' like 'Corvallis', but for postgres the only thing is 'Corvallis' like 'Corvalli%'). Has anybody worked around this before? See below for code. TIA. Feel free to email me directly. or_gis=# select * from quads_w_cities where 'Corvallis' = any (cities); -- this works like I want ohiocode | cities --+- 44123e2 | {Albany,Corvallis,Tangent,Estacada} 44123e3 | {Corvallis,Philomath} (2 rows) or_gis=# select * from quads_w_cities where 'corv.*' ~ any (cities); -- I want this to give me something just like the above ohiocode | cities --+ (0 rows) or_gis=# select * from quads_w_cities where 'corv.*' ~~ any (cities); -- etc... ohiocode | cities --+ (0 rows) or_gis=# select * from quads_w_cities where 'corv.*' ~* any (cities); ohiocode | cities --+ (0 rows) or_gis=# select * from quads_w_cities where 'Corv.*' ~* any (cities); ohiocode | cities --+ (0 rows) or_gis=# select * from quads_w_cities where '.*Corv.*' ~* any (cities); ohiocode | cities --+ (0 rows) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Form builder?
This is no small task. But that a mans reach should exceed his grasp... All of that being said, if you want to do it yourself, I would still claim that you'd get there a lot faster adopting Andromeda, because all you are really trying to do is embellish what we've already done. The problem with Andromeda are three fold, the last being trivial and brekaing the camels back for me: 1. I don't know Andromeda, while I kind of know Python/Tk 2. Andromeda is not an item in the standard Linux style toolkit, while Python/Tkinter is part of the system level installations on most Linuxes, and is easily installed on Windows and Macs. 3. I can't get Andromeda to compile on my installation of Gentoo. I am still figuring out the vagaries of Tkinter, but I think my approach will involve mapping a little language of table names, modifiers, and connecting operators to a sequence of forms. I think I can do a little magic in propagating fields chosen in early forms down to defaults in later forms in a sequence to get the effect of entering, say, a student and her three classes. I will announce to the list if I get a working prototype. w ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] Form builder?
So far, here are the candidates: Andromeda, Lazarus, and Rekall. I was probably fairly inarticulate in my first post, but none of these seem to meet my criteria for automatic generation of forms based on the database definition. Most of the above frameworks have a good deal more functionality than I need, at least at first. Really I want to be able to open, say, ipython and type: Someobject.form(table='sometablename', times=3) (Maybe at the SQL prompt: "> \form name=name times=3") And have it give cycle three times through a reasonable (though possibly still imperfect) form for entering three rows in table sometablename. I don't want to do any developing except for finding out the table names in the database. I don't want to drag and drop forms into a visual editor and hook them up with procedures, and any extra processing should be done inside the database via triggers, defaults, etc (so the system would have to handle rollbacks and notices gracefully). Speed of data entry is the most important thing in the form and form chain itself. I have some ideas for chaining together forms when there are FK's, but I will talk about that later. I think it may be up to me at this point. Would anyone else find this useful? ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Form builder?
I don't want to revisit or be redundant... but is there a quick and dirty and cross-platform system for developing user input forms for Postgres? Ideally, I am interested in something such that you can give it ("it" being something like a Python function) a table name, resulting in a magically appearing input form. It should be able to deal with getting the format more or less correct for a datatype, not displaying system columns, etc, using the system tables to get the necessary info. I thought first about hacking xdialog, but they don't have multiple field forms. I have looked at wx, but real gui programing is complex and not general. Right now I have an ugly hack that writes out a text file with colon separated lines for each row, with a defaults option so that you can chain together a set of One-Many forms (eg you enter a person, then you can cycle through with library books each referencing that person in a for loop). I would rather trade ease of use for functionality, if it can get it 90% correct just by the table name and the defaults. And I refuse to use XML. And I want pure Linux. If I have to develop the darn thing, of course I am happy to share. I want to develop an anthropological fieldwork database, but the form based data entry is important for ease of use. Cheers. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Wal logs continued...
Earlier I posted with my problems about the WAL logs eating up all my diskspace. I tried the solutions offered--checkpoint after a big copy and shortening the time between flushes. They helped somewhat. Unfortunately, the problem snow seems to happen when I vacuum-analyze after a big delete. Even if the vacuum takes more than 2 minutes (the amount of time between flushes that I set in postgresql.conf), the logs continue to grow. Currently the vacuum has been running for about 20 minutes after a delete of about 2,000,000 records out of 5,000,000, and the WAL logs are at about 1.5 G. Versions: RH 6.2, PG 7.1.1 Is there some way to set the number of log files before a flush/checkpoint thing? We are going to go to a bigger machine next week, but this is quite an inconvenience, and it would probably benefit the DB as a whole to place some limit on the size of the WAL. I would code it myself, but I can't yet (next year, after I finish a couple of Stevens' books). If there is any thing else I can do to help, please let me know. Thanks, W ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Max simultaneous users
We have a table with a lot of user sessions (basically end -time and length of connection). We would like to query this table to count the max number of simultaneous sessions, but we are stumped on how to do that. The only thing I have been able to think of is to iterate over the entire table and count the number of connections at the beginning and end of each user session, keeping a variable that records the time and max number of sessions at each iteration. We can do this in either in Perl or PL/SQL, but it seems like it would be *horribly* slow, especially considering we have about 250,000 of these records come in a day. I also wonder if there might be some intermediate data structure that we could create to make this easy, but I thought I would rely on other people's experience before trying to do this myself...:) Thanks ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[GENERAL] Fixed width COPY
Does anybody know of a good way to COPY a file into a table if the data is based on fixed width format? Do I just have to write some code with scanf(Ick)? For example (Sorry about the poor ASCII art formatting): | FIELD DESCRIPTIONLENGTH POSITION| |___| | 1. Action (E=eff, D=del, P=pend eff) 1 1 | 2. Action Date (mmddyy) 6 2-7 | 3. Country Code38-10 | 4. filler111 5. Paired Code 312-14 | 6. filler 115 | 7. Country Name 6516-80 | 8. Footnote code (LERGINFO, Sec 1.1) 1 81 | 9. filler 982-90 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [GENERAL] Vacuum analyze keeps hanging (RedHat 6.2, PG 7.03)
I had this problem with 7.0.3, but it cleared up completely with 7.1 W James Thornton wrote: > > Vacuum analyze keeps hanging here... > > NOTICE: --Relation referer_log-- > NOTICE: Pages 529: Changed 1, reaped 509, Empty 0, New 0; Tup 24306: > Vac 43000, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 72, MaxLen 324; > Re-using: Free/Avail. Space 5205100/5193540; EndEmpty/Avail. Pages > 0/508. CPU 0.03s/0.11u sec. > NOTICE: Index referer_log_date_idx: Pages 159; Tuples 24306: Deleted 0. > CPU 0.01s/0.08u sec. > > -- > > My system > > RedHat 6.2 > PostgreSQL 7.03 > > PIII 500 MHz, 320 MB memory > > - JT > > James Thornton, http://jamesthornton.com > > ---(end of broadcast)--- > TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[GENERAL] WAL Log using all my disk space!
Hi all, The problem: I do a large bulk copy once a day (100,000 records of Radius data), tearing down indices, truncating a large table that contains summary information, and rebuilding everything after the copy. Over the course of this operation, I can generate up to 1.5 gigs of WAL data in pg_xlog. Sometimes (like just now), I will run out of disk space and the postmaster will crash. I try to restart it, and it errors out. Then I delete all the WAL logs, try to restart, and (surprise) it errors out again. I tried to set some of the of the WAL parameters in postgres.conf like so: wal_buffers = 4 # min 4 wal_files = 8 # range 0-64 wal_sync_method = fdatasync # fsync or fdatasync or open_sync or open_datasync but I get 24+ separate files. I would like to recover without an initdb, but if that isn't possible, I would definitely like to avoid this problem in the future. Thanks to all ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Database Connect
Aahh! I have a million Perl scripts that rely on this syntax in the DBI to connect to remote databases. like: $dbh=DBI->connect( "dbi:Pg:dbname=datab@remotehost",'wsprague','pass', {RaiseError => 1, AutoCommit => 1}) or print STDERR "$DBI::errstr"; Umm, do you have any idea what we should do now? Thanks for all the effort that you guys have put into Postgres--if I was following pgsql-hackers I probably could have put my 2 cents in earlier. W Tom Lane wrote: > > But whan I start psql with the command 'psql mybase@myhost' I receive : > > psql: FATAL 1: Database "mybase@myhost" does not exist in system catalog > > This syntax isn't supported. > ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[GENERAL] NFS mounted DBs and Vacuum
I have a server for which PG_DATA = /home/maxtor, an NFS mounted disk. Normally it works great, but when I try to vacuum, it takes FOREVER. Should I not even try to use remote storage like this? Has anybody else run into a similar problem? Thanks in advance, -- Webb Sprague Programmer O1 Communications
[GENERAL] Invisible tables
Hello all, In my quest to learn PG and SQL programming, I have created tables in a database "foo". I am able to insert, select, etc just fine, but when I use "\dt" to show them they don't appear. They ARE listed in the system table "pg_tables", however. I have also tried to createdb "test1" after destroying it once, but I get "Sorry, test1 already exists", even though I deleted it from "data/base/" Any ideas on how to fix this? I am running 6.5.3 (the last one before 7.0) on RH 6.0. I have also run out disc space before this and had to destroy some databases and init them again, if that is leading to this weird behavior. Thanks in advance __ Do You Yahoo!? Send online invitations with Yahoo! Invites. http://invites.yahoo.com
[GENERAL] Create Database Error
A couple of days ago I tried to create a database and got an error to the effect that I couldn't do that (no more specific). I then tried to mkdir in my DATA directory for a new database (as root), and I got "Can't make directory--not enough space in DIR"). I deleted everything and then did initdb again, and it works fine. My tables were almost empty (I am just experimenting), but I never "vacuumed"--could that be the problem? Thanks in advance (especially to all the people who wrote such an amazing RDMS in the first place!) __ Do You Yahoo!? Send online invitations with Yahoo! Invites. http://invites.yahoo.com