[SQL] a select statement that sometimes joins
Here's a situation I've run into more than once with SQL: I want to select all the rows in a table that match a criteria, where one of the criteria is possibly having a related entry in a second table. For my example, lets say I have table named 'messages' and another named 'message_attachments'. The former has a primary key of msg_id, the latter also contains msg_id, and has an attachment_id as it's primary key. This statement shows me all the messages that also have attachments: SELECT messages.msg_id, message_attachments.attachment_id FROM messages,message_attachments WHERE messages.msg_id = message_attachments.msg_id; But I want a statement that says: "Show me all the messages, and include information about an attachment if they have one" (Let's further assume that a message will have only one attachment). Is this possible? Anyone like to share an example? Much thanks. -mark http://mark.stosberg.com/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[SQL] function to format floats as money?
Hello, I'm curious to know if there is a function available in Postgres 7.0.3 (or 7.1) that will format a float style number as "money". I understand that the money type is going away in the future, so using a float type that is sometimes formatted like money seems like a good alternative. So ideally, I'm looking for a solution that won't go away when the money type does. :) Thanks! -mark http://mark.stosberg.com/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [SQL] function to format floats as money?
Peter Eisentraut wrote: > > Mark Stosberg writes: > > > I'm curious to know if there is a function available in Postgres 7.0.3 > > (or 7.1) that will format a float style number as "money". I understand > > that the money type is going away in the future, so using a float type > > that is sometimes formatted like money seems like a good alternative. So > > ideally, I'm looking for a solution that won't go away when the money type > > does. :) Thanks! > > to_char() for formatting. > > numeric for storage. > > Using floats for monetary amounts is not only an extremely bad idea > because of the inexactness of storage and arithmetic, it might even be > illegal if you're using it for official purposes. Thanks Peter. So if the money type is going away, and floats can be illegal, whats the best way to store money? -mark personal website } Summersault Website Development http://mark.stosberg.com/{ http://www.summersault.com/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [SQL] function to format floats as money? (removing space padding)
Now that I've figured out that numeric is good for storing money, and that I can format with like this: to_char(price, '9,999,999.99') as price Then I discovered that sometimes this returns leading spaces I don't want. I can get rid of them like this: trim(to_char(price, '9,999,999.99')) as price Is that the recommended money formatting style, for amounts less than 9,999,999.99? (assuming I'll tack on my own currency symbol? ). Other there other general styles that folks like for this? Thanks, -mark personal website } Summersault Website Development http://mark.stosberg.com/{ http://www.summersault.com/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Client/Server Security question
Lonnie Cumberland wrote: > > Hello All, > > We are developing an application that will allow our websites to talk to our > database. > > In the interest of security, I am wondering if it is possible to turn off some > of the functions in the SQL command list such that a user can only communicate > to the database through our functions. > > What I mean is this. We have built a number of "C" extensions and PL/pgSQL > proceedures that will work on our database, but I only want to allow an outside > query to only one or two of our selected entry points. > > The webserver interface query statement might, for example, be able to only > call "select register_user(...)" or "select login_user()" and NONE of > the other PostgreSQL command functions. > > I only want to allow access to these functions from the outside world, but the > server needs to be able to execute all of the original functions without > restrictions. Lonnie, Have you checked the Postgres docs on security and access? It offers a lot of flexibility. For example, you can use a different postgres username to access the database from the outside world, in conjunction with using "grant" statements and views to give that user only the ability to perform specific actions on specific tables and views. If after reading the docs you still have specific questions about details that are not clear them, send a follow-up post with a more specific question and we can give you a more useful answer. :) -mark ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[SQL] creating tables that are visible across databases
Hello, I'd like to create some tables that would visible across databases, much like the postgres system tables. These would be for "static" data, such as state and country codes, and geo-spatial data. I couldn't find this mentioned in the docs, but unless this feature of the system tables is magical, it appears to be possible. Did I miss an explanation in some docs, or could someone give me a pointer? Thanks! -mark personal website } Summersault Website Development http://mark.stosberg.com/{ http://www.summersault.com/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[SQL] behavior of ' = NULL' vs. MySQL vs. Standards
Hello, I'm a long time Postgres user who uses MySQL when I have to. I recently ran into an issue with MySQL where this construct didn't do what I expect: WHERE date_column = NULL I expected it to work like "date_column IS NULL" like it does it Postgres 7.0.2, but instead it returned an empty result set. After conversing with some folks on the MySQL list, it was mentioned that: * "NULL is *NOT* a value. It's an absence of a value, and doing *any* comparisons with NULL is invalid (the result must always be NULL, even if you say "foo = NULL")." * Postgres handling is non-standard (even if it's intuitive.) My questions then are: 1.) What IS the standard for handling NULLs? and then 2.) If Postgres handling is different than the standard, what's the reason? To me, having " = NULL" be the same as " IS NULL" is intuitive and thus useful, but I also like appeal of using standards when possible. :) Thanks! -mark http://mark.stosberg.com/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [SQL] About table column names.
David BOURIAUD wrote: > > Hi the list ! > As far as I know, column names for a table can't contain any space, > tabs, and other sort of "exotic" characters. In fact, I know you can have at least spaces in your column names, like this: mark=> create table t ("column one" text); CREATE Just put quotes around them. > Is there a way to add a > description of a table column anywhere in postgres tables, or does it > have to be handled manually by creating a custum table handling this > kind of datas ? Thanks by advance for any suggestion. I'm interested in this, too. It seems more useful than having them in a SQL file...which can sometimes get out of synch with the database. :) -mark http://mark.stosberg.com/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] Re: Help!!! Trying to "SELECT" and get a tree structure back.
--CELKO-- wrote: > > >> The table causing my headache: > > CREATE TABLE app_components > (idNUMERIC(7) NOT NULL PRIMARY KEY, > name VARCHAR(100) NOT NULL, > description VARCHAR(500) NULL, > parent_id NUMERIC(7) NULL >REFERENCES app_components(id) >ON DELETE CASCADE, > CONSTRAINT appcomp_name_u UNIQUE (name, parent_id)); << I first tried the above approach to model trees in SQL, which also caused me headaches. The recursion needed to find all the ancestors for a given id was slow. So I bought and looked through Joe Celko's book (who recently posted on this topic). I implemented his ideas, and found that they were better than the method above (and faster, as he says), but I still wasn't satisfied. First, I didn't like that the notion wasn't easily parsable for me. Updating and deleting categories felt like hacks, and moving a category seemed like too much work. So I kept looking for new ideas to model trees in SQL. On my third try, I found a solution I was happy with, which I'll call the "sort key" method. I first read about it here: http://philip.greenspun.com/wtr/dead-trees/53013.htm (Search for "Sort keys deserve some discussion") on this page The sort key is a single string that gives you the location of a node in a tree. Used in conjunction with a parent_id, I found that most of the questions I was asking were easy to answer: Who is my parent? Who are all my ancestors? Who are my immediate children? How many descendants do I have? Who are siblings? Furthermore, it's fairly straightforward to manipulate items using this structure, and queries are fast-- most questions can answered with one SQL statement. Finally, the sort_keys are fairly human parsable, which is nice. The trade-off for all these features is that you have a fixed number of immediate children for any parent (based on how many characters are used for each piece of the sort key). I think in my application to categorize data, each parent can only have 62 immediate children. I can live with that. Cascade is a complete (free) Perl/Postgres application using this scheme if you are interested in seeing these ideas in action. It's homepage is here: http://summersault.com/software/cascade/ You'll be able to get a demo and source code from there. Thanks, -mark http://mark.stosberg.com/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
[SQL] need assistance with multi-row matching expression
Hello, I'm using Postgres 7.1 and have been enjoying the upgraded "sub-select" support in this release versus older versions. At the moment, I'm stuck with a SQL issue that I haven't run into before. I need to select the data for all the "parks" that match some search criteria. The parks are stored in a "parks" table with a park_id as a primary key. Part of the search includes the logic of "match parks that include all these features". The "features" are stored in their own table, and are related to the parks table with a park_feature_map table, which contains a park_id column and a feature_id column. A user can use 0 to N to features, and each park might have 0 to N entries in the park_feature_map table. Where I'm stuck is that I'm used to putting together SQL statements to match a given row. This is different-- to create a successful match for a park_id, I need to check to match against N rows, where N is the number of feature_ids provided. How do I do that? Can I do it in one query? Thanks! -mark http://mark.stosberg.com/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] need assistance with multi-row matching expression
On Mon, 19 Aug 2002, Nick Fankhauser wrote: > > This may not be the best way, but I couldn't resist taking a shot at it... Thanks for the response Nick. If only I knew I was going to get a response from a block away, I would have just come down to say hi. :) I had an "a ha" moment about this over lunch. I was making the problem much harder than it needed to me, having assured myself I was going to need some advanced SQL feature to solve the problem. Some testing seems to reveal that I can address this problem simply by joining against the park_feature_map table N times. This way I only need to match against 1 row each of these tables, which is easy in SQL. Here's my statement I tested with for N=2: SELECT p.park_id, park_name FROM parks p JOIN park_feature_map map_4 ON (p.park_id = map_4.park_id AND map_4.feature_id=4) JOIN park_feature_map map_15 ON (p.park_id = map_15.park_id AND map_15.feature_id=15); In this way, I'm only returned the parks that match all the features. Thanks again for your help! -mark http://mark.stosberg.com/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [SQL] Event recurrence - in database or in application code ????
Hello Darrin, I recently implemented what I would consider the "hard part" of a solution to this using Perl and Postgres. My solution handles multi-day events and recurring events, including events that are both multi-day and recurring. Here's an overview of how I did it: A table called "calendar" has just one column, "date". I inserted 10,000 rows into the table, one for every day starting a couple of years back and going _way_ into the the future. This is so that when I construct a SELECT statement to say "show me every day in May, 2002", I get back a row for every day, regardless of whether or not there was an event. A second table "events", holds my events including an event_id, and start and end dates and times. There is one row for each event, no matter if it recurs or is multi-day. A third table "events_calendar" is built based on the "events" table. In this table, a row is inserted for every day that an event occurs. So if an event spans 3 days and occurs a total of 3 times, there are 9 rows added to this table. For recurring events, the start and end dates and times are adjusted to be "local" to this occurance, not the original start date and time. In addition to the fields contained in the "events" table, the events_calendar table also has "date" column to denote which date is being refered to. Now with a simple SELECT statement that joins the calendar table with the events_calendar table, I can easily build a public view of the data with events appearing on as many dates as they should. On the administrative side, I have a few functions to make this work: - a function to build the entire events_calendar table initially - some functions to handle inserting events into events_calendar - some funcions to handle deleting events from events_calendar When I make an insert in the events table, I run the functions to create the inserts for the events_calendar. When I delete from the events table, the related rows from events_calendar table get deleted. When updating the events table, I delete from events_calendar, and then re-insert into it. I'm sure this piece could be done with triggers, but I'm much better at writing Perl, so I did it that way. :) I've been happy with this solution. I think the Perl turned out to be fairly easy to understand and maintain, the SQL that needs to be used ends up being fairly straightforward, and the performance is good because the selects to view the calendar are fairly simple. The one drawback is that sometime before 2028, I have to remember to add some rows to the calendar table. :) -mark http://mark.stosberg.com/ On Tue, 20 Aug 2002, Darrin Domoney wrote: > One of the features that I am attempting to implement in the system that I > am building is > the capability to schedule events (with recurrence). My question to those of > you that are > more experienced in postgresql is whether you would implement this > functionality in the > database level using triggers or at the application code level (PHP). > > Ideally I would like to be able to generate an event off a template > appointment (initial appt) > and have it schedule identical appointments hourly, daily, weekly, monthly, > or by selectable > range (ie: first tuesday every month). I would also like to have the > capability to remove an > appointment and leave others (much like using C pointers - I think)or to > remove all > (cascading delete). > > Any suggestions, etc gratefully appreciated. > > Darrin > > > > ---(end of broadcast)--- > TIP 4: Don't 'kill -9' the postmaster > ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] Event recurrence - in database or in application code ????
On 21 Aug 2002, Robert Treat wrote: > On Tue, 2002-08-20 at 22:15, Mark Stosberg wrote: > > > > Hello Darrin, > > > > > > > I've been happy with this solution. I think the Perl turned out to be > > fairly easy to understand and maintain, the SQL that needs to be used > > ends up being fairly straightforward, and the performance is good > > because the selects to view the calendar are fairly simple. The one > > drawback is that sometime before 2028, I have to remember to add some > > rows to the calendar table. :) > > > > You need to add rows as well as re-populate a bunch of info for > recurring dates that are not listed forward right? Perhaps this will answer your question Robert-- one point I didn't mention before is that I don't allow events events to recur forever, they have end after some finite number of times. You could add a birthday and tell it to repeat it once a year for the next 100 years for example. I wouldn't have to go and add rows for these later though-- the rows needed for the next 100 years would already be generated in the events_calendar table. The only thing that "expires" with my solution is the dates in the calendar table. I could make the dates run for the next 100 years just as easy as 28 years, I just figured the system would probably get a significant revamp sometime in the next quarter century. :) -mark http://mark.stosberg.com/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [SQL] Separating data sets in a table
On Sun, 25 Aug 2002, Andreas Tille wrote: > On Sat, 24 Aug 2002, Mark Stosberg wrote: > > > On Thu, 22 Aug 2002, Andreas Tille wrote: > > > Hello, > > > > > > I want to solve the following problem: > > > > > > CREATE TABLE Ref( Id int ) ; > > > CREATE TABLE Import ( Idint, > > > Other varchar(42), > > > Flag int, > > > Tstimestamp ) ; > > > CREATE TABLE Data ( Idint, > > > Other varchar(42) ) ; > > larger problem. I get the sense that you have data you importing on a > > regular basis from outside Postgres, and you want to check it before > > it get moves into production, but I'm not exactly sure what's happening. > > You are completely right. I just do an import from an external database. > The person I obtain the data from does an output of the table in a form > to do a "COPY FROM". The problem is that it might happen that there are > some data rows which infringe referential integrity and I have to ask > back the data provider for additional data which describe additional data > which are referenced by the Id mentioned above. So I have to sort out those > data sets who have no known Id in my production data. Andreas, Thanks for the clarification. Here's an idea about how to solve your problem. As you are importing your data, instead of doing it all at once, try import it a row at a time into a table that has the RI turned on. Check each insert to see if it's successful. It if it's not successful, then insert that row into a table that /doesn't/ have RI (maybe "import_failures"), perhaps also including the error that Postgres returned. (This may be stored in $DBH::errstr). Then when you are done, you can look in the import_failures for a report of which rows need some assistance. If you need every row to succeed that's imported into the production table, you can do all this inside of a transaction, and roll it back if any of the inserts fail. [ thinks for a moment. ] Of course, that would normally rollback your inserts into import_failures too, so perhaps you can use a second database connection to make sure those always happen. I hope that helps. Perhaps thinking in terms of "row-at-a-time processing" will help you solve your problem. -mark http://mark.stosberg.com/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [SQL] how to refer to tables in another database( or schema)
On Mon, 19 Aug 2002, Stephan Szabo wrote: > On Mon, 19 Aug 2002, Jiaqing wrote: > > > Hello, > > I'm still new here and new to PostgreSQL, I'd like to know that after I > > have created two databases on my site, such as one is called backend, and > > another one is called admin, how do I refer(query) the table from backend > > while I'm connected to admin database, or is it possible to do that in > > PostgreSQL? any answer is appreciated. > > In addition to previous answers (dblink related), in 7.3 schemas will > be implemented and you may be able to use one database with two schemas > in which case normal sql should work. This isn't out yet, so it's a > future concern. One problem space that I think either of these solutions might address is the issue of having static "country code" and "state code" tables reproduced on many databases throughout an installation. Would anyone recommend either of these solutions, or another one, for addressing this issue? I'm not looking forward to the day when a new country appears, and I have to find all the places I have country code lists to add it. :) -mark http://mark.stosberg.com/ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [SQL] recursive sql (using the sort_key method)
In article <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]> wrote: > > can anyone recommend a good reference source for doing recursive sql on > postgresql? i want to do something similar to a BOM expansion. (i.e. i need > to traverse a self-referencing table that stores a tree structure and answer > a question like "Get me A and all of A's descendents") Floyd, When building Cascade ( http://summersault.com/software/cascade ), I struggled with a few different models for storing a tree structure in Postgres. Here are some bits of how the system I settled on works. I've been really happy with it, both of in terms of performance, but also in terms of ease of writing queries that make use of it. category_id | integer| not null default nextval('"cas_category_category_id_seq"'::text) parent_id | integer| sort_key| character varying(255) | The 'parent_id' is not strictly needed, but makes some queries easier. The 'sort_key' is real crux of the system. It may be best explained by illustration. Each node in the tree has a two letter code associated with it. For the root node in the tree, this is 'aa'. Each child node forms its "sort_key" value by taking it's parents value and appending it's own. So the first child of the root node would have: And the second child would have aaab Here's an actual snapshot of my database using this: (from Skatepark.org ) category_id | parent_id | sort_key |name -+---+--+- 0 | | aa | Top 10 | 0 | aaab | Propaganda 43 |10 | aaabaa | Quotes 12 |10 | aaabab | Presentations 64 |10 | aaabac | Public Parks 65 |10 | aaabad | Private Parks 66 |10 | aaabae | Essays 67 |10 | aaabaf | Letters 69 |10 | aaabah | Surveys 70 |10 | aaabai | Waivers 4 |10 | aaabaj | Legislation 54 | 4 | aaabajaa | Youth in Politics 36 |10 | aaabak | Statistics 3 |10 | aaabal | Media Coverage 30 | 3 | aaabalaa | Success Stories 19 |10 | aaabam | Sarcastic Rants 8 |10 | aaaban | Web Services 37 | 0 | aaag | Fund-raising 46 |37 | aaagaa | Grants 9 | 0 | aaai | Design and Building ### Answering a question like "Get me all descendants of the 'Propaganda' category" becomes very easy: SELECT category_id, name from cas_category WHERE sort_key like 'aaab%'; By using "LIKE" above, and checking the length of the sort_key, just about any tree related query becomes easy, especially when you have the parent_id as well. You can look at the Cascade source code for more examples that use this. The one 'drawback' to this system is that it doesn't support trees of infinite size. If I'm doing my math right, I think the design above 'only' supports 676 children per node. I've never run into that limitation. :) Of course, you could always make each piece of the sort_key longer, if you needed to support more children per node. Mark > > Regards, > > Floyd Shackelford > 4 Peaks Technology Group, Inc. > VOICE: 334.735.9428 > FAX: 702.995.6462 > EMAIL: [EMAIL PROTECTED] > ICQ #: 161371538 > PGP Key ID: 0x2E84F2F2 > PGP Fone at private.fwshackelford.com on request > > Shackelford Motto: ACTA NON VERBA - Actions, not words > > Alabama StateMotto: AUDEMUS JURA NOSTRA DEFENDERE - We Dare Defend Our > Rights > > The Philosophy of Liberty: http://www.isil.org/resources/introduction.swf > > "We have allowed our constitutional republic to deteriorate into a virtually > unchecked direct democracy. Today's political process is nothing more than a > street fight between various groups seeking to vote themselves other > people's money. Individual voters tend to support the candidate that > promises them the most federal loot in whatever form, rather than the > candidate who will uphold the rule of law." --Rep. Ron Paul > > > ---(end of broadcast)--- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match > -- -- http://mark.stosberg.com/ ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [SQL] A tricky sql-query...
On 2003-10-22, Timo <[EMAIL PROTECTED]> wrote: > > You can't have any recursion in an pure sql-query, can you? It depends on how you think of recursion, I'd say. You join on the same table a number of times, by giving it a different alias each time. You have to manually specify (or generate with application code) all these aliases and joins, though. Sometimes people use this technique to implement tree structures in SQL. Mark -- http://mark.stosberg.com/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[SQL] Avoiding RI failure with INSERT ... SELECT FROM
Hello! I got an unexpected SQL failure last night, and want to see how to prevent it in the future. This is with 8.1.3. I have a couple of insert statements that run in a transaction block, initializing rows that will be used the next day: INSERT into item_hit_log SELECT item_id, CURRENT_DATE + CAST('1 day' AS interval), 0 FROM items where item_state = 'available'; INSERT into item_view_log SELECT item_id, CURRENT_DATE + CAST('1 day' AS interval), 0 FROM items where item_state = 'available'; The "items" table has a few hundred thousand rows in it, so this takes a bit a run. The error we got last night was: ERROR: insert or update on table "item_hit_log" violates foreign key constraint "item_id_fk" DETAIL: Key (item_id)=(451226) is not present in table "items". Re-running the transaction block a few minutes later worked. What's an alternate design that avoids this possible error? Thanks! Mark ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[SQL] INSERT ... SELECT FROM .... FOR UPDATE?
Hello, This is a re-phrasing of question I asked recently, inquiring about a an approach I'm considering. My query is: > INSERT into item_hit_log (item_id, hit_date, hit_count) >SELECT item_id, CURRENT_DATE + CAST('1 day' AS interval), 0 > FROM items where item_state = 'available'; The "items" table has a few hundred thousand rows in it, and is likely to be updated by other queries while this runs. The error we got last night was: ERROR: insert or update on table "item_hit_log" violates foreign key constraint "item_id_fk" DETAIL: Key (item_id)=(451226) is not present in table "items". Re-running the transaction block a few minutes later worked. From reading the docs, it sounds like "SELECT ... FOR UPDATE" might be the perfect solution here. http://www.postgresql.org/docs/8.1/static/sql-select.html#SQL-FOR-UPDATE-SHARE Since it's hard to test for this kind of failure, I wanted to get the opinions of others here if this would be a good idea. Thanks! Mark ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[SQL] recommended data model for a search?
We have an application that manages RSS feeds for searches. Our first attempt at modeling this was to create a table for the searches with the standard integer primary key generated from a sequence, and then have one column per search key. Each row should represent a unique search. When a new search comes we need to look up the key/value pairs received and find the unique search ID that corresponds to it. ( IE, we need to know whether we have an "insert" case or an "update" case, without the benefit of being provided the primary key for the update case. ) The problem with this design is that there are 13 potential search terms, and the "unique" index we would like to create to avoid duplicates and speed-up searches would necessarily span all 13 columns. And further complicating the matter is the handling of nulls, which need to be considered as part of the unique signature of a search, but are not easily indexed that way. Having gotten this far, I'm taking a step back to consider if there's a better way to design this. Perhaps there's a standard solution for this, but I'm not sure where to find it. Here's one idea I've had for a refined design: Each search can be represented as a unique, sorted query string. It would be easy for the application to compute this and then make an MD5-hash of it (or similar). The MD5-hash would then be used as the unique key instead of a standard integer. This would then be easily indexable for quick look-ups, and would allow us to create a unique index that doesn't have a problem with null values, too. Is this a reasonable approach, or is there a better design I'm overlooking? Thanks, Mark -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Need help revoking access WHERE state = 'deleted'
We are working on a project to start storing some data as "soft deleted" (WHERE state = 'deleted') instead of hard-deleting it. To make sure that we never accidentally expose the deleted rows through the application, I had the idea to use a view and permissions for this purpose. I thought I could revoke SELECT access to the "entities" table, but then grant SELECT access to a view: CREATE VIEW entities_not_deleted AS SELECT * FROM entities WHERE state != 'deleted'; We could then find/replace in the code to replace references to the "entities" table with the "entities_not_deleted" table However, this isn't working, I "permission denied" when trying to use the view. (as the same user that has had their SELECT access removed to the underlying table.) We are not stuck on this design. What's a recommended way to solve this problem? Mark -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Need help revoking access WHERE state = 'deleted'
On 02/28/2013 01:02 PM, Ben Morrow wrote: > Quoth m...@summersault.com (Mark Stosberg): >> >> We are working on a project to start storing some data as "soft deleted" >> (WHERE state = 'deleted') instead of hard-deleting it. >> >> To make sure that we never accidentally expose the deleted rows through >> the application, I had the idea to use a view and permissions for this >> purpose. >> >> I thought I could revoke SELECT access to the "entities" table, but then >> grant SELECT access to a view: >> >> CREATE VIEW entities_not_deleted AS SELECT * FROM entities WHERE state >> != 'deleted'; >> >> We could then find/replace in the code to replace references to the >> "entities" table with the "entities_not_deleted" table > > (If you wanted to you could instead rename the table, and use rules on > the view to transform DELETE to UPDATE SET state = 'deleted' and copy > across INSERT and UPDATE...) > >> However, this isn't working, I "permission denied" when trying to use >> the view. (as the same user that has had their SELECT access removed to >> the underlying table.) > > Works for me. Have you made an explicit GRANT on the view? Make sure > you've read section 37.4 'Rules and Privileges' in the documentation, > since it explains the ways in which this sort of information hiding is > not ironclad. Thanks for the response, Ben. Here's a "screenshot" of our issue, showing that even an explicit grant on the view doesn't fix things. This with 9.1. # Revoke from the underlying table db=> revoke select on entities from myuser; REVOKE # Try selecting through the view db=> SELECT 1 FROM entities_not_deleted WHERE some_col = 'y'; ERROR: permission denied for relation entities # Explicitly grant access to the view. db=> grant select on entities_not_deleted to myuser; GRANT # Try again to use the view. Still fails db=> SELECT 1 FROM entities_not_deleted WHERE some_col = 'y'; ERROR: permission denied for relation entities ### I've also now read 37.4. That was helpful, but didn't lead to a breakthrough for me. Mark -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Need help revoking access WHERE state = 'deleted'
On 02/28/2013 02:08 PM, Tom Lane wrote: > Mark Stosberg writes: >> # Explicitly grant access to the view. >> db=> grant select on entities_not_deleted to myuser; >> GRANT > >> # Try again to use the view. Still fails >> db=> SELECT 1 FROM entities_not_deleted WHERE some_col = 'y'; >> ERROR: permission denied for relation entities > > What's failing is that the *owner of the view* needs, and hasn't got, > select access on the entities table. This is a separate check from > whether the current user has permission to select from the view. > Without such a check, views would be a security hole. This was precisely our issue. Thanks, Tom. I changed the owner of the view, and our approach is working now. Mark -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Peer-review requested of soft-delete scheme
Hello, I'm working on designing a soft-delete scheme for our key entity-- there are 17 other tables that reference our key table via RI. Let's call the table "foo". I understand there are a couple common design patterns for soft-deletes: 1. Use a trigger to move the rows to a "tombstone table". 2. Add an "deleted flag" to the table. The "tombstone table" approach is out for us because all the RI. The "deleted flag" approach would be a natural fit for us. There's already a "state" column in the table, and there will only be a small number rows in the "soft-deleted" state at a time, as we'll hard-delete them after a few months. The table has only about about 10,000 rows in it anyway. My challenge is that I want to make very hard or impossible to access the soft-deleted rows through SELECT statements. There are lots of selects statements in the system. My current idea is to rename the "foo" table to something that would stand-out like "foo_with_deleted_rows". Then we would create a view named "foo" that would select all the rows except the soft-deleted views. I think that would make it unlikely for a developer or reviewer to mess up SELECTs involving the statement. Inserts/Updates/Delete statements against the table are view, and coud reference the underlying table directly. Is this sensible? Is there another approach to soft-deletes I should be considering? Thanks! Mark -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] GROUP BY on a column which might exist in one of two tables
Hello, I could use some help figuring out a complex "GROUP BY". As a setup, let's say I have two tables which stores a single "hit" or "view" in a row hits hit_id partner_id views view_id partner_id There is of course a "partners" table with a "partner_id" column. My target result is more like partner_id total_views total_hits Where the totals are "counts" of the the rows in the hits and views tables. There should be no rows for partners without hits or views. My first attempt at the SQL for this GROUP COALESCE(hits.partner_id,views.partner_id) which didn't work. I'm stuck on what to try next. Thanks! Mark -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] GROUP BY on a column which might exist in one of two tables
> select > coalesce(h.partner_id, v.partner_id) as partner_id, > coalesce(v.count, 0) as total_views, > coalesce(h.count, 0) as total_hits > from > (select partner_id, count(*) from hits group by partner_id) as h > full outer join > (select partner_id, count(*) from views group by partner_id) as v > on h.partner_id = v.partner_id > ; > That looks right. Thanks! -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] GROUP BY on a column which might exist in one of two tables
On Wed, 2008-06-25 at 14:20 +, Greg Sabino Mullane wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: RIPEMD160 > > > > Where the totals are "counts" of the the rows in the hits and views > > tables. There should be no rows for partners without hits or views. > > How about something like this?: > > SELECT partner_id, COUNT(view_id) AS total_views, COUNT(hit_id) AS total_hits > FROM ( > SELECT partner_id, hit_id, NULL AS view_id FROM hits > UNION ALL > SELECT partner_id, NULL, view_id FROM views > ) AS foo > GROUP BY 1; That was very helpful, Greg. My new design looks much like this. Mark -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Egroupware infolog query slow (includes query plan)
Hello, I could use some help figuring out how to speed up a query. Below is the SQL and query plan for a common SELECT done by the open source eGroupware project. Now that there are about 16,000 rows in egw_infolog and 32,000 in egw_infolog_extra, the below query takes about 6 minutes to finish! I'm not skilled enough at reading the "Explain Analzyze" output to understand what the primary problem is. Thanks! Mark ### SELECT DISTINCT main.* ,( SELECT count(*) FROM egw_infolog sub WHERE sub.info_id_parent=main.info_id AND (info_owner=6 OR ((','||info_responsible||',' LIKE '%,-2,%' OR ','||info_responsible||',' LIKE '%,-1,%' OR ','||info_responsible||',' LIKE '%,6,%') AND info_access='public') OR info_owner IN (6) OR (info_access='public' AND info_owner IN(6))) ) AS info_anz_subs FROM egw_infolog main LEFT JOIN egw_infolog_extra ON main.info_id=egw_infolog_extra.info_id WHERE ( (info_owner=6 OR ((','||info_responsible||',' LIKE '%,-2,%' OR ','||info_responsible||',' LIKE '%,-1,%' OR ','||info_responsible||',' LIKE '%,6,%') AND info_access='public') OR info_owner IN (6) OR (info_access='public' AND info_owner IN(6))) AND info_status <> 'deleted' ) ORDER BY info_datemodified DESC LIMIT 15 OFFSET 0 Query plan: Limit (cost=68624989.18..68624991.31 rows=15 width=1011) (actual time=686260.735..686260.878 rows=15 loops=1) -> Unique (cost=68624989.18..68627288.59 rows=16212 width=1011) (actual time=686260.733..686260.857 rows=15 loops=1) -> Sort (cost=68624989.18..68625068.47 rows=31716 width=1011) (actual time=686260.730..686260.766 rows=29 loops=1) Sort Key: main.info_datemodified, main.info_id, main.info_type, main.info_from, main.info_addr, main.info_subject, main.info_des, main.info_owner, main.info_responsible, main.info_access, main.info_cat, main.info_startdate, main.info_enddate, main.info_id_parent, main.info_planned_time, main.info_used_time, main.info_status, main.info_confirm, main.info_modifier, main.info_link_id, main.info_priority, main.pl_id, main.info_price, main.info_percent, main.info_datecompleted, main.info_location, main.info_custom_from, (subplan) -> Merge Left Join (cost=0.00..68594428.95 rows=31716 width=1011) (actual time=21.358..684226.134 rows=32424 loops=1) Merge Cond: (main.info_id = egw_infolog_extra.info_id) -> Index Scan using egw_infolog_pkey on egw_infolog main (cost=0.00..3025.84 rows=16212 width=1011) (actual time=0.060..135.766 rows=16212 loops=1) Filter: (((info_owner = 6) OR (','::text || (info_responsible)::text) || ','::text) ~~ '%,-2,%'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~ '%,-1, %'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~ '%,6,%'::text)) AND ((info_access)::text = 'public'::text)) OR (info_owner = 6) OR (((info_access)::text = 'public'::text) AND (info_owner = 6))) AND ((info_status)::text <> 'deleted'::text)) -> Index Scan using egw_infolog_extra_pkey on egw_infolog_extra (cost=0.00..1546.30 rows=32424 width=4) (actual time=0.025..317.272 rows=32424 loops=1) SubPlan -> Aggregate (cost=2162.60..2162.61 rows=1 width=0) (actual time=21.073..21.073 rows=1 loops=32424) -> Seq Scan on egw_infolog sub (cost=0.00..2122.07 rows=16212 width=0) (actual time=21.065..21.065 rows=0 loops=32424) Filter: ((info_id_parent = $0) AND ((info_owner = 6) OR (','::text || (info_responsible)::text) || ','::text) ~~ '%,-2,%'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~ '%,-1,%'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~ '%,6, %'::text)) AND ((info_access)::text = 'public'::text)) OR (info_owner = 6) OR (((info_access)::text = 'public'::text) AND (info_owner = 6 Total runtime: 686278.730 ms -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Egroupware infolog query slow (includes query plan)
I should have mentioned in the last post that PostgreSQL 8.2.9 is in use. I could upgrade to 8.3.x if that is expected to help performance in this case. Mark On Sun, 2008-07-06 at 16:23 -0400, Mark Stosberg wrote: > Hello, > > I could use some help figuring out how to speed up a query. Below is the > SQL and query plan for a common SELECT done by the open source > eGroupware project. > > Now that there are about 16,000 rows in egw_infolog and 32,000 in > egw_infolog_extra, the below query takes about 6 minutes to finish! > > I'm not skilled enough at reading the "Explain Analzyze" output to > understand what the primary problem is. > > Thanks! > > Mark > > ### > > SELECT DISTINCT main.* ,( > SELECT count(*) FROM egw_infolog sub WHERE > sub.info_id_parent=main.info_id AND (info_owner=6 OR > ((','||info_responsible||',' LIKE '%,-2,%' OR > ','||info_responsible||',' LIKE '%,-1,%' OR > ','||info_responsible||',' LIKE '%,6,%') AND > info_access='public') OR info_owner IN (6) OR > (info_access='public' > AND info_owner IN(6))) > ) AS info_anz_subs FROM egw_infolog main > LEFT JOIN egw_infolog_extra ON main.info_id=egw_infolog_extra.info_id > WHERE ( > (info_owner=6 OR ((','||info_responsible||',' LIKE '%,-2,%' OR > ','||info_responsible||',' LIKE '%,-1,%' OR > ','||info_responsible||',' LIKE '%,6,%') AND > info_access='public') OR info_owner IN (6) OR > (info_access='public' > AND info_owner IN(6))) AND info_status <> 'deleted' ) > ORDER BY > info_datemodified DESC LIMIT 15 OFFSET 0 > > Query plan: > > Limit (cost=68624989.18..68624991.31 rows=15 width=1011) (actual > time=686260.735..686260.878 rows=15 loops=1) > -> Unique (cost=68624989.18..68627288.59 rows=16212 width=1011) > (actual time=686260.733..686260.857 rows=15 loops=1) > -> Sort (cost=68624989.18..68625068.47 rows=31716 width=1011) > (actual time=686260.730..686260.766 rows=29 loops=1) > Sort Key: main.info_datemodified, main.info_id, > main.info_type, main.info_from, main.info_addr, main.info_subject, > main.info_des, main.info_owner, main.info_responsible, main.info_access, > main.info_cat, main.info_startdate, main.info_enddate, > main.info_id_parent, main.info_planned_time, main.info_used_time, > main.info_status, main.info_confirm, main.info_modifier, > main.info_link_id, main.info_priority, main.pl_id, main.info_price, > main.info_percent, main.info_datecompleted, main.info_location, > main.info_custom_from, (subplan) > -> Merge Left Join (cost=0.00..68594428.95 rows=31716 > width=1011) (actual time=21.358..684226.134 rows=32424 loops=1) > Merge Cond: (main.info_id = > egw_infolog_extra.info_id) > -> Index Scan using egw_infolog_pkey on egw_infolog > main (cost=0.00..3025.84 rows=16212 width=1011) (actual > time=0.060..135.766 rows=16212 loops=1) > Filter: (((info_owner = 6) OR (','::text > || (info_responsible)::text) || ','::text) ~~ '%,-2,%'::text) OR > (((','::text || (info_responsible)::text) || ','::text) ~~ '%,-1, > %'::text) OR (((','::text || (info_responsible)::text) || ','::text) ~~ > '%,6,%'::text)) AND ((info_access)::text = 'public'::text)) OR > (info_owner = 6) OR (((info_access)::text = 'public'::text) AND > (info_owner = 6))) AND ((info_status)::text <> 'deleted'::text)) > -> Index Scan using egw_infolog_extra_pkey on > egw_infolog_extra (cost=0.00..1546.30 rows=32424 width=4) (actual > time=0.025..317.272 rows=32424 loops=1) > SubPlan > -> Aggregate (cost=2162.60..2162.61 rows=1 > width=0) (actual time=21.073..21.073 rows=1 loops=32424) > -> Seq Scan on egw_infolog sub > (cost=0.00..2122.07 rows=16212 width=0) (actual time=21.065..21.065 > rows=0 loops=32424) > Filter: ((info_id_parent = $0) AND > ((info_owner = 6) OR (','::text || (info_responsible)::text) || > ','::text) ~~ '%,-2,%'::text) OR (((','::text || > (info_responsible)::text) || ','::text) ~~ '%,-1,%'::text) OR > (((','::text || (info_responsible)::text) || ','::text) ~~ '%,6, > %'::text)) AND ((info_access)::text = 'public'::text)) OR (info_owner = > 6) OR (((info_access)::text = 'public'::text) AND (info_owner = 6 > Total runtime: 686278.730 ms > > > > > -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Best practice for long-lived journal tables: bigint or recycling IDs?
Hello, I have some tables that continually collect statistics, and then over time are pruned as the stats are aggregated into more useful formats. For some of these tables, it it is fore-seeable that the associated sequences would be incremented past the max value of the "int" type in the normal course of things. I see two options to prepare for that: 1. Convert the primary keys to "bigint", which should be good enough "forever". I suppose there would some minor storage and performance penalty. 2. Reset the sequence at some point. There would no "collisions", because the older rows would have long been pruned-out. I suppose there is an improbable edge case in which we restore some old data from tape and then are confused because some new data has the same IDs, but as I said, these tables are used as temporary holding locations, not permanent storage. Both options have some appeal for me. What have others done? Mark -- . . . . . . . . . . . . . . . . . . . . . . . . . . . Mark StosbergPrincipal Developer [EMAIL PROTECTED] Summersault, LLC 765-939-9301 ext 202 database driven websites . . . . . http://www.summersault.com/ . . . . . . . . -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Re: Best practice for long-lived journal tables: bigint or recycling IDs?
On Tue, 8 Jul 2008 17:20:13 -0400 Alvaro Herrera <[EMAIL PROTECTED]> wrote: > Mark Stosberg wrote: > > > > Hello, > > > > I have some tables that continually collect statistics, and then over time > > are > > pruned as the stats are aggregated into more useful formats. > > > > For some of these tables, it it is fore-seeable that the associated > > sequences > > would be incremented past the max value of the "int" type in the normal > > course > > of things. > > > > I see two options to prepare for that: > > 3. Deal with wraparound by ensuring that the applications behave sanely Wrap-around? Exceeding the max size of "int" looks more like a brick wall than wrap-around to me: insert into t values (2147483648); ERROR: integer out of range Mark -- . . . . . . . . . . . . . . . . . . . . . . . . . . . Mark StosbergPrincipal Developer [EMAIL PROTECTED] Summersault, LLC 765-939-9301 ext 202 database driven websites . . . . . http://www.summersault.com/ . . . . . . . . -- . . . . . . . . . . . . . . . . . . . . . . . . . . . Mark StosbergPrincipal Developer [EMAIL PROTECTED] Summersault, LLC 765-939-9301 ext 202 database driven websites . . . . . http://www.summersault.com/ . . . . . . . . -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Re: Best practice for long-lived journal tables: bigint or recycling IDs?
> > > 3. Deal with wraparound by ensuring that the applications behave sanely > > > > Wrap-around? > > > > Exceeding the max size of "int" looks more like a brick wall than > > wrap-around to me: > > > > insert into t values (2147483648); > > ERROR: integer out of range > > Hmm, you can alter the sequence so that it wraps around at the point it > reaches INT_MAX. So inserting this number would never actually happen. Ah, that does look like the best solution. I'll confirm that will work for our cases. Thanks! Mark > alvherre=# create table t (a serial); > NOTICE: CREATE TABLE créera des séquences implicites « t_a_seq » pour la > colonne serial « t.a » > CREATE TABLE > alvherre=# alter sequence t_a_seq maxvalue 2147483647; > ALTER SEQUENCE > alvherre=# alter sequence t_a_seq cycle; > ALTER SEQUENCE > alvherre=# select setval('t_a_seq', 2147483645); >setval > > 2147483645 > (1 ligne) > > alvherre=# insert into t default values; > INSERT 0 1 > alvherre=# insert into t default values; > INSERT 0 1 > alvherre=# insert into t default values; > INSERT 0 1 > alvherre=# insert into t default values; > INSERT 0 1 > alvherre=# insert into t default values; > INSERT 0 1 > alvherre=# select * from t; > a > > 2147483646 > 2147483647 > 1 > 2 > 3 > (5 lignes) -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Best practices for geo-spatial city name searches?
Hello, I use PostgreSQL and the "cube" type to perform geo-spatial zipcode proximity searches. I'm wondering about the best practices also supporting a geo-spatial distance search based on a city name rather than zipcode. In our original data model, we used a 'zipcodes' table, with the zipcode as the primary key. This can of course contain a "City Name" column, but there is a problem with this, illustrated a "Nome, Alaska" case. Nome's zipcode is 99762. It maps to multiple cities including Diomede, Alaska and Nome, Alaska. In the data model described, only the "Diomede" row is imported, and the other rows, including the "Nome, Alaska" row are dropped. So if you try to search for Nome, Alaska, you won't find anything. One solution would be to have a "cities" table, with the city/state as the primary key, and a zipcode as an additional column. Then, by joining on the zipcodes table, the coordinates for a city could be found. Is there any other way I should be considering data modelling to support searches on zipcodes and cities? Thanks! Mark -- . . . . . . . . . . . . . . . . . . . . . . . . . . . Mark StosbergPrincipal Developer m...@summersault.com Summersault, LLC 765-939-9301 ext 202 database driven websites . . . . . http://www.summersault.com/ . . . . . . . . -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql