Re: [GENERAL] full text search on hstore or json with materialized view?
On Tue, 18 Apr 2017 14:38:15 -0700, Rj Ewing wrote: >I am evaluating postgres for as a datastore for our webapp. We are moving >away from a triple store db due to performance issues. > >Our data model consists of sets of user defined attributes. Approx 10% of >the attributes tend to be 100% filled with 50% of the attributes having >approx 25% filled. This is fairly sparse data, and it seems that jsonb or >hstore will be best for us. > >Unfortunately, from my understanding, postres doesn't support fulltext >search across hstore or jsonb key:values or even the entire document. While >this is not a deal breaker, this would be a great feature to have. We have >been experimenting w/ elasticsearch a bit, and particularly enjoy this >feature, however we don't really want to involve the complexity and >overhead of adding elasticsearch in front of our datasource right now. hstore and JSON values all really are just formatted text with a custom column type. You can create tsvectors from the values if you cast them to text. Note that a tsvector can only work on a /flat/ key:value structure: it won't understand nesting, and it and even with a flat store it won't understand the difference between keys/tags and the associated values. E.g., you will be able to see that a value contains both "foo" and "bar", but to distinguish 'foo:bar' from 'bar:foo' or 'foo:q,bar:z' you either must check the token positions (from the tsvector) or *try* to extract the key(s) you are interested in and check the associated value(s). This might work ok if you search only for keys in a "document" ... but trying to search values, I think would be far too complicated. It might help if you stored a 2D array instead of a flat structure, but even that would be fairly complicated to work with. >An idea that has come up is to use a materialized view or secondary table >with triggers, where we would have 3 columns (id, key, value). > >I think this would allow us to store a tsvector and gin index. Giving us >the ability to use fulltext search on k:v pairs, then join the original >data on the id field to return the entire record. This is a much better idea because it separates the key from the value, and unlike the full "document" case [above], you will know that the FTS index is covering only the values. If you need to preserve key order to reconstruct records, you will need an additional column to maintain that ordering. >is anyone currently doing this? Is there a better alternative? Any >performance issues that immediately jump out ( I realize the writes will >take longer)? > >the nature of our data is "relatively" static with bulk uploads (100 - 1000 >records). So we can sacrifice some write performance. > >RJ Having to "reconstruct" records will make reads take longer as well, but I think separating the keys and values is the best way to do it. YMMV, George -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Large data and slow queries
Thanks John. Yes, you are absolutely right, you want the index to be bottom heavy so you can cull as much as possible at the top. I'm familiar with that, once implementing a brute-force sudoku solver, it has the same principle. I've been working on this all afternoon. By reducing the longitude, latitude columns to float4, in my test cases, I found about 50% improvement in performance. It may also use less space. So part of the problem was my choice of data type. We've computed that float4 has a worst case precision of about 1.6m which we are okay with for analytics data. Another option we may consider is using a (signed) integer - e.g. longitude = 180*(v/2^31) and latitude = 180*(v/2^31) as this has a uniform error across all points, but it's a bit more cumbersome to handle. Is there a rational datatype in postgres which works like this? On 19 April 2017 at 16:42, John R Pierce wrote: > On 4/18/2017 9:01 PM, Samuel Williams wrote: >> >> We want the following kinds of query to be fast: >> >> SELECT ... AND (latitude > -37.03079375089291 AND latitude < >> -36.67086424910709 AND longitude > 174.6307139779924 AND longitude < >> 175.0805140220076); > > > > I wonder if GIST would work better if you use the native POINT type, and > compared it like > > mypoint <@ BOX > '((174.6307139779924,-37.03079375089291),(175.0805140220076,-36.67086424910709 > ))' > > with a gist index on mypoint... > > but, it all hinges on which clauses in your query are most selective, thats > where you want an index. > > -- > john r pierce, recycling bits in santa cruz > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Large data and slow queries
On 4/18/2017 9:01 PM, Samuel Williams wrote: We want the following kinds of query to be fast: SELECT ... AND (latitude > -37.03079375089291 AND latitude < -36.67086424910709 AND longitude > 174.6307139779924 AND longitude < 175.0805140220076); I wonder if GIST would work better if you use the native POINT type, and compared it like mypoint <@ BOX '((174.6307139779924,-37.03079375089291),(175.0805140220076,-36.67086424910709 ))' with a gist index on mypoint... but, it all hinges on which clauses in your query are most selective, thats where you want an index. -- john r pierce, recycling bits in santa cruz -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Large data and slow queries
Oh, I've also tried earth distance and ll_to_earth in a GIST index... it was slower that the BTREE index on a small subset of data in my tests. On 19 April 2017 at 16:01, Samuel Williams wrote: > Hi. > > We have 400,000,000 records in a table (soon to be 800,000,000), here > is the schema (\d+) > > https://gist.github.com/ioquatix/bddda36d9e4ffaceb7a62d7b62259121 > > We want the following kinds of query to be fast: > > SELECT DISTINCT "user_event"."user_id" FROM "user_event" WHERE > "user_event"."what" IN ('poll', 'location_change', > 'midnight_location') AND ("user_event"."created_at" >= '2016-04-19 > 01:23:55') AND (latitude > -37.03079375089291 AND latitude < > -36.67086424910709 AND longitude > 174.6307139779924 AND longitude < > 175.0805140220076); > > We have a btree index and it appears to be working. However, it's > still pretty slow. > > EXPLAIN ANALYZE gives the following: > https://gist.github.com/ioquatix/bddda36d9e4ffaceb7a62d7b62259121#gistcomment-2065314 > > I'm thinking that I need to do the following to help: > > CLUSTER user_event ON index_user_event_for_visits_3 followed by > analyze... Our data is mostly time series but sometimes we get some > dumps with historical records. > > Perhaps add a BRIN index on created_at > > I'm wondering if... we can use an index to cache, all user_ids seen on > a given day. If we change our index to be more discrete, e.g. > created_at::date, would this help? The set union of user_ids for 365 > days should be pretty fast? > > I'm open to any ideas or suggestions, ideally we can keep > optimisations within the database, rather than adding a layer of > caching on top. > > Kind regards, > Samuel -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Large data and slow queries
Hi. We have 400,000,000 records in a table (soon to be 800,000,000), here is the schema (\d+) https://gist.github.com/ioquatix/bddda36d9e4ffaceb7a62d7b62259121 We want the following kinds of query to be fast: SELECT DISTINCT "user_event"."user_id" FROM "user_event" WHERE "user_event"."what" IN ('poll', 'location_change', 'midnight_location') AND ("user_event"."created_at" >= '2016-04-19 01:23:55') AND (latitude > -37.03079375089291 AND latitude < -36.67086424910709 AND longitude > 174.6307139779924 AND longitude < 175.0805140220076); We have a btree index and it appears to be working. However, it's still pretty slow. EXPLAIN ANALYZE gives the following: https://gist.github.com/ioquatix/bddda36d9e4ffaceb7a62d7b62259121#gistcomment-2065314 I'm thinking that I need to do the following to help: CLUSTER user_event ON index_user_event_for_visits_3 followed by analyze... Our data is mostly time series but sometimes we get some dumps with historical records. Perhaps add a BRIN index on created_at I'm wondering if... we can use an index to cache, all user_ids seen on a given day. If we change our index to be more discrete, e.g. created_at::date, would this help? The set union of user_ids for 365 days should be pretty fast? I'm open to any ideas or suggestions, ideally we can keep optimisations within the database, rather than adding a layer of caching on top. Kind regards, Samuel -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Experiencing "compressed data is corrupted" error
Hello, I have a massive table in my database, about 1.5TB and around 3.5 billion rows. (that is one partition of massive table). When I try to do backup, just on this one big table, after a while I get following error: pg_dump: Dumping the contents of table "ta_low" failed: PQgetResult() failed. pg_dump: Error message from server: ERROR: compressed data is corrupted pg_dump: The command was: COPY public.ta_low (foreign_id, name, numbers) TO stdout; where foreign_id is bigint, name is varchar(10) and extended storage, numbers is quite long array of ints, also extended storage obviously. The table works normally, I guess there will be small number of rows that are corrupted. I tried using approach described here: https://no0p.github.io/postgresql/ 2013/04/02/postgres-corruption-resolution.html However upon creating the checking function and running the query: select ctid from ta_low where chk(ta_low); I get another error, despite the exception handling in the function: ERROR: compressed data is corrupted CONTEXT: PL/pgSQL function chk(anyelement) while storing call arguments into local variables ** Error ** ERROR: compressed data is corrupted SQL state: XX000 Context: PL/pgSQL function chk(anyelement) while storing call arguments into local variables I want to backup this table and possibly replace the corrupted rows, although it is not top priority, and if single digits of rows are not working out of the billions its not a big issue. I'm using Postgresql 9.5. Any advice on how to find/replace/delete the corrupted rows is appreciated, or how to backup the table. Thank you.
Re: [GENERAL] streaming replication and archive_status
On Tue, Apr 18, 2017 at 5:20 AM, Luciano Mittmann wrote: > > > Hi Jeff, > > **Does each file in pg_xlog/archive_status/ have a corresponding file one > directory up? > > no corresponding file on pg_xlog directory. That is the question.. for > some reason or some parameter that I do not know, the files are considered > consumed but are not erased later. > I can see how a well-timed crash could leave behind a few .done files, but not 75 thousand of them. Are they still accumulating, or was it only an historical accumulation? Also, is this on Windows? Cheers, Jeff
Re: [GENERAL] full text search on hstore or json with materialized view?
A step in the right direction for me, however it doesn't appear to support per field full text searching. It is exciting though! On Tue, Apr 18, 2017 at 3:00 PM, Bruce Momjian wrote: > On Tue, Apr 18, 2017 at 02:38:15PM -0700, Rj Ewing wrote: > > I am evaluating postgres for as a datastore for our webapp. We are > moving away > > from a triple store db due to performance issues. > > > > Our data model consists of sets of user defined attributes. Approx 10% > of the > > attributes tend to be 100% filled with 50% of the attributes having > approx 25% > > filled. This is fairly sparse data, and it seems that jsonb or hstore > will be > > best for us. > > > > Unfortunately, from my understanding, postres doesn't support fulltext > search > > across hstore or jsonb key:values or even the entire document. While > this is > > not a deal breaker, this would be a great feature to have. We have been > > experimenting w/ elasticsearch a bit, and particularly enjoy this > feature, > > however we don't really want to involve the complexity and overhead of > adding > > elasticsearch in front of our datasource right now. > > Full text search of JSON and JSONB data is coming in Postgres 10, which > is to to be released in September of this year: > > https://www.depesz.com/2017/04/04/waiting-for-postgresql- > 10-full-text-search-support-for-json-and-jsonb/ > > -- > Bruce Momjian http://momjian.us > EnterpriseDB http://enterprisedb.com > > + As you are, so once was I. As I am, so you will be. + > + Ancient Roman grave inscription + >
Re: [GENERAL] full text search on hstore or json with materialized view?
On Tue, Apr 18, 2017 at 02:38:15PM -0700, Rj Ewing wrote: > I am evaluating postgres for as a datastore for our webapp. We are moving away > from a triple store db due to performance issues. > > Our data model consists of sets of user defined attributes. Approx 10% of the > attributes tend to be 100% filled with 50% of the attributes having approx 25% > filled. This is fairly sparse data, and it seems that jsonb or hstore will be > best for us. > > Unfortunately, from my understanding, postres doesn't support fulltext search > across hstore or jsonb key:values or even the entire document. While this is > not a deal breaker, this would be a great feature to have. We have been > experimenting w/ elasticsearch a bit, and particularly enjoy this feature, > however we don't really want to involve the complexity and overhead of adding > elasticsearch in front of our datasource right now. Full text search of JSON and JSONB data is coming in Postgres 10, which is to to be released in September of this year: https://www.depesz.com/2017/04/04/waiting-for-postgresql-10-full-text-search-support-for-json-and-jsonb/ -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] full text search on hstore or json with materialized view?
I am evaluating postgres for as a datastore for our webapp. We are moving away from a triple store db due to performance issues. Our data model consists of sets of user defined attributes. Approx 10% of the attributes tend to be 100% filled with 50% of the attributes having approx 25% filled. This is fairly sparse data, and it seems that jsonb or hstore will be best for us. Unfortunately, from my understanding, postres doesn't support fulltext search across hstore or jsonb key:values or even the entire document. While this is not a deal breaker, this would be a great feature to have. We have been experimenting w/ elasticsearch a bit, and particularly enjoy this feature, however we don't really want to involve the complexity and overhead of adding elasticsearch in front of our datasource right now. An idea that has come up is to use a materialized view or secondary table with triggers, where we would have 3 columns (id, key, value). I think this would allow us to store a tsvector and gin index. Giving us the ability to use fulltext search on k:v pairs, then join the original data on the id field to return the entire record. is anyone currently doing this? Is there a better alternative? Any performance issues that immediately jump out ( I realize the writes will take longer)? the nature of our data is "relatively" static with bulk uploads (100 - 1000 records). So we can sacrifice some write performance. RJ
Re: [GENERAL] tuple statistics update
Tom DalPozzo writes: > Hi, I'm using libpq to insert tuples in my table and keep looking at > statistics through psql instead. > I noticed that sometimes n_tuple_ins is not updated even after 1 min that > my transaction committed. If your session is just sitting, that's not surprising. I think stats updates are only transmitted to the collector at transaction end (and even then, only if it's been at least N msec since the last transmission from the current session). regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] tuple statistics update
On 04/17/2017 09:18 AM, Tom DalPozzo wrote: Hi, I'm using libpq to insert tuples in my table and keep looking at statistics through psql instead. I noticed that sometimes n_tuple_ins is not updated even after 1 min that my transaction committed. My libpq connection is kept alive. If I close the connection then the stats get updated. I know that stats are not instantaneous, but I thought that after a while that a transaction is committed it would be updated. Any of this apply?: https://www.postgresql.org/docs/9.6/static/monitoring-stats.html "Another important point is that when a server process is asked to display any of these statistics, it first fetches the most recent report emitted by the collector process and then continues to use this snapshot for all statistical views and functions until the end of its current transaction. So the statistics will show static information as long as you continue the current transaction. Similarly, information about the current queries of all sessions is collected when any such information is first requested within a transaction, and the same information will be displayed throughout the transaction. This is a feature, not a bug, because it allows you to perform several queries on the statistics and correlate the results without worrying that the numbers are changing underneath you. But if you want to see new results with each query, be sure to do the queries outside any transaction block. Alternatively, you can invoke pg_stat_clear_snapshot(), which will discard the current transaction's statistics snapshot (if any). The next use of statistical information will cause a new snapshot to be fetched." Regards Pupillo -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] tuple statistics update
Hi, I'm using libpq to insert tuples in my table and keep looking at statistics through psql instead. I noticed that sometimes n_tuple_ins is not updated even after 1 min that my transaction committed. My libpq connection is kept alive. If I close the connection then the stats get updated. I know that stats are not instantaneous, but I thought that after a while that a transaction is committed it would be updated. Regards Pupillo
[GENERAL] [OT] Help: stories of database security and privacy
Hi folks, in a few weeks I'll start a short course on the basics of database security for a group of high-school students with a background in elementary relational theory and SQL. I plan to explain the usage of grant/revoke, RBAC, DAC, and inference in statistical databases. I'd like to take the opportunity to also engage students about the topic of privacy (or lack thereof). So, I am here to ask if you have interesting/(in)famous stories to share on database security/privacy "gone wrong" or "done right"(tm), possibly with technical details (not necessarily to share with the students, but for me to understand the problems). I am asking to this list because I will use PostgreSQL, so maybe I can collect ideas that I can implement or demonstrate in practice. Thanks in advance, Life. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] # of connections and architecture design
On Tue, Apr 18, 2017 at 2:42 AM, Moreno Andreo wrote: > Hi all, > As many of you has read last Friday (and many has tired to help, too, > and I still thank you very much), I had a bad service outage. > I was pointed to reduce number of maximum connections using a pooler, and > that's what I'm building in test lab, but I'm wondering if there's > something I can do with my overall architecture design. > ATM we host one database per customer (about 400 now) and every customer > has two points of access to data: > - Directly to database, via rubyrep, to replicate the database he has in > his own machine > - Wia WCF self-hosted web services to read other customers data > Every customer can access (and replicate) his database from a number of > different positions (max 3). > Customers are organized in groups (max 10 per group), and there is the > chance that someone accesses someone else's data via WCF. > For example, pick up a group of 5: everyone running rubyrep with only one > position enabled, and getting data from others' database. > If I'm not wrong, it's 5 connections (rubyrep) plus 5*4 (everyone > connecting to everyone else's database) for WCF, so 25 connections > Now imagine a group of 10 > Last friday I've been told that 350 connections is quite a big number and > things can begin to slow down. Ok. When something slows down I'm used to > search and find the bottleneck (CPU, RAM, IO, etc). If everything was > running apparently fine (CPU < 10%, RAM used < 20%, I/O rarely over 20%), > how can I say there's a bottleneck that's slowing down things? Am I missing > something? > Another thing is that on a customer server (with a similar, smaller > architecture) I _do_ have a connection leak problem that's under > investigation, but when things begin to slow down I simply run a > pg_terminate_backend on all connection with an age > 10 min and everything > goes back to normal. On my server, last friday, it did not help, so I > thought that was not the main problem. > I've got no problems in splitting this architecture in how many servers I > need, but I think I need some tips on how to design this, in order to avoid > major issues in the near future (ask for details if needed). > > The current PostgreSQL 9.5.6 server is an 8 core VM with 52 GB RAM and > Debian 8. > WCF server is Windows 2012 R2 4-core, 16 GB RAM. > > While facing the issue none of them showed up any kind of overload and > their logs were clean. > > I'm a bit scared it can happen again. > The logs being clean doesn't help much, if your log settings are set to be too terse. Is log_lock_waits on? log_checkpoints? track_io_timing (doesn't show up in the logs, you have to query database views)? Is log_min_duration_statement set to a reasonable value? log_autovacuum_min_duration? Are you using pg_stat_statement (also doesn't show up in the logs, you have to query it), and perhaps auto_explain? Cheers, Jeff
[GENERAL] With the password stored a another database, how to securely connecting to server
Hi folks I've a questions about the secure conections. I'm designed a local app with a micro database with h2 (is the java app), this app connect to PostgreSQL with many users. I want store password encrypted with md5 in the microdb. But my question, with this encrypted password. How to securely connecting to server? or How you recommend this? I Write from app a .pgpass with users data? Or whats the best practice for this? Thanks for your help! - Dame un poco de fe, eso me bastará. Rozvo Ware Solutions -- View this message in context: http://www.postgresql-archive.org/With-the-password-stored-a-another-database-how-to-securely-connecting-to-server-tp5956994.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] QGIS Loads Black Screen For PostGIS Out-Db Raster Data
On 04/18/2017 07:37 AM, Osahon Oduware wrote: I forgot to mention that I have put the question on the qgis-user mailing list. Thanks for your help. Great, because I don't have a clue where to go from here. -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] QGIS Loads Black Screen For PostGIS Out-Db Raster Data
I forgot to mention that I have put the question on the qgis-user mailing list. Thanks for your help. On Tue, Apr 18, 2017 at 3:31 PM, Osahon Oduware wrote: > Hi Adrian, > > I think it is a problem with QGIS as things seems fine from the PostGIS > end (raster data) going by the following queries: > > *SELECT r_table_name, r_raster_column, out_db FROM raster_columns;* > returned the following output: > "test_rast","rast","[True, True, True]" > > *SELECT t.rid, (md).isoutdb, (md).path* > *FROM test_rast AS t, ST_BandMetaData(t.rast) AS md* > *limit 1;* > returned the following output: > 1,True,"/home/nagispg/local/src/SID_Test/rast_1.sid" > > *SELECT ST_Width(rast) wdth, ST_Height(rast) hgt* > *FROM test_rast;* > returned the following output: > 1,7000 > > The "Add to canvas" menu is a pop-up that is displayed when you > right-click on the raster table from the DB Manager Menu (Database => DB > Manager => DB Manager) > > On Tue, Apr 18, 2017 at 3:03 PM, Adrian Klaver > wrote: > >> On 04/18/2017 06:41 AM, Osahon Oduware wrote: >> >>> Hi All, >>> >>> I have an out-db raster on PostGIS which I tried to load in QGIS (v >>> 2.14.12-Essen) utilizing the DB Manager to establish a connection. I >>> could view the raster table, but when I try to "Add to canvas" it >>> displays a black screen instead of the raster image. >>> >> >> There is data in the table? >> >> I am not that familiar with GIS, much less QGIS. Still a quick look at >> the docs found this: >> >> http://docs.qgis.org/2.14/en/docs/user_manual/working_with_r >> aster/supported_data.html#what-is-raster-data >> >> I see Add Raster layer, nothing about Add to canvas. >> >> If the above does not answer the question I would say you will have a >> better chance of getting an answer here: >> >> http://lists.osgeo.org/mailman/listinfo/qgis-user >> >> >> >>> I would be glad if someone could help me to resolve this. >>> >> >> >> -- >> Adrian Klaver >> adrian.kla...@aklaver.com >> > >
Re: [GENERAL] CANNOT USE ANY INDEX ON UPSERT (INSERT.....ON CONFLICT)
Hi, Thank you for your suggestion, i'll try to implement it. Many thanks, Cheers, Agharta Il 18/04/2017 12:38, Alban Hertroys ha scritto: On 18 Apr 2017, at 10:13, agharta wrote: Hi all, I have a problem with INSERT ... ON CONFLICT sql command. Reading 9.6 documentation i see that ON CONFLICT command will accpets only index_column_name or index_expression (unique composite/primary indexes are valid too). So, my problem is that i can't create any type of upsert-valid index . Let me explain. I have a table T1 containing F1, F2, F3, F4 fields. I can insert same records in T1, MAX TWICE. How is UPSERT supposed to know which of a pair of duplicate records it is supposed to update? You'll have to make them unique somehow. The safest approach is usually to add a surrogate key based on a sequence. I can have records like (A,B,C,D),(B,A,D,C), etc.. and (A,B,C,D) AGAIN. Any other next insert of (A,B,C,D) is not allowed (actually it is avoided by a complex-and-slow-performance select count in before insert/update trigger). You're probably better off with an EXISTS query there. Something like: select F1, F2, F3, F4, case when exists (select 1 from T1 t where t.F1 = T1.F1 and t.F2 = T1.F2 and t.F3 = T1.F3 and t.F4 = T1.F4 and t.pk <> T1.pk) then 1 else 0 end as have_duplicate from T1 where F1 = NEW.F1 and F2 = NEW.F2 and F3 = NEW.F3 and F4 = NEW.F4 limit 1; The pk field in there is the surrogate key from the previous paragraph. Alternatively, wrap your COUNT around a sub-query that's limited to 2 results. No extra pk needed in that case, unless you still need to use UPSERT with that. In either case it will make a big difference to have an index on at least (F1, F2, F3, F4), perhaps with the new pk column added at the end. In this case i can't create any type of primary/unique index, like a composite F1,F2, F3, F4 index. (correct me if i am wrong please). Correct, you'll most likely have to add a new one (unless someone comes up with better suggestions). Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] QGIS Loads Black Screen For PostGIS Out-Db Raster Data
Hi Adrian, I think it is a problem with QGIS as things seems fine from the PostGIS end (raster data) going by the following queries: *SELECT r_table_name, r_raster_column, out_db FROM raster_columns;* returned the following output: "test_rast","rast","[True, True, True]" *SELECT t.rid, (md).isoutdb, (md).path* *FROM test_rast AS t, ST_BandMetaData(t.rast) AS md* *limit 1;* returned the following output: 1,True,"/home/nagispg/local/src/SID_Test/rast_1.sid" *SELECT ST_Width(rast) wdth, ST_Height(rast) hgt* *FROM test_rast;* returned the following output: 1,7000 The "Add to canvas" menu is a pop-up that is displayed when you right-click on the raster table from the DB Manager Menu (Database => DB Manager => DB Manager) On Tue, Apr 18, 2017 at 3:03 PM, Adrian Klaver wrote: > On 04/18/2017 06:41 AM, Osahon Oduware wrote: > >> Hi All, >> >> I have an out-db raster on PostGIS which I tried to load in QGIS (v >> 2.14.12-Essen) utilizing the DB Manager to establish a connection. I >> could view the raster table, but when I try to "Add to canvas" it >> displays a black screen instead of the raster image. >> > > There is data in the table? > > I am not that familiar with GIS, much less QGIS. Still a quick look at the > docs found this: > > http://docs.qgis.org/2.14/en/docs/user_manual/working_with_r > aster/supported_data.html#what-is-raster-data > > I see Add Raster layer, nothing about Add to canvas. > > If the above does not answer the question I would say you will have a > better chance of getting an answer here: > > http://lists.osgeo.org/mailman/listinfo/qgis-user > > > >> I would be glad if someone could help me to resolve this. >> > > > -- > Adrian Klaver > adrian.kla...@aklaver.com >
Re: [GENERAL] QGIS Loads Black Screen For PostGIS Out-Db Raster Data
On 04/18/2017 06:41 AM, Osahon Oduware wrote: Hi All, I have an out-db raster on PostGIS which I tried to load in QGIS (v 2.14.12-Essen) utilizing the DB Manager to establish a connection. I could view the raster table, but when I try to "Add to canvas" it displays a black screen instead of the raster image. There is data in the table? I am not that familiar with GIS, much less QGIS. Still a quick look at the docs found this: http://docs.qgis.org/2.14/en/docs/user_manual/working_with_raster/supported_data.html#what-is-raster-data I see Add Raster layer, nothing about Add to canvas. If the above does not answer the question I would say you will have a better chance of getting an answer here: http://lists.osgeo.org/mailman/listinfo/qgis-user I would be glad if someone could help me to resolve this. -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] UDP buffer drops / statistics collector
Okay, so I've run an strace on the collector process during a buffer drop event. I can see evidence of a recvfrom loop pulling in a *maximum* of 142kb. While I've had already increased rmem_max, it would appear this is not being observed by the kernel. rmem_default is set to 124kb, which would explain the above read maxing out just slightly beyond this (presuming a ring buffer filling up behind the read). I'm going to try increasing rmem_default and see if it has any positive effect.. (and then investigate why the kernel doesn't want to consider rmem_max).. On Tue, Apr 18, 2017 at 8:05 AM Tim Kane wrote: > Hi all, > > I'm seeing sporadic (but frequent) UDP buffer drops on a host that so far > I've not been able to resolve. > > The drops are originating from postgres processes, and from what I know - > the only UDP traffic generated by postgres should be consumed by the > statistics collector - but for whatever reason, it's failing to read the > packets quickly enough. > > Interestingly, I'm seeing these drops occur even when the system is idle.. > but every 15 minutes or so (not consistently enough to isolate any > particular activity) we'll see in the order of ~90 packets dropped at a > time. > > I'm running 9.6.2, but the issue was previously occurring on 9.2.4 (on the > same hardware) > > > If it's relevant.. there are two instances of postgres running (and > consequently, 2 instances of the stats collector process) though 1 of those > instances is most definitely idle for most of the day. > > In an effort to try to resolve the problem, I've increased (x2) the UDP > recv buffer sizes on the host - but it seems to have had no effect. > > cat /proc/sys/net/core/rmem_max > 1677216 > > The following parameters are configured > > track_activities on > track_counts on > track_functions none > track_io_timing off > > > There are approximately 80-100 connections at any given time. > > It seems that the issue started a few weeks ago, around the time of a > reboot on the given host... but it's difficult to know what (if anything) > has changed, or why :-/ > > > Incidentally... the documentation doesn't seem to have any mention of UDP > whatsoever. I'm going to use this as an opportunity to dive into the > source - but perhaps it's worth improving the documentation around this? > > My next step is to try disabling track_activities and track_counts to see > if they improve matters any, but I wouldn't expect these to generate enough > data to flood the UDP buffers :-/ > > Any ideas? > > > >
Re: [GENERAL] Clone PostgreSQL schema
On Tue, Apr 18, 2017 at 3:48 AM, R. Reiterer wrote: > Hi Melvin, > > after a first test, the function seems to work perfect! MANY THX!!! > > Regards, > > Reinhard > > Am 17.04.2017 17:21 schrieb Melvin Davidson: > >> On Mon, Apr 17, 2017 at 11:02 AM, Melvin Davidson >> wrote: >> >> On Mon, Apr 17, 2017 at 9:42 AM, Melvin Davidson >>> wrote: >>> >>> On Mon, Apr 17, 2017 at 2:20 AM, R. Reiterer >>> wrote: >>> Hi Melvin, >>> >>> thanks again for your help! I did some testing, but views in the >>> new schema still refer to the old schema. >>> >>> Regards, Reinhard >>> >>> Am 17.04.2017 04:53 schrieb Melvin Davidson: >>> On Sun, Apr 16, 2017 at 4:42 PM, R. Reiterer >>> wrote: >>> >>> Unfortunately, I do not have the skills to improve the function. >>> Maybe someone at dba.stackexchange.com [1] [1] can help me. I'll >>> >>> open a >>> >>> ticket. I hope this is okay for you. >>> >>> Am 16.04.2017 22:31 schrieb Melvin Davidson: >>> I missed to note that this is a VIEW issue (?) >>> AH, IN THAT CASE, YOU ARE RIGHT. I AM RETIRED, BUT IF YOU WISH, GO >>> AHEAD AND APPLY THE FIX YOURSELF. JUST MAKE A COMMENT AT THE TOP TO >>> INCLUDE THE DATE, YOUR NAME AND THE FIX. >>> >>> On Sun, Apr 16, 2017 at 4:24 PM, R. Reiterer >>> wrote: >>> >>> Hi Melvin, >>> >>> thanks for your reply. I missed to note that this is a VIEW issue >>> (?). After duplicating a schema, views in the cloned schema >>> (schema_new) refer still to the source schema (schema_old) in the >>> FROM clause: >>> >>> View in cloned schema (schema_new) --> >>> >>> CREATE VIEW schema_new.my_view AS >>> SELECT * >>> FROM schema_old.my_table; >>> >>> To me 'FROM schema_new.my_table' would be more logical. >>> >>> Regards, Reinhard >>> >>> Am 16.04.2017 22:12 schrieb Melvin Davidson: >>> On Sun, Apr 16, 2017 at 3:20 PM, R. Reiterer >>> wrote: >>> >>> Hi Melvin, >>> >>> I use your PL/pgSQL function posted at >>> >>> >>> https://www.postgresql.org/message-id/CANu8FixK9P8UD43nv2s% >> 2Bc-9jHkyy8wPUc_f5K3HcrmJYVPr-wQ%40mail.gmail.com >> >>> [2] >>> [2] >>> >>> [1] >>> [1] to clone schemas in PostgreSQL databases. Many thanks for your >>> work! >>> >>> I noticed that in cloned schemas the schema name isn't updated in >>> the FROM clause: >>> >>> schema_old --> >>> >>> CREATE VIEW schema_old.my_view AS >>> SELECT * >>> FROM schema_old.my_table; >>> >>> schema_new --> >>> >>> CREATE VIEW schema_new.my_view AS >>> SELECT * >>> FROM schema_old.my_table; >>> >>> Are you interessted to fix this? >>> >>> Regards, >>> >>> Reinhard >>> >>> FIRST, THANK YOU FOR THE COMPLEMENT. >>> >>> However, AFAIC, there is nothing to "fix" with regards to cloning >>> schema name. In a database, you cannot have two schemas with the >>> same >>> name, >>> >>> so what would be the point? If you want to "clone" to a different >>> database, then just use pg_dump and pg_restore. >>> >>> -- >>> >>> MELVIN DAVIDSON >>> I reserve the right to fantasize. Whether or not you >>> wish to share my fantasy is entirely up to you. >>> >>> Links: >>> -- >>> [1] >>> >>> >>> https://www.postgresql.org/message-id/CANu8FixK9P8UD43nv2s% >> 2Bc-9jHkyy8wPUc_f5K3HcrmJYVPr-wQ%40mail.gmail.com >> >>> [2] >>> [2] >>> [1] >>> >>> -- >>> >>> MELVIN DAVIDSON >>> I reserve the right to fantasize. Whether or not you >>> wish to share my fantasy is entirely up to you. >>> >>> Links: >>> -- >>> [1] >>> >>> >>> https://www.postgresql.org/message-id/CANu8FixK9P8UD43nv2s% >> 2Bc-9jHkyy8wPUc_f5K3HcrmJYVPr-wQ%40mail.gmail.com >> >>> [2] >>> [2] >>> >>> Reinhard, >>> >>> After reviewing things, I note it's possible that you downloaded an >>> earlier version that had some errors in it and was not as complete. >>> >>> Therefore, I've attached the latest, more complete version of the >>> function. Please let me know if this solves the problem. >>> >>> -- >>> >>> MELVIN DAVIDSON >>> I reserve the right to fantasize. Whether or not you >>> wish to share my fantasy is entirely up to you. >>> >>> Links: >>> -- >>> [1] http://dba.stackexchange.com [1] >>> [2] >>> >>> https://www.postgresql.org/message-id/CANu8FixK9P8UD43nv2s% >> 2Bc-9jHkyy8wPUc_f5K3HcrmJYVPr-wQ%40mail.gmail.com >> >>> [2] >>> >> >> My apologies, >> >> I though I had had a fix. I even worked on it a couple of hours this >> morning, but it seems it's a bit trickier than I thought. I'll keep >> trying >> >> until I get it right. >> >> -- >> >> MELVIN DAVIDSON >> I reserve the right to fantasize. Whether or not you >> wish to share my fantasy is entirely up to you. >> >> OK REINHARD, I THINK I HAVE IT, PLEASE TRY THE REVISION I HAVE >> ATTACHED. >> >> -- >> >> MELVIN DAVIDSON >> I reserve the right to fantasize. Whether or not you >> wish to share my fantasy is entirely up to you. >> >> OOPS, I FORGOT TO REMOVE THE PREMATURE RETURN, USE THIS LATEST >> ATTACHED. >> >> -- >> >> MELVIN DAVIDSON >> I reserve the right to fantasize. Whether or not you >> wish to share my fantasy is entirely up to you. >> >> >> >> Links: >> -- >> [1] http://dba.stackexch
[GENERAL] QGIS Loads Black Screen For PostGIS Out-Db Raster Data
Hi All, I have an out-db raster on PostGIS which I tried to load in QGIS (v 2.14.12-Essen) utilizing the DB Manager to establish a connection. I could view the raster table, but when I try to "Add to canvas" it displays a black screen instead of the raster image. I would be glad if someone could help me to resolve this.
Re: [GENERAL] Error During PostGIS Build From Source on Linux
Hi Adrian, This suggestion helped to resolve the problem. Thanks a lot for your help. On Sat, Apr 15, 2017 at 8:02 PM, Adrian Klaver wrote: > On 04/15/2017 11:25 AM, John R Pierce wrote: > >> On 4/15/2017 10:49 AM, Adrian Klaver wrote: >> >>> >>> Probably going to involve dealing with ldconfig. >>> >>> Look in /etc/ld.so.conf and see if the directory that contains >>> libproj.so.12 is in there? >>> >> >> the catch-22 is, /etc/ld.so.conf is a global thing, and this guy is >> building all his stuff to run under his /home/username/ (where, btw, >> selinux might get unhappy with you). instead, if its private stuff >> like that, use LD_LIBRARY_PATH in the environment of the processes that >> need the libraries. or link with the paths hard coded, eg >> -Wl,-rpath=/home/username//lib64 when linking the packages that >> refer to these libraries. >> > > Well when I did my build I also did some of it in my /home directory to > replicate the OP's process to a certain extent. To get everything to run I > symlinked the *.so's into the corresponding global directory covered by > ld.so.config and ran ldconfig. Then all was golden. Before that I got > similar error messages when CREATEing the postgis extension. > > >> >> >> >> > > -- > Adrian Klaver > adrian.kla...@aklaver.com > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Fwd: [GENERAL] Error During PostGIS Build From Source on Linux
Noted Something I meant to refer to earlier, it is considered bad etiquette to > cross post to multiple lists. It means in this case I have to remember to > drop the postgis list from my replies as I am not a member of that list and > I get a rejection notice if I forget.
Re: [GENERAL] streaming replication and archive_status
2017-04-17 20:04 GMT-03:00 Jeff Janes : > 2017-04-17 17:08 GMT-03:00 Jeff Janes : > >> On Mon, Apr 17, 2017 at 12:22 PM, Luciano Mittmann >> wrote: >> >>> Hi All, >>> >>> anyone knows why there are so many files in the directory >>> pg_xlog/archive_status/ in replication server? >>> >>> # pg_xlog/archive_status/ | wc -l >>> >>> 75217 >>> >>> Is possible to clean this .done files or just don't need to worry ? >>> >>> It's not occurs on primary or standalone servers, just on replication. >>> >> >> What version? Are you logging checkpoints, and if so what do they say? >> >> > On Mon, Apr 17, 2017 at 1:24 PM, Luciano Mittmann > wrote: > >> Hi Jeff, >> >> checkpoint message on standby node: >> >> [ 2017-04-17 17:21:56 BRT] @ LOG: restartpoint complete: wrote 21475 >> buffers (2.6%); 0 transaction log file(s) added, 0 removed, 0 recycled; >> write=149.816 s, sync=0.064 s, total=149.890 s; sync files=314, >> longest=0.002 s, average=0.000 s; distance=145449 kB, estimate=236346 kB >> [ 2017-04-17 17:21:56 BRT] @ LOG: recovery restart point at 126/A7072A88 >> [ 2017-04-17 17:21:56 BRT] @ DETAIL: last completed transaction was at >> log time 2017-04-17 17:21:02.289164-03 >> > > Gostaria que o código de log de ponto de verificação lhe dissesse quantos > arquivos de log de transações ele intencionalmente mantidos também. Eu não > vi "acidentalmente" reter arquivos, mas vendo o número registrado ajudaria > a simplificar a solução de problemas. > > Does each file in pg_xlog/archive_status/ have a corresponding file one > directory up? > > Cheers, > > Jeff > Hi Jeff, **Does each file in pg_xlog/archive_status/ have a corresponding file one directory up? no corresponding file on pg_xlog directory. That is the question.. for some reason or some parameter that I do not know, the files are considered consumed but are not erased later. regards,
Re: [GENERAL] CANNOT USE ANY INDEX ON UPSERT (INSERT.....ON CONFLICT)
> On 18 Apr 2017, at 10:13, agharta wrote: > > Hi all, > > I have a problem with INSERT ... ON CONFLICT sql command. > > Reading 9.6 documentation i see that ON CONFLICT command will accpets only > index_column_name or index_expression (unique composite/primary indexes are > valid too). > > So, my problem is that i can't create any type of upsert-valid index . Let me > explain. > > I have a table T1 containing F1, F2, F3, F4 fields. > > I can insert same records in T1, MAX TWICE. How is UPSERT supposed to know which of a pair of duplicate records it is supposed to update? You'll have to make them unique somehow. The safest approach is usually to add a surrogate key based on a sequence. > I can have records like (A,B,C,D),(B,A,D,C), etc.. and (A,B,C,D) AGAIN. Any > other next insert of (A,B,C,D) is not allowed (actually it is avoided by a > complex-and-slow-performance select count in before insert/update trigger). You're probably better off with an EXISTS query there. Something like: select F1, F2, F3, F4, case when exists (select 1 from T1 t where t.F1 = T1.F1 and t.F2 = T1.F2 and t.F3 = T1.F3 and t.F4 = T1.F4 and t.pk <> T1.pk) then 1 else 0 end as have_duplicate from T1 where F1 = NEW.F1 and F2 = NEW.F2 and F3 = NEW.F3 and F4 = NEW.F4 limit 1; The pk field in there is the surrogate key from the previous paragraph. Alternatively, wrap your COUNT around a sub-query that's limited to 2 results. No extra pk needed in that case, unless you still need to use UPSERT with that. In either case it will make a big difference to have an index on at least (F1, F2, F3, F4), perhaps with the new pk column added at the end. > In this case i can't create any type of primary/unique index, like a > composite F1,F2, F3, F4 index. (correct me if i am wrong please). Correct, you'll most likely have to add a new one (unless someone comes up with better suggestions). Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] # of connections and architecture design
Hi all, As many of you has read last Friday (and many has tired to help, too, and I still thank you very much), I had a bad service outage. I was pointed to reduce number of maximum connections using a pooler, and that's what I'm building in test lab, but I'm wondering if there's something I can do with my overall architecture design. ATM we host one database per customer (about 400 now) and every customer has two points of access to data: - Directly to database, via rubyrep, to replicate the database he has in his own machine - Wia WCF self-hosted web services to read other customers data Every customer can access (and replicate) his database from a number of different positions (max 3). Customers are organized in groups (max 10 per group), and there is the chance that someone accesses someone else's data via WCF. For example, pick up a group of 5: everyone running rubyrep with only one position enabled, and getting data from others' database. If I'm not wrong, it's 5 connections (rubyrep) plus 5*4 (everyone connecting to everyone else's database) for WCF, so 25 connections Now imagine a group of 10 Last friday I've been told that 350 connections is quite a big number and things can begin to slow down. Ok. When something slows down I'm used to search and find the bottleneck (CPU, RAM, IO, etc). If everything was running apparently fine (CPU < 10%, RAM used < 20%, I/O rarely over 20%), how can I say there's a bottleneck that's slowing down things? Am I missing something? Another thing is that on a customer server (with a similar, smaller architecture) I _do_ have a connection leak problem that's under investigation, but when things begin to slow down I simply run a pg_terminate_backend on all connection with an age > 10 min and everything goes back to normal. On my server, last friday, it did not help, so I thought that was not the main problem. I've got no problems in splitting this architecture in how many servers I need, but I think I need some tips on how to design this, in order to avoid major issues in the near future (ask for details if needed). The current PostgreSQL 9.5.6 server is an 8 core VM with 52 GB RAM and Debian 8. WCF server is Windows 2012 R2 4-core, 16 GB RAM. While facing the issue none of them showed up any kind of overload and their logs were clean. I'm a bit scared it can happen again. Thanks again Moreno -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] CANNOT USE ANY INDEX ON UPSERT (INSERT.....ON CONFLICT)
Hi all, I have a problem with INSERT ... ON CONFLICT sql command. Reading 9.6 documentation i see that ON CONFLICT command will accpets only index_column_name or index_expression (unique composite/primary indexes are valid too). So, my problem is that i can't create any type of upsert-valid index . Let me explain. I have a table T1 containing F1, F2, F3, F4 fields. I can insert same records in T1, *MAX TWICE*. I can have records like (A,B,C,D),(B,A,D,C), etc.. and (A,B,C,D) AGAIN. Any other next insert of (A,B,C,D) is not allowed (actually it is avoided by a complex-and-slow-performance select count in before insert/update trigger). In this case i can't create any type of primary/unique index, like a composite F1,F2, F3, F4 index. (correct me if i am wrong please). So, how to use UPSERT in this case? Best regards, Agharta
[GENERAL] UDP buffer drops / statistics collector
Hi all, I'm seeing sporadic (but frequent) UDP buffer drops on a host that so far I've not been able to resolve. The drops are originating from postgres processes, and from what I know - the only UDP traffic generated by postgres should be consumed by the statistics collector - but for whatever reason, it's failing to read the packets quickly enough. Interestingly, I'm seeing these drops occur even when the system is idle.. but every 15 minutes or so (not consistently enough to isolate any particular activity) we'll see in the order of ~90 packets dropped at a time. I'm running 9.6.2, but the issue was previously occurring on 9.2.4 (on the same hardware) If it's relevant.. there are two instances of postgres running (and consequently, 2 instances of the stats collector process) though 1 of those instances is most definitely idle for most of the day. In an effort to try to resolve the problem, I've increased (x2) the UDP recv buffer sizes on the host - but it seems to have had no effect. cat /proc/sys/net/core/rmem_max 1677216 The following parameters are configured track_activities on track_counts on track_functions none track_io_timing off There are approximately 80-100 connections at any given time. It seems that the issue started a few weeks ago, around the time of a reboot on the given host... but it's difficult to know what (if anything) has changed, or why :-/ Incidentally... the documentation doesn't seem to have any mention of UDP whatsoever. I'm going to use this as an opportunity to dive into the source - but perhaps it's worth improving the documentation around this? My next step is to try disabling track_activities and track_counts to see if they improve matters any, but I wouldn't expect these to generate enough data to flood the UDP buffers :-/ Any ideas?