[GENERAL] Checksums and full_page_writes
Hi all. I’ve read thread [0], src/backend/storage/page/README and everything I found in the documentation about checksums [1, 2] but I haven’t understood what is my risk if I enable checksums during initdb and turn off full_page_writes? Am I right that I can get torn pages on disk in that case but I will somehow know about it during checksum checking when this page will be read from disk to shared buffers? And is there any way to recover such torn page (from replica or backup+archived logs, for example)? [0] http://www.postgresql.org/message-id/20111217213324.ga4...@fetter.org [1] http://www.postgresql.org/docs/current/static/app-initdb.html#APP-INITDB-DATA-CHECKSUMS [2] http://www.postgresql.org/docs/current/static/wal-reliability.html -- Vladimir
Re: [GENERAL] help troubleshooting invalid page header error
On 23 Dec 2014 12:05, Cory Zue c...@dimagi.com wrote: Hi all, Our postgres instance on one of our production machines has recently been returning errors of the form DatabaseError: invalid page header in block 1 of relation base/16384/76623 from normal queries. I've been reading that these are often linked to hardware errors, but I would like to better understand what else it could be or how to determine that for sure. I've filled out the standard issue reporting template below. Any feedback or troubleshooting instructions would be much appreciated. --- A description of what you are trying to achieve and what results you expect.: Intermittent queries are failing with the error DatabaseError: invalid page header in block 1 of relation base/16384/76623 PostgreSQL version number you are running: PostgreSQL 8.4.13 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit How you installed PostgreSQL: from standard package installer Changes made to the settings in the postgresql.conf file: name | current_setting | source --+-+-- checkpoint_completion_target | 0.9 | configuration file checkpoint_segments | 32 | configuration file checkpoint_timeout | 15min | configuration file DateStyle| ISO, MDY| configuration file default_text_search_config | pg_catalog.english | configuration file effective_cache_size | 1GB | configuration file lc_messages | en_US.UTF-8 | configuration file lc_monetary | en_US.UTF-8 | configuration file lc_numeric | en_US.UTF-8 | configuration file lc_time | en_US.UTF-8 | configuration file log_checkpoints | on | configuration file log_connections | off | configuration file log_destination | csvlog | configuration file log_directory| /opt/data/pgsql/data/pg_log | configuration file log_disconnections | off | configuration file log_duration | on | configuration file log_filename | postgres-%Y-%m-%d_%H%M%S| configuration file log_lock_waits | on | configuration file log_min_duration_statement | 250ms | configuration file log_rotation_age | 1d | configuration file log_rotation_size| 1GB | configuration file log_temp_files | 0 | configuration file log_timezone | Asia/Kolkata| command line log_truncate_on_rotation | on | configuration file logging_collector| on | configuration file maintenance_work_mem | 768MB | configuration file max_connections | 500 | configuration file max_stack_depth | 2MB | environment variable port | 5432| command line shared_buffers | 4GB | configuration file ssl | on | configuration file TimeZone | Asia/Kolkata| command line timezone_abbreviations | Default | command line wal_buffers | 16MB| configuration file work_mem | 48MB| configuration file It's also probably worth noting that postgres is installed on an encrypted volume which is mounted using ecryptfs. Operating system and version: RedHatEnterpriseServer, version 6.6 What program you're using to connect to PostgreSQL: Python (django) Is there anything relevant or unusual in the PostgreSQL server logs?: I see lots of instances of this error (and similar). I'm not sure what else I should be looking for. What you were doing when the error happened / how to cause the error: I haven't explicitly tried to reproduce it, but it seems to consistently happen with certain queries. However, the system was rebooted shortly before the errors started occuring. The system was rebooted because another database (elasticsearch) was having problems on the same machine and the reboot was to attempt to resolve things. The EXACT TEXT of the error message you're getting, if there is one: DatabaseError: invalid
[GENERAL] Drastic select count performance hit when jsonb GIN indices are present
Hi, I've been playing with jsonb for storing and querying data from the Mailgun Events API (http://documentation.mailgun.com/api-events.html#examples). I already have a system that parses the JSON to csv and loads into standard tables but what better way to spend the holidays than nerding out on new postgres tech :-)? I am using the official postgres-supplied 9.4 Ubuntu repo on 14.04. I have been testing both just inserting the raw API call JSON ({items:[{item1],paging:...}) and extracting the actual events from the items array ({item1:...}, {item2:...}) and inserting per-event rather than per page of 100 events in an items array. I did this to try and benefit from GIN indices but much to my surprise, adding indices when there is a line per-event actually drastically *reduces* performance! mgevents=# create table myevents (event jsonb); mgevents=# create index idx_myevents on myevents using gin (event jsonb_path_ops); mgevents=# create index idx_myevents_no_path on myevents using gin (event); mgevents=# select count(*) from myevents; count - 3715600 (1 row) mgevents=# explain (analyze, buffers) select count(*) from myevents where event - 'event' = 'delivered'; QUERY PLAN --- Aggregate (cost=279884.45..279884.46 rows=1 width=0) (actual time=14626.213..14626.216 rows=1 loops=1) Buffers: shared read=448208 - Seq Scan on myevents (cost=0.00..279838.00 rows=18578 width=0) (actual time=0.719..11432.283 rows=1417152 loops=1) Filter: ((event - 'event'::text) = 'delivered'::text) Rows Removed by Filter: 2298448 Buffers: shared read=448208 Planning time: 0.074 ms Execution time: 14626.955 ms (8 rows) mgevents=# explain (analyze, buffers) select count(*) from myevents where event @ '{event: delivered}'; QUERY PLAN - Aggregate (cost=3624.38..3624.39 rows=1 width=0) (actual time=80984.095..80984.098 rows=1 loops=1) Buffers: shared hit=1 read=298589 - Bitmap Heap Scan on myevents (cost=35.80..3615.09 rows=3716 width=0) (actual time=351.510..77669.907 rows=1417152 loops=1) Recheck Cond: (event @ '{event: delivered}'::jsonb) Heap Blocks: exact=298362 Buffers: shared hit=1 read=298589 - Bitmap Index Scan on idx_myevents (cost=0.00..34.87 rows=3716 width=0) (actual time=257.219..257.219 rows=1417152 loops=1) Index Cond: (event @ '{event: delivered}'::jsonb) Buffers: shared hit=1 read=227 Planning time: 3.197 ms Execution time: 80986.340 ms (11 rows) mgevents=# drop index idx_myevents; mgevents=# explain (analyze, buffers) select count(*) from myevents where event @ '{event: delivered}'; QUERY PLAN - Aggregate (cost=3631.38..3631.39 rows=1 width=0) (actual time=81898.250..81898.252 rows=1 loops=1) Buffers: shared hit=383 read=299133 - Bitmap Heap Scan on myevents (cost=42.80..3622.09 rows=3716 width=0) (actual time=534.816..78526.944 rows=1417152 loops=1) Recheck Cond: (event @ '{event: delivered}'::jsonb) Heap Blocks: exact=298362 Buffers: shared hit=383 read=299133 - Bitmap Index Scan on idx_myevents_no_path (cost=0.00..41.87 rows=3716 width=0) (actual time=453.412..453.412 rows=1417152 loops=1) Index Cond: (event @ '{event: delivered}'::jsonb) Buffers: shared hit=383 read=771 Planning time: 2.322 ms Execution time: 81898.578 ms (11 rows) mgevents=# drop index idx_myevents_no_path; mgevents=# explain (analyze, buffers) select count(*) from myevents where event @ '{event: delivered}'; QUERY PLAN -- Aggregate (cost=270558.29..270558.30 rows=1 width=0) (actual time=19834.530..19834.532 rows=1 loops=1) Buffers: shared hit=130935 read=317273 - Seq Scan on myevents (cost=0.00..270549.00 rows=3716 width=0) (actual time=4.650..16764.726 rows=1417152 loops=1) Filter: (event @ '{event: delivered}'::jsonb) Rows Removed by Filter: 2298448 Buffers: shared hit=130935 read=317273 Planning time: 0.238 ms Execution time: 19834.605 ms (8 rows) So it doesn't even appear to be a problem with the operator (- vs @) but rather that the planner is using the index rather than just a scan. Now as I'm completely new to jsonb and GIN there
Re: [GENERAL] Check constraint on foreign table using SQL function
On 25.12.2014 23:50, Adrian Klaver wrote: On 12/25/2014 11:21 AM, Andreas Ulbrich wrote: Hey. In the first part I connect to testdb (the remote db). Here I create the schema 'andreas'. That the search_path correct is, shows the output of the pgplsql version. Yeah, that was a false alarm on my part. Forgot about $user in the path. Then, in the second part, I'm postgres (this db ist empty since compiling). The usermapping says, I wont to be andreas@testdb. The plpgsql-version (and the other work arround: schema qualified, set search_path) shows, that my assumptions are correct. First I'm wondering, why are the constraints are checked in select, but this is not the main problem, because, if I would do an INSERT, I will get the same problem. The issue seems to begin here: CREATE TABLE IF NOT EXISTS tab_b ( id INTEGER PRIMARY KEY, id_a INTEGER NOT NULL REFERENCES tab_a, name TEXT, CHECK(test_name_b(id_a, name)) ); The CHECK calls test_name_b() which has SELECT $2 = name FROM tab_a WHERE id = $1 in it As Tom said fdw calls have a limited search_path and the tab_b table is not schema qualified in the function, so: 2014-12-24 13:11:27 CET andreas(at)testdb QUERY: SELECT $2 = name FROM tab_a WHERE id = $1 2014-12-24 13:11:27 CET postgres(at)postgres ERROR: relation tab_a does not exist 2014-12-24 13:11:27 CET postgres(at)postgres CONTEXT: Remote SQL command: SELECT id, id_a, name FROM andreas.tab_b SQL function test_name_b during inlining As you found out you need to be explicit about your schemas when going through fdw. Either schema qualify object names of set explicit search_path, All this starts when you try to create the foreign table: 2014-12-24 13:11:27 CET postgres(at)postgres STATEMENT: TABLE ftab_b; I believe, that the inlining does not use the search_path set in the ALTER DATABASE. Here is a \d output before the END of the transaction in the first part. List of relations Schema | Name | Type | Owner -+--+---+- andreas | tab_a| table | andreas andreas | tab_b| table | andreas test| unaccent | foreign table | test Regards Andreas O.K. I've seen. I tried again the plpgsql version: 1.) the INSERTS on testdb prints the correct search_path in the raise WARNING: test_name_b called: $user, test, public, ext 2.) Here the TABLE ftab_B works; because the check constraint is not done (the constraint function is not called) Is there any deeper cause, why the check is done with SQL function and not with plpgsql in SELECT? It seems not necessary to do the check - for me. 3.) Is I try to make an INSERT from postgres to testdb, it failes. And here is the search path not set: WARNING: test_name_b called: pg_catalog I think, that's a gap in the fdw. My idea was the following: I wont do some refactoring. So a created a new DB with a new schema with the refactored data structure. To populate this new structure I wanted to use fdw-select to the old structure. The problem is, I can't do so much changes in the old structure. Further, qualified table access in the function makes them not reuseable in the new structure. O.K. the new idea is. Create a new 'export' schema in the old DB with a constraintless export formated tables (JSON), populate this in the old DB and then use fdw to these special designed tables. Thanks Andreas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Drastic select count performance hit when jsonb GIN indices are present
- Bitmap Heap Scan on myevents (cost=35.80..3615.09 rows=3716 width=0) (actual time=351.510..77669.907 rows=1417152 loops=1) Recheck Cond: (event @ '{event: delivered}'::jsonb) Heap Blocks: exact=298362 Buffers: shared hit=1 read=298589 ... Execution time: 80986.340 ms - Bitmap Heap Scan on myevents (cost=42.80..3622.09 rows=3716 width=0) (actual time=534.816..78526.944 rows=1417152 loops=1) Recheck Cond: (event @ '{event: delivered}'::jsonb) Heap Blocks: exact=298362 Buffers: shared hit=383 read=299133 ... Execution time: 81898.578 ms Hi Anton, What you see there (i think) - it's a performance hit of random disk read for non-cached database. Try increase a shared buffers to value when table and index could fit into, and redo queries few time until you see something like Buffers: shared hit=bigvalue read=0 and compare performance, it might change timing quite a lot. Also, I recommend set track_io_timing=on in postgresql.conf and after it use explain (analyze, buffers, timing) to see check how much time database spent doing IO operations. Also try perform vacuum analyze myevents; before testing because it seems that you have no up to date visibility map on the table. However, even in fully cached case selecting 40% on the table rows almost always will be faster via sequential scan, so I don't expect miracles. -- Maxim Boguk Senior Postgresql DBA http://www.postgresql-consulting.ru/ http://www.postgresql-consulting.com/ Phone RU: +7 910 405 4718 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk Jabber: maxim.bo...@gmail.com МойКруг: http://mboguk.moikrug.ru/ People problems are solved with people. If people cannot solve the problem, try technology. People will then wish they'd listened at the first stage.
Re: [GENERAL] help troubleshooting invalid page header error
Hi all, Thanks for the responses. Chiru, I'm looking into your suggestion. Sameer, here is the kernel version info: Linux dimagi 2.6.32-431.20.5.el6.x86_64 #1 SMP Wed Jul 16 05:26:53 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux Does that seem like it could be a problematic version? More generally - I'm still wondering whether I should chalk this failure up to a transient/random issue, or whether I should be more worried about the hardware on the machine. According to our diagnostic tools, disk and memory are fine, but it's still not clear to me how it got into this state. Any general bits of information regarding the potential causes of these types of issues would be much appreciated. thanks, Cory On Fri, Dec 26, 2014 at 6:55 AM, Sameer Kumar sameer.ku...@ashnik.com wrote: On 23 Dec 2014 12:05, Cory Zue c...@dimagi.com wrote: Hi all, Our postgres instance on one of our production machines has recently been returning errors of the form DatabaseError: invalid page header in block 1 of relation base/16384/76623 from normal queries. I've been reading that these are often linked to hardware errors, but I would like to better understand what else it could be or how to determine that for sure. I've filled out the standard issue reporting template below. Any feedback or troubleshooting instructions would be much appreciated. --- A description of what you are trying to achieve and what results you expect.: Intermittent queries are failing with the error DatabaseError: invalid page header in block 1 of relation base/16384/76623 PostgreSQL version number you are running: PostgreSQL 8.4.13 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit How you installed PostgreSQL: from standard package installer Changes made to the settings in the postgresql.conf file: name | current_setting | source --+-+-- checkpoint_completion_target | 0.9 | configuration file checkpoint_segments | 32 | configuration file checkpoint_timeout | 15min | configuration file DateStyle| ISO, MDY| configuration file default_text_search_config | pg_catalog.english | configuration file effective_cache_size | 1GB | configuration file lc_messages | en_US.UTF-8 | configuration file lc_monetary | en_US.UTF-8 | configuration file lc_numeric | en_US.UTF-8 | configuration file lc_time | en_US.UTF-8 | configuration file log_checkpoints | on | configuration file log_connections | off | configuration file log_destination | csvlog | configuration file log_directory| /opt/data/pgsql/data/pg_log | configuration file log_disconnections | off | configuration file log_duration | on | configuration file log_filename | postgres-%Y-%m-%d_%H%M%S| configuration file log_lock_waits | on | configuration file log_min_duration_statement | 250ms | configuration file log_rotation_age | 1d | configuration file log_rotation_size| 1GB | configuration file log_temp_files | 0 | configuration file log_timezone | Asia/Kolkata| command line log_truncate_on_rotation | on | configuration file logging_collector| on | configuration file maintenance_work_mem | 768MB | configuration file max_connections | 500 | configuration file max_stack_depth | 2MB | environment variable port | 5432| command line shared_buffers | 4GB | configuration file ssl | on | configuration file TimeZone | Asia/Kolkata| command line timezone_abbreviations | Default | command line wal_buffers | 16MB| configuration file work_mem | 48MB| configuration file It's also probably worth noting that
Re: [GENERAL] Drastic select count performance hit when jsonb GIN indices are present
Hi Maxim, Thanks for the suggestions. Unfortunately, it does appear to be an IO issue but not one I can get around (on my laptop at least!). Postgres seems to refuse to put any of the index or table in shared buffers at all. For some reason, there seems to be a very great increase in space required for having each event (item) on its own row as opposed to having it with the original items array within far fewer database rows: relation |size -+ public.myevents | 3502 MB ... pg_toast.pg_toast_57487 | 800 MB ... public.events | 2232 kB ... myevents is where each event has a row to itself and events has 100 events in an array within the JSON rows (and the toast table above is where the data seem to be stored) - actually the events table has more data in it as there are the extra paging objects which have been removed from myevents. Performing vacuum analyze seems to have no effect either. Getting back to my original point - you pointed out that for queries that need a decent % of the table it will be cheaper to do a scan, which is exactly what the query planner does for the relational version. If it only needs a small % of the values it looks at the index and for a large % it goes for a scan (it also puts everything in shared buffers and is lightening quick!). Is this just a lack of maturity in the jsonb planner or am I missing something? Thanks again, Anton On 26 December 2014 at 14:19, Maxim Boguk maxim.bo...@gmail.com wrote: - Bitmap Heap Scan on myevents (cost=35.80..3615.09 rows=3716 width=0) (actual time=351.510..77669.907 rows=1417152 loops=1) Recheck Cond: (event @ '{event: delivered}'::jsonb) Heap Blocks: exact=298362 Buffers: shared hit=1 read=298589 ... Execution time: 80986.340 ms - Bitmap Heap Scan on myevents (cost=42.80..3622.09 rows=3716 width=0) (actual time=534.816..78526.944 rows=1417152 loops=1) Recheck Cond: (event @ '{event: delivered}'::jsonb) Heap Blocks: exact=298362 Buffers: shared hit=383 read=299133 ... Execution time: 81898.578 ms Hi Anton, What you see there (i think) - it's a performance hit of random disk read for non-cached database. Try increase a shared buffers to value when table and index could fit into, and redo queries few time until you see something like Buffers: shared hit=bigvalue read=0 and compare performance, it might change timing quite a lot. Also, I recommend set track_io_timing=on in postgresql.conf and after it use explain (analyze, buffers, timing) to see check how much time database spent doing IO operations. Also try perform vacuum analyze myevents; before testing because it seems that you have no up to date visibility map on the table. However, even in fully cached case selecting 40% on the table rows almost always will be faster via sequential scan, so I don't expect miracles. -- Maxim Boguk Senior Postgresql DBA http://www.postgresql-consulting.ru/ http://www.postgresql-consulting.com/ Phone RU: +7 910 405 4718 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk Jabber: maxim.bo...@gmail.com МойКруг: http://mboguk.moikrug.ru/ People problems are solved with people. If people cannot solve the problem, try technology. People will then wish they'd listened at the first stage. -- echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D4D465452snlbxq' | dc This will help you for 99.9% of your problems ...
Re: [GENERAL] Check constraint on foreign table using SQL function
On 12/26/2014 05:18 AM, Andreas Ulbrich wrote: On 25.12.2014 23:50, Adrian Klaver wrote: On 12/25/2014 11:21 AM, Andreas Ulbrich wrote: Hey. In the first part I connect to testdb (the remote db). Here I create the schema 'andreas'. That the search_path correct is, shows the output of the pgplsql version. Yeah, that was a false alarm on my part. Forgot about $user in the path. Then, in the second part, I'm postgres (this db ist empty since compiling). The usermapping says, I wont to be andreas@testdb. The plpgsql-version (and the other work arround: schema qualified, set search_path) shows, that my assumptions are correct. First I'm wondering, why are the constraints are checked in select, but this is not the main problem, because, if I would do an INSERT, I will get the same problem. The issue seems to begin here: CREATE TABLE IF NOT EXISTS tab_b ( id INTEGER PRIMARY KEY, id_a INTEGER NOT NULL REFERENCES tab_a, name TEXT, CHECK(test_name_b(id_a, name)) ); The CHECK calls test_name_b() which has SELECT $2 = name FROM tab_a WHERE id = $1 in it As Tom said fdw calls have a limited search_path and the tab_b table is not schema qualified in the function, so: 2014-12-24 13:11:27 CET andreas(at)testdb QUERY: SELECT $2 = name FROM tab_a WHERE id = $1 2014-12-24 13:11:27 CET postgres(at)postgres ERROR: relation tab_a does not exist 2014-12-24 13:11:27 CET postgres(at)postgres CONTEXT: Remote SQL command: SELECT id, id_a, name FROM andreas.tab_b SQL function test_name_b during inlining As you found out you need to be explicit about your schemas when going through fdw. Either schema qualify object names of set explicit search_path, All this starts when you try to create the foreign table: 2014-12-24 13:11:27 CET postgres(at)postgres STATEMENT: TABLE ftab_b; I believe, that the inlining does not use the search_path set in the ALTER DATABASE. Here is a \d output before the END of the transaction in the first part. List of relations Schema | Name | Type | Owner -+--+---+- andreas | tab_a| table | andreas andreas | tab_b| table | andreas test| unaccent | foreign table | test Regards Andreas O.K. I've seen. I tried again the plpgsql version: 1.) the INSERTS on testdb prints the correct search_path in the raise WARNING: test_name_b called: $user, test, public, ext 2.) Here the TABLE ftab_B works; because the check constraint is not done (the constraint function is not called) Is there any deeper cause, why the check is done with SQL function and not with plpgsql in SELECT? It seems not necessary to do the check - for me. This, from what I gather, is the result of the SQL inlining process whereby SQL functions are optimized ahead of use. So the error shows up when Postgres walks through the SQL function to do the optimization. In the plpgsql function this does not happen. Someone with more knowledge of the internals will have to fill in the details. 3.) Is I try to make an INSERT from postgres to testdb, it failes. And here is the search path not set: WARNING: test_name_b called: pg_catalog The search_path is set, it is just very restricted. I think, that's a gap in the fdw. Well as Tom said in his post: Anyway, as far as your second question goes, the postgres_fdw wrapper intentionally forces the search_path to be just pg_catalog in its remote session. We're aware that this breaks carelessly written triggers and CHECK functions and so on, but really such functions are broken anyhow; they should not be assuming anything about what search_path they're called with. My idea was the following: I wont do some refactoring. So a created a new DB with a new schema with the refactored data structure. To populate this new structure I wanted to use fdw-select to the old structure. The problem is, I can't do so much changes in the old structure. Further, qualified table access in the function makes them not reuseable in the new structure. O.K. the new idea is. Create a new 'export' schema in the old DB with a constraintless export formated tables (JSON), populate this in the old DB and then use fdw to these special designed tables. Well fdw is not the only game in town, there is also dblink: http://www.postgresql.org/docs/9.3/interactive/dblink.html Thanks Andreas -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] question about window function in C
Hi ! I'm trying to write a window function in C . In the function I'm using a tuplesort to sort the window data and then do some processing. Now when I'm running the function I get this: 'WARNING: temporary file leak: File 43 still referenced' The warning comes from my failure to call tuplesort_end at the appropriate time. From the beginning I thought that calling tuplesort_end when current_pos in the window approched WinGetPartitionRowCount would do it. That seemed to work at first but when I put a limit-clause in the query that doesn't work. So I think I need to use some callback mechanism to get that to work. I've found some reference in the postgres source code to RegisterExprContextCallback which seemed promising but I have no idea how to get the right expression context to put in as the first argument to that function. So my question is, how do I shut down the tuplesort properly after the last call to my window function ? I'm running PostgreSQL 9.3.5 on i686-pc-linux-gnu, compiled by gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 32-bit Best Regards Dan S
Re: [GENERAL] question about window function in C
Dan S strd...@gmail.com writes: I'm trying to write a window function in C . In the function I'm using a tuplesort to sort the window data and then do some processing. Hmm ... why do you feel you need to do that? The window function's input should already be sorted according to the window specification. Now when I'm running the function I get this: 'WARNING: temporary file leak: File 43 still referenced' The warning comes from my failure to call tuplesort_end at the appropriate time. Unsurprising. So I think I need to use some callback mechanism to get that to work. I've found some reference in the postgres source code to RegisterExprContextCallback which seemed promising but I have no idea how to get the right expression context to put in as the first argument to that function. I don't think there is one :-(. WindowAgg has a per-input-tuple econtext, and a per-output-tuple econtext, but what you'd need for this is a partition-lifespan econtext, which doesn't exist. It's possible that we could promote the partcontext memory context into a full econtext so as to support this sort of behavior. But I'd want to see a reasonably convincing use-case for it. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] How to install pgAdmin 1.20 on Mint Rebecca?
Can someone point simple instructions on how to install pgAdmin 1.20 on Linux Mint Rebecca (17.1 - based on Ubuntu trusty)? Thanks, Edson Richter -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to install pgAdmin 1.20 on Mint Rebecca?
On 12/26/2014 10:41 AM, Edson Carlos Ericksson Richter wrote: Can someone point simple instructions on how to install pgAdmin 1.20 on Linux Mint Rebecca (17.1 - based on Ubuntu trusty)? Well a fairly extensive search showed no package for that combination of OS version and pgAdmin version. So the choices seem to be: 1) Stick with pgAdmin 1.18. 2) Wait for someone to release a newer pgAdmin package for Mint. 3) Install from source: http://git.postgresql.org/gitweb/?p=pgadmin3.git;a=blob;f=INSTALL;h=4a96ce64ff695a3942d41b5982540ce61fcb996d;hb=REL-1_20_0_PATCH Thanks, Edson Richter -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] question about window function in C
Well I'm trying to implement a window-function that works on range_types and produces 'atomic ranges' for each input range. Let's say I have a set of ranges some overlapping some not, and I want to split each range at every boundary of every overlapping range and return those. So for each range r I want to return an array of ranges that consists of range r split at every overlapping range boundary. I need to consider both upper and lower boundaries and to be able to do this in one pass over the data I need to sort both the upper and lower boundaries and also sort the original ranges so I can loop over the ranges and boundaries in lockstep to produce the arrays. As a last step I sort back the arrays of atomic ranges in the original order in the window so I can read out the tuplesort in that order and return each array to its corresponding range r. (The result can be used to answer questions like what are the maximum number of simultaneously overlapping ranges and at which ranges the maximum occurs) Best Regards Dan S 2014-12-26 18:57 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Dan S strd...@gmail.com writes: I'm trying to write a window function in C . In the function I'm using a tuplesort to sort the window data and then do some processing. Hmm ... why do you feel you need to do that? The window function's input should already be sorted according to the window specification. Now when I'm running the function I get this: 'WARNING: temporary file leak: File 43 still referenced' The warning comes from my failure to call tuplesort_end at the appropriate time. Unsurprising. So I think I need to use some callback mechanism to get that to work. I've found some reference in the postgres source code to RegisterExprContextCallback which seemed promising but I have no idea how to get the right expression context to put in as the first argument to that function. I don't think there is one :-(. WindowAgg has a per-input-tuple econtext, and a per-output-tuple econtext, but what you'd need for this is a partition-lifespan econtext, which doesn't exist. It's possible that we could promote the partcontext memory context into a full econtext so as to support this sort of behavior. But I'd want to see a reasonably convincing use-case for it. regards, tom lane
Re: [GENERAL] help troubleshooting invalid page header error
Hi Cory, After recovering table turn off *zero_damaged_pages *parameter. On Fri, Dec 26, 2014 at 9:13 PM, Cory Zue c...@dimagi.com wrote: Hi all, Thanks for the responses. Chiru, I'm looking into your suggestion. Sameer, here is the kernel version info: Linux dimagi 2.6.32-431.20.5.el6.x86_64 #1 SMP Wed Jul 16 05:26:53 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux Does that seem like it could be a problematic version? More generally - I'm still wondering whether I should chalk this failure up to a transient/random issue, or whether I should be more worried about the hardware on the machine. According to our diagnostic tools, disk and memory are fine, but it's still not clear to me how it got into this state. Any general bits of information regarding the potential causes of these types of issues would be much appreciated. thanks, Cory On Fri, Dec 26, 2014 at 6:55 AM, Sameer Kumar sameer.ku...@ashnik.com wrote: On 23 Dec 2014 12:05, Cory Zue c...@dimagi.com wrote: Hi all, Our postgres instance on one of our production machines has recently been returning errors of the form DatabaseError: invalid page header in block 1 of relation base/16384/76623 from normal queries. I've been reading that these are often linked to hardware errors, but I would like to better understand what else it could be or how to determine that for sure. I've filled out the standard issue reporting template below. Any feedback or troubleshooting instructions would be much appreciated. --- A description of what you are trying to achieve and what results you expect.: Intermittent queries are failing with the error DatabaseError: invalid page header in block 1 of relation base/16384/76623 PostgreSQL version number you are running: PostgreSQL 8.4.13 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit How you installed PostgreSQL: from standard package installer Changes made to the settings in the postgresql.conf file: name | current_setting | source --+-+-- checkpoint_completion_target | 0.9 | configuration file checkpoint_segments | 32 | configuration file checkpoint_timeout | 15min | configuration file DateStyle| ISO, MDY| configuration file default_text_search_config | pg_catalog.english | configuration file effective_cache_size | 1GB | configuration file lc_messages | en_US.UTF-8 | configuration file lc_monetary | en_US.UTF-8 | configuration file lc_numeric | en_US.UTF-8 | configuration file lc_time | en_US.UTF-8 | configuration file log_checkpoints | on | configuration file log_connections | off | configuration file log_destination | csvlog | configuration file log_directory| /opt/data/pgsql/data/pg_log | configuration file log_disconnections | off | configuration file log_duration | on | configuration file log_filename | postgres-%Y-%m-%d_%H%M%S| configuration file log_lock_waits | on | configuration file log_min_duration_statement | 250ms | configuration file log_rotation_age | 1d | configuration file log_rotation_size| 1GB | configuration file log_temp_files | 0 | configuration file log_timezone | Asia/Kolkata| command line log_truncate_on_rotation | on | configuration file logging_collector| on | configuration file maintenance_work_mem | 768MB | configuration file max_connections | 500 | configuration file max_stack_depth | 2MB | environment variable port | 5432| command line shared_buffers | 4GB | configuration file ssl | on | configuration file TimeZone | Asia/Kolkata| command line timezone_abbreviations | Default | command line wal_buffers | 16MB
Re: [GENERAL] Checksums and full_page_writes
On Fri, Dec 26, 2014 at 01:11:57PM +0300, Borodin Vladimir wrote: Hi all. I ve read thread [0], src/backend/storage/page/README and everything I found in the documentation about checksums [1, 2] but I haven t understood what is my risk if I enable checksums during initdb and turn off full_page_writes? Am I right that I can get torn pages on disk in that case but I will somehow know about it during checksum checking when this page will be read from disk to shared buffers? And is there any way to recover such torn page (from replica or Yes. backup+archived logs, for example)? There is no way to _fix_ the torn page, but you can ignore the error with ignore_checksum_failure: http://www.postgresql.org/docs/9.4/static/runtime-config-developer.html -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checksums and full_page_writes
On Fri, Dec 26, 2014 at 04:07:11PM -0500, Bruce Momjian wrote: On Fri, Dec 26, 2014 at 01:11:57PM +0300, Borodin Vladimir wrote: Hi all. I ve read thread [0], src/backend/storage/page/README and everything I found in the documentation about checksums [1, 2] but I haven t understood what is my risk if I enable checksums during initdb and turn off full_page_writes? Am I right that I can get torn pages on disk in that case but I will somehow know about it during checksum checking when this page will be read from disk to shared buffers? And is there any way to recover such torn page (from replica or Yes. backup+archived logs, for example)? There is no way to _fix_ the torn page, but you can ignore the error with ignore_checksum_failure: http://www.postgresql.org/docs/9.4/static/runtime-config-developer.html Oh, you can fail-over to the replica, of course, but there is no clean way to restore just the torn page. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to install pgAdmin 1.20 on Mint Rebecca?
Yes, building from sources worked flawlessly, even for me (I'm not kind a build C apps from sources guy). Regards, Edson On 26-12-2014 17:15, Adrian Klaver wrote: On 12/26/2014 10:41 AM, Edson Carlos Ericksson Richter wrote: Can someone point simple instructions on how to install pgAdmin 1.20 on Linux Mint Rebecca (17.1 - based on Ubuntu trusty)? Well a fairly extensive search showed no package for that combination of OS version and pgAdmin version. So the choices seem to be: 1) Stick with pgAdmin 1.18. 2) Wait for someone to release a newer pgAdmin package for Mint. 3) Install from source: http://git.postgresql.org/gitweb/?p=pgadmin3.git;a=blob;f=INSTALL;h=4a96ce64ff695a3942d41b5982540ce61fcb996d;hb=REL-1_20_0_PATCH Thanks, Edson Richter -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] help troubleshooting invalid page header error
Hi Chiru, I am trying to pg_dump the database to have a snapshot of the current state. I've turned on 'zero_damaged_pages' but pg_dump is still failing with an invalid page header error - this time from what looks like a sequence object that is auto-setting IDs on a table. Any advice on how to remove this error? Here is the full query that's failing: SELECT sequence_name, start_value, last_value, increment_by, CASE WHEN increment_by 0 AND max_value = 9223372036854775807 THEN NULL WHEN increment_by 0 AND max_value = -1 THEN NULL ELSE max_value END AS max_value, CASE WHEN increment_by 0 AND min_value = 1 THEN NULL WHEN increment_by 0 AND min_value = -9223372036854775807 THEN NULL ELSE min_value END AS min_value, cache_value, is_cycled, is_called from unfinishedsubmissionstub_id_seq On Fri, Dec 26, 2014 at 2:35 PM, chiru r chir...@gmail.com wrote: Hi Cory, After recovering table turn off *zero_damaged_pages *parameter. On Fri, Dec 26, 2014 at 9:13 PM, Cory Zue c...@dimagi.com wrote: Hi all, Thanks for the responses. Chiru, I'm looking into your suggestion. Sameer, here is the kernel version info: Linux dimagi 2.6.32-431.20.5.el6.x86_64 #1 SMP Wed Jul 16 05:26:53 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux Does that seem like it could be a problematic version? More generally - I'm still wondering whether I should chalk this failure up to a transient/random issue, or whether I should be more worried about the hardware on the machine. According to our diagnostic tools, disk and memory are fine, but it's still not clear to me how it got into this state. Any general bits of information regarding the potential causes of these types of issues would be much appreciated. thanks, Cory On Fri, Dec 26, 2014 at 6:55 AM, Sameer Kumar sameer.ku...@ashnik.com wrote: On 23 Dec 2014 12:05, Cory Zue c...@dimagi.com wrote: Hi all, Our postgres instance on one of our production machines has recently been returning errors of the form DatabaseError: invalid page header in block 1 of relation base/16384/76623 from normal queries. I've been reading that these are often linked to hardware errors, but I would like to better understand what else it could be or how to determine that for sure. I've filled out the standard issue reporting template below. Any feedback or troubleshooting instructions would be much appreciated. --- A description of what you are trying to achieve and what results you expect.: Intermittent queries are failing with the error DatabaseError: invalid page header in block 1 of relation base/16384/76623 PostgreSQL version number you are running: PostgreSQL 8.4.13 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit How you installed PostgreSQL: from standard package installer Changes made to the settings in the postgresql.conf file: name | current_setting | source --+-+-- checkpoint_completion_target | 0.9 | configuration file checkpoint_segments | 32 | configuration file checkpoint_timeout | 15min | configuration file DateStyle| ISO, MDY| configuration file default_text_search_config | pg_catalog.english | configuration file effective_cache_size | 1GB | configuration file lc_messages | en_US.UTF-8 | configuration file lc_monetary | en_US.UTF-8 | configuration file lc_numeric | en_US.UTF-8 | configuration file lc_time | en_US.UTF-8 | configuration file log_checkpoints | on | configuration file log_connections | off | configuration file log_destination | csvlog | configuration file log_directory| /opt/data/pgsql/data/pg_log | configuration file log_disconnections | off | configuration file log_duration | on | configuration file log_filename | postgres-%Y-%m-%d_%H%M%S| configuration file log_lock_waits | on | configuration file log_min_duration_statement | 250ms | configuration file log_rotation_age | 1d | configuration file log_rotation_size| 1GB | configuration file log_temp_files | 0 | configuration file log_timezone | Asia/Kolkata|
Re: [GENERAL] help troubleshooting invalid page header error
(nevermind - it looks like the zero_damaged_pages setting only took for the duration of the session) On Fri, Dec 26, 2014 at 5:15 PM, Cory Zue c...@dimagi.com wrote: Hi Chiru, I am trying to pg_dump the database to have a snapshot of the current state. I've turned on 'zero_damaged_pages' but pg_dump is still failing with an invalid page header error - this time from what looks like a sequence object that is auto-setting IDs on a table. Any advice on how to remove this error? Here is the full query that's failing: SELECT sequence_name, start_value, last_value, increment_by, CASE WHEN increment_by 0 AND max_value = 9223372036854775807 THEN NULL WHEN increment_by 0 AND max_value = -1 THEN NULL ELSE max_value END AS max_value, CASE WHEN increment_by 0 AND min_value = 1 THEN NULL WHEN increment_by 0 AND min_value = -9223372036854775807 THEN NULL ELSE min_value END AS min_value, cache_value, is_cycled, is_called from unfinishedsubmissionstub_id_seq On Fri, Dec 26, 2014 at 2:35 PM, chiru r chir...@gmail.com wrote: Hi Cory, After recovering table turn off *zero_damaged_pages *parameter. On Fri, Dec 26, 2014 at 9:13 PM, Cory Zue c...@dimagi.com wrote: Hi all, Thanks for the responses. Chiru, I'm looking into your suggestion. Sameer, here is the kernel version info: Linux dimagi 2.6.32-431.20.5.el6.x86_64 #1 SMP Wed Jul 16 05:26:53 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux Does that seem like it could be a problematic version? More generally - I'm still wondering whether I should chalk this failure up to a transient/random issue, or whether I should be more worried about the hardware on the machine. According to our diagnostic tools, disk and memory are fine, but it's still not clear to me how it got into this state. Any general bits of information regarding the potential causes of these types of issues would be much appreciated. thanks, Cory On Fri, Dec 26, 2014 at 6:55 AM, Sameer Kumar sameer.ku...@ashnik.com wrote: On 23 Dec 2014 12:05, Cory Zue c...@dimagi.com wrote: Hi all, Our postgres instance on one of our production machines has recently been returning errors of the form DatabaseError: invalid page header in block 1 of relation base/16384/76623 from normal queries. I've been reading that these are often linked to hardware errors, but I would like to better understand what else it could be or how to determine that for sure. I've filled out the standard issue reporting template below. Any feedback or troubleshooting instructions would be much appreciated. --- A description of what you are trying to achieve and what results you expect.: Intermittent queries are failing with the error DatabaseError: invalid page header in block 1 of relation base/16384/76623 PostgreSQL version number you are running: PostgreSQL 8.4.13 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit How you installed PostgreSQL: from standard package installer Changes made to the settings in the postgresql.conf file: name | current_setting | source --+-+-- checkpoint_completion_target | 0.9 | configuration file checkpoint_segments | 32 | configuration file checkpoint_timeout | 15min | configuration file DateStyle| ISO, MDY| configuration file default_text_search_config | pg_catalog.english | configuration file effective_cache_size | 1GB | configuration file lc_messages | en_US.UTF-8 | configuration file lc_monetary | en_US.UTF-8 | configuration file lc_numeric | en_US.UTF-8 | configuration file lc_time | en_US.UTF-8 | configuration file log_checkpoints | on | configuration file log_connections | off | configuration file log_destination | csvlog | configuration file log_directory| /opt/data/pgsql/data/pg_log | configuration file log_disconnections | off | configuration file log_duration | on | configuration file log_filename | postgres-%Y-%m-%d_%H%M%S| configuration file log_lock_waits | on | configuration file log_min_duration_statement | 250ms | configuration file log_rotation_age | 1d | configuration file log_rotation_size| 1GB
Re: [GENERAL] How to install pgAdmin 1.20 on Mint Rebecca?
On 12/26/2014 02:16 PM, Edson Carlos Ericksson Richter wrote: Yes, building from sources worked flawlessly, even for me (I'm not kind a build C apps from sources guy). Great that it worked for you. I have had issues with the wxGTK portion at various times, though it seems that may not be a problem in the future: http://pgsnake.blogspot.com/2014/12/the-story-of-pgadmin.html Regards, Edson -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Drastic select count performance hit when jsonb GIN indices are present
Getting back to my original point - you pointed out that for queries that need a decent % of the table it will be cheaper to do a scan, which is exactly what the query planner does for the relational version. If it only needs a small % of the values it looks at the index and for a large % it goes for a scan (it also puts everything in shared buffers and is lightening quick!). Is this just a lack of maturity in the jsonb planner or am I missing something? Hi Anton, Good selectivity estimators exists only for the scalar data types. For the complex data types such as json/jsonb introducing a reasonable selectivity estimator is very complicated task, so database could only guess in this cases. In your case the database guessed amount of returned rows with 3 order of magnitude error (estimated 3716 rows, actually 1417152 rows). Personally, I don't expect serious progress in json/jsonb selectivity estimators in short future, so better to avoid using a low-selectivity queries against indexed json/jsonb fields. -- Maxim Boguk Senior Postgresql DBA http://www.postgresql-consulting.ru/ http://www.postgresql-consulting.com/ Phone RU: +7 910 405 4718 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk Jabber: maxim.bo...@gmail.com МойКруг: http://mboguk.moikrug.ru/ People problems are solved with people. If people cannot solve the problem, try technology. People will then wish they'd listened at the first stage.
Re: [GENERAL] How to install pgAdmin 1.20 on Mint Rebecca?
On 26-12-2014 22:13, Adrian Klaver wrote: On 12/26/2014 02:16 PM, Edson Carlos Ericksson Richter wrote: Yes, building from sources worked flawlessly, even for me (I'm not kind a build C apps from sources guy). Great that it worked for you. I have had issues with the wxGTK portion at various times, though it seems that may not be a problem in the future: http://pgsnake.blogspot.com/2014/12/the-story-of-pgadmin.html Well, I've followed build instructions step-by-step, and found no error at all. May be I'm just lucky, but I do prefer to believe that the instructions have been well written. I love Mint for Java+PostgreSQL development. I've never found a easier to maintain OS, even easier than Windows boxes that are almost plug and play (and believe me: I have been MCSE for a long while). Regards, Edson -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] question about window function in C
On Fri, Dec 26, 2014 at 1:19 PM, Dan S strd...@gmail.com wrote: Well I'm trying to implement a window-function that works on range_types and produces 'atomic ranges' for each input range. Let's say I have a set of ranges some overlapping some not, and I want to split each range at every boundary of every overlapping range and return those. So for each range r I want to return an array of ranges that consists of range r split at every overlapping range boundary. I need to consider both upper and lower boundaries and to be able to do this in one pass over the data I need to sort both the upper and lower boundaries and also sort the original ranges so I can loop over the ranges and boundaries in lockstep to produce the arrays. As a last step I sort back the arrays of atomic ranges in the original order in the window so I can read out the tuplesort in that order and return each array to its corresponding range r. (The result can be used to answer questions like what are the maximum number of simultaneously overlapping ranges and at which ranges the maximum occurs) Best Regards Dan S 2014-12-26 18:57 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Dan S strd...@gmail.com writes: I'm trying to write a window function in C . In the function I'm using a tuplesort to sort the window data and then do some processing. Hmm ... why do you feel you need to do that? The window function's input should already be sorted according to the window specification. Now when I'm running the function I get this: 'WARNING: temporary file leak: File 43 still referenced' The warning comes from my failure to call tuplesort_end at the appropriate time. Unsurprising. So I think I need to use some callback mechanism to get that to work. I've found some reference in the postgres source code to RegisterExprContextCallback which seemed promising but I have no idea how to get the right expression context to put in as the first argument to that function. I don't think there is one :-(. WindowAgg has a per-input-tuple econtext, and a per-output-tuple econtext, but what you'd need for this is a partition-lifespan econtext, which doesn't exist. It's possible that we could promote the partcontext memory context into a full econtext so as to support this sort of behavior. But I'd want to see a reasonably convincing use-case for it. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] question about window function in C
On Fri, Dec 26, 2014 at 11:41 PM, Merlin Moncure mmonc...@gmail.com wrote: On Fri, Dec 26, 2014 at 1:19 PM, Dan S strd...@gmail.com wrote: Well I'm trying to implement a window-function that works on range_types and produces 'atomic ranges' for each input range. Let's say I have a set of ranges some overlapping some not, and I want to split each range at every boundary of every overlapping range and return those. So for each range r I want to return an array of ranges that consists of range r split at every overlapping range boundary. I need to consider both upper and lower boundaries and to be able to do this in one pass over the data I need to sort both the upper and lower boundaries and also sort the original ranges so I can loop over the ranges and boundaries in lockstep to produce the arrays. As a last step I sort back the arrays of atomic ranges in the original order in the window so I can read out the tuplesort in that order and return each array to its corresponding range r. (The result can be used to answer questions like what are the maximum number of simultaneously overlapping ranges and at which ranges the maximum occurs) shoot -- sorry for empty mail -- misclick. anyways, can you give some precise examples of what you want to occur? for example, a set of sql taking actual inputs and the output you would like to see... merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general