Re: [HACKERS] [PATCH] Tests for reloptions

2017-10-29 Thread Nikolay Shaplov
В письме от 19 октября 2017 14:20:52 Вы написали:

> Yeah, it would perhaps be good idea to ensure we don't break things that
> are documented to work.  If the tests don't take too long, I'm not
> opposed to testing every single option.  As you say, code coverage is
> important but it's not the only goal.
> 
> I'm hesitant to hardcode things like the number of bits in bloom, as you
> had in the original.  If I understand correctly, that number could
> change with compile options (different blocksize?), so I removed that
> part.  I also fixed a few spelling errors.
> 
> And pushed.  Let's see what the buildfarm says about this.
While merging this commit to my branch, I found two issues that as I think 
needs fixing. Hope this does not require creating new commit request...

First is missing tab.

Second i think it is better to write "The OIDS option is not stored as 
_reloption_" otherwise it cat be read as if it is not stored at all.

See patch in the attachment.

Thank you again for your work with the patch. I've seen how much you have 
change it.

PS do I get right that 80 character code width rule is applied to SQL tests 
too?

-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)diff --git a/src/test/regress/sql/reloptions.sql b/src/test/regress/sql/reloptions.sql
index c9119fd..21a315b 100644
--- a/src/test/regress/sql/reloptions.sql
+++ b/src/test/regress/sql/reloptions.sql
@@ -47,12 +47,12 @@ SELECT reloptions FROM pg_class WHERE oid = 'reloptions_test'::regclass;
 ALTER TABLE reloptions_test RESET (autovacuum_enabled,
 	autovacuum_analyze_scale_factor);
 SELECT reloptions FROM pg_class WHERE oid = 'reloptions_test'::regclass AND
-reloptions IS NULL;
+	reloptions IS NULL;
 
 -- RESET fails if a value is specified
 ALTER TABLE reloptions_test RESET (fillfactor=12);
 
--- The OIDS option is not stored
+-- The OIDS option is not stored as reloption
 DROP TABLE reloptions_test;
 CREATE TABLE reloptions_test(i INT) WITH (fillfactor=20, oids=true);
 SELECT reloptions, relhasoids FROM pg_class WHERE oid = 'reloptions_test'::regclass;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Tests for reloptions

2017-10-22 Thread Nikolay Shaplov
В письме от 19 октября 2017 14:20:52 Вы написали:

> I'm hesitant to hardcode things like the number of bits in bloom, as you
> had in the original.  If I understand correctly, that number could
> change with compile options (different blocksize?), so I removed that
> part.  

#define MAX_BLOOM_LENGTH(256 * SIGNWORDBITS)

#define SIGNWORDBITS ((int) (BITS_PER_BYTE * sizeof(BloomSignatureWord)))

typedef uint16 BloomSignatureWord;

Here everything is based on uint16, and it is platform independent, as far as 
I can get.

But this is not really important now... 


> I also fixed a few spelling errors.
Thank you. I am not so good with natural languages :-)

> And pushed.
Thanx!

> Let's see what the buildfarm says about this.
It seems to me that it is quite happy about these tests :-)

> Oh, one more thing: be careful when editing parallel_schedule.  There
> are constraints on the number of entries
Oh, I did not payed attention to this issue, through it it mentioned in  
parallel_schedule comments. I've added it to the wiki 
https://wiki.postgresql.org/wiki/Regression_test_authoring So it was all 
described in one place.


> in each group; you had added a
> 20th entry after the comment that the group can only have 19.
Oups it was definitely my mistake. I should be more attentive... :-( 


-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Tests for reloptions

2017-10-06 Thread Nikolay Shaplov
В письме от 3 октября 2017 11:48:43 пользователь Michael Paquier написал:

> > src/backend/access/common/reloptions.c get only 7 lines, it was quite
> > covered by existing test, but all most of the access methods gets some
> > coverage increase:
> > 
> > src/backend/access/brin 1268 -> 1280 (+18)
> > src/backend/access/gin  2924 -> 2924 (0)
> > src/backend/access/gist 1673 -> 2108 (+435)
> > src/backend/access/hash 1580 -> 1638 (+58)
> > src/backend/access/heap 2863 -> 2866 (+3)
> > src/backend/access/nbtree   2565 -> 2647 (+82)
> > src/backend/access/spgist   2066 -> 2068 (+2)
> > 
> > Though I should say that incredible coverage boost for gist, is the result
> > of not steady results of test run. The real value should be much less...
> +-- Test buffering and fillfactor reloption takes valid values
> +create index gist_pointidx2 on gist_point_tbl using gist(p) with
> (buffering = on, fillfactor=50);
> +create index gist_pointidx3 on gist_point_tbl using gist(p) with
> (buffering = off);
> +create index gist_pointidx4 on gist_point_tbl using gist(p) with
> (buffering = auto);
> Those are the important bits doing the boost in gist visibly. To be kept.
> 
> -CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops);
> +CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random
> float8_ops) WITH (fillfactor=60);
> Woah. So that alone increases hash by 58 steps.
Might be... As I have noticed, tests  for indexes filled with random data test 
coverage, gives random coverage. There needed more random data to give steady 
results. I am gathering statistics now, and later I hope I will offer another 
patch(es) to fix it. So not all of 58 lines might be result of this patch :-)
 
> +CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (length=0);
> +ERROR:  value 0 out of bounds for option "length"
> +DETAIL:  Valid values are between "1" and "4096".
> contrib/bloom contributes to the coverage of reloptions.c thanks to
> its coverage of the add_ routines when the library is loaded. And
> those don't actually improve any coverage, right?
I actually run only simple "make check". running "make check" for bloom will 
also add some add_ routines to coverage.


> > Nevertheless tests touches the reloptions related code, checks for proper
> > error handling, and it is good.
> 
> Per your measurements reloptions.c is improved by 1.7%, or 7 lines as
> you say. Five of them are in parse_one_reloption for integer (2) and
> reals (2), plus one error at the top. Two are in transformRelOptions
> for a valid namespace handling. In your patch reloptions.sql is 141
> lines. Couldn't it be shorter with the same improvements? It looks
> that a lot of tests are overlapping with existing ones.
> 
> > I think we should commit it.
> 
> My take is that a lighter version could be committed instead. My
> suggestion is to keep the new test reloptions minimal so as it tests
> the relopt types and their bounds, and I think that you could remove
> the same bounding checks in the other tests that you added: bloom,
> gist, etc.

I've been thinking a lot, and rereading the patch. When I reread it I've been 
thinking that I would like to add more tests to it now... ;-)

If the only purpose of tests is to get better coverage, then I would agree 
with you. But I've been thinking that tests also should check that everything 
behaves as expected (or as written in documentation).

I would say that options names is a of part of SQL dialect that postgres uses, 
kind of part of the syntax. It is good to be sure that they still supported. 
So I would add a test for every option heap supports.

Checking minimum and maximum values are more disputable. The argumentation for 
it is that min and max are written in documentation, and it is good to check 
that postgres behaves according to documentation...



But it you tell me for sure that test only for code coverage and should not do 
anything more, than I surely remove all tests that are not increase test 
coverage.


-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Tests for reloptions

2017-09-30 Thread Nikolay Shaplov

While working with reloptions refactoring patch, I've written series of tests 
that triggers reloptions related code in all access methods. (I needed it to 
make sure I did not break anything while coding)

I've included these tests to that patch.

Meanwhile Alvaro suggested to commit these tests before the main patch, in 
order to make sure, that this patch does not change usual behavior.

So these tests separated from reloptions patch is in the attachment.

I've removed tests that check functionality that were introduced only in my 
patch, and kept those that checks things that are already in postgres.

I also compared test coverage before and after applying this patch 
(You can also compare, I put coverage results online
http://lj.nataraj.su/2017/reloptions_fix/coverage-master/
http://lj.nataraj.su/2017/reloptions_fix/coverage-patched/ )

Tests adds almost 600 lines to the test covered code (but see note at the end 
of the letter)

src/backend/access/common/reloptions.c get only 7 lines, it was quite covered 
by existing test, but all most of the access methods gets some coverage 
increase:

src/backend/access/brin 1268 -> 1280 (+18)
src/backend/access/gin  2924 -> 2924 (0)
src/backend/access/gist 1673 -> 2108 (+435)
src/backend/access/hash 1580 -> 1638 (+58)
src/backend/access/heap 2863 -> 2866 (+3) 
src/backend/access/nbtree   2565 -> 2647 (+82)
src/backend/access/spgist   2066 -> 2068 (+2)

Though I should say that incredible coverage boost for gist, is the result of 
not steady results of test run. The real value should be much less...

Nevertheless tests touches the reloptions related code, checks for proper 
error handling, and it is good.

I think we should commit it.







  

 

-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)diff --git a/contrib/bloom/expected/bloom.out b/contrib/bloom/expected/bloom.out
index cbc50f7..df9b7b9 100644
--- a/contrib/bloom/expected/bloom.out
+++ b/contrib/bloom/expected/bloom.out
@@ -210,3 +210,25 @@ ORDER BY 1;
  text_ops | t
 (2 rows)
 
+-- reloptions test
+DROP INDEX bloomidx;
+CREATE INDEX bloomidx ON tst USING bloom (i, t) WITH (length=7, col1 = 4);
+SELECT reloptions FROM pg_class WHERE oid = 'bloomidx'::regclass;
+reloptions 
+---
+ {length=7,col1=4}
+(1 row)
+
+-- check for min and max values
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (length=0);
+ERROR:  value 0 out of bounds for option "length"
+DETAIL:  Valid values are between "1" and "4096".
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (length=4097);
+ERROR:  value 4097 out of bounds for option "length"
+DETAIL:  Valid values are between "1" and "4096".
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (col1=0);
+ERROR:  value 0 out of bounds for option "col1"
+DETAIL:  Valid values are between "1" and "4095".
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (col1=4096);
+ERROR:  value 4096 out of bounds for option "col1"
+DETAIL:  Valid values are between "1" and "4095".
diff --git a/contrib/bloom/sql/bloom.sql b/contrib/bloom/sql/bloom.sql
index 2227460..39b0239 100644
--- a/contrib/bloom/sql/bloom.sql
+++ b/contrib/bloom/sql/bloom.sql
@@ -81,3 +81,14 @@ SELECT opcname, amvalidate(opc.oid)
 FROM pg_opclass opc JOIN pg_am am ON am.oid = opcmethod
 WHERE amname = 'bloom'
 ORDER BY 1;
+
+-- reloptions test
+
+DROP INDEX bloomidx;
+CREATE INDEX bloomidx ON tst USING bloom (i, t) WITH (length=7, col1 = 4);
+SELECT reloptions FROM pg_class WHERE oid = 'bloomidx'::regclass;
+-- check for min and max values
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (length=0);
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (length=4097);
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (col1=0);
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (col1=4096);
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index ed03cb9..e03d72d 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -3551,3 +3551,10 @@ create table parted_validate_test_1 partition of parted_validate_test for values
 alter table parted_validate_test add constraint parted_validate_test_chka check (a > 0) not valid;
 alter table parted_validate_test validate constraint parted_validate_test_chka;
 drop table parted_validate_test;
+-- test alter column options
+CREATE TABLE tmp(i integer);
+INSERT INTO tmp VALUES (1);
+ALTER TABLE tmp ALTER COLUMN i SET (n_distinct = 1, n_distinct_inherited = 2);
+ALTER TABLE tmp ALTER COLUMN i RESET (n_distinct_inherited);
+ANALYZE tmp;
+DROP TABLE tmp;
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 064adb4..16a3b8b 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2337,7 +2337,7 @@ Options: fastupdate=on, gin_pending_list_limit=128
 CREATE INDEX hash_i4_index ON 

Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM

2017-09-05 Thread Nikolay Shaplov
В письме от 3 сентября 2017 11:45:43 пользователь Alvaro Herrera написал:

> > I've just made sure that patch is still applyable to the current master.
> > 
> > And I am still waiting for reviews :-)
> 
> I see that this patch adds a few tests for reloptions, for instance in
> bloom.  I think we should split this in at least two commits, one that
> adds those tests before the functionality change, so that they can be
> committed in advance, verify that the buildfarm likes it with the
> current code, and verify the coverage.
This sounds as a good idea. I created such patch and started a new thread for 
it https://www.postgresql.org/message-id/2615372.orqtEn8VGB%40x200m (I will 
add that thread to commitfest as soon as I understand what test should be left 
there)

> I also noticed that your patch adds an error message that didn't exist
> previously,
> 
> +   ereport(ERROR,
> +   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +  errmsg("col%i should not be grater than length",
> i)));
> 
> this seems a bit troublesome, because of the "col%i" thing 
What the problem with col%i ?

> and also because of the typo.
Oh... I always had bad marks for all natural languages in school :-( ;-)

> I wonder if this means you've changed the way sanity checks work here.
This should not be like this (I hope). I will attentively look at this part of 
code again, and explain what exactly I've done. (I did it a year ago and now 
need some time to recall)

> The new checks around toast table creation look like they belong to a
> completely independent patch also ... 
This sounds reasonable too. Will do it, it is possible, as far as I remember.

> the error message there goes against message style guidelines also.
Oh... these style guidelines... will pay attention to it too...

-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Tests for reloptions

2017-09-05 Thread Nikolay Shaplov

This thread continues discussion at 
https://www.postgresql.org/message-id/20170903094543.kkqdbdjuxwa5z6ji@alvherre.pgsql
(Shortly: I refactored reloptions code, Alvaro offered to commit tests before 
the full patch)

> I see that this patch adds a few tests for reloptions, for instance in
> bloom.  I think we should split this in at least two commits, one that
> adds those tests before the functionality change, so that they can be
> committed in advance, verify that the buildfarm likes it with the
> current code, and verify the coverage.

This sounds as a really good idea.

Though I have several questions. This tests also covers some functionality 
that were introduced only in my patch:

1. Forbid SET and RESET options where they should not be changed
2. Forbid creating tables with toasts options when no toast table is created
3. Split StdRdOptions into HeapOptions and ToastOptions and forbid uising Heap 
specific options for toast.

In the patch I've attached I've commented out this functionality. But I am not 
quite sure that it is good idea to commit it this way in master.

May be it would be good to make 1-3 as a separate patches and bring it's tests 
with, as a separate commit. But...

2nd can be easily ported to master. It does not depend much on my reloptions 
patch as far as I remember.

It would be insane to port 1st feature to master. It highly integrated with 
reloptions mechanism, It would require complete reimplementation of this 
feature using old reloptions tools. I totally do not want to do it 

The 3rd functionality came from philosophy one relation-type -- one options 
catalog, that was implemented in my reloptions patch. It is not really needed 
in master without the full patch. With some efforts I think it can be made as a 
separate patch, thought I would also try to avoid it if possible.


So the questions still is: should we commit not existent functionality tests 
commented, uncomented but with no proper error response in expected output, or 
just remove these tests from this patch?

-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)diff --git a/contrib/bloom/expected/bloom.out b/contrib/bloom/expected/bloom.out
index cbc50f7..37fc7f9 100644
--- a/contrib/bloom/expected/bloom.out
+++ b/contrib/bloom/expected/bloom.out
@@ -210,3 +210,33 @@ ORDER BY 1;
  text_ops | t
 (2 rows)
 
+-- reloptions test
+DROP INDEX bloomidx;
+CREATE INDEX bloomidx ON tst USING bloom (i, t) WITH (length=7, col1 = 4);
+SELECT reloptions FROM pg_class WHERE oid = 'bloomidx'::regclass;
+reloptions 
+---
+ {length=7,col1=4}
+(1 row)
+
+-- check for min and max values
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (length=0);
+ERROR:  value 0 out of bounds for option "length"
+DETAIL:  Valid values are between "1" and "4096".
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (length=4097);
+ERROR:  value 4097 out of bounds for option "length"
+DETAIL:  Valid values are between "1" and "4096".
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (col1=0);
+ERROR:  value 0 out of bounds for option "col1"
+DETAIL:  Valid values are between "1" and "4095".
+CREATE INDEX bloomidx2 ON tst USING bloom (i, t) WITH (col1=4096);
+ERROR:  value 4096 out of bounds for option "col1"
+DETAIL:  Valid values are between "1" and "4095".
+ check post_validate for colN 0) not valid;
 alter table parted_validate_test validate constraint parted_validate_test_chka;
 drop table parted_validate_test;
+-- test alter column options
+CREATE TABLE tmp(i integer);
+INSERT INTO tmp VALUES (1);
+ALTER TABLE tmp ALTER COLUMN i SET (n_distinct = 1, n_distinct_inherited = 2);
+ALTER TABLE tmp ALTER COLUMN i RESET (n_distinct_inherited);
+ANALYZE tmp;
+DROP TABLE tmp;
diff --git a/src/test/regress/expected/brin.out b/src/test/regress/expected/brin.out
index ca80f00..6568b6b 100644
--- a/src/test/regress/expected/brin.out
+++ b/src/test/regress/expected/brin.out
@@ -507,3 +507,6 @@ EXPLAIN (COSTS OFF) SELECT * FROM brin_test WHERE b = 1;
Filter: (b = 1)
 (2 rows)
 
+ Check that changing reloptions for brin index is not allowed
+--ALTER INDEX brinidx SET (pages_per_range = 10);
+--ALTER INDEX brinidx RESET (pages_per_range);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 064adb4..16a3b8b 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2337,7 +2337,7 @@ Options: fastupdate=on, gin_pending_list_limit=128
 CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
 CREATE INDEX hash_name_index ON hash_name_heap USING hash (random name_ops);
 CREATE INDEX hash_txt_index ON hash_txt_heap USING hash (random text_ops);
-CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops);
+CREATE INDEX hash_f8_index ON hash_f8_heap USING hash (random float8_ops) WITH (fillfactor=60);
 CREATE UNLOGGED TABLE unlogged_hash_table (id int4);
 CREATE INDEX 

Re: [HACKERS] pgbench tap tests & minor fixes.

2017-08-31 Thread Nikolay Shaplov
Hi All!

I am about to set "Ready for commit" status to this patch. So there is my 
summary for commiter, so one does not need to carefully read all the thread.

This patch is consists of three parts. May be they should be commited 
separately, then Fabien will split them, I think.

1. The tests.

Tests are quite good. May be I myself would write them in another style, but 
"there is more than one way to do it" in perl, you know. 
These test covers more than 90% of C code of pgbench, which is good. (I did 
not check this myself, but see no reason not to trust Fabien)

The most doubtful part of the patch is the following: the patch introduce 
command_checks_all function in TestLib.pm that works like command_like 
function but also allows to check STDERR output and exit status.

First: I have some problem with the name, I would call it command_like_more or 
something similar, but I decided to leave it for commiter to make final choice.

Second: I think that command_checks and command_like do very similar things. I 
think that later all lests should be rewritten to use command_checks, and get 
rid of command_like, And I do not know how to put this better in the 
developing workflow. May be it should be another patch after this one is 
commited.

2. Patch for -t/-R/-L case.

Current pgbench does not process -t/-R/-L case correctly. This was also fixed 
in this patch. 

Now if you set number of transactions using -t/--transactions, combining with 
-R/--rate or -L/--latency-limit, you can be sure there would be not more than 
were specified in -t and they are properly counted.

This part is quite clear

3. \n process in process_backslash_command error output

process_backslash_command raises an error if there are some problems with 
backslash commands, and prints the command that has error.

But it considers that there is always \n symbol at the end of the command and 
just chop it out. But when the backslash command is at the end of the file, 
there is no \n at the end of line.

So this patch introduces expr_scanner_chomp_substring function that works just 
like expr_scanner_get_substring but it skips \n or \r\n symbols at the end of 
line.

I still have some problems with function name here, but have no idea how to do 
it better.



So that's it. I hope my involvement in the review process were useful. Will be 
happy to see this patch commited into master :-)

-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-08-20 Thread Nikolay Shaplov
В письме от 25 июня 2017 20:42:58 пользователь Fabien COELHO написал:
> > may be this it because of "end_offset + 1" in expr_scanner_chomp_substring
> > ? Why there is + 1 there?
> 
> Thanks for the debug! Here is a v9 which does a rebase after the
> reindentation and fixes the bad offset.
Sorry I am back here after a big pause (I hope)

I've applied the patch to the current master, and it seems that one test now 
fails:

regress_log_002_pgbench_no_server:

not ok 70 - pgbench option error: prepare after script err /(?^:query mode .* 
before any)/

#   Failed test 'pgbench option error: prepare after script err /(?^:query 
mode .* before any)/'
#   at /home/nataraj/dev/review-
pgbench/ccc/postgresql/src/bin/pgbench/../../../src/test/perl/TestLib.pm line 
369.
#   'connection to database "" failed:
# could not connect to server: No such file or directory
#   Is the server running locally and accepting
#   connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
# '
# doesn't match '(?^:query mode .* before any)'
opts=-i -S, stat=1, out=(?^:^$), err=(?^:cannot be used in initialization), 
name=pgbench option error: init vs run# Running: pgbench -i -S


-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-06-25 Thread Nikolay Shaplov
В письме от 15 июня 2017 21:10:12 Вы написали:
> > As for me, I would do expr_scanner_chomp_substring(PsqlScanState, int,
> > int&); that changes end_offset as desired...
> 
> Why not.
> 
> > And use it instead of end_offset = expr_scanner_offset(sstate) - 1;
> 
> I removed these?
> 
> > The second issue: you are removing all trailing \n and \r. I think you
> > should remove only one \n at the end of the string, and one \r before \n
> > if there was one.
> 
> So chomp one eol.
> 
> Is the attached version better to your test?

I've expected from expr_scanner_chomp_substring to decrement end_offset, so it 
would work more like perl chomp function, but the way you've done is also 
good.

The sad think is that in v7 and v8 TAP tests fails. (in v6 it still works, I 
have local branches for all your patches versions). I did not check it bdefore 
in v7, just read the code. It was my mistake 

-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-06-25 Thread Nikolay Shaplov
В письме от 25 июня 2017 19:03:54 пользователь Fabien COELHO написал:
> Hello Nikolay,
> 
> >> Is the attached version better to your test?
> > 
> > I've expected from expr_scanner_chomp_substring to decrement end_offset,
> > so it would work more like perl chomp function, but the way you've done
> > is also good.
> 
> Ok.
> 
> > The sad think is that in v7 and v8 TAP tests fails. (in v6 it still works,
> > I have local branches for all your patches versions). I did not check it
> > bdefore in v7, just read the code. It was my mistake
> 
> Could you be more precise please? Which TAP tests are failing? Could you
> give the log showing the issues encountered?

I am building dev postgres with --enable-cassert

and get a lot of 

'pgbench: exprscan.l:354: expr_scanner_get_substring: Assertion `end_offset <= 
strlen(state->scanbuf)' failed.

may be this it because of "end_offset + 1" in expr_scanner_chomp_substring ?

Why there is + 1 there?

> 
> I did "make check" and "make check-world", both PASS.
> 
> ISTM that manually in pgbench right know with patch v8 I have:
> 
>   sh> make check
>   rm -rf '/home/fabien/DEV/GIT/postgresql'/tmp_install
>   /bin/mkdir -p '/home/fabien/DEV/GIT/postgresql'/tmp_install/log
>   make -C '../../..' DESTDIR='/home/fabien/DEV/GIT/postgresql'/tmp_install
>   install >'/home/fabien/DEV/GIT/postgresql'/tmp_install/log/install.log
> 2>&1 rm -rf /home/fabien/DEV/GIT/postgresql/src/bin/pgbench/tmp_check/log
> cd . && TESTDIR='/home/fabien/DEV/GIT/postgresql/src/bin/pgbench'
> PATH="/home/fabien/DEV/GIT/postgresql/tmp_install/usr/local/pgsql/bin:$PATH
> "
> LD_LIBRARY_PATH="/home/fabien/DEV/GIT/postgresql/tmp_install/usr/local/pgsq
> l/lib" PGPORT='65432'
> PG_REGRESS='/home/fabien/DEV/GIT/postgresql/src/bin/pgbench/../../../src/te
> st/regress/pg_regress' prove -I ../../../src/test/perl/ -I .  t/*.pl
> t/001_pgbench_with_server.pl .. ok
>   t/002_pgbench_no_server.pl  ok
>   All tests successful.
>   Files=2, Tests=360,  6 wallclock secs ( 0.04 usr  0.02 sys +  4.53 cusr
> 0.22 csys =  4.81 CPU) Result: PASS
> 
> Which looks ok.

-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-06-14 Thread Nikolay Shaplov
В письме от 8 июня 2017 19:56:02 пользователь Fabien COELHO написал:

> > So this should be fixed in both expr_scanner_get_substring cases, and keep
> > last symbol only if it is not "\n".
> 
> Indeed, this is a mess. The code assumes all stuff is a line ending with
> '\n', but this is not always the case.
> 
> Here is a v7 which chomps the final newline only if there is one.

Sorry, but I still have problems with new solution...

First is function name. "expr_scanner_get_line" does not deal with any line at 
all... it gets substring and "chomps" it.

As for me, I would do expr_scanner_chomp_substring(PsqlScanState, int, int&);
that changes end_offset as desired... And use it instead of end_offset = 
expr_scanner_offset(sstate) - 1;



The second issue: you are removing all trailing \n and \r. I think you should 
remove only one \n at the end of the string, and one \r before \n if there was 
one.
I do not think there will be any practical difference between my and yours 
solutions here, but mine is more neat, I think... I do not have enough 
imagination to think about if "\n\r\r\r\r\r\n" case can happen, and what will 
happen of we remove them all... So I suggest to remove only the last one.


-- 
Do code for fun. Can do it for money (Perl & C/C++ ~10h/week)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-06-07 Thread Nikolay Shaplov
В письме от 30 мая 2017 17:24:26 Вы написали:


> > I still have three more questions. A new one:
> > 
> > 
> > 
> >my_command->line = expr_scanner_get_substring(sstate,
> >
> >  start_offset,
> > 
> > - end_offset);
> > + end_offset + 1);
> > 
> > 
> > I do not quite understand what are you fixing here,
> 
> I fix a truncation which appears in an error message later on.
> 
> > I did not find any mention of it in the patch introduction letter.
> 
> Indeed. Just a minor bug fix to avoid an error message to be truncated. If
> you remove it, then you can get:
> 
>   missing argument in command "sleep"
>   \slee
> 
> Note the missing "p"...
> 
> > And more, expr_scanner_get_substring is called twice in very similar
> > code, and it is fixed only once. Can you tell more about this fix.
> 
> I fixed the one which was generating truncated messages. I did not notice
> other truncated messages while testing, so I assume that other calls are
> correct, but I did not investigate further, so I may be wrong. Maybe in
> other instances the truncation removes a "\n" which is intended?

I did some investigation: The code there really suppose that there is always 
\n at the end of the line, and truncates the line. It is done in 

/* Get location of the ending newline */
end_offset = expr_scanner_offset(sstate) - 1;

just two lines above the code we are discussing.

When you have one line code /sleep 2ms with no "end of line" symbol at the 
end, it will cut off "s" instead of "\n"

You've fix it, but now it will leave \n, in all sleeps in multiline scripts.

So this should be fixed in both expr_scanner_get_substring cases, and keep last 
symbol only if it is not "\n".

> Here is a v6.
> 
>   - it uses "progress == 0"
> 
>   - it does not change "nxacts <= 0" because of possible wrapping
> 
>   - it fixes two typos in a perl comment about the checks_all function
> 
>   - I kept "checks all" because this talks more to me than "like more"
> if a native English speaker or someone else has an opinion, it is
> welcome.
Ok, let's leave this for commiter to make final decision.

> Also, if someone could run a test on windows, it would be great.
I'll try to ask a friend of mine to run this on windows...

-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-05-08 Thread Nikolay Shaplov
В письме от 8 мая 2017 22:08:51 пользователь Fabien COELHO написал:

> >while (thread->throttle_trigger < now_us -
> >latency_limit &&
> >
> >   /* with -t, do not overshoot */
> >   (nxacts <= 0 || st->cnt < nxacts))
> 
>  ...
> 
> >if (nxacts > 0 && st->cnt >= nxacts)
> >{
> >
> >st->state = CSTATE_FINISHED;
> >break;
> >
> >}
> > 

st->cnt -- number of transactions finished successed or failed, right?

one iteration of for(;;) is for one transaction or really less. Right? We 
can't process two tansactions in one iteration of this loop. So  we can't 
increase st->cnt more then once during one iteration?


So let's look at the while loop:

   while (thread->throttle_trigger < now_us - latency_limit 
&&
   /* with -t, do not overshoot */
   (nxacts <= 0 || st->cnt < nxacts))
{
processXactStats(thread, st, , true, agg);
/* next rendez-vous */
wait = getPoissonRand(thread, throttle_delay);
thread->throttle_trigger += wait;
st->txn_scheduled = thread->throttle_trigger;
}


Let's imagine that thread->throttle_trigger is  now_us - 10 000,
latency_limit is 5 000 and throttle_delay is 100

How many times you would call  processXactStats in this while loop?
And each time it would do st->cnt++

And this while loop is inside for(;;) in which as I said above we can do st-
>cnt++ not more than once. I see no logic here.


PS This is a fast reply. May be it will make things clear fast wither for me 
or for you. I will carefully answer your full letter tomorrow (I hope nothing 
will prevent me from doing it)

-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-04-24 Thread Nikolay Shaplov
В письме от 24 апреля 2017 09:01:18 пользователь Fabien COELHO написал:
> > To sum up:
> > 
> > - I agree to add a generic command TestLib & a wrapper in PostgresNode,
> > 
> >   instead of having pgbench specific things in the later, then call
> >   them from pgbench test script.
> > 
> > - I still think that moving the pgbench scripts inside the test script
> > 
> >   is a bad idea (tm).

My sum up is the following:

I see my job as a reviewer is to tell "I've read the code, and I am sure it is 
good".

I can tell this about this code, if:

- There is no pgbench specific staff in src/test/perl. Or there should be 
_really_big_ reason for it. 

- All the testing is done via calls of TestLib.pm functions. May be these 
functions should be improved somehow. May be there should be some warper 
around them. But not direct IPC::Run::run call.

- All the pgbench scripts are stored in one file. 36 files are not acceptable.  
I would include them in the test script itself. May be it can be done in other 
ways. But not 36 less then 100 byte files in source code tree...


May be I am wrong. I am not a guru. But then somebody else should say "I've 
read the code, and I am sure it is good" instead of me. And it would be his 
responsibility then. But if you ask me, issues listed above are very 
important, and until they are solved I can not tell "the code is good", and I 
see no way to persuade me. May be just ask somebody else...


-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-04-24 Thread Nikolay Shaplov
В письме от 23 апреля 2017 22:02:25 пользователь Fabien COELHO написал:
> Hello Nikolay,
> 
> >> Hmmm. The pre-existing TAP test in pgbench is about concurrent commits,
> >> it
> >> is not to test pgbench itself. Pgbench allows to run some programmable
> >> clients in parallel very easily, which cannot be done simply otherwise. I
> >> think that having it there would encourage such uses.
> > 
> > Since none of us is Tom Lane, I think we have no moral right to encourage
> > somebody doing somebody in postgres.
> 
> I do not get it.
> 
> > People do what they think better to do.
> 
> Probably.
> 
> >> [...] abstraction. For me, all pg standard executables could have their
> >> methods in PostgresNode.pm.
> > 
> > you are speaking about
> > local $ENV{PGPORT} = $self->port;
> > ?
> 
> Yes. My point is that this is the internal stuff of PostgresNode and that
> it should stay inside that class. The point of the class is to organize
> postgres instances for testing, including client-server connections. From
> an object oriented point of view, the method to identify the server could
> vary, thus the testing part should not need to know, unless what is tested
> is this connection variations, hence it make sense to have it there.
> 
> > Why do you need it here after all? Lots of TAP tests for bin utilities
> > runs
> > them using command_like function from TestLib.pm and need no setting
> > $ENV{PGPORT}.
> 
> The test I've seen that do not need it do not connect to the server.
> In order to connect to the server they get through variants from
> PostgresNode which set the variables before calling the TestLib function.
> 
> > Is pgbench so special? If it is so, may be it is good reason to
> > fix utilities from TestLib.pm, so they can take port from PostgresNode.
> 
> Nope. TestLib does not know about PostgresNode, the idea is rather that
> PostgresNode knows and wraps around TestLib when needed.

Actually as I look at this part, I feeling an urge to rewrite this code, and 
change it so, that all command_like calls were called in a context of certain 
node, but it is subject for another patch. In this patch it would be good to 
use TestLib functions as they are now, or with minimum modifications.
 
> > If in these tests we can use command_like instead of pgbench_likes it
> > would increase maintainability of the code.
> 

> "command_like" variants do not look precisely at all results (status,
> stdout, stderr) and are limited to what they check (eg one regex at a
> time). Another thing I do not like is that they expect commands as a list
> of pieces, which is not very readable.
checking all this things sounds as a good idea. 

> 
> Now I can add a command_likes which allows to manage status, stdout and
> stderr and add that in TestLib & PostgresNode .
This is good idea too, I still do not understand why do you want to add it to 
PostgresNode. 

And name command_likes -- a very bad solution, as it can easily be confused 
with command_like. If it is possible to do it backward compatible, I would try 
to improve command_like, so it can check all the results. If it is not, I 
would give this function a name that brings no confusion.

> >>> I think all the data from this file should be somehow imported into
> >>> 001_pgbench.pl and these files should be created only when tests is
> >>> running, and then removed when testing is done.
> >> 
> >> Hmmm. I thought of that but was very unconvinced by the approach: pgbench
> >> basically always requires a file, so what the stuff was doing was
> >> writting
> >> the script into a file, checking for possible errors when doing so, then
> >> unlinking the file and checking for errors again after the run.
> > 
> > I think I was wrong about deleting file after tests are finished. Better
> > keep them.
> 
> Hmmm... Then what is the point not to have them as files to begin with?

Not to have them in source code tree in git.

The rest would be in the answer to another sum up letter.

-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-04-23 Thread Nikolay Shaplov
once chooses the right one, 
there would be no need in additional backslashes.

> Finally, if the script is inside the perl
> script it makes it hard to run the test outside of it when a problem is
> found, so it is a pain.
I've took back my words about deleting. After a first run one will have these 
files "in flesh" so they would be available for further experiments.

I would speak again about maintainability. Having 36 files, most of them <100b 
size -- is a very bad idea for maintainability. If each commit of new 
functionality would add 36 small files, we will drown in these files quite 
soon. 
These files should be generated on fly. I am 100% sure of it.

The way they are generated may vary... I would prefer to have the script 
source code written close to the test that uses it, where it is possible, but 
this is just my wishes. 

 
PS. I've read the perl code through much more carefully. All other things are 
looking quite good to me.

-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench tap tests & minor fixes

2017-04-20 Thread Nikolay Shaplov
В письме от 17 апреля 2017 14:51:45 пользователь Fabien COELHO написал:

> When developping new features for pgbench, I usually write some tests
> which are lost when the feature is committed. Given that I have submitted
> some more features and that part of pgbench code may be considered for
> sharing with pgsql, I think that improving the abysmal state of tests
> would be a good move.
Hi! Since I used to be good at perl, I like tests, and I've dealt with 
postgres TAP tests before, I think I can review you patch. If it is OK for 
everyone.

For now I've just gave this patch a quick look through... But nevertheless I 
have something to comment:

> The attached patch:
> 
> (1) extends the existing perl tap test infrastructure with methods to test
> pgbench, i.e. "pgbench" which runs a pgbench test and "pgbench_likes"
> which allows to check for expectations.
I do not think it is good idea adding this functions to the PostgresNode.pm. 
They are pgbench specific. I do not think anybody will need them outside of 
pgbench tests. 

Generally, these functions as they are now, should be placed is separate .pm 
file. like it was done in src/test/ssl/

May be you should move some code into some kind of helpers and keep them in 
PostgresNode.pm. 

Or write universal functions that can be used for testing any postgres console 
tool. Then they can be placed in PostgresNode.pm

> (3) add a lot of new very small tests so that coverage jumps from very low
> to over 90% for source files. I think that derived files (exprparse.c,
> exprscan.c) should be removed from coverage analysis.
> 
> Previous coverage status:
> 
>   exprparse.y 0.0 % 0 / 770.0 %0 / 8
>   exprscan.l  0.0 % 0 / 102   0.0 %0 / 8
>   pgbench.c   28.3 %  485 / 1716  43.1 %  28 / 65
> 
> New status:
> 
>   exprparse.y 96.1 %73 / 76   100.0 %  8 / 8
>   exprscan.l  92.8 %90 / 97   100.0 %  8 / 8
>   pgbench.c   90.4 %  1542 / 1705  96.9 % 63 / 65
> 
> The test runtime is about doubled on my laptop, which is not too bad given
> the coverage achieved.
What tool did you use to calculate the coverage?

Lots of small test looks good at first glance, except the point that adding 
lots of small files with pgbench scripts is not great idea. This makes code 
tree too overloaded with no practical reason. I am speaking of 
src/bin/pgbench/t/scripts/001_0* files.

I think all the data from this file should be somehow imported into 
001_pgbench.pl and these files should be created only when tests is running, 
and then removed when testing is done.

I think that job for creating and removing these files should be moved to 
pgbench and pgbench_likes. But there is more then one way to do it. ;-)



That's what I've noticed so far... I will give this patch more attentive look 
soon.

-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] BRIN autosummarize tests

2017-04-19 Thread Nikolay Shaplov

Hi!

Alvaro, you've recently commited patch with BRIN autosummarize

Tell me, this feature can't be really tested via regression tests?

Because now I am rebasing my reloptions patch (again!), and as it was partly 
rebased, I run make check. 
At that point I did not moved implementation of this option completely, and 
thus autosummarize were forever set to false.

And in this obviously broken situation all tests were successful when I run 
them. May be there can be a way to make sure it is really works, and one 
should add it to the tests?



-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM

2017-03-28 Thread Nikolay Shaplov
В письме от 26 марта 2017 15:02:12 пользователь Alvaro Herrera написал:
> Nikolay Shaplov wrote:
> > If I would think about it now: we always know how many options we will
> > have. So we can just pass this number to palloc and assert if somebody
> > adds more options then expected... What do yo think about it.
> 
> I think we need to preserve the ability to add custom options, because
> extensions may want to do that.

I've been thinking about it for a while... I think this might lead to reducing 
the quality of the code...

For example: There was 10 options for some relation type. I decided to add one 
more option, but i did not ++ the number of options for 
allocateOptionsCatalog. So space for 10 options were allocated, and then when 
11th option is added, optionCatalogAddItem would allocate space for ten more 
options, and nine of them would not be used. And nobody will notice it.

So, I see here four solutions:

0. Leave it as it was. (We both do not like it)
1. Forbid dynamic number of options (you do not like it)
2. Allow dynamic number of options only for special cases, and in all other 
cases make them strict, and asserting if option number is wrong. This can be 
done in two ways:
2a. Add strict boolean flag, that tells if allow to add more options or not
2b. Do not add any flags, but if number of items is specified, then process 
number of items in strict mode. If it is set to -1, then do as it is done now, 
totally dynamically.

I would prefer 2b, if you sure that somebody will need dynamic number of 
options.

PS. I hardly believe that there can be dynamic number of options, as this 
options later are mapped into C-structure that is quite static. No use case 
comes into my mind, where I would like to have dynamic number of options, not 
knowing at build time, how many of them there would be.



-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM

2017-03-28 Thread Nikolay Shaplov
В письме от 17 марта 2017 14:21:26 пользователь Alvaro Herrera написал:

> Please make sure to mark functions as static (e.g. bringetreloptcatalog).
I am a bit confused here:

For brin and nbtree this is a good idea: brin.c and nbtree.c has AM-handler 
inside it, and  there are other static functions there. Adding one more static 
function here seems to be quite logical.

For gin, gist and spgist, authors seems to use [index_name]_private.h files to 
hide internal functions from outside code. In ginutil.c and spgutils.c, where 
AM-handler is located, there is no static functions at all...  gist.c has, but 
I think I should write similar code for all *_private.h indexes. 

So I think it wold be good to hide catalog function via *_pricate.h include 
file for these three indexes.

hash.c is quite a mess...
There is no hash_private.h, AM-handles is located in hash.c, that has "This 
file contains only the public interface routines." comment at the beginning, 
and there is no static functions inside. I do not know what is the right way 
to hide hashgetreloptcatalog function here...

What would you advise?


-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM

2017-03-26 Thread Nikolay Shaplov
В письме от 26 марта 2017 15:02:12 Вы написали:
> Nikolay Shaplov wrote:
> > If I would think about it now: we always know how many options we will
> > have. So we can just pass this number to palloc and assert if somebody
> > adds more options then expected... What do yo think about it.
> 
> I think we need to preserve the ability to add custom options, because
> extensions may want to do that.
Ok. At least this will not do any harm :-)

-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM

2017-03-26 Thread Nikolay Shaplov
В письме от 23 марта 2017 16:14:58 пользователь Fabrízio de Royes Mello 
написал:
> On Thu, Mar 23, 2017 at 3:58 PM, Alvaro Herrera <alvhe...@2ndquadrant.com>
> 
> wrote:
> > Copying Fabrízio Mello here, who spent some time trying to work on
> > reloptions too.  He may have something to say about the new
> > functionality that this patch provides.
> 
> Thanks Álvaro, I'll look the patch and try to help in some way.
Thank you, that would be great!

-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM

2017-03-26 Thread Nikolay Shaplov
В письме от 17 марта 2017 14:21:26 пользователь Alvaro Herrera написал:
> I gave this patch a quick skim.
Thanks!

> At first I was confused by the term
> "catalog"; I thought it meant we stored options in a system table.  But
> that's not what is meant at all; instead, what we do is build these
> "catalogs" in memory.  Maybe a different term could have been used, but
> I'm not sure it's stricly necessary.  I wouldn't really bother.
Yes "catalog" is quite confusing. I did not find better name while I was 
writing the code, so I take first one, that came into my mind.
If you, or somebody else, have better idea how to call this sets of options 
definitions I will gladly rename it, as catalog is a bad name. May be 
OptionsDefSet instead of OptionsCatalog?


> I'm confused by the "no ALTER, no lock" rule.  Does it mean that if
> "ALTER..SET" is forbidden?  Because I noticed that brin's
> pages_per_range is marked as such, but we do allow that option to change
> with ALTER..SET, so there's at least one bug there, and I would be
> surprised if there aren't any others.

If you grep, for example, gist index code for "buffering_mode" option, you will 
see, that this option is used only in gistbuild.c. There it is written into 
the meta page, and afterwards, value from meta page is used, and one from 
options, is just ignored.
Nowdays you can successfully alter this value, but this will not affect 
anything until index is recreated... I thought it is very confusing behavior 
and decided that we should just forbid such alters.


> Please make sure to mark functions as static (e.g. bringetreloptcatalog).
Ok. Will do.

> Why not pass ->struct_size and ->postprocess_fun (which I'd rename it as
> ->postprocess_fn, more in line with our naming style) as a parameter to
> allocateOptionsCatalog?
struct_size -- very good idea!
postprocess_fn -- now it is rarely used. In most cases it is NULL. May be it 
would be ok to set it afterwards in that rare cases when it is needed.

> Also, to avoid repalloc() in most cases (and to
> avoid pallocing more options that you're going to need in a bunch of
> cases, perhaps that function should the number of times you expect to
> call AddItems for that catalog (since you do it immediately afterwards
> in all cases I can see), and allocate that number.  If later another
> item arrives, then repalloc using the same code you already have in
> AddItems().
I've copied this code from reloptions code for custom options. Without much 
thinking about it.

If I would think about it now: we always know how many options we will have. 
So we can just pass this number to palloc and assert if somebody adds more 
options then expected... What do yo think about it.


> Something is wrong with leading whitespace in many places; either you
> added too many tabs, or the wrong number spaces; not sure which but
> visually it's clearly wrong.  ... Actually there are whitespace-style
> violations in several places; please fix using pgindent (after adding
> any new typedefs your defining to typedefs.list).
I will run pgindent on my code.



-- 
Nikolay Shaplov, independent Perl & C/C++ developer. Available for hire.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [DOC][PATCH]bloom index options limits

2016-06-03 Thread Nikolay Shaplov

Hi! I've changed limits of the length option in the "Introduction" section. 
They should be 1 ≤ colN ≤ 2048 instead of 1 < colN < 2048 (I've talk with the 
author about it)

I've also added minimal maximum and default values for description of options 
in the "Parameters" section, because I think user should be able to see full 
description of the options in the section where they are described, not in the 
introduction.

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 49cb066..36ddc39 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -23,7 +23,7 @@
Since a signature is a lossy representation of all indexed attributes,
search results must be rechecked using heap information.
The user can specify signature length (in uint16, default is 5) and the
-   number of bits, which can be set per attribute (1 < colN < 2048).
+   number of bits, which can be set per attribute (1  colN  2048).
   
 
   
@@ -51,7 +51,8 @@
 length
 
  
-  Length of signature in uint16 type values
+  Number of uint16 units in signature, e. g. default value length = 5
+  defines 80-bit signature. Allowed values for length are from 1 to 256.
  
 

@@ -61,7 +62,8 @@
 col1  col16
 
  
-  Number of bits for corresponding column
+  Number of bits for corresponding column. Allowed values are from 1
+  to 2048, but not greater than length*16. Default value is 2.
  
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH][Documination] Add optional USING keyword before opclass name in INSERT statemet

2016-05-31 Thread Nikolay Shaplov
В письме от 31 мая 2016 15:38:38 пользователь Robert Haas написал:

> >>> 99% of the time, you'd be right.  But this is an unusual case, for the
> >>> reasons I mentioned before.
> >> 
> >> I tend to agree with Nikolay.  I can't see much upside in making this
> >> change.  At best, nothing will break.  At worst, something will break.
> >> But how do we actually come out ahead?
> > 
> > We come out ahead by not having to make the documentation more confusing.
> > 
> > Basically, we have the opportunity to fix an ancient mistake here at
> > very low cost.  I do not think that doubling down on the mistake is
> > a better answer.
> 
> I'm not convinced, but we don't have to agree on everything...
I am not convinced too. But I will not argue hard for the patch as far as my 
main goal was to report inconsistency. Through the I consider Tom's proposal 
quite strange...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind

2016-05-27 Thread Nikolay Shaplov
В письме от 27 мая 2016 15:05:58 Вы написали:
> Nikolay Shaplov wrote:
> > Story start from the point that I found out that a.m. can not forbid
> > changing some of it's reloptions with ALTER INDEX command. That was not
> > necessary before, because all reloptions at that existed at that time can
> > be changed on fly. But now for bloom index it is unacceptable, because
> > for changing bloom's reloptions for existing index will lead to index
> > malfunction.
> 
> Hmm, this sounds like a bug to me.  In BRIN, if you change the
> pages_per_range option for an existing index, the current index
> continues to work because the value used during the last index build is
> stored in the metapage.  Only when you reindex after changing the option
> the new value takes effect.
> 
> I think Bloom should do likewise.

I do not think that it is the best behavior. Because if we came to this 
situation, the current value of pages_per_range that index actually using is 
not available for user, because he is not able to look into meta page.

In this case it would be better either forbid changing the options, so it 
would be consistent, or force index rebuild, then it would be consistent too. 
I would vote for first behavior as this is less work to do for me, and can be 
changed later, if it is really needed for some case.

PS sorry I did not add mail list to the CC, so I send it second time...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind

2016-05-27 Thread Nikolay Shaplov
В письме от 24 мая 2016 17:12:16 пользователь Nikolay Shaplov написал:

While working on this patch I met some difficulties that makes me to completely 
rewrite a code that is responsible for interacting reloptions.c with access 
methods.

Story start from the point that I found out that a.m. can not forbid changing 
some of it's reloptions with ALTER INDEX command. That was not necessary  
before, because all reloptions at that existed at that time can be changed on 
fly. But now for bloom index it is unacceptable, because for changing bloom's 
reloptions for existing index will lead to index malfunction.

I was thinking about adding for_alter flag argument for parseRelOptions funtion 
and pass it all the way from indexcmds.c & tablecmds.c, and add some flag in 
the definition of reloptions to mark those reloptions that should not be used 
in ALTER INDEX clause. 

This theoretically this will give us a way to throw error when user tries to 
change an reloption that should not be changed.

But unfortunately it turned out that in ATExecSetRelOptions new reloptions is 
merged with existing reloptions values using transformRelOptions, and only 
after that the result is sent for validation. 

So on the step of the validation we can not distinguish old options from a new 
one. 

I've been thinking how to work this around, but found out that there is no 
simple way. I can pass another array of flags through the whole call stack. But 
this make all thing uglier. 

I can create another function that do pre-validation for new reloptions, 
before doing final validation of the whole set. But this will make me to have  
to entities available from the a.m.: the descriptor of  reloptions and 
function for custom validation of the results (i.e. like adjustBloomOptions 
for Bloom).

But it would be not good if we dedicate two entires of a.m. to reoptions 
definition. And there can be even more... 

So I came to a conclusion: gather all the staff that defines the all the 
behavior of reloption parsing and validation into one structure, both array of 
relopt_value, custom validation function, and all the other staff.

And this structure is the only thing that a.m. gives to reloptions.c code for 
parsing and validation purposes. And that should be enough.

I hope this make code more flexible and more easy to understand.



-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH][Documination] Add optional USING keyword before opclass name in INSERT statemet

2016-05-26 Thread Nikolay Shaplov
В письме от 26 мая 2016 10:05:56 пользователь Tom Lane написал:

> > 2. I think expression with USING in it is more human readable:
> > CREATE INDEX (xxx op_yyy);
> > is less sensible then
> > CREATE INDEX (xxx USING op_yyy);
> 
> Yes.  If we were working in a green field, it would have been better to
> make USING (or some other reserved word) required, not optional, there.
> That would have been better for readability and would have avoided some
> syntactic headaches, such as the need to parenthesize the expressions in
> expression indexes.  However, we're about 18 years too late to make that
> decision.  Opclass with no introductory keyword is the entrenched standard
> at this point, and we're never going to be able to remove it.
No, but we cat keep "USING" as an optional keyword there as it were, just 
mention it in the docs. It seems logical to me. 

// Actually I did not expected any discussion for this case. Documentations 
missed an optional keyword, documentation should be fixed. I am surprised that 
there can be any other opinions ;-)

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind

2016-05-26 Thread Nikolay Shaplov
В письме от 25 мая 2016 14:03:17 Вы написали:

> > > > >This all should me moved behind "access method" abstraction...
> > > > 
> > > > +1 relopt_kind should be moved in am, at least. Or removed.
> > > 
> > > Hm, but we have tablespace options too, so I'm not sure that using AM as
> > > abstraction level is correct.
> > 
> > We will use am for all indexes, and keep all the rest in relopotion.c for
> > non-indexes. May be divided options catalog into sections one section for
> > each kind.
> That makes sense.  I can review the patch later.
That would be great! ;-)


> 
> > And as I also would like to use this code for dynamic attoptions later, I
> > would like to remove relopt_kind at all. Because it spoils live in that
> > case.
> As I remember, Fabrízio (in CC) had a patch for dynamic reloptions, but
> there was some problem with it and we dumped it; see
> https://www.postgresql.org/message-id/CAFcNs+rqCq1H5eXW-cvdti6T-xo2STEkhjREx
> =odmakk5ti...@mail.gmail.com for previous discussion.

I've read the start of that thread. In my opinion Fabrízio offering quite 
different thing. I am speaking about adding options to a new object (access 
method or later operator class). You add these options when you create this 
object and you have a monopoly of defining options for it.

Fabrízio offers adding options for relkind that already exists. This seems to 
be a complex thing. You should resolve conflicts between two extensions that 
want to define same option for example. Somehow deal with the situation when 
the extension is dropped. Should we remove reloptions created by that 
extension from pg_class then?

And moreover, as far as I understand reloptions is about how does relation 
operates. And the external extension would not affect this at all. So we are 
speaking not about options of a relation, but about extension that want to 
store some info about some relation. I think in general this extension should 
store it's info into it's own data structures. 

May be there is cases when you will really need such behavior. But it is quite 
different task, have almost nothing in common with things I am going to do :-)

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH][Documination] Add optional USING keyword before opclass name in INSERT statemet

2016-05-26 Thread Nikolay Shaplov
В письме от 24 мая 2016 12:47:20 пользователь Tom Lane написал:
> Nikolay Shaplov <n.shap...@postgrespro.ru> writes:
> > If I read gram.y code for insert statement, I see that there is an
> > optional
> > USING keyword before opclass name
> > 
> > opt_class:  any_name{ $$ = $1; }
> > 
> > | USING any_name{ $$ = $2; }
> > | /*EMPTY*/ { $$ = NIL; }
> > 
> > ;
> > 
> > but it the documentation this keyword is omitted.
> 
> I think we should seriously consider fixing this code/docs discrepancy
> by making the code match the docs, not vice versa.  That is, let's just
> remove the USING alternative here entirely.
> 
> If we wanted to make the docs match the code, it would not only be
> CREATE INDEX that would have to be patched, because that's not the
> only place that index_elem can appear.  As far as I can find in a
> quick search, none of the relevant statements have ever documented
> that USING is allowed here; nor does it appear that any client-side
> code of ours makes use of the keyword.
> 
> Also, because USING is already used elsewhere in CREATE INDEX (to
> introduce the optional index AM name), I think that documenting its
> use in this clause would add confusion not subtract it.  References
> to "the USING clause in CREATE INDEX" would become ambiguous.
> 
> This wouldn't be something to back-patch, of course, but I think it's
> an entirely reasonable change to make in HEAD.
> 
> Comments?

I have two arguments for not removing USING there. 

1. Backward compatibility. Are you sure, that nobody ever wrote a code using 
this "USING" keyword? I would say it is unlikely, but will not be sure 100%. 
Dropping this backward compatibility (even with small chance that it was ever 
used) for no practical reason is not wise at all. If it might bring some pain 
to somebody, then why do it after all...

2. I think expression with USING in it is more human readable:

CREATE INDEX (xxx op_yyy);

is less sensible then 

CREATE INDEX (xxx USING op_yyy);

in my opinion. In second example person that does not know SQL at all, will 
understand that xxx is main object or action, and op_yyy is about how this xxx 
will be done or used.

If somebody would like to write more readable code, why we should forbid it to 
him?

2.1. As far as I can get the general idea of SQL, there is a tendency to put 
keywords (optional or not) between to object names. Like this

SELECT a _AS_ b from 

I like this tendency

2.2. I am not familiar with SQL ISO standart, and I suppose there is no USING 
at all in that case, but I think it would be good to look there to check for 
it or for something similar

2.3. And the last, when I found out about this keyword, I started to use it in 
my SQL statements that I use while development, and I just liked it. I will 
miss it if you remove it ;-) 


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind

2016-05-25 Thread Nikolay Shaplov
В письме от 25 мая 2016 13:25:38 Вы написали:
> Teodor Sigaev wrote:
> > >This all should me moved behind "access method" abstraction...
> > 
> > +1 relopt_kind should be moved in am, at least. Or removed.
> 
> Hm, but we have tablespace options too, so I'm not sure that using AM as
> abstraction level is correct.
We will use am for all indexes, and keep all the rest in relopotion.c for 
non-indexes. May be divided options catalog into sections one section for each 
kind.

And as I also would like to use this code for dynamic attoptions later, I 
would like to remove relopt_kind at all. Because it spoils live in that case.

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PROPOSAL] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind

2016-05-24 Thread Nikolay Shaplov

While working on patch for attribute options for indexes (see  
http://www.postgresql.org/message-id/5213596.TqFRiqmCTe@nataraj-amd64 )
I found out that code for reloptions is not flexible at all.

All definitions of attoptons are stored in central catalog in 
src/backend/access/common/reloptions.c
It is done this way for both heap and tuple relations, and also for all index 
access methods that goes with source code. 

Most of the code of the indexes is now hidden behind 
"access method" abstraction, but not the reloption code. If you grep through 
src/backend/access/common/reloptions.c, you will find RELOPT_KIND_GIN, 
RELOPT_KIND_BTREE, RELOPT_KIND_GIST, RELOPT_KIND_SPGIST, RELOPT_KIND_BRIN...

This all should me moved behind "access method" abstraction...

Custom indexes, like postgresql/contrib/bloom can add own reloptions, and 
relopt_kind. But the number of relopt_kinfd available is short (it would be 
enough for reloptions, but if we add attoptions, we will meet shortage soon).

So my proposial is the following: Each access method has it's own catalog of 
options (reloptions, and later attoptions) and when it want to call any 
function from src/backend/access/common/reloptions.c it uses a reference to 
that catalog instead of relopt_kind.

In the patch that is attached to this message, there is an idea of how it can 
be done.

In that patch I've left relopt_kind, and added refence to options catalog, 
and functions from reloptions.c uses that one that is defined. I've implemented 
"catalog reference" version for bloom index, and all the rest still work as 
they were.

In final patch there should be no relopt_kind at all, all descriptions of 
reloptions for indexes should go to src/backend/access/[am-name], reloptions 
for heap, toast, views and so on, should stay in 
src/backend/access/common/reloptions.c but should be stored as separate option 
catalog for each type. I still not sure what to do  with 
RELOPT_KIND_HEAP | RELOPT_KIND_TOAST options. But one or another solutions can 
be found...

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 05dbe87..a71f7d0 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -36,8 +36,8 @@
 
 PG_FUNCTION_INFO_V1(blhandler);
 
-/* Kind of relation optioms for bloom index */
-static relopt_kind bl_relopt_kind;
+/* Catalog of relation options for bloom index */
+static options_catalog bl_relopt_catalog;
 
 static int32 myRand(void);
 static void mySrand(uint32 seed);
@@ -51,15 +51,18 @@ _PG_init(void)
 	int			i;
 	char		buf[16];
 
-	bl_relopt_kind = add_reloption_kind();
+	bl_relopt_catalog.definitions = NULL;
+	bl_relopt_catalog.num = 0;
+	bl_relopt_catalog.max = 0;
+	bl_relopt_catalog.need_initialization = 1;
 
-	add_int_reloption(bl_relopt_kind, "length",
+	add_int_reloption(0, _relopt_catalog, "length",
 	  "Length of signature in uint16 type", 5, 1, 256);
 
 	for (i = 0; i < INDEX_MAX_KEYS; i++)
 	{
 		snprintf(buf, 16, "col%d", i + 1);
-		add_int_reloption(bl_relopt_kind, buf,
+		add_int_reloption(0, _relopt_catalog, buf,
 	  "Number of bits for corresponding column", 2, 1, 2048);
 	}
 }
@@ -457,7 +460,7 @@ bloptions(Datum reloptions, bool validate)
 		tab[i + 1].offset = offsetof(BloomOptions, bitSize[i]);
 	}
 
-	options = parseRelOptions(reloptions, validate, bl_relopt_kind, );
+	options = parseRelOptions(reloptions, validate, _relopt_catalog, 0, );
 	rdopts = allocateReloptStruct(sizeof(BloomOptions), options, numoptions);
 	fillRelOptions((void *) rdopts, sizeof(BloomOptions), options, numoptions,
    validate, tab, INDEX_MAX_KEYS + 1);
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 89bad05..c804212 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -761,7 +761,7 @@ brinoptions(Datum reloptions, bool validate)
 		{"pages_per_range", RELOPT_TYPE_INT, offsetof(BrinOptions, pagesPerRange)}
 	};
 
-	options = parseRelOptions(reloptions, validate, RELOPT_KIND_BRIN,
+	options = parseRelOptions(reloptions, validate, NULL, RELOPT_KIND_BRIN,
 			  );
 
 	/* if none set, we're done */
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 7448c7f..38fe2c8 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -378,6 +378,7 @@ static bits32 last_assigned_kind = RELOPT_KIND_LAST_DEFAULT;
 static int	num_custom_options = 0;
 static relopt_gen **custom_options = NULL;
 static bool need_initialization = true;
+static int	max_custom_options = 0;
 
 static void initialize_reloptions(void);
 static void parse_one_reloption(relopt_value *option, char *text_str,
@@ -496,32 +497,31 @@ add_reloption_kind(void)
  *		main parser table.
  */
 static void
-add_reloption(relopt_gen

[HACKERS] [PROPOSAL][PROTOTYPE] Individual options for each index column: Opclass options

2016-05-24 Thread Nikolay Shaplov
e is written, it is good, but you 
can't reuse it for attoptions as it is binded to relopt_kind and quite 
centralized.

So first step to implement attoptions for indexes, will be rewriting reoptions 
code to get rid of relopt_kind, make all access methods has its own option 
descriptor catalogs and use reference to this catalog instead of relopt_kind.

This should be a separate patch and I think I will start another thread for 
it. I will write another letter for relopt_kind remove issue...

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 05dbe87..f439403 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -36,8 +36,8 @@
 
 PG_FUNCTION_INFO_V1(blhandler);
 
-/* Kind of relation optioms for bloom index */
-static relopt_kind bl_relopt_kind;
+/* Catalog of relation options for bloom index */
+static options_catalog bl_relopt_catalog;
 
 static int32 myRand(void);
 static void mySrand(uint32 seed);
@@ -51,15 +51,18 @@ _PG_init(void)
 	int			i;
 	char		buf[16];
 
-	bl_relopt_kind = add_reloption_kind();
+	bl_relopt_catalog.definitions = NULL;
+	bl_relopt_catalog.num = 0;
+	bl_relopt_catalog.max = 0;
+	bl_relopt_catalog.need_initialization = 1;
 
-	add_int_reloption(bl_relopt_kind, "length",
+	add_int_reloption(0, _relopt_catalog, "length",
 	  "Length of signature in uint16 type", 5, 1, 256);
 
 	for (i = 0; i < INDEX_MAX_KEYS; i++)
 	{
 		snprintf(buf, 16, "col%d", i + 1);
-		add_int_reloption(bl_relopt_kind, buf,
+		add_int_reloption(0, _relopt_catalog, buf,
 	  "Number of bits for corresponding column", 2, 1, 2048);
 	}
 }
@@ -104,6 +107,7 @@ blhandler(PG_FUNCTION_ARGS)
 	amroutine->amcostestimate = blcostestimate;
 	amroutine->amoptions = bloptions;
 	amroutine->amvalidate = blvalidate;
+	amroutine->amattoptions = NULL;
 
 	PG_RETURN_POINTER(amroutine);
 }
@@ -457,7 +461,7 @@ bloptions(Datum reloptions, bool validate)
 		tab[i + 1].offset = offsetof(BloomOptions, bitSize[i]);
 	}
 
-	options = parseRelOptions(reloptions, validate, bl_relopt_kind, );
+	options = parseRelOptions(reloptions, validate, _relopt_catalog, 0, );
 	rdopts = allocateReloptStruct(sizeof(BloomOptions), options, numoptions);
 	fillRelOptions((void *) rdopts, sizeof(BloomOptions), options, numoptions,
    validate, tab, INDEX_MAX_KEYS + 1);
diff --git a/contrib/intarray/_int.h b/contrib/intarray/_int.h
index d524f0f..a05350f 100644
--- a/contrib/intarray/_int.h
+++ b/contrib/intarray/_int.h
@@ -47,15 +47,24 @@
 
 
 /* bigint defines */
-#define SIGLENINT  63			/* >122 => key will toast, so very slow!!! */
-#define SIGLEN	( sizeof(int)*SIGLENINT )
-#define SIGLENBIT (SIGLEN*BITS_PER_BYTE)
+//#define SIGLENINT  63			/* >122 => key will toast, so very slow!!! */
+//#define SIGLEN	( sizeof(int)*SIGLENINT )
+//#define SIGLENBIT (SIGLEN*BITS_PER_BYTE)
 
-typedef char BITVEC[SIGLEN];
+
+#define SIGLEN_BYTES(i) (sizeof(int)*i)	   /* array signature length in bytes */
+#define SIGLEN_BITS(i) \
+			(SIGLEN_BYTES(i)*BITS_PER_BYTE) /* array signature length in bits */
+
+
+//typedef char BITVEC[SIGLEN];
 typedef char *BITVECP;
 
-#define LOOPBYTE \
-			for(i=0;i<SIGLEN;i++)
+//#define LOOPBYTE \
+//			for(i=0;i<SIGLEN;i++)
+
+#define LOOPBYTE_NEW(siglenint) \
+			for(i=0;i<SIGLEN_BYTES(siglenint);i++)
 
 /* beware of multiple evaluation of arguments to these macros! */
 #define GETBYTE(x,i) ( *( (BITVECP)(x) + (int)( (i) / BITS_PER_BYTE ) ) )
@@ -63,8 +72,12 @@ typedef char *BITVECP;
 #define CLRBIT(x,i)   GETBYTE(x,i) &= ~( 0x01 << ( (i) % BITS_PER_BYTE ) )
 #define SETBIT(x,i)   GETBYTE(x,i) |=  ( 0x01 << ( (i) % BITS_PER_BYTE ) )
 #define GETBIT(x,i) ( (GETBYTE(x,i) >> ( (i) % BITS_PER_BYTE )) & 0x01 )
-#define HASHVAL(val) (((unsigned int)(val)) % SIGLENBIT)
-#define HASH(sign, val) SETBIT((sign), HASHVAL(val))
+
+//#define HASHVAL(val) (((unsigned int)(val)) % SIGLENBIT)
+#define HASHVAL_NEW(val, siglenint) (((unsigned int)(val)) % (unsigned int)(SIGLEN_BITS(siglenint)))
+
+//#define HASH(sign, val) SETBIT((sign), HASHVAL(val))
+#define HASH_NEW(sign, val, siglenint) SETBIT((sign), HASHVAL_NEW(val, siglenint))
 
 /*
  * type of index key
@@ -76,12 +89,23 @@ typedef struct
 	char		data[FLEXIBLE_ARRAY_MEMBER];
 } GISTTYPE;
 
+/* intarray opclass options */
+typedef struct IntArrayOpclassOptions
+{
+	int32		vl_len_;		/* varlena header (do not touch directly!) */
+	int			sig_len_int;	/* FIXME comment */
+}	IntArrayOpclassOptions;
+
+
 #define ALLISTRUE		0x04
 
 #define ISALLTRUE(x)	( ((GISTTYPE*)x)->flag & ALLISTRUE )
 
 #define GTHDRSIZE		(VARHDRSZ + sizeof(int32))
-#define CALCGTSIZE(flag) ( GTHDRSIZE+(((flag) & ALLISTRUE) ? 0 : SIGLEN) )
+
+#define GT_EMPTY_SIZE	GTHDRSIZE
+
+#define GT_SIZE(siglenint) (GTHDRSIZE + SIGLEN_BYTES(siglenint))
 
 #define GETSIGN(x)		( (BITVECP)( (char*)x+

[HACKERS] [PATCH] add missing "USING bloom" into bloom extension documentation

2016-05-24 Thread Nikolay Shaplov

In current bloom documentation an example

CREATE INDEX bloomidx ON tbloom(i1,i2,i3) 
   WITH (length=5, col1=2, col2=2, col3=4);

does not work, as it does not actually create a bloom index

you should add "USING bloom" to the insert statement, in order make this 
example work.

Patch in the attachment fixes an example. Please commit it ;-)

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 7349095..ab84b69 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -76,7 +76,7 @@
   
 
 
-CREATE INDEX bloomidx ON tbloom(i1,i2,i3) 
+CREATE INDEX bloomidx ON tbloom USING bloom (i1,i2,i3)
WITH (length=5, col1=2, col2=2, col3=4);
 
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH][Documination] Add optional USING keyword before opclass name in INSERT statemet

2016-05-16 Thread Nikolay Shaplov

If I read gram.y code for insert statement, I see that there is an optional 
USING keyword before opclass name


opt_class:  any_name{ $$ = $1; }
| USING any_name{ $$ = $2; }
| /*EMPTY*/ { $$ = NIL; }
;

but it the documentation this keyword is omitted.

I'd like to offer a patch that fixes this problem 

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 7dee405..788bc3f 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
  
 
 CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] name ] ON table_name [ USING method ]
-( { column_name | ( expression ) } [ COLLATE collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
+( { column_name | ( expression ) } [ COLLATE collation ] [ [ USING ] opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
 [ WITH ( storage_parameter = value [, ... ] ) ]
 [ TABLESPACE tablespace_name ]
 [ WHERE predicate ]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] amroutine->amsupport from numeric to defined constants

2016-04-26 Thread Nikolay Shaplov

While working with postgres code, I found that for gist index number of 
amsupport procs are defined two times. First in src/include/access/gist.h

#define GISTNProcs  9

and second in src/backend/access/gist/gist.c

amroutine->amsupport = 9;

I think this number should be specified only once, in .h file.

So I would offer a patch for all access methods. That changes amsupport and 
amstrategies from numbers to defined constants. (I've added two of them where 
they were missing)

See attachment.

If it is good, I will add it to the commitfest.

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 9450267..a2450f4 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -35,7 +35,7 @@ ginhandler(PG_FUNCTION_ARGS)
 	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
 
 	amroutine->amstrategies = 0;
-	amroutine->amsupport = 6;
+	amroutine->amsupport = GINNProcs;
 	amroutine->amcanorder = false;
 	amroutine->amcanorderbyop = false;
 	amroutine->amcanbackward = false;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 996363c..a290887 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -57,7 +57,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
 
 	amroutine->amstrategies = 0;
-	amroutine->amsupport = 9;
+	amroutine->amsupport = GISTNProcs;
 	amroutine->amcanorder = false;
 	amroutine->amcanorderbyop = true;
 	amroutine->amcanbackward = false;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 8c89ee7..4fecece 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -51,8 +51,8 @@ hashhandler(PG_FUNCTION_ARGS)
 {
 	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
 
-	amroutine->amstrategies = 1;
-	amroutine->amsupport = 1;
+	amroutine->amstrategies = HTMaxStrategyNumber;
+	amroutine->amsupport = HASHNProcs;
 	amroutine->amcanorder = false;
 	amroutine->amcanorderbyop = false;
 	amroutine->amcanbackward = true;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index bf8ade3..013394c 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -84,8 +84,8 @@ bthandler(PG_FUNCTION_ARGS)
 {
 	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
 
-	amroutine->amstrategies = 5;
-	amroutine->amsupport = 2;
+	amroutine->amstrategies = BTMaxStrategyNumber;
+	amroutine->amsupport = BTNProcs;
 	amroutine->amcanorder = true;
 	amroutine->amcanorderbyop = false;
 	amroutine->amcanbackward = true;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 201203f..bc679bf 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -36,7 +36,7 @@ spghandler(PG_FUNCTION_ARGS)
 	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
 
 	amroutine->amstrategies = 0;
-	amroutine->amsupport = 5;
+	amroutine->amsupport = SPGISTNProc;
 	amroutine->amcanorder = false;
 	amroutine->amcanorderbyop = false;
 	amroutine->amcanbackward = false;
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
index 3a68390..fa3f9b6 100644
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -239,6 +239,7 @@ typedef HashMetaPageData *HashMetaPage;
  *	Since we only have one such proc in amproc, it's number 1.
  */
 #define HASHPROC		1
+#define HASHNProcs		1
 
 
 /* public routines */
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index ca50349..d956900 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -454,6 +454,7 @@ typedef struct xl_btree_newroot
 
 #define BTORDER_PROC		1
 #define BTSORTSUPPORT_PROC	2
+#define BTNProcs			2
 
 /*
  *	We need to be able to tell the difference between read and write

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-25 Thread Nikolay Shaplov
В письме от 25 ноября 2015 22:27:57 пользователь Michael Paquier написал:
> On Wed, Nov 25, 2015 at 10:16 PM, Nikolay Shaplov <n.shap...@postgrespro.ru>
> wrote:
> > В письме от 18 ноября 2015 15:52:54 пользователь Michael Paquier написал:
> > Teodor Sigaev asked a very good question: does it properly do upgrade from
> > 1.3
> > to 1.4
> > 
> > I've rechecked and fixed
> 
> What was actually the problem? 

tuple_data_split should be defined before heap_page_item_attrs.

And there were also error in the first argument of tuple_data_split there. It 
should be  "rel_oid oid" instead of "t_oid rel_oid" 

> I have to admit that I forgot to test that
> directly, and did not spot anything obvious on the 1.3--1.4.sql file.
yes. but each of us added a non-obvious mistake there :-)

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-25 Thread Nikolay Shaplov
В письме от 18 ноября 2015 15:52:54 пользователь Michael Paquier написал:

Teodor Sigaev asked a very good question: does it properly do upgrade from 1.3 
to 1.4

I've rechecked and fixed

here is a patch.

> On Wed, Nov 18, 2015 at 3:10 AM, Nikolay Shaplov wrote:
> > Everything seems to be ok. I've changed only one thing in your version
> > of the patch. I've renamed split_tuple_data to
> > tuple_data_split_internal, because when there are split_tuple_data and
> > tuple_data_split in the same file, nobody will guess what the difference
> > between these function and why are they named in such a strange way.
> 
> Yep, that sounds better this way.
> 
> > If it is ok, set proper status, and let commiter do his job :-)
> 
> OK. I have switched the status of this patch to "Ready for committer"
> (please, committer-san, double-check the area around
> tuple_data_split_internal when fetching data for each attribute, I
> think that we got that right but I may be missing something as well).

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..91ab119 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,9 +5,9 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
-	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
-	pageinspect--unpackaged--1.0.sql
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql \
+	pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
+	pageinspect--1.0--1.1.sql pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
 
 ifdef USE_PGXS
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..546a051 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -27,8 +27,11 @@
 
 #include "access/htup_details.h"
 #include "funcapi.h"
-#include "utils/builtins.h"
+#include "catalog/pg_type.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/builtins.h"
+#include "utils/rel.h"
 
 
 /*
@@ -55,6 +58,42 @@ bits_to_text(bits8 *bits, int len)
 
 
 /*
+ * text_to_bits
+ *
+ * Converts a c-string representation of bits into a bits8-array. This is
+ * the reverse operation of previous routine.
+ */
+static bits8 *
+text_to_bits(char *str, int len)
+{
+	bits8	   *bits;
+	int			off = 0;
+	char		byte = 0;
+
+	bits = palloc(len + 1);
+
+	while (off < len)
+	{
+		if (off % 8 == 0)
+			byte = 0;
+
+		if ((str[off] == '0') || (str[off] == '1'))
+			byte = byte | ((str[off] - '0') << off % 8);
+		else
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+	 errmsg("illegal character '%c' in t_bits string", str[off])));
+
+		if (off % 8 == 7)
+			bits[off / 8] = byte;
+
+		off++;
+	}
+
+	return bits;
+}
+
+/*
  * heap_page_items
  *
  * Allows inspection of line pointers and tuple headers of a heap page.
@@ -122,8 +161,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -154,7 +193,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 			lp_offset + lp_len <= raw_page_size)
 		{
 			HeapTupleHeader tuphdr;
-			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +208,14 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff,
+	 tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -180,11 +228,11 @@ heap_page_items(PG_FUNCTION_ARGS)
 			{
 if (tuphdr->t_infomask & HEAP_HASNULL)
 {
-	bits_len = tuphdr->t_hoff -
-		offsetof(HeapTupleHeaderData, t_bits);
+	int	bits_len =
+			((tuphdr->t_infomask2 & HEAP_NATTS_MASK) / 8 + 1) * 8;
 
 	values[11] = CStringGetTextDatum(
- bits_to_text(tuphdr->t_bits, bits_len * 8));
+ bits_to_text(tuphdr->t_bits, bits_len));
 }
 else
 	nulls[11] = true;
@@ -208,7 +256,7 @@ heap_page_it

Re: [HACKERS] [PROPOSAL] TAP test example

2015-11-19 Thread Nikolay Shaplov
В письме от 19 ноября 2015 10:57:27 пользователь Tom Lane написал:
> Nikolay Shaplov <n.shap...@postgrespro.ru> writes:
> > But this is not the issue. We can change it any way we like. The question
> > do we need such example at all, or no?
> 
> I'm kind of -1 on the idea of a module that doesn't actually do something
> *useful*. 
Actually it tests that sameuser  option from pg_hba.conf works as expected.  
Allow user to connect only to database with the same name as user name.

I can't say it is something really useful. But my intention was to write an 
example, so I choose the most simple functionality to test, so the user of the 
example can pay all attention to perl related staff. 

I just thought this example might be useful to others. If so, I can prepare it 
for commit. Otherwise I will just use it were I intended.

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] TAP test example

2015-11-19 Thread Nikolay Shaplov
В письме от 19 ноября 2015 09:39:41 пользователь Jim Nasby написал:
> On 11/19/15 8:42 AM, Nikolay Shaplov wrote:
> > +sub psql_ok
> > +{
> > +   my $db = shift;
> > +   my $sql = shift;
> > +   my $comment = shift;
> 
> Isn't the preferred method of parameter assignment to use @_?
Hm... it is the way I wrote perl programs. There is more then one way to do it 
in perl, you know ;-) I think that this way is more understandable for others. 
But this is not the issue. We can change it any way we like. The question do 
we need such example at all, or no?

> Also, I'd think one of the examples should use DBI, since presumably one
> of the big benefits to tap is not dealing with raw psql output...

I wrote this example based on ssl TAP test. There was no DBI there. And I 
think it was done for purpose. If we add DBI to tests, then we should add it 
to build dependencies. And it is not a good idea, and so not a good example.

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PROPOSAL] TAP test example

2015-11-19 Thread Nikolay Shaplov

Don't we need a simple example of TAP tests in src/test ? Something that test 
a trivial feature, but shows basic testing tricks?

While explaining to my friends how does TAP test works I wrote an example TAP 
test. May be we we should add it to the core with sensible README explanation?

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..a45129c 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -17,7 +17,7 @@ SUBDIRS = regress isolation modules
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want "make clean" etc to recurse into them.  Likewise for ssl/,
 # because the SSL test suite is not secure to run on a multi-user system.
-ALWAYS_SUBDIRS = examples locale thread ssl
+ALWAYS_SUBDIRS = examples locale thread ssl tap-examples
 
 # We want to recurse to all subdirs for all standard targets, except that
 # installcheck and install should not recurse into the subdirectory "modules".
diff --git a/src/test/tap-examples/Makefile b/src/test/tap-examples/Makefile
new file mode 100644
index 000..e0d6e10
--- /dev/null
+++ b/src/test/tap-examples/Makefile
@@ -0,0 +1,7 @@
+
+subdir = src/test/tap-examples
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
diff --git a/src/test/tap-examples/TapExamples.pm b/src/test/tap-examples/TapExamples.pm
new file mode 100644
index 000..80fe13b
--- /dev/null
+++ b/src/test/tap-examples/TapExamples.pm
@@ -0,0 +1,34 @@
+package TapExamples;
+
+use strict;
+use TestLib;
+use Test::More;
+
+use Exporter 'import';
+our @EXPORT = qw(
+  psql_ok psql_fails
+);
+
+sub psql_ok
+{
+	my $db = shift;
+	my $sql = shift;
+	my $comment = shift;
+	my $res;
+	eval {$res = psql($db,$sql)};
+	$res = 0 if $@;
+	ok($res, $comment);
+}
+
+sub psql_fails
+{
+	my $db = shift;
+	my $sql = shift;
+	my $comment = shift;
+	my $res;
+	eval {$res = psql($db,$sql)};
+	$res = 0 if $@;
+	ok(!$res, $comment);
+}
+
+1;
diff --git a/src/test/tap-examples/t/001_sameuser_test.pl b/src/test/tap-examples/t/001_sameuser_test.pl
new file mode 100644
index 000..b7236ae
--- /dev/null
+++ b/src/test/tap-examples/t/001_sameuser_test.pl
@@ -0,0 +1,36 @@
+# Minimal test testing streaming replication
+use strict;
+use warnings;
+use TestLib;
+use Test::More "no_plan";
+use TapExamples;
+
+my $tempdir = TestLib::tempdir;
+#my $tempdir = 'my_tmp';
+
+diag "setting up data directory in \"$tempdir\"...";
+
+my $current_user = (getpwuid($>))[0];
+my $db1 = $current_user;
+my $db2 = "${current_user}_another_db";
+
+diag "Running postgres as user '$current_user' with default configuration";
+start_test_server($tempdir);
+
+psql_ok('postgres', "CREATE DATABASE $db1", "Creating DB '$db1'");
+psql_ok('postgres', "CREATE DATABASE $db2", "Creating DB '$db2'");
+
+psql_ok($db1, "select now()", "Connecting to '$db1'");
+psql_ok($db2, "select now()", "Connecting to '$db2'");
+
+diag "Updateing pg_hba.conf, setting 'local sameuser all trust'";
+open HBA, ">$tempdir/pgdata/pg_hba.conf";
+print HBA "# TYPE  DATABASEUSERADDRESS METHOD\n";
+print HBA "local   sameuserall trust \n";
+close HBA;
+
+diag "Restarting postgres";
+restart_test_server();
+
+psql_ok($db1, "select now()", "Connecting to '$db1'");
+psql_fails($db2, "select now()", "Should fail when connecting to '$db2'");

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-17 Thread Nikolay Shaplov

> >> I still have an opinion that documentation should be more verbose, than
> >> your version, but I can accept your version.
> > 
> > I am not sure that's necessary, pageinspect is for developers.
> > 
> >> Who is going to add heap_page_item_attrs to your patch? me or you?
> > 
> > I recall some parts of the code I still did not like much :) I'll grab
> > some room to have an extra look at it.
> 
> I have added back heap_page_item_attrs the patch attached, fixing at
> the same time some typos found while looking again at the code. If you
> are fine with this version, let's have a committer look at it.

Everything seems to be ok. I've changed only one thing in your version
of the patch. I've renamed split_tuple_data to
tuple_data_split_internal, because when there are split_tuple_data and
tuple_data_split in the same file, nobody will guess what the difference
between these function and why are they named in such a strange way.

If it is ok, set proper status, and let commiter do his job :-)



-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..91ab119 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,9 +5,9 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
-	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
-	pageinspect--unpackaged--1.0.sql
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql \
+	pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
+	pageinspect--1.0--1.1.sql pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
 
 ifdef USE_PGXS
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..546a051 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -27,8 +27,11 @@
 
 #include "access/htup_details.h"
 #include "funcapi.h"
-#include "utils/builtins.h"
+#include "catalog/pg_type.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/builtins.h"
+#include "utils/rel.h"
 
 
 /*
@@ -55,6 +58,42 @@ bits_to_text(bits8 *bits, int len)
 
 
 /*
+ * text_to_bits
+ *
+ * Converts a c-string representation of bits into a bits8-array. This is
+ * the reverse operation of previous routine.
+ */
+static bits8 *
+text_to_bits(char *str, int len)
+{
+	bits8	   *bits;
+	int			off = 0;
+	char		byte = 0;
+
+	bits = palloc(len + 1);
+
+	while (off < len)
+	{
+		if (off % 8 == 0)
+			byte = 0;
+
+		if ((str[off] == '0') || (str[off] == '1'))
+			byte = byte | ((str[off] - '0') << off % 8);
+		else
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+	 errmsg("illegal character '%c' in t_bits string", str[off])));
+
+		if (off % 8 == 7)
+			bits[off / 8] = byte;
+
+		off++;
+	}
+
+	return bits;
+}
+
+/*
  * heap_page_items
  *
  * Allows inspection of line pointers and tuple headers of a heap page.
@@ -122,8 +161,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -154,7 +193,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 			lp_offset + lp_len <= raw_page_size)
 		{
 			HeapTupleHeader tuphdr;
-			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +208,14 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff,
+	 tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -180,11 +228,11 @@ heap_page_items(PG_FUNCTION_ARGS)
 			{
 if (tuphdr->t_infomask & HEAP_HASNULL)
 {
-	bits_len = tuphdr->t_hoff -
-		offsetof(HeapTupleHeaderData, t_bits);
+	int	bits_len =
+			((tuphdr->t_infomask2 & HEAP_NATTS_MASK) / 8 + 1) * 8;
 
 	values[11] = CStringGetTextDatum(
- bits_to_text(tuphdr->t_bits, bits_len * 8));
+ bits_to_text(tuphdr->t_bits, bits_len));
 }
 else
 	nulls[11] = true;
@@ -20

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-11-11 Thread Nikolay Shaplov
В письме от 28 октября 2015 16:57:36 пользователь Michael Paquier написал:
> On Sat, Oct 17, 2015 at 1:48 AM, Michael Paquier wrote:
> > On Sat, Oct 17, 2015 at 5:15 AM, Nikolay Shaplov wrote:
> >> Or it's ready to commit, and just not marked this way?
> > 
> > No, I don't think we have reached this state yet.
> > 
> >> I am going to make report based on this patch in Vienna. It would be
> >> nice to have it committed until then :)
> > 
> > This is registered in this November's CF and the current one is not
> > completely wrapped up, so I haven't rushed into looking at it.
> 
> So, I have finally been able to look at this patch in more details,
> resulting in the attached. I noticed a couple of things that should be
> addressed, mainly:
> - addition of a new routine text_to_bits to perform the reverse
> operation of bits_to_text. This was previously part of
> tuple_data_split, I think that it deserves its own function.
> - split_tuple_data should be static
> - t_bits_str should not be a text, rather a char* fetched using
> text_to_cstring(PG_GETARG_TEXT_PP(4)). This way there is no need to
> perform extra calculations with VARSIZE and VARHDRSZ
> - split_tuple_data can directly use the relation OID instead of the
> tuple descriptor of the relation
> - t_bits was leaking memory. For correctness I think that it is better
> to free it after calling split_tuple_data.
> - PG_DETOAST_DATUM_COPY allocates some memory, this was leaking as
> well in raw_attr actually. I refactored the code such as a bytea* is
> used and always freed when allocated to avoid leaks. Removing raw_attr
> actually simplified the code a bit.
> - I simplified the docs, that was largely too verbose in my opinion.
> - Instead of using VARATT_IS_1B_E and VARTAG_EXTERNAL, using
> VARATT_IS_EXTERNAL and VARATT_IS_EXTERNAL_ONDISK seems more adapted to
> me, those other ones are much more low-level and not really spread in
> the backend code.
> - Found some typos in the code, the docs and some comments. I reworked
> the error messages as well.
> - The code format was not really in line with the project guidelines.
> I fixed that as well.
> - I removed heap_page_item_attrs for now to get this patch in a
> bare-bone state. Though I would not mind if this is re-added (I
> personally don't think that's much necessary in the module
> actually...).
> - The calculation of the length of t_bits using HEAP_NATTS_MASK is
> more correct as you mentioned earlier, so I let it in its state.
> That's actually more accurate for error handling as well.
> That's everything I recall I have. How does this look?
You've completely rewrite everything ;-)

Let everything be the way you wrote. This code is better than mine.

With one exception. I really  need heap_page_item_attrs function. I used it in 
my Tuples Internals presentation 
https://github.com/dhyannataraj/tuple-internals-presentation
and I am 100% sure that this function is needed for educational purposes, and 
this function should be as simple as possible, so students cat use it without 
extra efforts.

I still have an opinion that documentation should be more verbose, than your 
version, but I can accept your version.

Who is going to add heap_page_item_attrs to your patch? me or you?

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-16 Thread Nikolay Shaplov
So what's next?
We need something else to discuss?
We need somebody else's opinion to rule this out?
Or it's ready to commit, and just not marked this way?

I am going to make report based on this patch in Vienna. It would be
nice to have it committed until then :)

On 02.10.2015 07:10, Michael Paquier wrote:
> On Thu, Oct 1, 2015 at 8:13 PM, Nikolay Shaplov wrote:
>> В письме от 30 сентября 2015 13:49:00 пользователь Michael Paquier
> написал:
>>> - errmsg("input page too small (%d
> bytes)",
>>> raw_page_size)));
>>> +errmsg("input page too small (%d
>>> bytes)", raw_page_size)));
>>> Please be careful of spurious diffs. Those just make the life of
> committers
>>> more difficult than it already is.
>> Oh, that's not diff. That's I've fixed indent on the code I did not write
> :-)
>
> pgindent would have taken care of that if needed. There is no need to add
> noise in the code for this patch.
>
>>> + 
>>> + General idea about output columns: lp_*
>>> attributes
>>> + are about where tuple is located inside the page;
>>> + t_xmin, t_xmax,
>>> + t_field3, t_ctid are
> about
>>> + visibility of this tuplue inside or outside of the transaction;
>>> + t_infomask2, t_infomask
> has
>>> some
>>> + information about properties of attributes stored in tuple data.
>>> + t_hoff sais where tuple data begins and
>>> + t_bits sais which attributes are NULL and
> which
>>> are
>>> + not. Please notice that t_bits here is not an actual value that is
>>> stored
>>> + in tuple data, but it's text representation  with '0' and '1'
>>> charactrs.
>>> + 
>>> I would remove that as well. htup_details.h contains all this
> information.
>> Made it even more shorter. Just links and comments about representation of
>> t_bits.
> I would completely remove this part.
>
>> There also was a bug in original pageinspect, that did not get t_bits
> right
>> when there was OID in the tuple.  I've fixed it too.
> Aha. Good catch! By looking at HeapTupleHeaderGetOid if the tuple has an
> OID it will be stored at the end of HeapTupleHeader and t_hoff includes it,
> so bits_len is definitely larger of 4 bytes should an OID be present.
>if (tuphdr->t_infomask & HEAP_HASNULL)
>{
> -   bits_len = tuphdr->t_hoff -
> -   offsetof(HeapTupleHeaderData, t_bits);
> +  int bits_len =
> +  ((tuphdr->t_infomask2 & HEAP_NATTS_MASK)/8 +
> 1)*8;
>
> values[11] = CStringGetTextDatum(
> -   bits_to_text(tuphdr->t_bits,
> bits_len * 8));
> +  bits_to_text(tuphdr->t_bits,
> bits_len));
> }
> And here is what you are referring to in your patch. I think that we should
> instead check for HEAP_HASOID and reduce bits_len by the size of Oid should
> one be present. As this is a bug that applies to all the existing versions
> of Postgres it would be good to extract it as a separate patch and then
> apply your own patch on top of it instead of including in your feature.
> Attached is a patch, this should be applied down to 9.0 I guess. Could you
> rebase your patch on top of it?
>
>> Here is next patch in attachment.
> Cool, thanks.
>
> -test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
> +
> +test=# select * from heap_page_items(get_raw_page('pg_range', 0));
> This example in the docs is far too long in character length... We should
> get that reduced.
>
> +  Please notice that t_bits in tuple header
> structure
> +  is a binary bitmap, but heap_page_items returns
> +  t_bits as human readable text representation of
> +  original t_bits bitmap.
> This had better remove the mention to "please notice". Still as this is
> already described in htup_details.h there is no real point to describe it
> again here: that's a bitmap of NULLs.
>
>



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] minor doc patch

2015-10-06 Thread Nikolay Shaplov
This patch changes  in
 
http://www.postgresql.org/docs/9.5/static/sql-createtype.html

"variable length" into "variable-length", since in other places there it is 
written with "-" in the middle, not " ".


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml
index a6a4644..5a09f19 100644
--- a/doc/src/sgml/ref/create_type.sgml
+++ b/doc/src/sgml/ref/create_type.sgml
@@ -324,7 +324,7 @@ CREATE TYPE name
internallength.
Base data types can be fixed-length, in which case
internallength is a
-   positive integer, or variable  length, indicated by setting
+   positive integer, or variable-length, indicated by setting
internallength
to VARIABLE.  (Internally, this is represented
by setting typlen to -1.)  The internal representation of all

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-02 Thread Nikolay Shaplov
В письме от 2 октября 2015 13:10:22 пользователь Michael Paquier написал:
 
> >> + 
> >> + General idea about output columns: lp_*
> >> attributes
> >> + are about where tuple is located inside the page;
> >> + t_xmin, t_xmax,
> >> + t_field3, t_ctid are
> 
> about
> 
> >> + visibility of this tuplue inside or outside of the transaction;
> >> + t_infomask2, t_infomask
> 
> has
> 
> >> some
> >> + information about properties of attributes stored in tuple data.
> >> + t_hoff sais where tuple data begins and
> >> + t_bits sais which attributes are NULL and
> 
> which
> 
> >> are
> >> + not. Please notice that t_bits here is not an actual value that is
> >> stored
> >> + in tuple data, but it's text representation  with '0' and '1'
> >> charactrs.
> >> + 
> >> I would remove that as well. htup_details.h contains all this
> 
> information.
> 
> > Made it even more shorter. Just links and comments about representation of
> > t_bits.
> 
> I would completely remove this part.

Michael my hand would not do it. I've been working as a lecturer for six 
years. If I want to pass an information in a comfortable way to reader, there 
should go some binding phrase. It may be very vague, but it should outline  
(may be in a very brief way, but still outline) an information that would be 
found if he follows the links. 

If we just give links "knowledge flow" will be uncomfortable for person who 
reads it.


> -test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
> +
> +test=# select * from heap_page_items(get_raw_page('pg_range', 0));
> This example in the docs is far too long in character length... We should
> get that reduced.

May be should do 

\x and limit 1 like this:

test=# select * from heap_page_items(get_raw_page('pg_range', 0)) limit 1;
-[ RECORD 1 ]---
lp  | 1
lp_off  | 8144
lp_flags| 1
lp_len  | 48
t_xmin  | 1
t_xmax  | 0
t_field3| 0
t_ctid  | (0,1)
t_infomask2 | 6
t_infomask  | 2304
t_hoff  | 24
t_bits  | 
t_oid   | 
t_data  | \x400f1700ba074a0f520f




> +  Please notice that t_bits in tuple header
> structure
> +  is a binary bitmap, but heap_page_items returns
> +  t_bits as human readable text representation of
> +  original t_bits bitmap.
> This had better remove the mention to "please notice". Still as this is
> already described in htup_details.h there is no real point to describe it
> again here: that's a bitmap of NULLs.

heap_page_items returns text(!) as t_bits. This is unexpected. This is not 
described in page layout documentation. We should tell about it here 
explicitly. 


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-02 Thread Nikolay Shaplov
В письме от 2 октября 2015 13:10:22 Вы написали:

> > There also was a bug in original pageinspect, that did not get t_bits
> 
> right
> 
> > when there was OID in the tuple.  I've fixed it too.
> 
> Aha. Good catch! By looking at HeapTupleHeaderGetOid if the tuple has an
> OID it will be stored at the end of HeapTupleHeader and t_hoff includes it,
> so bits_len is definitely larger of 4 bytes should an OID be present.
>if (tuphdr->t_infomask & HEAP_HASNULL)
>{
> -   bits_len = tuphdr->t_hoff -
> -   offsetof(HeapTupleHeaderData, t_bits);
> +  int bits_len =
> +  ((tuphdr->t_infomask2 & HEAP_NATTS_MASK)/8 +
> 1)*8;
> 
> values[11] = CStringGetTextDatum(
> -   bits_to_text(tuphdr->t_bits,
> bits_len * 8));
> +  bits_to_text(tuphdr->t_bits,
> bits_len));
> }
> And here is what you are referring to in your patch. I think that we should
> instead check for HEAP_HASOID and reduce bits_len by the size of Oid should
> one be present. As this is a bug that applies to all the existing versions
> of Postgres it would be good to extract it as a separate patch and then
> apply your own patch on top of it instead of including in your feature.
> Attached is a patch, this should be applied down to 9.0 I guess. Could you
> rebase your patch on top of it?
No we should not do it, because after t_bits there always goes padding bytes. 
So OID or the top of tuple data is properly aligned. So we should not use 
t_hoff for determinating t_bit's size at all.

Here is an example. I create a table with 10 columns and OID. (ten is greater 
then 8, so there should be two bytes of t_bits data)

create table test3 (a1 int, a2 int, a3 int, a4 int,a5 int,a6 int, a7 int,a8 
int, a9 int, a10 int) with oids;
insert into test3 VALUES
(1,2,3,4,5,6,7,8,null,10);

With your patch we get 

test=# select lp, t_bits, t_data from heap_page_items(get_raw_page('test3', 0));
 lp |  t_bits  |
   t_data   
+--+
  1 | 0100 | 
\x010002000300040005000600070008000a00
(1 row)


here we get 40 bits of t_bits.

With my way to calculate t_bits length we get

test=# select lp, t_bits, t_data from heap_page_items(get_raw_page('test3', 0));
 lp |  t_bits  |   t_data   

+--+
  1 | 0100 | 
\x010002000300040005000600070008000a00
(1 row)

16 bits as expected.

So I would keep my version of bits_len calculation


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-10-01 Thread Nikolay Shaplov
В письме от 30 сентября 2015 13:49:00 пользователь Michael Paquier написал:



> 
> - errmsg("input page too small (%d bytes)",
> raw_page_size)));
> +errmsg("input page too small (%d
> bytes)", raw_page_size)));
> Please be careful of spurious diffs. Those just make the life of committers
> more difficult than it already is.

Oh, that's not diff. That's I've fixed indent on the code I did not write :-)


> + 
> + General idea about output columns: lp_*
> attributes
> + are about where tuple is located inside the page;
> + t_xmin, t_xmax,
> + t_field3, t_ctid are about
> + visibility of this tuplue inside or outside of the transaction;
> + t_infomask2, t_infomask has
> some
> + information about properties of attributes stored in tuple data.
> + t_hoff sais where tuple data begins and
> + t_bits sais which attributes are NULL and which
> are
> + not. Please notice that t_bits here is not an actual value that is
> stored
> + in tuple data, but it's text representation  with '0' and '1'
> charactrs.
> + 
> I would remove that as well. htup_details.h contains all this information.
Made it even more shorter. Just links and comments about representation of 
t_bits.


> +
> +test=# select * from heap_page_item_attrs(get_raw_page('pg_range',
> 0),'pg_range'::regclass);
> +[loong tuple data]
> This example is too large in character per lines, I think that you should
> cut a major part of the fields, why not just keeping lp and t_attrs for
> example.
Did id.

> 
> +  
> +   
> +rel_oid
> +oid
> +OID of the relation, of the tuple we want to split
> +   
> +
> +   
> +tuple_data
> +bytea
> +tuple raw data to split
> +
> +   
> In the description of tuple_data_split, I am not sure it is worth listing
> all the argument of the function like that. IMO, we should just say: those
> are the fields returned by "heap_page_items". tuple_data should as well be
> renamed to t_data.
Did it.

> 
> +  tuple attributes instead of one peace of raw tuple data. All other
> return
> This is not that "peaceful" to me. It should be "piece" :)
Oops ;-)

> +   values[13] = PointerGetDatum(tuple_data_bytea);
> +   nulls[13] = false;
> There is no point to set nulls[13] here.
Oh, you are right!

> 
> +
> +test=# select tuple_data_split('pg_range'::regclass,
> '\x400f1700ba074a0f520f'::bytea, 2304, 6, null);
> +   tuple_data_split
> +---
>  +
> {"\\x400f","\\x1700","\\x","\\xba07","\\x4a0f","\\x5
> 20f"} +(1 row)
> This would be more demonstrative if combined with heap_page_items, like
> that for example:
> SELECT tuple_data_split('pg_class'::regclass, t_data, t_infomask,
> t_infomask2, t_bits) FROM heap_page_items(get_raw_page('pg_class', 0));
> And actually this query crashes.
Oh... It crached because I did not check that t_data can actually be NULL.

There also was a bug in original pageinspect, that did not get t_bits right 
when there was OID in the tuple.  I've fixed it too. 


Here is next patch in attachment.


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..4fd3087 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,14 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/pg_type.h"
 
+Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len,
+			TupleDesc tuple_desc, uint16 t_infomask,
+			uint16 t_infomask2, bits8 *t_bits, b

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-29 Thread Nikolay Shaplov
В письме от 26 сентября 2015 20:57:25 пользователь Michael Paquier написал:

> > So I would consider two options: Either move t_infomask/t_infomask2
> > details
> > into storage.sgml or leave as it is.
> 
> The documentation redirects the reader to look at htup_details.h (the
> documentation is actually incorrect, I sent a separate patch), and I
> see no point in duplicating such low-level descriptions, that would be
> double maintenance for the same result.

What do you think about this? (I've changed only heap_tuple_items 
documentation there)



-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..52a0663 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,14 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/pg_type.h"
 
+Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len,
+			TupleDesc tuple_desc, uint16 t_infomask,
+			uint16 t_infomask2, bits8 *t_bits, bool
+			do_detoast, int error_level);
 
 /*
  * bits_to_text
@@ -91,7 +98,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 		if (raw_page_size < SizeOfPageHeaderData)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-  errmsg("input page too small (%d bytes)", raw_page_size)));
+	 errmsg("input page too small (%d bytes)", raw_page_size)));
 
 		fctx = SRF_FIRSTCALL_INIT();
 		mctx = MemoryContextSwitchTo(fctx->multi_call_memory_ctx);
@@ -122,8 +129,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -155,6 +162,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		{
 			HeapTupleHeader tuphdr;
 			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +177,14 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff,
+	 tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+			nulls[13] = false;
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -208,7 +225,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 			 */
 			int			i;
 
-			for (i = 4; i <= 12; i++)
+			for (i = 4; i <= 13; i++)
 nulls[i] = true;
 		}
 
@@ -223,3 +240,223 @@ heap_page_items(PG_FUNCTION_ARGS)
 	else
 		SRF_RETURN_DONE(fctx);
 }
+
+PG_FUNCTION_INFO_V1(tuple_data_split);
+Datum
+tuple_data_split(PG_FUNCTION_ARGS)
+{
+	Oidrel_oid;
+	bytea		   *raw_data;
+	uint16			t_infomask;
+	uint16			t_infomask2;
+	text		   *t_bits_str;
+	RelationData   *rel;
+	TupleDesc		tuple_desc;
+	bool			do_detoast = false;
+	interror_level = ERROR;
+
+	bits8		   *t_bits = NULL;
+	Datum			res;
+
+	rel_oid		= PG_GETARG_OID(0);
+	raw_data	= PG_GETARG_BYTEA_P(1);
+	t_infomask	= PG_GETARG_INT16(2);
+	t_infomask2	= PG_GETARG_INT16(3);
+	t_bits_str	= PG_ARGISNULL(4) ? NULL : PG_GETARG_TEXT_P(4);
+	if (PG_NARGS()>=6)
+		do_detoast = PG_GETARG_BOOL(5);
+	if (PG_NARGS()>=7)
+		error_level = PG_GETARG_BOOL(6) ? WARNING : ERROR;
+
+	if (!superuser())
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+	(errmsg("must be superuser to use raw page functions";
+
+	/*
+	 * Here we converting t_bits string back to the bits8 array,
+	 * as it really is in the tuple header
+	 */
+	if (t_infomask & HEAP_HASNULL)
+	{
+		int		bits_str_len;
+		int		bits_len;
+		char   *p;
+		int		off;
+		char	byte = 0;
+
+		bits_len = (t_infomask2 & HEAP_NATTS_MASK) / 8 + 1;
+		if (!t_bits_str)
+		

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-28 Thread Nikolay Shaplov
ere I parse t_bits from text back to bit 
representation.  Here there is not chance to get non-strict behavior, since 
there is no way to guess what should we do if we met characters then '0' or 
'1'. Or what to do if there is only 7 characters but we should write whole 
byte.etc...
Moreover as I wrote above bay be we just do not need warning_mode at all for 
current situation.

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-25 Thread Nikolay Shaplov
В письме от 11 сентября 2015 15:12:04 пользователь Michael Paquier написал:

> > Ok.Let's come to the final decision with tuple_data_parse, and i will add
> > this switch there and to pure sql heap_page_item_attrs
> 
> Fine for me.

Here is final version with documentation.

Hope it will be the last one. :-)

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..d42ca48 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,11 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/pg_type.h"
 
+Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len, TupleDesc tuple_desc, uint16 t_infomask, uint16 t_infomask2, bits8 *t_bits, bool do_detoast, int error_level);
 
 /*
  * bits_to_text
@@ -122,8 +126,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -155,6 +159,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		{
 			HeapTupleHeader tuphdr;
 			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +174,13 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff, tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+			nulls[13] = false;
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -208,7 +221,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 			 */
 			int			i;
 
-			for (i = 4; i <= 12; i++)
+			for (i = 4; i <= 13; i++)
 nulls[i] = true;
 		}
 
@@ -223,3 +236,210 @@ heap_page_items(PG_FUNCTION_ARGS)
 	else
 		SRF_RETURN_DONE(fctx);
 }
+
+PG_FUNCTION_INFO_V1(tuple_data_split);
+Datum
+tuple_data_split(PG_FUNCTION_ARGS)
+{
+	Oidrel_oid;
+	bytea			*raw_data;
+	uint16			t_infomask;
+	uint16			t_infomask2;
+	text			*t_bits_str;
+	RelationData	*rel;
+	TupleDesc		tuple_desc;
+	bool			do_detoast = false;
+	interror_level = ERROR;
+
+	bits8			*t_bits = NULL;
+	Datum			res;
+
+	rel_oid		= PG_GETARG_OID(0);
+	raw_data	= PG_GETARG_BYTEA_P(1);
+	t_infomask	= PG_GETARG_INT16(2);
+	t_infomask2	= PG_GETARG_INT16(3);
+	t_bits_str	= PG_ARGISNULL(4) ? NULL : PG_GETARG_TEXT_P(4);
+	if (PG_NARGS()>=6) do_detoast = PG_GETARG_BOOL(5);
+	if (PG_NARGS()>=7) error_level = PG_GETARG_BOOL(6)?WARNING:ERROR;
+
+	if (!superuser())
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+	(errmsg("must be superuser to use raw page functions";
+
+	/*
+	 * Here we converting t_bits string back to the bits8 array,
+	 * as it really is in the tuple header
+	 */
+	if (t_infomask & HEAP_HASNULL)
+	{
+		int bits_str_len;
+		int bits_len;
+		char * p;
+		int off;
+		char byte = 0;
+		bits_len = (t_infomask2 & HEAP_NATTS_MASK) / 8 + 1;
+		if (! t_bits_str)
+		{
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+		errmsg("t_bits argument is NULL, though we expect it to be NOT NULL and %i character long", bits_len * 8)));
+		}
+		bits_str_len =  VARSIZE(t_bits_str) - VARHDRSZ;
+		if (bits_str_len % 8)
+		{
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+		errmsg("t_bits argument length should be multiple of eight")));
+		}
+		if (bits_len * 8 != bits_str_len)
+		{
+			ereport(ERROR,
+	(errcode(ERRCODE_DATA_CORRUPTED),
+		errmsg("wrong t_bits argument length. Expected %i, actual is %i", bits_len * 8, bits_str_len)));
+		}
+		t_bits = palloc(bits_len + 1);
+
+		p = (char *) t_bits_str + VARHDRSZ;
+		off

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-25 Thread Nikolay Shaplov
В письме от 25 сентября 2015 20:59:29 пользователь Michael Paquier написал:

> > Here is final version with documentation.
> 
> Thanks! I just had a short look at it:
> - I am not convinced that it is worth declaring 3 versions of
> tuple_data_split.
How which of them should we leave? 

> - The patch does not respect the project code style,
> particularly one-line "if condition {foo;}" are not adapted, code line is 
limited
> to 80 characters, etc. The list is basically here:
> http://www.postgresql.org/docs/current/static/source.html
I did my best. Results are attached.

> - Be aware of typos: s/whitch/which is one.
I've run spellchecker on all comments. Hope that I removed most of the 
mistakes. But as I am not native speaker, I will not be able to eliminate them 
all. I will need help here.
 
> +t_infomask2
> +integer
> +stores number of attributes (11 bits of the word). The
> rest are used for flag bits:
> +
> +0x2000 - tuple was updated and key cols modified, or tuple deleted
> +0x4000 - tuple was HOT-updated
> +0x8000 - this is heap-only tuple
> +0xE000 - visibility-related bits (so called "hint bits")
> +
> This large chunk of documentation is a duplicate of storage.sgml. If
> that's really necessary, it looks adapted to me to have more detailed
> comments at code level directly in heapfuncs.c.
Hm... There is no explanation of t_infomask/t_infomask2 bits in storage.sgml. 

If there is no source of information other then source code, then the 
documentation is not good.

If there were information about t_infomask/t_infomask2 in storage.sgml, then I 
would add "See storage.sgml for more info" into pageinspect doc, and thats 
all. But since there is no such information there, I  think that the best 
thing is to quote comments from source code there, so you can get all 
information from documentation, not looking for it in the code.

So I would consider two options: Either move t_infomask/t_infomask2 details 
into storage.sgml or leave as it is.

I am lazy, and does not feel confidence about touching main documentation, so I 
would prefer to leave as it is.


> The example of tuple_data_split in the docs would be more interesting
> if embedded with a call to heap_page_items.

This example would be almost not readable. May be I should add one more praise 
explaining where did we take arguments for tuple_data_split


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..52a0663 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,14 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/pg_type.h"
 
+Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len,
+			TupleDesc tuple_desc, uint16 t_infomask,
+			uint16 t_infomask2, bits8 *t_bits, bool
+			do_detoast, int error_level);
 
 /*
  * bits_to_text
@@ -91,7 +98,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 		if (raw_page_size < SizeOfPageHeaderData)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-  errmsg("input page too small (%d bytes)", raw_page_size)));
+	 errmsg("input page too small (%d bytes)", raw_page_size)));
 
 		fctx = SRF_FIRSTCALL_INIT();
 		mctx = MemoryContextSwitchTo(fctx->multi_call_memory_ctx);
@@ -122,8 +129,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -155,6 +162,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		{
 			HeapTupleHeader tuphdr;
 			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +177,14 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-18 Thread Nikolay Shaplov
В письме от 11 сентября 2015 15:12:04 Вы написали:

> > Ok.Let's come to the final decision with tuple_data_parse, and i will add
> > this switch there and to pure sql heap_page_item_attrs
> 
> Fine for me.

So I've modified the code, now we have:

heap_page_items - have a column with raw tuple data

tuple_data_split - takes oid, raw tuple data, infomask, infomask2 and bits 
parsed as string and returns bytea[] with attribute raw values. It also have 
two optional arguments do_detoast that forces function to detoast attribute, 
and warning_mode that allows to set this function to warning mode, and do not 
stop working if some inconsistency were found. 

there is also pure sql function heap_page_item_attrs that takes page data, and 
table oid, and returns same data as heap_page_items but bytea[] of attributes 
instead of one whole piece of raw data. It also has optional argument 
do_detoast that allows to get bytea[] of detoasted attribute data.

I've decided that there is no real need in warning_mode in 
heap_page_item_attrs so there is no such argument there.

So now it is still RFC. Final patch with documentation will come soon

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..d42ca48 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,11 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/pg_type.h"
 
+Datum split_tuple_data(char *tuple_data, uint16 tuple_data_len, TupleDesc tuple_desc, uint16 t_infomask, uint16 t_infomask2, bits8 *t_bits, bool do_detoast, int error_level);
 
 /*
  * bits_to_text
@@ -122,8 +126,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		HeapTuple	resultTuple;
 		Datum		result;
 		ItemId		id;
-		Datum		values[13];
-		bool		nulls[13];
+		Datum		values[14];
+		bool		nulls[14];
 		uint16		lp_offset;
 		uint16		lp_flags;
 		uint16		lp_len;
@@ -155,6 +159,8 @@ heap_page_items(PG_FUNCTION_ARGS)
 		{
 			HeapTupleHeader tuphdr;
 			int			bits_len;
+			bytea		*tuple_data_bytea;
+			int			tuple_data_len;
 
 			/* Extract information from the tuple header */
 
@@ -168,6 +174,13 @@ heap_page_items(PG_FUNCTION_ARGS)
 			values[9] = UInt32GetDatum(tuphdr->t_infomask);
 			values[10] = UInt8GetDatum(tuphdr->t_hoff);
 
+			/* Copy raw tuple data into bytea attribute */
+			tuple_data_len = lp_len - tuphdr->t_hoff;
+			tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+			SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+			memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff, tuple_data_len);
+			values[13] = PointerGetDatum(tuple_data_bytea);
+			nulls[13] = false;
 			/*
 			 * We already checked that the item is completely within the raw
 			 * page passed to us, with the length given in the line pointer.
@@ -208,7 +221,7 @@ heap_page_items(PG_FUNCTION_ARGS)
 			 */
 			int			i;
 
-			for (i = 4; i <= 12; i++)
+			for (i = 4; i <= 13; i++)
 nulls[i] = true;
 		}
 
@@ -223,3 +236,210 @@ heap_page_items(PG_FUNCTION_ARGS)
 	else
 		SRF_RETURN_DONE(fctx);
 }
+
+PG_FUNCTION_INFO_V1(tuple_data_split);
+Datum
+tuple_data_split(PG_FUNCTION_ARGS)
+{
+	Oidrel_oid;
+	bytea			*raw_data;
+	uint16			t_infomask;
+	uint16			t_infomask2;
+	text			*t_bits_str;
+	RelationData	*rel;
+	TupleDesc		tuple_desc;
+	bool			do_detoast = false;
+	interror_level = ERROR;
+
+	bits8			*t_bits = NULL;
+	Datum			res;
+
+	rel_oid		= PG_GETARG_OID(0);
+	raw_data	= PG_GETARG_BYTEA_P(1);
+	t_infomask	= PG_GETARG_INT16(2);
+	t_infomask2	= PG_GETARG_INT16(3);
+	t_bits_str	= PG_ARGISNULL(4) ? NULL : PG_GETARG_TEXT_P(4);
+	if (PG_NARGS()>=6) do_detoast = PG_GETARG_BOOL(5);
+	if (PG_NARGS()>=7) error_level = PG_GETARG_BOOL(6)?WARNING:ERROR;
+
+	if (!superuser())
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+	(errmsg("must be superuser to use raw page functions";
+
+	/*
+	 * Here we converting t_bits string back to the bits8 array,
+	 * as it really is in the tuple header
+	 */
+	if (t_infomask & HEAP_HASNULL)
+	{
+		int bits_str_len;
+		int bits_len;
+		char *

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-10 Thread Nikolay Shaplov
В письме от 10 сентября 2015 15:46:25 пользователь Michael Paquier написал:

> > So if move tuple data parsing into separate function, then we have to pass
> > these values alongside the tuple data. This is not convenient at all.
> > So I did it in a way you see in the patch.
> 
> Why is it not convenient at all? Yes, you have a point, we need those
> fields to be able to parse the t_data properly. Still the possibility
> to show individual fields of a tuple as a bytea array either with
> toasted or detoasted values is a concept completely different from
> simply showing the page items, which is what, it seems to me,
> heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
> and t_bits are already available as return fields of heap_page_items,
> we should simply add a function like that:
> heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
> t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
> returns bytea[]

Just compare two expressions:

select * from heap_page_item_attrs(get_raw_page('test', 0),'test'::regclass);

and 

select  lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid, 
t_infomask2, t_infomask, t_hoff, t_bits, t_oid,  tuple_data_parse (
t_data,  t_infomask, t_infomask2, t_bits, 'test'::regclass, true, false) 
from heap_page_item_attrs(get_raw_page('test', 0));

The second variant is a total mess and almost unsable...

Though I've discussed this issue with friedns, and we came to conclusion that 
it might be good to implement tuple_data_parse and then implement 
easy to use heap_page_item_attrs in pure SQL, using heap_page_items and 
tuple_data_parse.

That would keep usage simplicity, and make code more simple as you offer.

The only one argument against it is that in internal representation t_bist is 
binary, and in sql output it is string with '0' and '1' characters. We will 
have to convert it back to binary mode. This is not difficult, but just useless 
convertations there and back again.

What do you think about this solution?


> Note that the data corruption checks apply only to this function as
> far as I understand, so I think that things could be actually split
> into two independent patches:
> 1) Upgrade heap_page_items to add the tuple data as bytea.
> 2) Add the new function able to parse those fields appropriately.
> 
> As this code, as you justly mentioned, is aimed mainly for educational
> purposes to understand a page structure, we should definitely make it
> as simple as possible at code level, and it seems to me that this
> comes with a separate SQL interface to control tuple data parsing as a
> bytea[]. We are doing no favor to our users to complicate the code of
> pageinspect.c as this patch is doing in its current state, especially
> if beginners want to have a look at it.
> 
> >> Actually not two functions, but just one, with an extra flag to be
> >> able to enforce detoasting on the field values where this can be done.
> > 
> > Yeah, I also thought about it. But did not come to any final conclusion.
> > Should we add forth argument, that enables detoasting, instead of adding
> > separate function?
> 
> This is definitely something you want to control with a switch.
Ok.Let's come to the final decision with tuple_data_parse, and i will add this 
switch there and to pure sql heap_page_item_attrs


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-09 Thread Nikolay Shaplov
В письме от 8 сентября 2015 11:53:24 Вы написали:
> On Sat, Sep 5, 2015 at 1:05 AM, Nikolay Shaplov wrote:
> > В письме от 4 сентября 2015 14:58:29 пользователь Michael Paquier написал:
> > > Documentation is missing, that would be good to have to understand what
> > > each function is intended to do.
> > 
> > I were going to add documentation when this patch is commited, or at least
> > approved for commit. So I would know for sure that function definition
> > would
> > not change, so had not to rewrite it again and again.
> 
> I doubt that this patch would be committed without documentation, this is a
> requirement for any patch.
Ok. I can do it. 

> > So if it is possible I would write documentation and test when this patch
> > is
> > already approved.
> 
> I'm fine with that. Let's discuss its shape and come up with an agreement.
> It would have been good to post test cases of what this stuff actually does
> though, people reviewing this thread are limited to guess on what is tried
> to be achieved just by reading the code.

To test non detoasted functions one should do:

CREATE EXTENSION pageinspect;

create table test (a int, b smallint,c varchar);
insert into test VALUES
(-1,2,'111'),
(2,null,'222'),
(3,8,'333'),
(null,16,null);

Then 

# select * from heap_page_items(get_raw_page('test', 0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | 
t_infomask2 | t_infomask | t_hoff |  t_bits  | t_oid | t_data 
++--++++--++-+++--+---+
  1 |   8152 |1 | 34 |763 |  0 |0 | (0,1)  |
   3 |   2050 | 24 |  |   | \x020009313131
  2 |   8120 |1 | 32 |763 |  0 |0 | (0,2)  |
   3 |   2051 | 24 | 1010 |   | \x020009323232
  3 |   8080 |1 | 34 |763 |  0 |0 | (0,3)  |
   3 |   2050 | 24 |  |   | \x030008000933
  4 |   8048 |1 | 26 |763 |  0 |0 | (0,4)  |
   3 |   2049 | 24 | 0100 |   | \x1000

or 

# select * from heap_page_item_attrs(get_raw_page('test', 0),'test'::regclass);
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | 
t_infomask2 | t_infomask | t_hoff |  t_bits  | t_oid | t_attrs  
   
++--++++--++-+++--+---+-
  1 |   8152 |1 | 34 |763 |  0 |0 | (0,1)  |
   3 |   2050 | 24 |  |   | 
{"\\x","\\x0200","\\x09313131"}
  2 |   8120 |1 | 32 |763 |  0 |0 | (0,2)  |
   3 |   2051 | 24 | 1010 |   | 
{"\\x0200",NULL,"\\x09323232"}
  3 |   8080 |1 | 34 |763 |  0 |0 | (0,3)  |
   3 |   2050 | 24 |  |   | 
{"\\x0300","\\x0800","\\x0933"}
  4 |   8048 |1 | 26 |763 |  0 |0 | (0,4)  |
   3 |   2049 | 24 | 0100 |   | {NULL,"\\x1000",NULL}
(4 rows)

For detoasted function you should do something like this:

CREATE EXTENSION pageinspect;

create table test2 (c varchar);

insert into test2 VALUES ('++');
insert into test2 VALUES (repeat('+',2100));

Then  heap_page_item_attrs will show you compressed attr data:

# select * from heap_page_item_attrs(get_raw_page('test2', 
0),'test2'::regclass);
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | 
t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |
t_attrs
++--++++--++-++++---+---
  1 |   8160 |1 | 27 |768 |  0 |0 | (0,1)  |
   1 |   2050 | 24 ||   | {"\\x072b2b"}
  2 |   8096 |1 | 59 |769 |  0 |0 | (0,2)  |
   1 |   2050 | 24 ||   | 
{"\\x8e003408fe2b0f01ff0f01ff0f01ff0f01ff0f01ff0f01ff0f01ff010f01aa"}
(2 rows)

and heap_page_item_detoast_attrs will show you original data

# select * from heap_page_item_detoast_attrs(get_raw_page('test2', 
0),'test2'::regclass);
[will not paste output here it is too long, see it for yourself]


> That's actually where
> documentation, even a draft of documentation helps a lot in the review to
> see if what is expected by the 

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-04 Thread Nikolay Shaplov
В письме от 4 сентября 2015 14:58:29 пользователь Michael Paquier написал:

> Yeah, I think that's acceptable to have a switch, defaulting to ERROR if
> caller specifies nothing.
> 
> Here are some observations after looking at the code, no functional testing.
> 
> +   int error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2)
> 
> :NULL;
> 
> When handling additional arguments, it seems to me that it is more readable
> to use something like that:
> if (PG_NARGS >= 3)
> {
>  arg = PG_GETARG_XXX(2);
>  //etc.
> }
> 
> +   error_mode = error_mode?WARNING:ERROR;
> 
> This generates warnings at compilation.
> 
yeah, what I have done here is too complex with no reason. I've simplified this 
code now.

> +   if (SRF_IS_FIRSTCALL() && (error_mode == WARNING))
> +   {
> +   ereport(WARNING,
> +   (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +   (errmsg("Runnung heap_page_item_attrs in
> WARNING mode.";
> +   }
> 
> I am not convinced that this is necessary.
I've removed it.

> 
> +   inter_call_data->raw_page_size = VARSIZE(raw_page) -
> VARHDRSZ;
> +   if (inter_call_data->raw_page_size < SizeOfPageHeaderData)
> Including raw_page_size in the status data does not seem necessary to me.
> The page data is already available for each loop so you could just extract
> it from it.
> 
> +   ereport(inter_call_data->error_level,
> +   (errcode(ERRCODE_DATA_CORRUPTED),
> +   errmsg("Data corruption: number of
> attributes in tuple header is greater than number of attributes in tuple
> descripor")));
> I'd rather switch the error reports related to data corruption from ereport
> to elog, which is more suited for internal errors, and it seems to me that
> it is the case here.
Hm... First, since we have proper error code, do not see why not give it.

Second, this is not exactly internal error, in some cases user tries to parse 
corrupted data on purpose. So for user it is not an internal error, error from 
deep below, but error on the level he is currently working at. So I would not 
call it internal error.

> heapfuncs.c:370:7: warning: variable 'raw_attr' is used uninitialized
> whenever 'if' condition is false [-Wsometimes-uninitialized]
> if (!is_null)
> ^~~~
> heapfuncs.c:419:43: note: uninitialized use occurs here
> raw_attrs = accumArrayResult(raw_attrs, raw_attr, is_null,
> BYTEAOID, fctx->multi_call_memory_ctx);
> ^~~~
> heapfuncs.c:370:3: note: remove the 'if' if its condition is always true
> if (!is_null)
> ^
> heapfuncs.c:357:17: note: initialize the variable 'raw_attr' to silence
> this warning
> Datum raw_attr;
> My compiler complains about uninitialized variables.
Fixed it

> +--
> +-- _heap_page_items()
> +--
> +CREATE FUNCTION _heap_page_items(IN page bytea, IN relname text,
> I am not sure why this is necessary and visibly it overlaps with the
> existing declaration of heap_page_items. The last argument is different
> though, and I recall that we decided not to use anymore the relation name
> specified as text, but its OID instead in this module.
Oh! This should not go to the final patch :-( Sorry. Removed it.


> Documentation is missing, that would be good to have to understand what
> each function is intended to do.

I were going to add documentation when this patch is commited, or at least 
approved for commit. So I would know for sure that function definition would 
not change, so had not to rewrite it again and again.

So if it is possible I would write documentation and test when this patch is 
already approved.

> Code has some whitespaces.
I've found and removed some. Hope that was all of them...


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect

Re: [HACKERS] pageinspect patch, for showing tuple data

2015-09-02 Thread Nikolay Shaplov
В письме от 3 августа 2015 15:35:23 пользователь Alvaro Herrera написал:
> Nikolay Shaplov wrote:
> > This patch adds several new functions, available from SQL queries. All
> > these functions are based on heap_page_items, but accept slightly
> > different arguments and has one additional column at the result set:
> > 
> > heap_page_tuples - accepts relation name, and bulkno, and returns usual
> > heap_page_items set with additional column that contain snapshot of tuple
> > data area stored in bytea.
> 
> I think the API around get_raw_page is more useful generally.  You might
> have table blocks stored in a bytea column for instance, because you
> extracted from some remote server and inserted into a local table for
> examination.  If you only accept relname/blkno, there's no way to
> examine data other than blocks directly in the database dir, which is
> limiting.
> 
> > There is also one strange function: _heap_page_items it is useless for
> > practical purposes. As heap_page_items it accepts page data as bytea, but
> > also get a relation name. It tries to apply tuple descriptor of that
> > relation to that page data.
> 
> For BRIN, I added something similar (brin_page_items), but it receives
> the index OID rather than name, and constructs a tuple descriptor to
> read the data.  I think OID is better than name generally.  (You can
> cast the relation name to regclass).
> 
> It's easy to misuse, but these functions are superuser-only, so there
> should be no security issue at least.  The possibility of a crash
> remains real, though, so maybe we should try to come up with a way to
> harden that.

I've done as you've said: Now patch adds two functions heap_page_item_attrs 
and heap_page_item_detoast_attrs and these function takes raw_page and 
relation OID as arguments. They also have third optional parameter that allows 
to change error mode from ERROR to WARNING.

Is it ok now?


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..59eae0f 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,12 @@
 #include "funcapi.h"
 #include "utils/builtins.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/rel.h"
+#include "catalog/namespace.h"
+#include "catalog/pg_type.h"
 
+#include "rawpage.h"
 
 /*
  * bits_to_text
@@ -53,6 +58,9 @@ bits_to_text(bits8 *bits, int len)
 	return str;
 }
 
+Datum heap_page_items_internal(FunctionCallInfo fcinfo, bytea *raw_page, Oid rel_oid, int error_level, bool do_detoast);
+void fill_header_data(FuncCallContext *fctx, Datum *values, bool *nulls, bool do_detoast);
+void split_tuple_data(FuncCallContext *fctx, HeapTupleHeader tuphdr, Datum *values, bool *nulls, bool do_detoast);
 
 /*
  * heap_page_items
@@ -66,38 +74,80 @@ typedef struct heap_page_items_state
 	TupleDesc	tupd;
 	Page		page;
 	uint16		offset;
+	TupleDesc	page_tuple_desc;
+	int		raw_page_size;
+	int		error_level;
 } heap_page_items_state;
 
 Datum
 heap_page_items(PG_FUNCTION_ARGS)
 {
 	bytea	   *raw_page = PG_GETARG_BYTEA_P(0);
+	return heap_page_items_internal(fcinfo, raw_page, InvalidOid, ERROR, false);
+}
+
+PG_FUNCTION_INFO_V1(heap_page_item_attrs);
+Datum
+heap_page_item_attrs(PG_FUNCTION_ARGS)
+{
+	bytea		*raw_page = PG_GETARG_BYTEA_P(0);
+	Oid		rel_oid = PG_GETARG_OID(1);
+	int		error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2) :NULL;
+	error_mode = error_mode?WARNING:ERROR;
+	if (SRF_IS_FIRSTCALL() && (error_mode == WARNING))
+	{
+		ereport(WARNING,
+			(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+(errmsg("Runnung heap_page_item_attrs in WARNING mode.";
+	}
+
+	return heap_page_items_internal(fcinfo, raw_page, rel_oid, error_mode, false);
+}
+
+PG_FUNCTION_INFO_V1(heap_page_item_detoast_attrs);
+Datum
+heap_page_item_detoast_attrs(PG_FUNCTION_ARGS)
+{
+	bytea		*raw_page = PG_GETARG_BYTEA_P(0);
+	Oid		rel_oid = PG_GETARG_OID(1);
+	int		error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2) :NULL;
+	error_mode = error_mode?WARNING:ERROR;
+	if (SRF_IS_FIRSTCALL() && (erro

[HACKERS] commitfest does not see my real latest patch

2015-09-02 Thread Nikolay Shaplov
Hi!

I've added a patch to commitfest

https://commitfest.postgresql.org/7/363/


but in my mail thread I've committed two version of patch. Old one 
(pageinspect.diff) and a new one (pageinspect_show_tuple_data.diff)

and commitfest software for some reason sees only old patch, not new. I think 
that this may be a bug. 

And also I'd like reviewers to see proper patch as "Latest patch", if someone 
can manually fix that, it would be great!

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-08-05 Thread Nikolay Shaplov
В письме от 3 августа 2015 15:35:23 Вы написали:
 Nikolay Shaplov wrote:
  This patch adds several new functions, available from SQL queries. All
  these functions are based on heap_page_items, but accept slightly
  different arguments and has one additional column at the result set:
  
  heap_page_tuples - accepts relation name, and bulkno, and returns usual
  heap_page_items set with additional column that contain snapshot of tuple
  data area stored in bytea.
 
 I think the API around get_raw_page is more useful generally.  You might
 have table blocks stored in a bytea column for instance, because you
 extracted from some remote server and inserted into a local table for
 examination.  If you only accept relname/blkno, there's no way to
 examine data other than blocks directly in the database dir, which is
 limiting.
 
  There is also one strange function: _heap_page_items it is useless for
  practical purposes. As heap_page_items it accepts page data as bytea, but
  also get a relation name. It tries to apply tuple descriptor of that
  relation to that page data.
 
 For BRIN, I added something similar (brin_page_items), but it receives
 the index OID rather than name, and constructs a tuple descriptor to
 read the data.  I think OID is better than name generally.  (You can
 cast the relation name to regclass).
 
 It's easy to misuse, but these functions are superuser-only, so there
 should be no security issue at least.  The possibility of a crash
 remains real, though, so maybe we should try to come up with a way to
 harden that.

Hmm... may be you are right. I'll try to change it would take raw page and 
OID.

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pageinspect patch, for showing tuple data

2015-08-03 Thread Nikolay Shaplov
В письме от 3 августа 2015 14:30:46 пользователь Michael Paquier написал:
 On Mon, Aug 3, 2015 at 1:03 AM, Nikolay Shaplov
 
 n.shap...@postgrespro.ru wrote:
  Hi!
  
  I've created a patch for pageinspect that allows to see data stored in the
  tuple.
  
  This patch has two main purposes:
  
  1. Practical: Make manual DB recovery more simple
 
 To what are you referring to in this case? Manual manipulation of
 on-disk data manually?
Yes, when DB is broken for example

 
  b) I have plenty of sanity check while reading parsing that tuple, for
  this
  function I've changed error level from ERROR to WARNING. This function
  will
  allow to write proper tests that all these checks work as they were
  designed (I hope to write these tests sooner or later)
 
 +ereport(inter_call_data-error_level,
 +(errcode(ERRCODE_DATA_CORRUPTED),
 +errmsg(Data corruption: Iterating over
 tuple data reached out of actual tuple size)));
 I don't think that the possibility to raise a WARNING is a good thing
 in those code paths. If data is corrupted this may crash, and I am not
 sure that anybody would want that even for educational purposes.
Hm... I considered _heap_page_items really very dangerous function, with big 
red Do not call it if you not sure what are you doing warning. Reading data 
with not proper attribute descriptor is dangerous any way. But when I wrote 
that code, I did not have that checks at first, and it was really interesting 
to subst one data with another and see what will happen. And I thought that 
may be other explorers will like to do the same. And it is really possible 
only in warning mode. So I left warnings only in _heap_page_items, and set 
errors for all other cases.


-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pageinspect patch, for showing tuple data

2015-08-02 Thread Nikolay Shaplov
Hi!

I've created a patch for pageinspect that allows to see data stored in the 
tuple.

This patch has two main purposes:

1. Practical: Make manual DB recovery more simple
2. Educational: Seeing what data is actually stored in tuple, allows to get 
better understanding of how does postgres actually works.

This patch adds several new functions, available from SQL queries. All these 
functions are based on heap_page_items, but accept slightly different 
arguments and has one additional column at the result set:

heap_page_tuples - accepts relation name, and bulkno, and returns usual 
heap_page_items set with additional column that contain snapshot of tuple data 
area stored in bytea.

heap_page_tuples_attributes - same as heap_page_tuples, but instead of single 
tuple data bytea snapshot, it has array of bytea values, that were splitted 
into attributes as they would be spitted by nocachegetattr function (I 
actually reimplemented this function main algorithm to get this done)

heap_page_tuples_attrs_detoasted - same as heap_page_tuples_attrs, but all 
varlen attributes values that were compressed or TOASTed, are replaced with 
unTOASTed and uncompressed values.


There is also one strange function: _heap_page_items it is useless for 
practical purposes. As heap_page_items it accepts page data as bytea, but also 
get a relation name. It tries to apply tuple descriptor of that relation to 
that page data. 
This would allow you to try to read page data from one table using tuple 
descriptor from anther. A strange idea, one should say. But this will allow 
you: a) See how wrong data can be interpreted (educational purpose).
b) I have plenty of sanity check while reading parsing that tuple, for this 
function I've changed error level from ERROR to WARNING. This function will 
allow to write proper tests that all these checks work as they were designed 
(I hope to write these tests sooner or later)

I've also added raw tuple data output to original heap_page_items function, 
thought I am not sure if it is good idea. I just can add it there so I did it. 
May be it would be better to change it back for better backward compatibility.

Attached patched is in almost ready state. It has some formatting issues. 
I'd like to hear HACKER's opinion before finishing it and sending to 
commitfest.

-- 
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index aec5258..e4bc1af 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -5,7 +5,7 @@ OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o \
 		  brinfuncs.o ginfuncs.o $(WIN32RES)
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.3.sql pageinspect--1.2--1.3.sql \
+DATA = pageinspect--1.4.sql pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
 	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql \
 	pageinspect--unpackaged--1.0.sql
 PGFILEDESC = pageinspect - functions to inspect contents of database pages
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 8d1666c..e296619 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -29,7 +29,12 @@
 #include funcapi.h
 #include utils/builtins.h
 #include miscadmin.h
+#include utils/array.h
+#include utils/rel.h
+#include catalog/namespace.h
+#include catalog/pg_type.h
 
+#include rawpage.h
 
 /*
  * bits_to_text
@@ -53,6 +58,9 @@ bits_to_text(bits8 *bits, int len)
 	return str;
 }
 
+Datum heap_page_items_internal(FunctionCallInfo fcinfo, bytea *raw_page, text *relname, int error_level, bool do_detoast);
+void fill_header_data(FuncCallContext *fctx, Datum *values, bool *nulls, bool do_detoast);
+void split_tuple_data(FuncCallContext *fctx, HeapTupleHeader tuphdr, Datum *values, bool *nulls, bool do_detoast);
 
 /*
  * heap_page_items
@@ -66,38 +74,98 @@ typedef struct heap_page_items_state
 	TupleDesc	tupd;
 	Page		page;
 	uint16		offset;
+	TupleDesc	page_tuple_desc;
+	int		raw_page_size;
+	int		error_level;
 } heap_page_items_state;
 
 Datum
 heap_page_items(PG_FUNCTION_ARGS)
 {
 	bytea	   *raw_page = PG_GETARG_BYTEA_P(0);
+	text	   *relname = PG_NARGS()==2 ? PG_GETARG_TEXT_P(1) : NULL;
+
+	/*
+	 * Error level is only used while splitting tuple into array of attributes
+	 * This is done only for _heap_page_items function. _heap_page_items is intended
+	 * for educational and research purposes, so we should change all error checks
+	 * to warnings
+	 */
+	return heap_page_items_internal(fcinfo, raw_page, relname, WARNING, false);
+}
+
+PG_FUNCTION_INFO_V1(heap_page_tuples);
+Datum
+heap_page_tuples(PG_FUNCTION_ARGS)
+{
+	text   *relname = PG_GETARG_TEXT_P(0);
+	uint32 blkno= PG_GETARG_UINT32(1);
+	bytea *raw_page;
+
+	if (SRF_IS_FIRSTCALL())
+	{
+		raw_page = get_raw_page_internal(relname, MAIN_FORKNUM, blkno);
+	}
+	return heap_page_items_internal(fcinfo, raw_page, NULL, ERROR, false);
+}
+
+PG_FUNCTION_INFO_V1