Re: [HACKERS] Renaming more clearly SHA functions in pgcrypto/

2016-07-04 Thread Michael Paquier
On Tue, Jul 5, 2016 at 2:09 AM, Alvaro Herrera  wrote:
> Michael Paquier wrote:
>> Hi all,
>> (folks interested in that in CC)
>>
>> While looking at some facility in pgcrypto, I have noticed the stanza
>> created by 56f4478 to prevent conflicts with OpenSSL, like that:
>> +#define SHA256_Init pg_SHA256_Init
>> +#define SHA256_Update pg_SHA256_Update
>>
>> Wouldn't it be better to avoid that, and just rename all those
>> functions as pg_shaXX_foo?
>
> Sure.

OK, so I guess I'll get something into a shape like that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning

2016-07-04 Thread Amit Langote
On 2016/07/04 21:31, Ashutosh Bapat wrote:
> Hi Amit,
> I observed that the ChangeVarNodes call at line 1229 in
> get_relation_constraints() (code below) changes the varno in the cached
> copy of partition key expression, which is undesirable. The reason for this
> seems to be that the RelationGetPartitionCheckQual() returns the copy of
> partition key expression directly from the cache. This is mostly because
> get_check_qual_for_range() directly works on cached copy of partition key
> expressions, which it should never.

Yes, a copyObject() on key->partexprs items seems necessary. Will fix that.

> 1223 /* Append partition check quals, if any */
> 1224 pcqual = RelationGetPartitionCheckQual(relation);
> 1225 if (pcqual)
> 1226 {
> 1227 /* Fix Vars to have the desired varno */
> 1228 if (varno != 1)
> 1229 ChangeVarNodes((Node *) pcqual, 1, varno, 0);
> 1230
> 1231 result = list_concat(result, pcqual);
> 1232 }
> 
> Because of this, the first time through the partition key expressions are
> valid, but then onwards they are restamped with the varno of the first
> partition.
> 
> Please add testcases to your patch to catch such types of issues.

I will integrate tests into the patch(es) and add some more.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [CF2016-9] Allow spaces in working path on tap-tests

2016-07-04 Thread Michael Paquier
On Mon, Jul 4, 2016 at 4:44 PM, Michael Paquier
 wrote:
> And as is the command built for zic.exe in Install.pm, no? $target is
> normally an absolute path per the call of Install().

Attached is the patch I have in mind. After more investigation zic.exe
is indeed broken, $target can be a full path, and if it contains a
space things blow up. The commands of vcregress upgradecheck need a
cleanup as well. I have merged both patches together and the part for
src/tools/msvc needs a backpatch. Separating both things is trivial
anyway as the MSVC and the TAP stuff are on two completely separate
paths.
-- 
Michael
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 636dfec..5767919 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -475,7 +475,8 @@ sub backup
 	my $name= $self->name;
 
 	print "# Taking pg_basebackup $backup_name from node \"$name\"\n";
-	TestLib::system_or_bail("pg_basebackup -D $backup_path -p $port -x");
+	TestLib::system_or_bail('pg_basebackup', '-D', $backup_path,
+			'-p', $port, '-x');
 	print "# Backup finished\n";
 }
 
@@ -763,7 +764,7 @@ sub enable_restoring
 	my $copy_command =
 	  $TestLib::windows_os
 	  ? qq{copy "$path%f" "%p"}
-	  : qq{cp $path/%f %p};
+	  : qq{cp "$path/%f" "%p"};
 
 	$self->append_conf(
 		'recovery.conf', qq(
@@ -791,7 +792,7 @@ sub enable_archiving
 	my $copy_command =
 	  $TestLib::windows_os
 	  ? qq{copy "%p" "$path%f"}
-	  : qq{cp %p $path/%f};
+	  : qq{cp "%p" "$path/%f"};
 
 	# Enable archive_mode and archive_command on node
 	$self->append_conf(
diff --git a/src/tools/msvc/Install.pm b/src/tools/msvc/Install.pm
index 031719d..29a4c3e 100644
--- a/src/tools/msvc/Install.pm
+++ b/src/tools/msvc/Install.pm
@@ -383,10 +383,19 @@ sub GenerateTimezoneFiles
 	$mf =~ /^TZDATA\s*:?=\s*(.*)$/m
 	  || die "Could not find TZDATA row in timezone makefile\n";
 	my @tzfiles = split /\s+/, $1;
-	unshift @tzfiles, '';
+
 	print "Generating timezone files...";
-	system("$conf\\zic\\zic -d \"$target/share/timezone\" "
-		  . join(" src/timezone/data/", @tzfiles));
+
+	my @args = ("$conf/zic/zic",
+'-d',
+"$target/share/timezone");
+	foreach (@tzfiles)
+	{
+		my $tzfile = $_;
+		push(@args, "src/timezone/data/$tzfile")
+	}
+
+	system(@args);
 	print "\n";
 }
 
@@ -634,9 +643,10 @@ sub CopyIncludeFiles
 		next unless (-d "src/include/$d");
 
 		EnsureDirectories("$target/include/server/$d");
-		system(
-qq{xcopy /s /i /q /r /y src\\include\\$d\\*.h "$ctarget\\include\\server\\$d\\"}
-		) && croak("Failed to copy include directory $d\n");
+		my @args = ('xcopy', '/s', '/i', '/q', '/r', '/y',
+ "src\\include\\$d\\*.h",
+ "$ctarget\\include\\server\\$d\\");
+		system(@args) && croak("Failed to copy include directory $d\n");
 	}
 	closedir($D);
 
@@ -689,9 +699,11 @@ sub GenerateNLSFiles
 
 			EnsureDirectories($target, "share/locale/$lang",
 "share/locale/$lang/LC_MESSAGES");
-			system(
-"\"$nlspath\\bin\\msgfmt\" -o \"$target\\share\\locale\\$lang\\LC_MESSAGES\\$prgm-$majorver.mo\" $_"
-			) && croak("Could not run msgfmt on $dir\\$_");
+			my @args = ("$nlspath\\bin\\msgfmt",
+			   '-o',
+			   "$target\\share\\locale\\$lang\\LC_MESSAGES\\$prgm-$majorver.mo",
+			   $_);
+			system(@args) && croak("Could not run msgfmt on $dir\\$_");
 			print ".";
 		}
 	}
diff --git a/src/tools/msvc/vcregress.pl b/src/tools/msvc/vcregress.pl
index 075279a..0a95bcb 100644
--- a/src/tools/msvc/vcregress.pl
+++ b/src/tools/msvc/vcregress.pl
@@ -412,35 +412,42 @@ sub upgradecheck
 	print "\nRunning initdb on old cluster\n\n";
 	standard_initdb() or exit 1;
 	print "\nStarting old cluster\n\n";
-	system("pg_ctl start -l $logdir/postmaster1.log -w") == 0 or exit 1;
+	my @args = ('pg_ctl', 'start', '-l', "$logdir/postmaster1.log", '-w');
+	system(@args) == 0 or exit 1;
 	print "\nSetting up data for upgrading\n\n";
 	installcheck();
 
 	# now we can chdir into the source dir
 	chdir "$topdir/src/bin/pg_upgrade";
 	print "\nDumping old cluster\n\n";
-	system("pg_dumpall -f $tmp_root/dump1.sql") == 0 or exit 1;
+	@args = ('pg_dumpall', '-f', "$tmp_root/dump1.sql");
+	system(@args) == 0 or exit 1;
 	print "\nStopping old cluster\n\n";
 	system("pg_ctl -m fast stop") == 0 or exit 1;
 	$ENV{PGDATA} = "$data";
 	print "\nSetting up new cluster\n\n";
 	standard_initdb() or exit 1;
 	print "\nRunning pg_upgrade\n\n";
-	system("pg_upgrade -d $data.old -D $data -b $bindir -B $bindir") == 0
+	@args = ('pg_upgrade', '-d', "$data.old", '-D', $data, '-b', $bindir,
+			 '-B', $bindir);
+	system(@args) == 0
 	  or exit 1;
 	print "\nStarting new cluster\n\n";
-	system("pg_ctl -l $logdir/postmaster2.log -w start") == 0 or exit 1;
+	@args = ('pg_ctl', '-l', "$logdir/postmaster2.log", '-w', 'start');
+	system(@args) == 0 or exit 1;
 	print "\nSetting up stats on new cluster\n\n";
 	system(".\\analyze_new_cluster.bat") == 0 or exit 1;
 	print "\nDumping new cluster\n\n";
-	

Re: [HACKERS] to_date_valid()

2016-07-04 Thread Pavel Stehule
2016-07-05 2:39 GMT+02:00 Andreas 'ads' Scherbaum :

> On 04.07.2016 18:37, Pavel Stehule wrote:
>
>>
>> I don't know if the name "strict" is best, but the name "validate" is
>> not good too. Current to_date does some validations too.
>>
>
> Obviously not enough, because it allows invalid dates. I'd say that the
> current to_date() merely validates the input format for string parsing, and
> that the date is in range. But there is not much validation on the date
> itself.
>
> So the name can't be "strict" because of the conflict with "NULL"
> handling, and you don't like "valid" - what other options do you offer?


I have not - so third option looks best for me - it can be long name
"only_correct_date", "only_valid_date", "only_valid_date_on_input" ...

Pavel

>
>
> --
> Andreas 'ads' Scherbaum
> German PostgreSQL User Group
> European PostgreSQL User Group - Board of Directors
> Volunteer Regional Contact, Germany - PostgreSQL Project
>


Re: [HACKERS] to_date_valid()

2016-07-04 Thread Pavel Stehule
2016-07-05 4:33 GMT+02:00 David G. Johnston :

> On Mon, Jul 4, 2016 at 8:39 PM, Andreas 'ads' Scherbaum <
> adsm...@wars-nicht.de> wrote:
>
>> On 04.07.2016 18:37, Pavel Stehule wrote:
>>
>>>
>>> I don't know if the name "strict" is best, but the name "validate" is
>>> not good too. Current to_date does some validations too.
>>>
>>
>> Obviously not enough, because it allows invalid dates. I'd say that the
>> current to_date() merely validates the input format for string parsing, and
>> that the date is in range. But there is not much validation on the date
>> itself.
>>
>> So the name can't be "strict" because of the conflict with "NULL"
>> handling, and you don't like "valid" - what other options do you offer?
>
>
> ​We don't have to change the name...we could do something like how
> RegularExpressions work - like (?i) - and just add  a new modifier ​code.
>
> ​'~-MI-DD' --that's a leading tilde, could be anything - even
> something like "HM-MI-DD" for "historical mode"
>
> ​Also, see this thread of a few weeks ago for related material:
>
>
> https://www.postgresql.org/message-id/1873520224.1784572.1465833145330.JavaMail.yahoo%40mail.yahoo.com
>
> It seems that fixing it is back on the table, possibly even for 9.6 since
> this is such a hideous bug - one that closely resembles a cockroach ;)
>
> WRT to the 2014 "reject out-of-range dates" thread, I'm kinda surprised
> that we didn't just set the date to be the minimum or maximum allowable
> date back in 9.2 instead of rejecting it...
>
> I'd be more inclined to add another argument if it was basically an enum.
>
> to_date(text, format, format_style)
>
> This at least opens the door for future enhancements that are associated
> with named styles.  I could imagine an "exact" format that also allows for
> something like regexp character classes so that one can write:
>  "[:SEP:]MM[:SEP:]DD" to keep exact matches but accommodate variations
> on what people type as a separator e.g. [.-/]
>
> format_style=>historical would invoke today's behaviors
>

this is compatibility break - the question is "can we break it?"

Pavel


>
> David J.
>
>


Re: [HACKERS] to_date_valid()

2016-07-04 Thread David G. Johnston
On Mon, Jul 4, 2016 at 8:39 PM, Andreas 'ads' Scherbaum <
adsm...@wars-nicht.de> wrote:

> On 04.07.2016 18:37, Pavel Stehule wrote:
>
>>
>> I don't know if the name "strict" is best, but the name "validate" is
>> not good too. Current to_date does some validations too.
>>
>
> Obviously not enough, because it allows invalid dates. I'd say that the
> current to_date() merely validates the input format for string parsing, and
> that the date is in range. But there is not much validation on the date
> itself.
>
> So the name can't be "strict" because of the conflict with "NULL"
> handling, and you don't like "valid" - what other options do you offer?


​We don't have to change the name...we could do something like how
RegularExpressions work - like (?i) - and just add  a new modifier ​code.

​'~-MI-DD' --that's a leading tilde, could be anything - even something
like "HM-MI-DD" for "historical mode"

​Also, see this thread of a few weeks ago for related material:

https://www.postgresql.org/message-id/1873520224.1784572.1465833145330.JavaMail.yahoo%40mail.yahoo.com

It seems that fixing it is back on the table, possibly even for 9.6 since
this is such a hideous bug - one that closely resembles a cockroach ;)

WRT to the 2014 "reject out-of-range dates" thread, I'm kinda surprised
that we didn't just set the date to be the minimum or maximum allowable
date back in 9.2 instead of rejecting it...

I'd be more inclined to add another argument if it was basically an enum.

to_date(text, format, format_style)

This at least opens the door for future enhancements that are associated
with named styles.  I could imagine an "exact" format that also allows for
something like regexp character classes so that one can write:
 "[:SEP:]MM[:SEP:]DD" to keep exact matches but accommodate variations
on what people type as a separator e.g. [.-/]

format_style=>historical would invoke today's behaviors

David J.


Re: [HACKERS] to_date_valid()

2016-07-04 Thread Andreas 'ads' Scherbaum

On 04.07.2016 18:37, Pavel Stehule wrote:


I don't know if the name "strict" is best, but the name "validate" is
not good too. Current to_date does some validations too.


Obviously not enough, because it allows invalid dates. I'd say that the 
current to_date() merely validates the input format for string parsing, 
and that the date is in range. But there is not much validation on the 
date itself.


So the name can't be "strict" because of the conflict with "NULL" 
handling, and you don't like "valid" - what other options do you offer?


--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] to_date_valid()

2016-07-04 Thread Andreas Karlsson

On 07/04/2016 10:55 PM, Pavel Stehule wrote:

2016-07-04 22:15 GMT+02:00 Andreas Karlsson >:
I do not see a clear conclusion in the linked threads. For example
Bruce calls it a bug in one of the emails

(https://www.postgresql.org/message-id/201107200103.p6K13ix10517%40momjian.us).

I think we should fix to_date() to throw an error. Personally I
would be happy if my code broke due to this kind of change since the
exception would reveal an old bug which has been there a long time
eating my data. I cannot see a case where I would have wanted the
current behavior.


If I remember, this implementation is based on Oracle's behave.


In the thread I linked above they claim that Oracle (at least 10g) does 
not work like this.


Thomas Kellerer wrote:
> Oracle throws an error for the above example:
>
> SQL> select to_date('20110231', 'MMDD') from dual;
> select to_date('20110231', 'MMDD') from dual
> *
> ERROR at line 1:
> ORA-01839: date not valid for month specified

I do not have access to an Oracle installation so I cannot confirm this 
myself.



It is
pretty old and documented - so it is hard to speak about it like the
bug. I understand, so the behave is strange, but it was designed in
different time. You can enter not 100% valid string, but you get correct
date

postgres=# select to_date('2016-12-40','-MM-DD');

┌┐
│  to_date   │
╞╡
│ 2017-01-09 │
└┘
(1 row)


It can be used for some date calculations. In old age the applications
was designed in this style. I am against to change of default behave. We
can introduce new new different function with different name with better
designed format and rules, or we can add new options to this function,
or we can live with current state.


While clever, I think this behavior is a violation of the principle of 
least surprise. It is not obvious to someone reading a query that 
to_date() would behave like this. Especially when Oracle's to_date() 
works differently.


> Now, to_date function should not be used - functions make_date,
> make_timestamp are faster and safe.

Yeah, I personally know of this behavior and therefore would never use 
to_date(), but I am far from the average PostgreSQL user.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] to_date_valid()

2016-07-04 Thread Pavel Stehule
2016-07-04 22:15 GMT+02:00 Andreas Karlsson :

> On 07/03/2016 12:36 PM, Andreas 'ads' Scherbaum wrote:
>
>> On 03.07.2016 07:05, Jaime Casanova wrote:
>>
>>> Shouldn't we fix this instead? Sounds like a bug to me. We don't usually
>>> want to be bug compatible so it doesn't matter if we break something.
>>>
>>
>> There are previous discussions about such a change, and this was rejected:
>>
>> https://www.postgresql.org/message-id/lbjf1v%24a2v%241%40ger.gmane.org
>>
>> https://www.postgresql.org/message-id/A737B7A37273E048B164557ADEF4A58B17C9140E%40ntex2010i.host.magwien.gv.at
>>
>>
>> Hence the new function, which does not collide with the existing
>> implementation.
>>
>
> I do not see a clear conclusion in the linked threads. For example Bruce
> calls it a bug in one of the emails (
> https://www.postgresql.org/message-id/201107200103.p6K13ix10517%40momjian.us
> ).
>
> I think we should fix to_date() to throw an error. Personally I would be
> happy if my code broke due to this kind of change since the exception would
> reveal an old bug which has been there a long time eating my data. I cannot
> see a case where I would have wanted the current behavior.
>

If I remember, this implementation is based on Oracle's behave. It is
pretty old and documented - so it is hard to speak about it like the bug. I
understand, so the behave is strange, but it was designed in different
time. You can enter not 100% valid string, but you get correct date

postgres=# select to_date('2016-12-40','-MM-DD');

┌┐
│  to_date   │
╞╡
│ 2017-01-09 │
└┘
(1 row)


It can be used for some date calculations. In old age the applications was
designed in this style. I am against to change of default behave. We can
introduce new new different function with different name with better
designed format and rules, or we can add new options to this function, or
we can live with current state.

Now, to_date function should not be used - functions make_date,
make_timestamp are faster and safe.

postgres=# select make_date(2017,01,40);
ERROR:  date field value out of range: 2017-01-40

Regards

Pavel




> If there is any legitimate use for the current behavior then we can add it
> back as another function.
>
> Andreas
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] to_date_valid()

2016-07-04 Thread Andreas Karlsson

On 07/03/2016 12:36 PM, Andreas 'ads' Scherbaum wrote:

On 03.07.2016 07:05, Jaime Casanova wrote:

Shouldn't we fix this instead? Sounds like a bug to me. We don't usually
want to be bug compatible so it doesn't matter if we break something.


There are previous discussions about such a change, and this was rejected:

https://www.postgresql.org/message-id/lbjf1v%24a2v%241%40ger.gmane.org
https://www.postgresql.org/message-id/A737B7A37273E048B164557ADEF4A58B17C9140E%40ntex2010i.host.magwien.gv.at


Hence the new function, which does not collide with the existing
implementation.


I do not see a clear conclusion in the linked threads. For example Bruce 
calls it a bug in one of the emails 
(https://www.postgresql.org/message-id/201107200103.p6K13ix10517%40momjian.us).


I think we should fix to_date() to throw an error. Personally I would be 
happy if my code broke due to this kind of change since the exception 
would reveal an old bug which has been there a long time eating my data. 
I cannot see a case where I would have wanted the current behavior.


If there is any legitimate use for the current behavior then we can add 
it back as another function.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-04 Thread Andrew Borodin
Thank you, Alvaro.

Yes, now I see. I'll update gistRedoPageUpdateRecord() to work
accordingly with patched gistplacetopage().
Also, i've noticed that PageAddItem uses MAXALIGN() macro to calculate
tuple size. So, PageIndexTupleOverwrite should behave the same.

About PageIndexDeleteNoCompact() in BRIN. I do not see why
PageIndexDeleteNoCompact() is not a good solution instead of
PageIndexTupleOverwrite? Will it mark tuple as unused until next
PageAddItem will use it's place?

Best regards, Andrey Borodin, Octonica & Ural Federal University.

2016-07-04 22:16 GMT+05:00 Alvaro Herrera :
> Andrew Borodin wrote:
>> Thank you, Amit.
>>
>> Currently, if WAL reconstruct page in another order it won't be a problem.
>
> We require that replay of WAL leads to identical bytes than the
> original.  Among other things, this enables use of a tool that verifies
> that WAL is working correctly simply by comparing page images.  So
> please fix it instead of just verifying that this works for GiST.
>
> By the way, BRIN indexes have a need of this operation too.  The current
> approach is to call PageIndexDeleteNoCompact followed by PageAddItem.
> I suppose it would be beneficial to use your new routine there too.
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-04 Thread Alvaro Herrera
Andrew Borodin wrote:
> Thank you, Amit.
> 
> Currently, if WAL reconstruct page in another order it won't be a problem.

We require that replay of WAL leads to identical bytes than the
original.  Among other things, this enables use of a tool that verifies
that WAL is working correctly simply by comparing page images.  So
please fix it instead of just verifying that this works for GiST.

By the way, BRIN indexes have a need of this operation too.  The current
approach is to call PageIndexDeleteNoCompact followed by PageAddItem.
I suppose it would be beneficial to use your new routine there too.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Renaming more clearly SHA functions in pgcrypto/

2016-07-04 Thread Alvaro Herrera
Michael Paquier wrote:
> Hi all,
> (folks interested in that in CC)
> 
> While looking at some facility in pgcrypto, I have noticed the stanza
> created by 56f4478 to prevent conflicts with OpenSSL, like that:
> +#define SHA256_Init pg_SHA256_Init
> +#define SHA256_Update pg_SHA256_Update
> 
> Wouldn't it be better to avoid that, and just rename all those
> functions as pg_shaXX_foo?

Sure.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-04 Thread Andrew Borodin
Thank you, Amit.

Currently, if WAL reconstruct page in another order it won't be a problem.
How can I check that it works? Would it be sufficient if I kill 9 backend
and postmaster after commit and it will start normally executing select
queries after restart?

I'll make a patch with missing spaces later. I also found some more missing
spaces in PageIndexTupleOverwrite call and typo in comments ("on-place").

Best regards, Andrey Borodin, Octonica & Ural Federal University.

2016-07-04 18:58 GMT+05:00 Amit Kapila :

> On Mon, Jul 4, 2016 at 4:52 PM, Andrew Borodin 
> wrote:
> > Here is a patch with corrections from Alexander Korotkov.
> > My tests, check and check-world show no problems against Ubuntu LTS 14
> x64 VM.
> >
> >
>
> - PageIndexTupleDelete(page, oldoffnum);
> - gistfillbuffer(page, itup, ntup, InvalidOffsetNumber);
> + {
> + /*if we have just one tuple to update we replace it on-place on page*/
> + if(ntup == 1)
> + {
> + PageIndexTupleOverwrite(page,oldoffnum,*itup);
> + }
> + else
> + {
> + /*this corner case is here to support mix calls case (see comment
> above)*/
> + PageIndexTupleDelete(page, oldoffnum);
> + gistfillbuffer(page, itup, ntup, InvalidOffsetNumber);
> + }
>
>
> Here, you have changed the way tuple is added on a page, but still the
> WAL record is same as before.  I think WAL replay should perform the
> action in a same way as we have done when original operation is
> performed.  If not, then it can change the organization of page which
> might lead to failure in replay of consecutive WAL records.
>
> + /*if we have just one tuple to update we replace it on-place on page*/
>
> In comments, we always leave a space both in the beginning and at the
> end of a comment.  See comments in nearby code.
>
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>


Re: [HACKERS] to_date_valid()

2016-07-04 Thread Pavel Stehule
2016-07-04 18:24 GMT+02:00 Andreas 'ads' Scherbaum :

> On 04.07.2016 05:51, Pavel Stehule wrote:
>
>>
>>
>> 2016-07-04 5:19 GMT+02:00 Pavel Stehule > >:
>>
>>
>>
>> 2016-07-04 4:25 GMT+02:00 Craig Ringer > >:
>>
>> On 3 July 2016 at 09:32, Euler Taveira > > wrote:
>>
>> On 02-07-2016 22 :04, Andreas 'ads'
>> Scherbaum wrote:
>> > The attached patch adds a new function "to_date_valid()"
>> which will
>> > validate the date and return an error if the input and
>> output date do
>> > not match. Tests included, documentation update as well.
>> >
>> Why don't you add a third parameter (say, validate = true |
>> false)
>> instead of creating another function? The new parameter
>> could default to
>> false to not break compatibility.
>>
>>
>> because
>>
>>
>> SELECT to_date('blah', 'pattern', true)
>>
>> is less clear to read than
>>
>> SELECT to_date_valid('blah', 'pattern')
>>
>> and offers no advantage. It's likely faster to use a separate
>> function too.
>>
>>
>> personally I prefer first variant - this is same function with
>> stronger check.
>>
>>
>> Currently probably we have not two similar function - one  fault
>> tolerant and second stricter. There is only one example of similar
>> behave - parse_ident with "strict" option.
>>
>> The three parameters are ok still - so I don't see a reason why we have
>> to implement new function. If you need to emphasize the fact so behave
>> should be strict, you can use named parameters
>>
>> select to_date('blah', 'patter', strict => true)
>>
>
> The new function is not "strict", it just adds a validation step:
>

I understand - I know, so this has zero relation to function flag STRICT

I don't know if the name "strict" is best, but the name "validate" is not
good too. Current to_date does some validations too.

Regards

Pavel


>
> postgres=# select to_date_valid(NULL, NULL);
>  to_date_valid
> ---
>
>
>
> (1 row)
>
> --
> Andreas 'ads' Scherbaum
> German PostgreSQL User Group
> European PostgreSQL User Group - Board of Directors
> Volunteer Regional Contact, Germany - PostgreSQL Project
>


Re: [HACKERS] to_date_valid()

2016-07-04 Thread Andreas 'ads' Scherbaum

On 04.07.2016 05:51, Pavel Stehule wrote:



2016-07-04 5:19 GMT+02:00 Pavel Stehule >:



2016-07-04 4:25 GMT+02:00 Craig Ringer >:

On 3 July 2016 at 09:32, Euler Taveira > wrote:

On 02-07-2016 22 :04, Andreas 'ads'
Scherbaum wrote:
> The attached patch adds a new function "to_date_valid()" which 
will
> validate the date and return an error if the input and output 
date do
> not match. Tests included, documentation update as well.
>
Why don't you add a third parameter (say, validate = true |
false)
instead of creating another function? The new parameter
could default to
false to not break compatibility.


because


SELECT to_date('blah', 'pattern', true)

is less clear to read than

SELECT to_date_valid('blah', 'pattern')

and offers no advantage. It's likely faster to use a separate
function too.


personally I prefer first variant - this is same function with
stronger check.


Currently probably we have not two similar function - one  fault
tolerant and second stricter. There is only one example of similar
behave - parse_ident with "strict" option.

The three parameters are ok still - so I don't see a reason why we have
to implement new function. If you need to emphasize the fact so behave
should be strict, you can use named parameters

select to_date('blah', 'patter', strict => true)


The new function is not "strict", it just adds a validation step:

postgres=# select to_date_valid(NULL, NULL);
 to_date_valid
---



(1 row)

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] to_date_valid()

2016-07-04 Thread Andreas 'ads' Scherbaum

On 04.07.2016 16:33, Amit Kapila wrote:

On Sun, Jul 3, 2016 at 6:34 AM, Andreas 'ads' Scherbaum
 wrote:


Hello,

we have customers complaining that to_date() accepts invalid dates, and
returns a different date instead. This is a known issue:

http://sql-info.de/postgresql/notes/to_date-to_timestamp-gotchas.html

On the other hand this leads to wrong dates when loading dates into the
database, because the database happily accepts invalid dates and ends up
writing something completely different into the table.

The attached patch adds a new function "to_date_valid()" which will validate
the date and return an error if the input and output date do not match.
Tests included, documentation update as well.



It seems that you are calling many additional function calls
(date_out, timestamp_in, etc.) to validate the date.  Won't the
additional function calls make to_date much costlier than its current
implementation?  I don't know if there is a better way, but I think it
is worth to consider, if we can find a cheaper way to detect validity
of date.


It certainly is costlier, and I'm open to suggestions how to improve the 
performance. That is one of the reasons why I considered a separate
function, instead of adding this to to_date() and add even more overhead 
there.




Note - Your patch is small (~13KB) enough that it doesn't need to zipped.


It's not about the small size, it's about websites like Gmail which 
mingle up the linebreaks and then git fails. Ran into this problem on 
the pgAdminIII list a while ago, ever since I just zip it and avoid the 
trouble.


--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] oddity in initdb probing of max_connections/shared_buffers

2016-07-04 Thread Tom Lane
Greg Stark  writes:
> I happened to notice a bit of an inconsistency in the way initdb
> probes max_connections and shared_buffers.

> This line in the shared_buffers test:

> /* Use same amount of memory, independent of BLCKSZ */
> test_buffs = (trial_bufs[i] * 8192) / BLCKSZ;

> has no equivalent in the max_connections test. As a result
> max_connections is tested with 10 buffers per connection regardless of
> BLCKSZ.

> Is this intentional? Is the idea that Postgres can't function properly
> without being able to read from 10 files concurrently regardless of
> block size? Or is it an unintentional holdover from before the line
> above was added for the shared_buffers tests?

I think it's intentional; the minimum number of buffers needed per
session doesn't really vary with BLCKSZ, but rather with code structure
(ie, how many buffer pins a query might take at once).  Still, some
comments documenting that a little better wouldn't be a bad thing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cluster on NAS and data center.

2016-07-04 Thread Matt Kelly
As someone who has bitten by index corruption due to collation changes
between glibc versions that shipped CentOS 6 and CentOS 7, don't even try
to do this with anything other than C collation. The default collation
_will_ deterministically leave you with a silently corrupt database if you
store anything other than ASCII text. Windows and Linux are going to
implement en_US.utf-8 slightly differently and Postgres is currently
relying on the OS to provide collation implementations. Go search for my
mailing list post about the dangers of running across versions of glibc for
more info.

I'm going to echo everyone else's sentiment though, and assert that what
you are trying to do is really an insane idea. You might be able to make it
appear like its working but as a DBA, I would have absolutely no confidence
in using that server for disaster recovery.

If your company is saving money by not getting Windows licenses for your DR
environment, you are far better off just saving one more license and making
both your production and DR server be Linux builds.

- Matt K.


Re: [HACKERS] Cluster on NAS and data center.

2016-07-04 Thread Tom Lane
Craig Ringer  writes:
> On 4 July 2016 at 17:33, Krzysztof Kaczkowski  wrote:
>> We know that standard PostgreSQL is not able to use cluster created on
>> different OS. We think that recompilation PostgreSQL with some specific
>> flags. This should give us compatibility of cluster on different Systems.
>> We see a small differences in cluster files on binary level. Can You help
>> us pick proper compilation flags?

> I wouldn't recommend that, and it might be pretty tricky.

Indeed.  Aside from architectural differences, I'd be very afraid of
differences in collation order (resulting in incompatible indexes on
textual columns).  You might be able to make it work if you only ever
use "C" locale.  But really, this is not and will never be considered
supported usage, and if it fails, no one is going to say anything
except "we warned you not to do that".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] to_date_valid()

2016-07-04 Thread Amit Kapila
On Sun, Jul 3, 2016 at 6:34 AM, Andreas 'ads' Scherbaum
 wrote:
>
> Hello,
>
> we have customers complaining that to_date() accepts invalid dates, and
> returns a different date instead. This is a known issue:
>
> http://sql-info.de/postgresql/notes/to_date-to_timestamp-gotchas.html
>
> On the other hand this leads to wrong dates when loading dates into the
> database, because the database happily accepts invalid dates and ends up
> writing something completely different into the table.
>
> The attached patch adds a new function "to_date_valid()" which will validate
> the date and return an error if the input and output date do not match.
> Tests included, documentation update as well.
>

It seems that you are calling many additional function calls
(date_out, timestamp_in, etc.) to validate the date.  Won't the
additional function calls make to_date much costlier than its current
implementation?  I don't know if there is a better way, but I think it
is worth to consider, if we can find a cheaper way to detect validity
of date.

Note - Your patch is small (~13KB) enough that it doesn't need to zipped.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-04 Thread Amit Kapila
On Mon, Jul 4, 2016 at 4:52 PM, Andrew Borodin  wrote:
> Here is a patch with corrections from Alexander Korotkov.
> My tests, check and check-world show no problems against Ubuntu LTS 14 x64 VM.
>
>

- PageIndexTupleDelete(page, oldoffnum);
- gistfillbuffer(page, itup, ntup, InvalidOffsetNumber);
+ {
+ /*if we have just one tuple to update we replace it on-place on page*/
+ if(ntup == 1)
+ {
+ PageIndexTupleOverwrite(page,oldoffnum,*itup);
+ }
+ else
+ {
+ /*this corner case is here to support mix calls case (see comment above)*/
+ PageIndexTupleDelete(page, oldoffnum);
+ gistfillbuffer(page, itup, ntup, InvalidOffsetNumber);
+ }


Here, you have changed the way tuple is added on a page, but still the
WAL record is same as before.  I think WAL replay should perform the
action in a same way as we have done when original operation is
performed.  If not, then it can change the organization of page which
might lead to failure in replay of consecutive WAL records.

+ /*if we have just one tuple to update we replace it on-place on page*/

In comments, we always leave a space both in the beginning and at the
end of a comment.  See comments in nearby code.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] 9.4 -> 9.5 regression with queries through pgbouncer on RHEL 6

2016-07-04 Thread Vladimir Borodin

> 13 июня 2016 г., в 21:58, Vladimir Borodin  написал(а):
> 
>> 
>> 13 июня 2016 г., в 0:51, Andres Freund > > написал(а):
>> 
>> Hi Vladimir,
>> 
>> Thanks for these reports.
>> 
>> On 2016-06-13 00:42:19 +0300, Vladimir Borodin wrote:
>>> perf report -g -i pg9?_all.data >/tmp/pg9?_perf_report.txt
>> 
>> Any chance you could redo the reports with --no-children --call-graph=fractal
>> added? The mode that includes child overheads unfortunately makes the
>> output hard to interpet/compare.
> 
> Of course. Not sure if that is important but I upgraded perf for that 
> (because --no-children option was introduced in ~3.16), so perf record and 
> perf report were done with different perf versions.
> 
> 
> 
> 
> 
> Also I’ve done the same test on same host (RHEL 6) but with 4.6 kernel/perf 
> and writing perf data to /dev/shm for not loosing events. Perf report output 
> is also attached but important thing is that the regression is not so 
> significant:
> 
> root@pgload05g ~ # uname -r
> 4.6.0-1.el6.elrepo.x86_64
> root@pgload05g ~ # cat /proc/sys/kernel/sched_autogroup_enabled
> 1
> root@pgload05g ~ # /tmp/run.sh
> RHEL 69.4 71634   0.893
> RHEL 69.5 54005   1.185
> RHEL 69.6 65550   0.976
> root@pgload05g ~ # echo 0 >/proc/sys/kernel/sched_autogroup_enabled
> root@pgload05g ~ # /tmp/run.sh
> RHEL 69.4 73041   0.876
> RHEL 69.5 60105   1.065
> RHEL 69.6 67984   0.941
> root@pgload05g ~ #
> 
> 
> 
> 

Andres, is there any chance that you would find time to look at those results? 
Are they actually useful?

> 
> 
>> 
>>> The results from pg9?_perf_report.txt are attached. Note that in all cases 
>>> some events were lost, i.e.:
>>> 
>>> root@pgload05g ~ # perf report -g -i pg94_all.data 
>>> >/tmp/pg94_perf_report.txt
>>> Failed to open [vsyscall], continuing without symbols
>>> Warning:
>>> Processed 537137 events and lost 7846 chunks!
>> 
>> You can reduce the overhead by reducing the sampling frequency, e.g. by
>> specifying -F 300.
>> 
>> Greetings,
>> 
>> Andres Freund
>> 
>> 
>> -- 
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org 
>> )
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers 
>> 
> 
> 
> --
> May the force be with you…
> https://simply.name 

--
May the force be with you…
https://simply.name



Re: [HACKERS] Statistics Injection

2016-07-04 Thread Victor Giannakouris - Salalidis
Hello,

pg_dbms_stats worked well for my case, thank you all for your quick
responses and for the support!

Victor

On Sat, Jul 2, 2016 at 8:28 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

>
>
> > The problem is that, even if I set the reltuples and relpages of my
> choice, when I run the EXPLAIN clause for a query in which the 'newTable'
>  is involved in (e.g. EXPLAIN SELECT * FROM newTable), I get the same cost
> and row estimation.
>
> >Could anyone help me with that?
>
>
> There's a pg_dbms_stats extension that enables you to override/freeze the
> statistics: https://github.com/ossc-db/pg_dbms_stats
>
> Vladimir
> ‎
>



-- 
Victor Giannakouris - Salalidis

Researcher
Computing Systems Laboratory (CSLab),
National Technical University of Athens

LinkedIn:
http://gr.linkedin.com/pub/victor-giannakouris-salalidis/69/585/b23/
Personal Page: http://gsvic.github.io


Re: [HACKERS] Reviewing freeze map code

2016-07-04 Thread Amit Kapila
On Mon, Jul 4, 2016 at 2:31 PM, Masahiko Sawada  wrote:
> On Sat, Jul 2, 2016 at 12:17 PM, Amit Kapila  wrote:
>> On Sat, Jul 2, 2016 at 12:53 AM, Andres Freund  wrote:
>>> On 2016-07-01 15:18:39 -0400, Robert Haas wrote:
>>>
 Should we just clear all-visible and call it good enough?
>>>
>>> Given that we need to do that in heap_lock_tuple, which entirely
>>> preserves all-visible (but shouldn't preserve all-frozen), ISTM we
>>> better find something that doesn't invalidate all-visible.
>>>
>>
>> Sounds logical, considering that we have a way to set all-frozen and
>> vacuum does that as well.  So probably either we need to have a new
>> API or add a new parameter to visibilitymap_clear() to indicate the
>> same.  If we want to go that route, isn't it better to have
>> PD_ALL_FROZEN as well?
>>
>
> Cant' we call visibilitymap_set with all-visible but not all-frozen
> bits instead of clearing flags?
>

That doesn't sound to be an impressive way to deal.  First,
visibilitymap_set logs the action itself which will generate two WAL
records (one for visibility map and another for lock tuple).  Second,
it doesn't look consistent to me.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning

2016-07-04 Thread Ashutosh Bapat
Hi Amit,
I observed that the ChangeVarNodes call at line 1229 in
get_relation_constraints() (code below) changes the varno in the cached
copy of partition key expression, which is undesirable. The reason for this
seems to be that the RelationGetPartitionCheckQual() returns the copy of
partition key expression directly from the cache. This is mostly because
get_check_qual_for_range() directly works on cached copy of partition key
expressions, which it should never.

1223 /* Append partition check quals, if any */
1224 pcqual = RelationGetPartitionCheckQual(relation);
1225 if (pcqual)
1226 {
1227 /* Fix Vars to have the desired varno */
1228 if (varno != 1)
1229 ChangeVarNodes((Node *) pcqual, 1, varno, 0);
1230
1231 result = list_concat(result, pcqual);
1232 }

Because of this, the first time through the partition key expressions are
valid, but then onwards they are restamped with the varno of the first
partition.

Please add testcases to your patch to catch such types of issues.

On Mon, Jun 27, 2016 at 3:56 PM, Amit Langote  wrote:

>
> Hi Ashutosh,
>
> On 2016/06/24 23:08, Ashutosh Bapat wrote:
> > Hi Amit,
> > I tried creating 2-level partitioned table and tried to create simple
> table
> > using CTAS from the partitioned table. It gives a cache lookup error.
> > Here's the test
> > CREATE TABLE pt1_l (a int, b varchar, c int) PARTITION BY RANGE(a);
> > CREATE TABLE pt1_l_p1 PARTITION OF pt1_l FOR VALUES START (1) END (250)
> > INCLUSIVE PARTITION BY RANGE(b);
> > CREATE TABLE pt1_l_p2 PARTITION OF pt1_l FOR VALUES START (251) END (500)
> > INCLUSIVE PARTITION BY RANGE(((a+c)/2));
> > CREATE TABLE pt1_l_p3 PARTITION OF pt1_l FOR VALUES START (501) END (600)
> > INCLUSIVE PARTITION BY RANGE(c);
> > CREATE TABLE pt1_l_p1_p1 PARTITION OF pt1_l_p1 FOR VALUES START
> ('01')
> > END ('000125') INCLUSIVE;
> > CREATE TABLE pt1_l_p1_p2 PARTITION OF pt1_l_p1 FOR VALUES START
> ('000126')
> > END ('000250') INCLUSIVE;
> > CREATE TABLE pt1_l_p2_p1 PARTITION OF pt1_l_p2 FOR VALUES START (251) END
> > (375) INCLUSIVE;
> > CREATE TABLE pt1_l_p2_p2 PARTITION OF pt1_l_p2 FOR VALUES START (376) END
> > (500) INCLUSIVE;
> > CREATE TABLE pt1_l_p3_p1 PARTITION OF pt1_l_p3 FOR VALUES START (501) END
> > (550) INCLUSIVE;
> > CREATE TABLE pt1_l_p3_p2 PARTITION OF pt1_l_p3 FOR VALUES START (551) END
> > (600) INCLUSIVE;
> > INSERT INTO pt1_l SELECT i, to_char(i, 'FM00'), i FROM
> > generate_series(1, 600, 2) i;
> > CREATE TABLE upt1_l AS SELECT * FROM pt1_l;
> >
> > The last statement gives error "ERROR:  cache lookup failed for function
> > 0". Let me know if this problem is reproducible.
>
> Thanks for the test case.  I can reproduce the error.  It has to do with
> partition key column (b) being of type varchar whereas the default
> operator family for the type being text_ops and there not being an
> =(varchar, varchar) operator in text_ops.
>
> When creating a plan for SELECT * FROM pt1_l, which is an Append plan,
> partition check quals are generated internally to be used for constraint
> exclusion - such as, b >= '01' AND b < '000125'.  Individual OpExpr's
> are generated by using information from PartitionKey of the parent
> (PartitionKey) and pg_partition entry of the partition.  When choosing the
> operator to use for =, >=, <, etc., opfamily and typid of corresponding
> columns are referred.  As mentioned above, in this case, they happened to
> be text_ops and varchar, respectively, for column b.  There doesn't exist
> an operator =(varchar, varchar) in text_ops, so InvalidOid is returned by
> get_opfamily_member which goes unchecked until the error in question
> occurs.
>
> I have tried to fix this in attached updated patch by using operator class
> input type for operator resolution in cases where column type didn't help.
>
> Thanks,
> Amit
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


[HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-07-04 Thread Andrew Borodin
Here is a patch with corrections from Alexander Korotkov.
My tests, check and check-world show no problems against Ubuntu LTS 14 x64 VM.


Best regards, Andrey Borodin, Octonica & Ural Federal University.

2016-07-04 10:41 GMT+05:00 Andrew Borodin :
> Hi!
>>I think you should implement PageReplaceItem() version and add it to the 
>>commitfest.
> Here is the patch.
> I've called function PageIndexTupleOverwrite() because it's suitable
> only for indices. It works on my tests and performance is the same as
> in proof-of-concept (despite some sanity checks copied from
> PageIndexTupleDelete).
> Next weekend I'll do more testing, and then add it to commitfest.
>
> Best regards, Andrey Borodin, Octonica & Ural Federal University.
>
> 2016-07-03 15:21 GMT+05:00 Alexander Korotkov :
>> Hi!
>>
>> On Sun, Jul 3, 2016 at 12:24 PM, Andrew Borodin 
>> wrote:
>>>
>>> I think there is some room for improving GiST inserts. Following is
>>> the description of what I think is performance problem.
>>>
>>> Function gistplacetopage in file /src/backend/access/gist/gist.c is
>>> responsible for updating or appending new tuple on GiST page.
>>> Currently, after checking necessity of page split due to overflow, it
>>> essentially executes following:
>>>
>>>if (OffsetNumberIsValid(oldoffnum))
>>>PageIndexTupleDelete(page, oldoffnum);
>>>gistfillbuffer(page, itup, ntup, InvalidOffsetNumber);
>>>
>>> That is: remove old tuple if it’s there, then place updated tuple at the
>>> end.
>>>
>>> Half of the old data have to be shifted my memmove inside
>>> PageIndexTupleDelete() call, half of the linp-s have to be corrected.
>>>
>>> If the updated tuple has same size as already residing on page we can
>>> just overwrite it. Attached patch demonstrates that concept. Attached
>>> test.sql inserts million rows into GiST index based on cube extension.
>>> My machine is Hyper-V VM running Ubuntu on i5-2500 CPU with SSD
>>> storage. Before patch, insert part of test is executed on average
>>> within 159 second, after patch application: insert part is executed
>>> within 77 seconds on average. That is almost twice faster (for
>>> CPU\Mem-bounded inserts, disk-constrained test will show no
>>> improvement). But it works only for fixed-size tuple inserts.
>>
>>
>> Very promising results!
>>
>>> I know that code in patch is far from beautiful: it operates with
>>> three different levels of abstraction within 5 lines of code. Those
>>> are low level memmove(), system-wide PageAddItem() and GiST private
>>> gistfillBuffer().
>>>
>>> By the way PageAddItem() have overwrite flag, but it only works with
>>> unused ItemId’s. Marking old ItemId as unused before PageAddItem()
>>> didn’t work for me. Unfortunately bufpage.c routines do not contain
>>> one for updating(replacing with new) tuple on page. It is important
>>> for me because I’m working on advanced GiST page layout (
>>>
>>> https://www.postgresql.org/message-id/CAJEAwVE0rrr%2BOBT-P0gDCtXbVDkBBG_WcXwCBK%3DGHo4fewu3Yg%40mail.gmail.com
>>> ), current approach is to use skip-tuples which can be used to skip
>>> many flowing tuples with one key check. Obviously, this design cares
>>> about tuples order. And update in a fashion “place updated tuple at
>>> the end” won’t work for me.
>>>
>>> So, I think it would be better to implement PageReplaceItem()
>>> functionality to make code better, to make existing GiST inserts
>>> faster and to enable new advanced page layouts in indices.
>>
>>
>> +1 for PageReplaceItem()
>> Even if item is not the same size, we can move the tail of page once instead
>> of twice.
>> I think you should implement PageReplaceItem() version and add it to the
>> commitfest.
>>
>> --
>> Alexander Korotkov
>> Postgres Professional: http://www.postgrespro.com
>> The Russian Postgres Company


PageIndexTupleOverwrite v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] oddity in initdb probing of max_connections/shared_buffers

2016-07-04 Thread Greg Stark
I happened to notice a bit of an inconsistency in the way initdb
probes max_connections and shared_buffers.

This line in the shared_buffers test:

/* Use same amount of memory, independent of BLCKSZ */
test_buffs = (trial_bufs[i] * 8192) / BLCKSZ;

has no equivalent in the max_connections test. As a result
max_connections is tested with 10 buffers per connection regardless of
BLCKSZ.

Is this intentional? Is the idea that Postgres can't function properly
without being able to read from 10 files concurrently regardless of
block size? Or is it an unintentional holdover from before the line
above was added for the shared_buffers tests?


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cluster on NAS and data center.

2016-07-04 Thread Krzysztof Kaczkowski
Hello,

About Your question, please do not ask. This infrastructure is an outcome
of a very long talk with supervisors.

Back to the topic. I like the tricky part. Do You think there is some way
to achieve this on compilation level or shall we search somewhere else? For
example UNIX libraries or even change and recompile kernel?


Re: [HACKERS] Cluster on NAS and data center.

2016-07-04 Thread Craig Ringer
On 4 July 2016 at 17:33, Krzysztof Kaczkowski  wrote:

> Hello everyone,
>
> Right now we have PostgreSQL on Windows Server (main data center) and
> cluster is placed on NAS. We have emergency data center on UNIX
> architecture. We want that emergency data center could continue work on
> PostgreSQL cluster that has been used by Windows PostgreSQL.
>
> We know that standard PostgreSQL is not able to use cluster created on
> different OS. We think that recompilation PostgreSQL with some specific
> flags. This should give us compatibility of cluster on different Systems.
> We see a small differences in cluster files on binary level. Can You help
> us pick proper compilation flags?
>

I wouldn't recommend that, and it might be pretty tricky.

Windows is an LLP64 architecture, and *nix is usually LP64. Because
PostgreSQL's data directory format is directly tied to the size of data
types in the host operating system this might cause you problems. See
http://stackoverflow.com/a/384672/398670 . sizeof(long) will differ.

I'd love to break that particular assumption if/when an on-disk format
break is done, since it's very unfortunate that PostgreSQL on one
architecture/OS won't read data directories from another architecture/OS.
But for now we're fairly stuck with it AFAIK.

Why can'y you just deploy a Windows backup cluster?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


[HACKERS] Cluster on NAS and data center.

2016-07-04 Thread Krzysztof Kaczkowski
Hello everyone,

Right now we have PostgreSQL on Windows Server (main data center) and
cluster is placed on NAS. We have emergency data center on UNIX
architecture. We want that emergency data center could continue work on
PostgreSQL cluster that has been used by Windows PostgreSQL.

We know that standard PostgreSQL is not able to use cluster created on
different OS. We think that recompilation PostgreSQL with some specific
flags. This should give us compatibility of cluster on different Systems.
We see a small differences in cluster files on binary level. Can You help
us pick proper compilation flags?


Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2016-07-04 Thread Heikki Linnakangas

On 18/03/16 19:19, Anastasia Lubennikova wrote:

Please, find the new version of the patch attached. Now it has WAL
functionality.

Detailed description of the feature you can find in README draft
https://goo.gl/50O8Q0

This patch is pretty complicated, so I ask everyone, who interested in
this feature,
to help with reviewing and testing it. I will be grateful for any feedback.
But please, don't complain about code style, it is still work in progress.

Next things I'm going to do:
1. More debugging and testing. I'm going to attach in next message
couple of sql scripts for testing.
2. Fix NULLs processing
3. Add a flag into pg_index, that allows to enable/disable compression
for each particular index.
4. Recheck locking considerations. I tried to write code as less
invasive as possible, but we need to make sure that algorithm is still
correct.
5. Change BTMaxItemSize
6. Bring back microvacuum functionality.


I think we should pack the TIDs more tightly, like GIN does with the 
varbyte encoding. It's tempting to commit this without it for now, and 
add the compression later, but I'd like to avoid having to deal with 
multiple binary-format upgrades, so let's figure out the final on-disk 
format that we want, right from the beginning.


It would be nice to reuse the varbyte encoding code from GIN, but we 
might not want to use that exact scheme for B-tree. Firstly, an 
important criteria when we designed GIN's encoding scheme was to avoid 
expanding on-disk size for any data set, which meant that a TID had to 
always be encoded in 6 bytes or less. We don't have that limitation with 
B-tree, because in B-tree, each item is currently stored as a separate 
IndexTuple, which is much larger. So we are free to choose an encoding 
scheme that's better at packing some values, at the expense of using 
more bytes for other values, if we want to. Some analysis on what we 
want would be nice. (It's still important that removing a TID from the 
list never makes the list larger, for VACUUM.)


Secondly, to be able to just always enable this feature, without a GUC 
or reloption, we might need something that's faster for random access 
than GIN's posting lists. Or can we just add the setting, but it would 
be nice to have some more analysis on the worst-case performance before 
we decide on that.


I find the macros in nbtree.h in the patch quite confusing. They're 
similar to what we did in GIN, but again we might want to choose 
differently here. So some discussion on the desired IndexTuple layout is 
in order. (One clear bug is that using the high bit of BlockNumber for 
the BT_POSTING flag will fail for a table larger than 2^31 blocks.)


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-07-04 Thread Masahiko Sawada
On Sat, Jul 2, 2016 at 12:17 PM, Amit Kapila  wrote:
> On Sat, Jul 2, 2016 at 12:53 AM, Andres Freund  wrote:
>> On 2016-07-01 15:18:39 -0400, Robert Haas wrote:
>>> On Fri, Jul 1, 2016 at 10:22 AM, Masahiko Sawada  
>>> wrote:
>>> > Ah, you're right, I misunderstood.
>>> >
>>> > Attached updated patch incorporating your comments.
>>> > I've changed it so that heap_xlog_lock clears vm flags if page is
>>> > marked all frozen.
>>>
>>> I believe that this should be separated into two patches, since there
>>> are two issues here:
>>>
>>> 1. Locking a tuple doesn't clear the all-frozen bit, but needs to do so.
>>> 2. heap_update releases the buffer content lock without logging the
>>> changes it has made.
>>>
>>> With respect to #1, there is no need to clear the all-visible bit,
>>> only the all-frozen bit.  However, that's a bit tricky given that we
>>> removed PD_ALL_FROZEN.  Should we think about putting that back again?
>>
>> I think it's fine to just do the vm lookup.
>>
>>> Should we just clear all-visible and call it good enough?
>>
>> Given that we need to do that in heap_lock_tuple, which entirely
>> preserves all-visible (but shouldn't preserve all-frozen), ISTM we
>> better find something that doesn't invalidate all-visible.
>>
>
> Sounds logical, considering that we have a way to set all-frozen and
> vacuum does that as well.  So probably either we need to have a new
> API or add a new parameter to visibilitymap_clear() to indicate the
> same.  If we want to go that route, isn't it better to have
> PD_ALL_FROZEN as well?
>

Cant' we call visibilitymap_set with all-visible but not all-frozen
bits instead of clearing flags?

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reviewing freeze map code

2016-07-04 Thread Masahiko Sawada
On Sat, Jul 2, 2016 at 12:34 PM, Amit Kapila  wrote:
> On Fri, Jul 1, 2016 at 7:52 PM, Masahiko Sawada  wrote:
>> On Fri, Jul 1, 2016 at 11:12 AM, Amit Kapila  wrote:
>>>
>>> Why do you think IndexOnlyScan will return wrong result?  If the
>>> server crash in the way as you described, the transaction that has
>>> made modifications will anyway be considered aborted, so the result of
>>> IndexOnlyScan should not be wrong.
>>>
>>
>> Ah, you're right, I misunderstood.
>>
>> Attached updated patch incorporating your comments.
>> I've changed it so that heap_xlog_lock clears vm flags if page is
>> marked all frozen.
>>
>
> I think we should make a similar change in heap_lock_tuple API as
> well.
> Also, currently by default heap_xlog_lock tuple tries to clear
> the visibility flags, isn't it better to handle it as we do at all
> other places (ex. see log_heap_update), by logging the information
> about same.

I will deal with them.

> Though, it is important to get the patch right, but I feel in the
> meantime, it might be better to start benchmarking.  AFAIU, even if
> change some part of information while WAL logging it, the benchmark
> results won't be much different.

Okay, I will do the benchmark test as well.

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [CF2016-9] Allow spaces in working path on tap-tests

2016-07-04 Thread Michael Paquier
On Mon, Jul 4, 2016 at 4:29 PM, Michael Paquier
 wrote:
> On Mon, Jul 4, 2016 at 4:02 PM, Kyotaro HORIGUCHI
>  wrote:
>> Hello, this is just an entry mail for the next CF.
>>
>> The tap-test fails when the soruce directoy containing spaces. I
>> accidentially hit this by a Jenkins project with the name "test
>> project".
>>
>> The function system_log() is safe for such parameters but
>> backup() uses it in wrong way. On the other hand,
>> enable_restoring() and enable_archiving() forgets to quote such a
>> path containing spaces as already done for Windows. I don't see
>> the similar problem in other places.
>
> Good catch, your fix looks good to me. I am noticing as well that the
> invocations of pg_ctl, pg_dumpall, pg_upgrade and diff are likely
> broken the same way in vcregress.pl.

And as is the command built for zic.exe in Install.pm, no? $target is
normally an absolute path per the call of Install().
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [CF2016-9] Allow spaces in working path on tap-tests

2016-07-04 Thread Michael Paquier
On Mon, Jul 4, 2016 at 4:02 PM, Kyotaro HORIGUCHI
 wrote:
> Hello, this is just an entry mail for the next CF.
>
> The tap-test fails when the soruce directoy containing spaces. I
> accidentially hit this by a Jenkins project with the name "test
> project".
>
> The function system_log() is safe for such parameters but
> backup() uses it in wrong way. On the other hand,
> enable_restoring() and enable_archiving() forgets to quote such a
> path containing spaces as already done for Windows. I don't see
> the similar problem in other places.

Good catch, your fix looks good to me. I am noticing as well that the
invocations of pg_ctl, pg_dumpall, pg_upgrade and diff are likely
broken the same way in vcregress.pl.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Renaming more clearly SHA functions in pgcrypto/

2016-07-04 Thread Michael Paquier
Hi all,
(folks interested in that in CC)

While looking at some facility in pgcrypto, I have noticed the stanza
created by 56f4478 to prevent conflicts with OpenSSL, like that:
+#define SHA256_Init pg_SHA256_Init
+#define SHA256_Update pg_SHA256_Update

Wouldn't it be better to avoid that, and just rename all those
functions as pg_shaXX_foo? It seems to me that this would be more
in-line with what's already in core. This renaming would be part of
the refactoring effort for SCRAM to have all the functions for SHA1,
SHA156, etc in a unique file sha.c in src/common with a reworked
interface, particularly for SHA1 where things are quite inconsistent
with SHA2XX.

Opinions? As that's a matter really rather independent on SCRAM, I
prefer creating a new thread to gather opinions..

Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [CF2016-9] Allow spaces in working path on tap-tests

2016-07-04 Thread Kyotaro HORIGUCHI
Hello, this is just an entry mail for the next CF.

The tap-test fails when the soruce directoy containing spaces. I
accidentially hit this by a Jenkins project with the name "test
project".

The function system_log() is safe for such parameters but
backup() uses it in wrong way. On the other hand,
enable_restoring() and enable_archiving() forgets to quote such a
path containing spaces as already done for Windows. I don't see
the similar problem in other places.

regards,
-- 
Kyotaro Horiguchi
NTT Open Source Software Center

>From 4c38a3e95da1abb5360a80b9ad646c9c6309bd9b Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Mon, 4 Jul 2016 15:18:51 +0900
Subject: [PATCH] Allow spaces in working directory path on TAP-tests.

Several tap tests fails when the soruce directory path is containing
spaces. This patch fixes it.
---
 src/test/perl/PostgresNode.pm | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 636dfec..5767919 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -475,7 +475,8 @@ sub backup
 	my $name= $self->name;
 
 	print "# Taking pg_basebackup $backup_name from node \"$name\"\n";
-	TestLib::system_or_bail("pg_basebackup -D $backup_path -p $port -x");
+	TestLib::system_or_bail('pg_basebackup', '-D', $backup_path,
+			'-p', $port, '-x');
 	print "# Backup finished\n";
 }
 
@@ -763,7 +764,7 @@ sub enable_restoring
 	my $copy_command =
 	  $TestLib::windows_os
 	  ? qq{copy "$path%f" "%p"}
-	  : qq{cp $path/%f %p};
+	  : qq{cp "$path/%f" "%p"};
 
 	$self->append_conf(
 		'recovery.conf', qq(
@@ -791,7 +792,7 @@ sub enable_archiving
 	my $copy_command =
 	  $TestLib::windows_os
 	  ? qq{copy "%p" "$path%f"}
-	  : qq{cp %p $path/%f};
+	  : qq{cp "%p" "$path/%f"};
 
 	# Enable archive_mode and archive_command on node
 	$self->append_conf(
-- 
1.8.3.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] An extra error for client disconnection on Windows

2016-07-04 Thread Kyotaro HORIGUCHI
Thank you for the suggestion, I've done it.

At Wed, 15 Jun 2016 12:15:07 -0400, Robert Haas  wrote 
in 
> On Thu, Jun 2, 2016 at 4:51 AM, Kyotaro HORIGUCHI
>  wrote:
> > After a process termination without PQfinish() of a client,
> > server emits the following log message not seen on Linux boxes.
> >
> >> LOG:  could not receive data from client: An existing connection was 
> >> forcibly closed by the remote host.
> >
> > This is because pgwin32_recv reuturns an error ECONNRESET for the
> > situation instead of returning non-error EOF as recv(2) does.
> >
> > This patch translates WSAECONNRESET of WSARecv to an EOF so that
> > pgwin32_recv behaves the same way with Linux.
> >
> > The attached patch does this.
> 
> Please add this to the next CommitFest so it gets reviewed.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers