date:20131218

Re: [HACKERS] GIN improvements part 1: additional information

2013-12-18 Thread Oleg Bartunov

Guys,

before digging deep into the art of comp/decomp world I'd like to know
if you familiar with results of
http://wwwconference.org/www2008/papers/pdf/p387-zhangA.pdf paper and
some newer research ?  Do we agree in what we really want ? Basically,
there are three main features: size, compression, decompression speed
- we should take two :)

Should we design sort of plugin, which could support independent
storage on disk, so users can apply different techniques, depending on
data.

What I want to say is that we certainly can play with this very
challenged task, but we have limited time  before 9.4 and we should
think in positive direction.

Oleg

On Wed, Dec 18, 2013 at 6:50 PM, Heikki Linnakangas
 wrote:
> On 12/18/2013 01:45 PM, Alexander Korotkov wrote:
>>
>> On Tue, Dec 17, 2013 at 2:49 AM, Heikki Linnakangas
>> >>
>>> wrote:
>>
>>
>>> On 12/17/2013 12:22 AM, Alexander Korotkov wrote:
>>>   2) Storage would be easily extendable to hold additional information as

 well.
 Better compression shouldn't block more serious improvements.

>>>
>>> I'm not sure I agree with that. For all the cases where you don't care
>>> about additional information - which covers all existing users for
>>> example
>>> - reducing disk size is pretty important. How are you planning to store
>>> the
>>> additional information, and how does using another encoding gets in the
>>> way
>>> of that?
>>
>>
>> I was planned to store additional information datums between
>> varbyte-encoded tids. I was expected it would be hard to do with PFOR.
>> However, I don't see significant problems in your implementation of
>> Simple9
>> encoding. I'm going to dig deeper in your version of patch.
>
>
> Ok, thanks.
>
> I had another idea about the page format this morning. Instead of having the
> item-indexes at the end of the page, it would be more flexible to store a
> bunch of self-contained posting list "segments" on the page. So I propose
> that we get rid of the item-indexes, and instead store a bunch of
> independent posting lists on the page:
>
> typedef struct
> {
> ItemPointerData first;   /* first item in this segment (unpacked) */
> uint16  nwords;  /* number of words that follow */
> uint64  words[1];/* var length */
> } PostingListSegment;
>
> Each segment can be encoded and decoded independently. When searching for a
> particular item (like on insertion), you skip over segments where 'first' >
> the item you're searching for.
>
> This format offers a lot more flexibility compared to the separate item
> indexes. First, we don't need to have another fixed sized area on the page,
> which simplifies the page format. Second, we can more easily re-encode only
> one segment on the page, on insertion or vacuum. The latter is particularly
> important with the Simple-9 encoding, which operates one word at a time
> rather than one item at a time; removing or inserting an item in the middle
> can require a complete re-encoding of everything that follows. Third, when a
> page is being inserted into and contains only uncompressed items, you don't
> waste any space for unused item indexes.
>
> While we're at it, I think we should use the above struct in the inline
> posting lists stored directly in entry tuples. That wastes a few bytes
> compared to the current approach in the patch (more alignment, and 'words'
> is redundant with the number of items stored on the tuple header), but it
> simplifies the functions handling these lists.
>
>
> - Heikki
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [9.3 bug] disk space in pg_xlog increases during archive recovery

2013-12-18 Thread Fujii Masao

On Mon, Dec 16, 2013 at 9:22 PM, MauMau  wrote:
> Hi, Fujii san,
>
>> On Wed, Aug 7, 2013 at 7:03 AM, Fujii Masao  wrote:
>>>
>>> On second thought, probably we cannot remove the restored WAL files early
>>> because they might be required for fast promotion which is new feature in
>>> 9.3.
>>> In fast promotion, an end-of-recovery checkpoint is not executed. After
>>> the end
>>> of recovery, normal online checkpoint starts. If the server crashes
>>> before such
>>> an online checkpoint completes, the server needs to replay again all the
>>> WAL
>>> files which it replayed before the promotion. Since this is the crash
>>> recovery,
>>> all those WAL files need to exist in pg_xlog directory. So if we remove
>>> the
>>> restored WAL file from pg_xlog early, such a crash recovery might fail.
>>>
>>> So even if cascade replication is disabled, if standby_mode = on, i.e.,
>>> fast
>>> promotion can be performed, we cannot remove the restored WAL files
>>> early.
>
>
> Following Fujii-san's advice, I've made the attached patch.

Thanks for the patch!

! if (source == XLOG_FROM_ARCHIVE && StandbyModeRequested)

Even when standby_mode is not enabled, we can use cascade replication and
it needs the accumulated WAL files. So I think that AllowCascadeReplication()
should be added into this condition.

!   snprintf(recoveryPath, MAXPGPATH, XLOGDIR "/RECOVERYXLOG");
!   XLogFilePath(xlogpath, ThisTimeLineID, endLogSegNo);
!
!   if (restoredFromArchive)

Don't we need to check !StandbyModeRequested and !AllowCascadeReplication()
here?

!   /*
!* If the latest segment is not archival, but there's still a
!* RECOVERYXLOG laying about, get rid of it.
!*/
!   unlink(recoveryPath);   /* ignore any error */

The similar line exists in the lower side of exitArchiveRecovery(), so ISTM that
you can refactor that.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Tatsuo Ishii

> I found that the psql tab-completion for ALTER SYSTEM SET has not been
> implemented yet.
> Attached patch does that. Barring any objections, I will commit this patch.

Good catch!

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-12-18 Thread Haribabu kommi

On 19 December 2013 05:31 Bruce Momjian wrote:
> On Wed, Dec 11, 2013 at 10:22:32AM +, Haribabu kommi wrote:
> > The make_absolute_path() function moving to port is changed in
> similar
> > way as Bruce Momjian approach. The psprintf is used to store the
> error
> > string which Occurred in the function. But psprintf is not used for
> > storing the absolute path As because it is giving problems in freeing
> the allocated memory in SelectConfigFiles.
> > Because the same memory is allocated in a different code branch from
> guc_malloc.
> >
> > After adding the make_absolute_path() function with psprintf stuff in
> > path.c file It is giving linking problem in compilation of ecpg. I am
> not able to find the problem.
> > So I added another file abspath.c in port which contains these two
> functions.
> 
> What errors are you seeing?

If I move the make_absolute_path function from abspath.c to path.c,
I was getting following linking errors while compiling "ecpg".

../../../../src/port/libpgport.a(path.o): In function `make_absolute_path':
/home/hari/postgres/src/port/path.c:795: undefined reference to `psprintf'
/home/hari/postgres/src/port/path.c:809: undefined reference to `psprintf'
/home/hari/postgres/src/port/path.c:818: undefined reference to `psprintf'
/home/hari/postgres/src/port/path.c:830: undefined reference to `psprintf'
collect2: ld returned 1 exit status
make[4]: *** [ecpg] Error 1
make[3]: *** [all-preproc-recurse] Error 2
make[2]: *** [all-ecpg-recurse] Error 2
make[1]: *** [all-interfaces-recurse] Error 2
make: *** [all-src-recurse] Error 2


Regards,
Hari babu.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Fujii Masao

On Thu, Dec 19, 2013 at 12:08 PM, Amit Kapila  wrote:
> On Wed, Dec 18, 2013 at 8:25 PM, Tatsuo Ishii  wrote:
 Is there any reason for the function returns int as it always returns
 0 or 1. Maybe returns bool is better?
>>>
>>
>> I have committed your patches. Thanks.
>
> Thank you very much.

I found that the psql tab-completion for ALTER SYSTEM SET has not been
implemented yet.
Attached patch does that. Barring any objections, I will commit this patch.

Regards,

-- 
Fujii Masao
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***
*** 541,546  static const SchemaQuery Query_for_list_of_matviews = {
--- 541,552 
  "SELECT pg_catalog.quote_ident(nspname) FROM pg_catalog.pg_namespace "\
  " WHERE substring(pg_catalog.quote_ident(nspname),1,%d)='%s'"
  
+ #define Query_for_list_of_alter_system_set_vars \
+ "SELECT name FROM "\
+ " (SELECT pg_catalog.lower(name) AS name FROM pg_catalog.pg_settings "\
+ "  WHERE context != 'internal') ss "\
+ " WHERE substring(name,1,%d)='%s'"
+ 
  #define Query_for_list_of_set_vars \
  "SELECT name FROM "\
  " (SELECT pg_catalog.lower(name) AS name FROM pg_catalog.pg_settings "\
***
*** 930,936  psql_completion(char *text, int start, int end)
  		{"AGGREGATE", "COLLATION", "CONVERSION", "DATABASE", "DEFAULT PRIVILEGES", "DOMAIN",
  			"EXTENSION", "FOREIGN DATA WRAPPER", "FOREIGN TABLE", "FUNCTION",
  			"GROUP", "INDEX", "LANGUAGE", "LARGE OBJECT", "MATERIALIZED VIEW", "OPERATOR",
! 			"ROLE", "RULE", "SCHEMA", "SERVER", "SEQUENCE", "TABLE",
  			"TABLESPACE", "TEXT SEARCH", "TRIGGER", "TYPE",
  		"USER", "USER MAPPING FOR", "VIEW", NULL};
  
--- 936,942 
  		{"AGGREGATE", "COLLATION", "CONVERSION", "DATABASE", "DEFAULT PRIVILEGES", "DOMAIN",
  			"EXTENSION", "FOREIGN DATA WRAPPER", "FOREIGN TABLE", "FUNCTION",
  			"GROUP", "INDEX", "LANGUAGE", "LARGE OBJECT", "MATERIALIZED VIEW", "OPERATOR",
! 			 "ROLE", "RULE", "SCHEMA", "SERVER", "SEQUENCE", "SYSTEM SET", "TABLE",
  			"TABLESPACE", "TEXT SEARCH", "TRIGGER", "TYPE",
  		"USER", "USER MAPPING FOR", "VIEW", NULL};
  
***
*** 1263,1268  psql_completion(char *text, int start, int end)
--- 1269,1279 
  
  		COMPLETE_WITH_LIST(list_ALTER_SERVER);
  	}
+ 	/* ALTER SYSTEM SET  */
+ 	else if (pg_strcasecmp(prev3_wd, "ALTER") == 0 &&
+ 			 pg_strcasecmp(prev2_wd, "SYSTEM") == 0 &&
+ 			 pg_strcasecmp(prev_wd, "SET") == 0)
+ 		COMPLETE_WITH_QUERY(Query_for_list_of_alter_system_set_vars);
  	/* ALTER VIEW  */
  	else if (pg_strcasecmp(prev3_wd, "ALTER") == 0 &&
  			 pg_strcasecmp(prev2_wd, "VIEW") == 0)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-12-18 Thread Peter Geoghegan

On Thu, Dec 12, 2013 at 4:18 PM, Peter Geoghegan wrote:
> Both of these revisions have identical ad-hoc test cases included as
> new files - see testcase.sh and upsert.sql. My patch doesn't have any
> unique constraint violations, and has pretty consistent performance,
> while yours has many unique constraint violations. I'd like to hear
> your thoughts on the testcase, and the design implications.

I withdraw the test-case. Both approaches behave similarly if you look
for long enough, and that's okay.

I also think that changes to HeapTupleSatisfiesUpdate() are made
unnecessary by recent bug fixes to that function. The test case
previously described [1] that broke that is no longer recreatable, at
least so far.

Do you think that we need to throw a serialization failure within
ExecLockHeapTupleForUpdateSpec() iff heap_lock_tuple() returns
HeapTupleInvisible and IsolationUsesXactSnapshot()? Also, I'm having a
hard time figuring out a good choke point to catch MVCC snapshots
availing of our special visibility rule where they should not due to
IsolationUsesXactSnapshot(). It seems sufficient to continue to assume
that Postgres won't attempt to lock any tid invisible under
conventional MVCC rules in the first place, except within
ExecLockHeapTupleForUpdateSpec(), but what do we actually do within
ExecLockHeapTupleForUpdateSpec()? I'm thinking of a new tqual.c
routine concerning the tuple being in the future that we re-check when
IsolationUsesXactSnapshot(). That's not very modular, though. Maybe
we'd go through heapam.c.

I think it doesn't matter that what now constitute MVCC snapshots
(with the new, special "reach into the future" rule) have that new
rule, for the purposes of higher isolation levels, because we'll have
a serialization failure within ExecLockHeapTupleForUpdateSpec() before
this is allowed to become a problem. In order for the new rule to be
relevant, we'd have to be the Xact to lock in the first place, and as
an xact in non-read-committed mode, we'd be sure to call the new
tqual.c "in the future" routine or whatever. Only upserters can lock a
row in the future, so it is the job of upserters to care about this
special case.

Incidentally, I tried to rebase recently, and saw some shift/reduce
conflicts due to 1b4f7f93b4693858cb983af3cd557f6097dab67b, "Allow
empty target list in SELECT". The fix for that is not immediately
obvious.

So I think we should proceed with the non-conclusive-check-first
approach (if only on pragmatic grounds), but even now I'm not really
sure. I think there might be unprincipled deadlocking should
ExecInsertIndexTuples() fail to be completely consistent about its
ordering of insertion - the use of dirty snapshots (including as part
of conventional !UNIQUE_CHECK_PARTIAL unique index enforcement) plays
a part in this risk. Roughly speaking, heap_delete() doesn't render
the tuple immediately invisible to some-other-xact's dirty snapshot
[2], and I think that could have unpleasant interactions, even if it
is also beneficial in some ways. Our old, dead tuples from previous
attempts stick around, and function as "value locks" to everyone else,
since for example _bt_check_unique() cares about visibility having
merely been affected, which is grounds for blocking. More
counter-intuitive still, we go ahead with "value locking" (i.e. btree
UNIQUE_CHECK_PARTIAL tuple insertion originating from the main
speculative ExecInsertIndexTuples() call) even though we already know
that we will delete the corresponding heap row (which, as noted, still
satisfies HeapTupleSatisfiesDirty() and so is value-lock-like).

Empirically, retrying because ExecInsertIndexTuples() returns some
recheckIndexes occurs infrequently, so maybe that makes all of this
okay. Or maybe it happens infrequently *because* we don't give up on
insertion when it looks like the current iteration is futile. Maybe
just inserting into every unique index, and then blocking on an xid
within ExecCheckIndexConstraints(), works out fairly and performs
reasonably in all common cases. It's pretty damn subtle, though, and I
worry about the worst case performance, and basic correctness issues
for these reasons. The fact that deferred unique indexes also use
UNIQUE_CHECK_PARTIAL is cold comfort -- that only ever has to through
an error on conflict, and only once. We haven't "earned the right" to
lock *all* values in all unique indexes, but kind of do so anyway in
the event of an "insertion conflicted after pre-check".

Another concern that bears reiterating is: I think making the
lock-for-update case work for exclusion constraints is a lot of
additional complexity for a very small return.

1 2 >

1 - 100 of 106 matches

Mail list logo