date:20140320

On Tue, Mar 18, 2014 at 4:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Mar 18, 2014 at 12:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Meh.  I think you're putting a bit too much faith in your ability to
 predict the locus of bugs that you think aren't there.

 Well, I'm open to suggestions.

 As a suggestion: it'd be worth explicitly testing zero-byte and one-byte
 messages, those being obvious edge cases.  Then, say, randomly chosen
 lengths in the range 100-1000; this would help ferret out odd-length
 issues.  And something with message sizes larger than the queue size.

All right, done.  Let's see if that tickles any edge cases we haven't
hit before.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Archive recovery won't be completed on some situation.

2014-03-20 Thread Alvaro Herrera

Kyotaro HORIGUCHI escribió:
 Hi, I confirmed that 82233ce7ea4 surely did it.
 
 At Wed, 19 Mar 2014 09:35:16 -0300, Alvaro Herrera wrote
  Fujii Masao escribió:
   On Wed, Mar 19, 2014 at 7:57 PM, Heikki Linnakangas
   hlinnakan...@vmware.com wrote:
  
9.4 canceles backup mode even on immediate shutdown so the
operation causes no problem, but 9.3 and before are doesn't.
   
Hmm, I don't think we've changed that behavior in 9.4.
   
   ISTM 82233ce7ea42d6ba519aaec63008aff49da6c7af changed immdiate
   shutdown that way.
  
  Uh, interesting.  I didn't see that secondary effect.  I hope it's not
  for ill?
 
 The crucial factor for the behavior change is that pmdie has
 become not to exit immediately for SIGQUIT. 'case SIGQUIT:' in
 pmdie() ended with ExitPostmaster(0) before the patch but now
 it ends with 'PostmasterStateMachine(); break;' so continues to
 run with pmState = PM_WAIT_BACKENDS, similar to SIGINT (fast
 shutdown).
 
 After all, pmState changes to PM_NO_CHILDREN via PM_WAIT_DEAD_END
 by SIGCHLDs from non-significant processes, then CancelBackup().

Judging from what was being said on the thread, it seems that running
CancelBackup() after an immediate shutdown is better than not doing it,
correct?

 Focusing on the point described above, the small patch below
 rewinds the behavior back to 9.3 and before but I don't know the
 appropriateness in regard to the intention of the patch.

I see.  Obviously your patch would, in effect, revert 82233ce7ea
completely, which is not something we want.  I think if we want to go
back to the previous behavior of not stopping the backup, some other
method should be used.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Risk Estimation WAS: Planner hints in Postgresql

On Tue, Mar 18, 2014 at 2:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Atri Sharma atri.j...@gmail.com writes:
 One of the factors that leads to bad estimates is that the histogram of the
 values of a column maintained by the planner gets old by time and the data
 in the column changes. So, the histogram is no longer a quite accurate view
 of the data and it leads to bad selectivity.

 TBH, this is so far down the list of problems that it'll be a long time
 before we need to worry about it.  It's certainly not the number one
 priority for any project to model risk in the planner.

 The thing that I think is probably the number one problem is estimates
 that depend on an assumption of uniform distribution of sought-after rows
 among those encountered by a scan.  This is usually where bad plans for
 LIMIT queries are coming from.  We could certainly add some sort of fudge
 factor to those costs, but I'd like to have a more-or-less principled
 framework for doing so.

I think the problem is, in some sense, more basic than that.  I think
the kind of query we're talking about here is:

SELECT * FROM foo WHERE unlikely ORDER BY indexed_column LIMIT 1

Assume for the sake of argument that there are 100 rows that would be
returned in the absence of the limit.  Let SC and TC be the startup
cost and total cost of the index scan.  As a matter of general policy,
we're going to say that the cost of this is SC + 0.01 * (TC - SC).
What makes this path look appealing to the planner is that SC is small
relative to TC.  If we knew, for example, that we weren't going to
find the first match until 90% of the way through the index scan, then
we could set SC = 90% * TC and, all else being equal, the planner
would make the right decision.

So you might think that the problem here is that we're assuming
uniform density.  Let's say there are a million rows in the table, and
there are 100 that match our criteria, so the first one is going to
happen 1/10,000'th of the way through the table.  Thus we set SC =
0.0001 * TC, and that turns out to be an underestimate if the
distribution isn't as favorable as we're hoping.  However, that is NOT
what we are doing.  What we are doing is setting SC = 0.  I mean, not
quite 0, but yeah, effectively 0. Essentially we're assuming that no
matter how selective the filter condition may be, we assume that it
will match *the very first row*.

So we're not assuming the average case and getting hosed when things
come out worse than average.  We're assuming the *best* case.  So
unless things happen to really swing in our favor, we got hosed.

Now it might be that a fudge factor of 2 or 1.5 or 10 or 3 or 17 is
appropriate, so that we actually assume we're going to have to scan a
little more of the index than we expect.  That can perhaps be
justified by the possibility that there may actually be NO rows
matching the filter condition, and we'll have to try scanning the
entire index to get off the ground.  We could also try to come up with
a mathematical model for that.  But that fudge factor would presumably
be a multiplier on the effort of finding the first tuple.  And right
now we assume that finding the first tuple will be trivial.  So I
think we should fix THAT problem first, and then if that turns out to
be insufficient, we can worry about what further fudging is required.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Risk Estimation WAS: Planner hints in Postgresql

Robert Haas robertmh...@gmail.com writes:
 So you might think that the problem here is that we're assuming
 uniform density.  Let's say there are a million rows in the table, and
 there are 100 that match our criteria, so the first one is going to
 happen 1/10,000'th of the way through the table.  Thus we set SC =
 0.0001 * TC, and that turns out to be an underestimate if the
 distribution isn't as favorable as we're hoping.  However, that is NOT
 what we are doing.  What we are doing is setting SC = 0.  I mean, not
 quite 0, but yeah, effectively 0. Essentially we're assuming that no
 matter how selective the filter condition may be, we assume that it
 will match *the very first row*.

I think this is wrong.  Yeah, the SC may be 0 or near it, but the time to
fetch the first tuple is estimated as SC + (TC-SC)/N.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: plpgsql.extra_warnings, plpgsql.extra_errors

Marko Tiikkaja ma...@joh.to writes:
 On 3/20/14, 12:32 AM, Tom Lane wrote:
 Also, adding GUC_LIST_INPUT later is not really cool since it changes
 the parsing behavior for the GUC.  If it's going to be a list, it should
 be one from day zero.

 I'm not sure what exactly you mean by this.  If the only allowed values 
 are none, variable_shadowing and all, how is the behaviour for 
 those going to change if we make it a list for 9.5?

If we switch to using SplitIdentifierString later, which is the typical
implementation of parsing list GUCs, that will do things like case-fold,
remove double quotes, remove white space.  It's possible that that's
completely upward-compatible with what happens if you don't do that ...
but I'm not sure about it.

In any case, if the point of this patch is to provide a framework for
extra error detection, I'm not sure why we'd arbitrarily say we're going
to leave the framework unfinished in the GUC department.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Risk Estimation WAS: Planner hints in Postgresql

2014-03-20 Thread Atri Sharma

On Thu, Mar 20, 2014 at 8:10 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Mar 18, 2014 at 2:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Atri Sharma atri.j...@gmail.com writes:
  One of the factors that leads to bad estimates is that the histogram of
 the
  values of a column maintained by the planner gets old by time and the
 data
  in the column changes. So, the histogram is no longer a quite accurate
 view
  of the data and it leads to bad selectivity.
 
  TBH, this is so far down the list of problems that it'll be a long time
  before we need to worry about it.  It's certainly not the number one
  priority for any project to model risk in the planner.
 
  The thing that I think is probably the number one problem is estimates
  that depend on an assumption of uniform distribution of sought-after rows
  among those encountered by a scan.  This is usually where bad plans for
  LIMIT queries are coming from.  We could certainly add some sort of fudge
  factor to those costs, but I'd like to have a more-or-less principled
  framework for doing so.

 I think the problem is, in some sense, more basic than that.  I think
 the kind of query we're talking about here is:

 SELECT * FROM foo WHERE unlikely ORDER BY indexed_column LIMIT 1

 Assume for the sake of argument that there are 100 rows that would be
 returned in the absence of the limit.  Let SC and TC be the startup
 cost and total cost of the index scan.  As a matter of general policy,
 we're going to say that the cost of this is SC + 0.01 * (TC - SC).
 What makes this path look appealing to the planner is that SC is small
 relative to TC.  If we knew, for example, that we weren't going to
 find the first match until 90% of the way through the index scan, then
 we could set SC = 90% * TC and, all else being equal, the planner
 would make the right decision.

 So you might think that the problem here is that we're assuming
 uniform density.  Let's say there are a million rows in the table, and
 there are 100 that match our criteria, so the first one is going to
 happen 1/10,000'th of the way through the table.  Thus we set SC =
 0.0001 * TC, and that turns out to be an underestimate if the
 distribution isn't as favorable as we're hoping.  However, that is NOT
 what we are doing.  What we are doing is setting SC = 0.  I mean, not
 quite 0, but yeah, effectively 0. Essentially we're assuming that no
 matter how selective the filter condition may be, we assume that it
 will match *the very first row*.



Cannot we reuse the same histogram we have in the planner right now for
this? I mean, AFAIK, the heuristic we have is that we divide the histogram
into equal size buckets and then find the bucket in which our predicate
value lies, then take some part of that bucket and the rest of the buckets
before that bucket,right?

So, suppose a query is SELECT * FROM table WHERE a  10, we shall find the
bucket that 10 lies in, right?

Now, why cannot we take the estimate of all the buckets behind the bucket
in which our value is present? Will that estimate not give us the fraction
of tuples that are expected to be before the first matching row?

Its pretty wild, but I wanted to know if my understanding of this scenario
is correct or not.

Regards,

Atri

-- 
Regards,

Atri
*l'apprenant*

Re: [HACKERS] Risk Estimation WAS: Planner hints in Postgresql

Atri Sharma atri.j...@gmail.com writes:
 Now, why cannot we take the estimate of all the buckets behind the bucket
 in which our value is present? Will that estimate not give us the fraction
 of tuples that are expected to be before the first matching row?

Uh, no, not unless you assume that the table happens to be perfectly
sorted by the column's value.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Risk Estimation WAS: Planner hints in Postgresql

2014-03-20 Thread Atri Sharma

On Thu, Mar 20, 2014 at 8:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Atri Sharma atri.j...@gmail.com writes:
  Now, why cannot we take the estimate of all the buckets behind the bucket
  in which our value is present? Will that estimate not give us the
 fraction
  of tuples that are expected to be before the first matching row?

 Uh, no, not unless you assume that the table happens to be perfectly
 sorted by the column's value.




Yes, that is true. So, if an attribute has an index present, can we do this
somehow?

Regards,

Atri



-- 
Regards,

Atri
*l'apprenant*

Re: [HACKERS] effective_cache_size cannot be changed by a reload

Fujii Masao masao.fu...@gmail.com writes:
 On Thu, Mar 20, 2014 at 2:34 AM, Jeff Janes jeff.ja...@gmail.com wrote:
 In 9.4dev, if the server is started with effective_cache_size = -1, then it
 cannot be changed away from that without a restart.

 I think that's a bug. Patch attached.

PGC_S_FILE is at least as bogus as the previous choice; for one thing,
such a source setting implies there should be a file and line number
recorded.

I think PGC_S_DYNAMIC_DEFAULT is the right thing, but I've not absorbed
much caffeine yet today.  Also, if that is the right thing, the section of
guc-file.l beginning at about line 284 needs to get taught about it; which
probably means that set_default_effective_cache_size needs a rethink so
that it can be applied and do something useful in that situation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: libpq: change PQconndefaults() to ignore invalid service files

2014-03-20 Thread Bruce Momjian

On Sat, Mar  8, 2014 at 08:44:34PM -0500, Bruce Momjian wrote:
 [Just getting back to this.]
 
 Agreed.  I have developed the attached patch which passes the strdup()
 failure up from pg_fe_getauthname() and maps the failure to
 PQconndefaults(), which is now documented as being memory allocation
 failure.
 
 FYI, there was odd coding in PQconndefaults() where we set local
 variable 'name' to NULL, then we tested to see if it was NULL --- I
 removed that test.
 
  idea that we're ignoring failure returns from pqGetpwuid/GetUserName?
 
 If we want pqGetpwuid/GetUserName to be a special return value, we would
 need to modify PQconndefaults()'s API, which doesn't seem worth it.

Applied.  I added a C comment about why we ignore pqGetpwuid/GetUserName
failures.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Risk Estimation WAS: Planner hints in Postgresql

On Thu, Mar 20, 2014 at 10:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 So you might think that the problem here is that we're assuming
 uniform density.  Let's say there are a million rows in the table, and
 there are 100 that match our criteria, so the first one is going to
 happen 1/10,000'th of the way through the table.  Thus we set SC =
 0.0001 * TC, and that turns out to be an underestimate if the
 distribution isn't as favorable as we're hoping.  However, that is NOT
 what we are doing.  What we are doing is setting SC = 0.  I mean, not
 quite 0, but yeah, effectively 0. Essentially we're assuming that no
 matter how selective the filter condition may be, we assume that it
 will match *the very first row*.

 I think this is wrong.  Yeah, the SC may be 0 or near it, but the time to
 fetch the first tuple is estimated as SC + (TC-SC)/N.

Hmm, you're right, and experimentation confirms that the total cost of
the limit comes out to about TC/selectivity.  So scratch that theory.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] QSoC proposal: date_trunc supporting intervals

Hello!
Here is the text of my proposal which I've applied to GSoC.
(and link
https://docs.google.com/document/d/1vBjQzhFT_fgoIkoEP5TVeyFA6ggsYlLq76tghGVUD6A/edit?usp=sharing)

Any suggestions and comments are welcome.
Because I don't know the code of PostgreSQL well I decide not to
participate is QSoC with previous proposal (rewrite pg_dump and
pg_restore as libraries). But I'm very interested to participate in QSoC
2014 as a part of PostgreSQL. So It's my new proposal.

PostgreSQL GSoC 2014 proposal

Project name

date_trunc() supporting intervals

Short description

The function date_trunc () is conceptually similar to the trunc function
for numbers. But now it doesn’t have full functionality because
intervals are not supporting intervals in date_trunc ().

Name: Alexander Shvidchenko

E-mail: askel...@gmail.com mailto:askel...@gmail.com

Location: Rostov-on-Don, Russia (UTC +04.00)

Benefits to the PostgreSQL Community

This feature will expand opportunities to work with time in databases.
It will do the job with time more flexible and easier.

Quantifiable results

Supporting and correct working with intervals by date_trunc ()

Project Schedule

until May 31

Make code review and solve architecture questions with help of community

1 June – 30 June

Detailed implementation of libraries.

1 July – 31 July

Finish Implementation of libraires and begin testing.

1 August -15 August

Final refactoring, testing and commit.

Some details

In the period until May 31 I need to discover what types of intervals
are able to be sent. Also I need to set the stamp of the result.

For example:

date_trunc (‘week’, ‘1 month 15 day’:interval)

result

‘1 month 14 day’

‘1 month 2 week’

It seems like this project idea isn’t very difficult and large. So if I
have time after finishing this job I’ll be able to work more: close some
bugs or realize some more features that will be usefull for the community.

Academic experience

I entered the university in 2013. Before entering the university I
finished the college in 2012. My graduate work in the college was the
client-server application. It was a realization of XMPP. The client was
realized in Qt. The client worked with SQLite database and the server
worked with MySQL database.

Why is PostgreSQL?

- I’m intereted in this idea and believe this project would be useful
for the community;

- PostgreSQL is a very respected community. I would be proud to be a
part of it;

- PostgreSQL is one of the best DBMS and I would like to make it better.

Links

1) PostgreSQL 9.3.3 Documentation, date_trunc

http://www.postgresql.org/docs/9.3/static/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC

*
With best wishes,
Alexander S.

[HACKERS]

2014-03-20 Thread Rajashree Mandaogane

While debugging any function in PostgreSQL, whenever I use the command
'bt', it doesn't give the entire list of functions used. Which command
should be used instead?

Re: [HACKERS]

Rajashree Mandaogane rajashree@gmail.com writes:
 While debugging any function in PostgreSQL, whenever I use the command
 'bt', it doesn't give the entire list of functions used. Which command
 should be used instead?

It's probably omitting functions that have been inlined; if so, the fix
is to recompile with a lower -O level to prevent inlining.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-20 Thread Peter Geoghegan

On Thu, Mar 20, 2014 at 5:32 AM, Alexander Korotkov
aekorot...@gmail.com wrote:
 Besides implementation, what the idea was here? For me, it's impossible to
 skip any single element, because it's possible for query to include only
 this element. If we skip that element, we can't answer corresponding query
 no more.

This had something to do with an alternative notion of containment. I
wouldn't have stuck with such a radical change without consulting you.
I reverted it, and am not going to argue for the idea right now.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2014 - mentors, students and admins

2014-03-20 Thread Thom Brown

Hi all,

There is 1 day left to get submissions in, so students should ensure
that they submit their proposals as soon as possible.  No submissions
will be accepted beyond the deadline of 19:00 UTC tomorrow (Friday
21st March).

Regards

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC application: MADlib k-medoids clustering

2014-03-20 Thread Maxence Ahlouche

Hi,

My proposal is now available on Google melange website:
http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/viod/5668600916475904
There seems to be a formatting issue: half of the text is a link to the
page I mentionned during my registration on my website. I don't know how to
fix it though.

Regards,
Maxence

-- 
Maxence Ahlouche
06 06 66 97 00

Re: [HACKERS] QSoC proposal: date_trunc supporting intervals

2014-03-20 Thread Josh Berkus

On 03/20/2014 09:56 AM, Alexandr wrote:
 Here is the text of my proposal which I've applied to GSoC.
 (and link
 https://docs.google.com/document/d/1vBjQzhFT_fgoIkoEP5TVeyFA6ggsYlLq76tghGVUD6A/edit?usp=sharing)
 
 Any suggestions and comments are welcome.
 Because I don't know the code of PostgreSQL well I decide not to
 participate is QSoC with previous proposal (rewrite pg_dump and
 pg_restore as libraries). But I'm very interested to participate in QSoC
 2014 as a part of PostgreSQL. So It's my new proposal.

Per my comments on the GSOC app, it looks good, but I'd like to see some
stretch goals if you are able to implement the new function before
GSOC is over.  For example, one thing which has been frequently
requested is functions to display intervals in the unit of your choice
... for example, convert 1 day to 14400 seconds.

Pick some stretch goals which work for you ... but I'd like to see some.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index

2014-03-20 Thread Jesper Krogh


On 15/03/14 20:27, Heikki Linnakangas wrote:
That said, I didn't expect the difference to be quite that big when 
you're appending to the end of the table. When the new entries go to 
the end of the posting lists, you only need to recompress and WAL-log 
the last posting list, which is max 256 bytes long. But I guess that's 
still a lot more WAL than in the old format.


That could be optimized, but I figured we can live with it, thanks to 
the fastupdate feature. Fastupdate allows amortizing that cost over 
several insertions. But of course, you explicitly disabled that...


In a concurrent update environment, fastupdate as it is in 9.2 is not 
really useful. It may be that you can bulk up insertion, but you have no 
control over who ends up paying the debt. Doubling the amount of wal 
from gin-indexing would be pretty tough for us, in 9.2 we generate 
roughly 1TB wal / day, keeping it
for some weeks to be able to do PITR. The wal are mainly due to 
gin-index updates as new data is added and needs to be searchable by 
users. We do run gzip that cuts it down to 25-30% before keeping the for 
too long, but doubling this is going to be a migration challenge.


If fast-update could be made to work in an environment where we both 
have users searching the index and manually updating it and 4+ backend 
processes updating the index concurrently then it would be a good 
benefit to gain.


the gin index currently contains 70+ million records with and average 
tsvector of 124 terms.


--
Jesper .. trying to add some real-world info.




- Heikki






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] QSoC proposal: date_trunc supporting intervals

2014-03-20 Thread Thom Brown

On 20 March 2014 20:07, Josh Berkus j...@agliodbs.com wrote:
 On 03/20/2014 09:56 AM, Alexandr wrote:
 Here is the text of my proposal which I've applied to GSoC.
 (and link
 https://docs.google.com/document/d/1vBjQzhFT_fgoIkoEP5TVeyFA6ggsYlLq76tghGVUD6A/edit?usp=sharing)

 Any suggestions and comments are welcome.
 Because I don't know the code of PostgreSQL well I decide not to
 participate is QSoC with previous proposal (rewrite pg_dump and
 pg_restore as libraries). But I'm very interested to participate in QSoC
 2014 as a part of PostgreSQL. So It's my new proposal.

 Per my comments on the GSOC app, it looks good, but I'd like to see some
 stretch goals if you are able to implement the new function before
 GSOC is over.  For example, one thing which has been frequently
 requested is functions to display intervals in the unit of your choice
 ... for example, convert 1 day to 14400 seconds.

+1

This is definitely something I've wanted in the past, like getting the
number of minutes between 2 timestamps without converting to seconds
since epoch then doing a subtraction.

like:

date_diff(timestamptz, timestamptz, interval) returns decimal

# SELECT date_diff('2014-02-04 12:44:18+0'::timestamptz, '2014-02-08
20:10:05+0'::timestamptz, '1 second');
 date_diff
---
372347
(1 row)

# SELECT date_diff('2014-02-04 12:44:18+0'::timestamptz, '2014-02-08
20:10:05+0'::timestamptz, '5 seconds');
 date_diff
---
 74469
(1 row)

# SELECT date_diff('2014-02-04 12:44:18+0'::timestamptz, '2014-02-08
20:10:05+0'::timestamptz, '1 day');
 date_diff

 4.3095717592592593
(1 row)


Although perhaps there's a more flexible and useful way of doing this
that.  One would probably want to convert an interval to such units
too, like '3 days' in seconds.

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] QSoC proposal: date_trunc supporting intervals



21.03.2014 00:07, Josh Berkus пишет:
Per my comments on the GSOC app, it looks good, but I'd like to see 
some stretch goals if you are able to implement the new function 
before GSOC is over. For example, one thing which has been frequently 
requested is functions to display intervals in the unit of your choice 
... for example, convert 1 day to 14400 seconds. Pick some stretch 
goals which work for you ... but I'd like to see some. 

I looked through TODO and found only 2 ideas with intervals:
1) Allow infinite intervals just like infinite timestamps
2) Have timestamp subtraction not call justify_hours() (formatting 
intervals with to_chars)

I want to add these ideas as stretch goals:
1) extract_total() - allows conversation of the interval to a total 
number of the user's desired unit

2) Allow TIMESTAMP WITH TIME ZONE
3) add function to allow the creation of timestamps using parameters
4) Add function to detect if an array is empty
Josh, what do you think about them?

Re: [HACKERS] QSoC proposal: date_trunc supporting intervals

2014-03-20 Thread Josh Berkus

On 03/20/2014 01:26 PM, Alexandr wrote:
 
 21.03.2014 00:07, Josh Berkus пишет:
 Per my comments on the GSOC app, it looks good, but I'd like to see
 some stretch goals if you are able to implement the new function
 before GSOC is over. For example, one thing which has been frequently
 requested is functions to display intervals in the unit of your choice
 ... for example, convert 1 day to 14400 seconds. Pick some stretch
 goals which work for you ... but I'd like to see some. 
 I looked through TODO and found only 2 ideas with intervals:
 1) Allow infinite intervals just like infinite timestamps
 2) Have timestamp subtraction not call justify_hours() (formatting
 intervals with to_chars)
 I want to add these ideas as stretch goals:
 1) extract_total() - allows conversation of the interval to a total
 number of the user's desired unit
 2) Allow TIMESTAMP WITH TIME ZONE
 3) add function to allow the creation of timestamps using parameters
 4) Add function to detect if an array is empty
 Josh, what do you think about them?

Comments:
#2: I don't understand this one?

#3 is already a patch for version 9.4, but possibly you can
improve/expand it.

#4 has already been the subject of a LOT of debate, I think you don't
want to get into it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] QSoC proposal: date_trunc supporting intervals

2014-03-20 Thread Steve Atkins

On Mar 20, 2014, at 1:24 PM, Thom Brown t...@linux.com wrote:

On 20 March 2014 20:07, Josh Berkus j...@agliodbs.com wrote:
On 03/20/2014 09:56 AM, Alexandr wrote:
Here is the text of my proposal which I've applied to GSoC.
(and link
https://docs.google.com/document/d/1vBjQzhFT_fgoIkoEP5TVeyFA6ggsYlLq76tghGVUD6A/edit?usp=sharing)

Per my comments on the GSOC app, it looks good, but I'd like to see some
stretch goals if you are able to implement the new function before
GSOC is over. For example, one thing which has been frequently
requested is functions to display intervals in the unit of your choice
... for example, convert 1 day to 14400 seconds.

This is definitely something I've wanted in the past, like getting the
number of minutes between 2 timestamps without converting to seconds
since epoch then doing a subtraction.

It’d be nice, but isn’t it impossible with anything similar to the existing
interval
type (as you lose data when you convert to an interval that you can’t get back)?

Subtracting to get an interval, then converting that interval to seconds or
minutes
could give you a value that’s wildly different from the right answer.

Cheers,
Steve

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] QSoC proposal: date_trunc supporting intervals



21.03.2014 00:33, Josh Berkus пишет:

Comments:
#2: I don't understand this one?
#3 is already a patch for version 9.4, but possibly you can 
improve/expand it.
#4 has already been the subject of a LOT of debate, I think you don't 
want to get into it. 
I meaned this one: Allow TIMESTAMP WITH TIME ZONE to store the original 
timezone information, either zone name or offset from UTC

And which ideas can you advise me to add to proposal?

With best wishes,
Alexander S.

Re: [HACKERS] QSoC proposal: date_trunc supporting intervals

2014-03-20 Thread Josh Berkus


 I meaned this one: Allow TIMESTAMP WITH TIME ZONE to store the original
 timezone information, either zone name or offset from UTC
 And which ideas can you advise me to add to proposal?

That one has also been hotly debated.  You'd probably have to do it as
an extension, and that would be a fairly large stretch goal.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] QSoC proposal: date_trunc supporting intervals

2014-03-20 Thread Alvaro Herrera

Alexandr escribió:
 
 21.03.2014 00:33, Josh Berkus пишет:
 Comments:
 #2: I don't understand this one?
 #3 is already a patch for version 9.4, but possibly you can
 improve/expand it.
 #4 has already been the subject of a LOT of debate, I think you
 don't want to get into it.
 I meaned this one: Allow TIMESTAMP WITH TIME ZONE to store the
 original timezone information, either zone name or offset from UTC
 And which ideas can you advise me to add to proposal?

This has been discussed previously.  I doubt it makes a good GSoC
project.  Maybe if you were to create a new datatype that stored the
timestamptz plus the original timezone separately, it'd work better;
however I vaguely remember we discussed this a long time ago.  One of
the challenges was how to store the timezone; we didn't want to spend as
much as the whole text representation, so we wanted a catalog that
attached an OID to each timezone. It got real messy from there, and we
dropped the idea.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] QSoC proposal: date_trunc supporting intervals



Subtracting to get an interval, then converting that interval to 
seconds or minutes could give you a value that’s wildly different from 
the right answer. 

Can you explain me when it happens ?


With best wishes,
Alexander S.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] QSoC proposal: date_trunc supporting intervals

2014-03-20 Thread Claudio Freire

On Thu, Mar 20, 2014 at 5:55 PM, Alexandr askel...@gmail.com wrote:
 Subtracting to get an interval, then converting that interval to seconds
 or minutes could give you a value that's wildly different from the right
 answer.

 Can you explain me when it happens ?


'1 month'::interval

It's different depending on which month we're talking about.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Array of composite types returned from python

2014-03-20 Thread Behn, Edward (EBEHN)

I've endeavored to enable the return of arrays of composite types from code
written in PL/Python.  It seems that this can be accomplished though a very
minor change to the code:

 

On line 401 in the file src/pl/plpython/plpy_typeio.c, remove the error
report PL/Python functions cannot return type. and replace it with the
command 

arg-func = PLyObject_ToComposite; 

 

From all that I can see, this does exactly what I want. A python list of
tuples is converted to an array of composite types in SQL. 

 

I ran the main and python regression suites for both python2 and python3
with assert enabled. The only discrepancies I got were ones that were due to
the output expecting an error. When I altered the .out files to the expected
behavior, it matched just fine. 

 

Am I missing anything, (ie memory leak, undesirable behavior elsewhere)? 

 -Ed 

 

 

Ed Behn / Staff Engineer / Airline and Network Services
Information Management Services
2551 Riva Road, Annapolis, MD 21401 USA
Phone: 410.266.4426 / Cell: 240.696.7443
eb...@arinc.com
 http://www.rockwellcollins.com/ www.rockwellcollins.com



 

image001.pngimage002.png

smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore

On Tue, Mar 18, 2014 at 8:41 PM, Alexandr askel...@gmail.com wrote:
 Rewrite (add) pg_dump and pg_restore utilities as libraries (.so, .dll 
 .dylib)

This strikes me as (1) pretty vague and (2) probably too hard for a
summer project.

I mean, getting the existing binaries to build libraries that you can
call with some trivial interface that mimics the existing command-line
functionality of pg_dump might be doable, but that's not all that
interesting.  What people are really going to want is a library with a
sophisticated API that lets you do interesting things
programmatically.  But that's going to be hard.  AFAIK, nobody's even
tried to figure out what that API should look like.  Even if we had
that worked out, a non-trivial task, the pg_dump source code is a
mess, so refactoring it to provide such an API is likely to be a job
and a half.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS]

2014-03-20 Thread Craig Ringer

On 03/21/2014 01:12 AM, Tom Lane wrote:
 Rajashree Mandaogane rajashree@gmail.com writes:
 While debugging any function in PostgreSQL, whenever I use the command
 'bt', it doesn't give the entire list of functions used. Which command
 should be used instead?
 
 It's probably omitting functions that have been inlined; if so, the fix
 is to recompile with a lower -O level to prevent inlining.

For more details, see
https://wiki.postgresql.org/wiki/Developer_FAQ#What_debugging_features_are_available.3F
.

If you want to completely prevent inlining you can use -O0 instead of -Og .


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore

2014-03-20 Thread Craig Ringer

On 03/21/2014 09:28 AM, Robert Haas wrote:
 On Tue, Mar 18, 2014 at 8:41 PM, Alexandr askel...@gmail.com wrote:
 Rewrite (add) pg_dump and pg_restore utilities as libraries (.so, .dll 
 .dylib)
 
 This strikes me as (1) pretty vague and (2) probably too hard for a
 summer project.
 
 I mean, getting the existing binaries to build libraries that you can
 call with some trivial interface that mimics the existing command-line
 functionality of pg_dump might be doable, but that's not all that
 interesting.  What people are really going to want is a library with a
 sophisticated API that lets you do interesting things
 programmatically.  But that's going to be hard.  AFAIK, nobody's even
 tried to figure out what that API should look like.  Even if we had
 that worked out, a non-trivial task, the pg_dump source code is a
 mess, so refactoring it to provide such an API is likely to be a job
 and a half.

... and still wouldn't solve one of the most frequently requested things
for pg_dump / pg_restore, which is the ability to use them *server-side*
over a regular PostgreSQL connection. It'd be useful progress toward
that, though.

Right now, we can't even get the PostgreSQL server to emit DDL for a
table, let alone do anything more sophisticated.

Here's how I think it needs to look:

- Design a useful API for pg_dump and pg_restore that is practical to
  use for pg_dump and pg_restore's current tasks (fast database
  dump/restore) and also useful for extracting specific objects
  from the database. When designing, consider that we'll want to
  expose this API or functions that use it over SQL later.

- Create a new libpqdump library.

- Implement the designed API in the new library, moving and
  adjusting code from pg_dump / pg_restore where possible, writing
  new code where not.

- Refactor (closer to rewrite) pg_dump and pg_restore to use libpqdump,
  removing as much knowledge of the system catalogs etc as possible from
  them.

- Make sure the result still performs OK

THEN, once that's settled in:

- Modify libpqdump to support compilation as a backend extension, with
  use of the SPI for queries and use of syscaches or direct scans
  where possible.

- Write a pg_dump extension that uses libpqdump in SPI mode
  to expose its API over SQL, or at least uses it to provide SQL
  functions to describe database objects. So you can dump a DB,
  or a subset of it, over SQL.

After all, a libpgdump won't do much good for the large proportion of
PostgreSQL users who use Java/JDBC, who can't use a native library
(without hideous hacks with JNI). For the very large group who use libpq
via language-specific client interfaces like the Pg gem for Ruby,
psycopg2 for Python, DBD::Pg for Perl, etc, it'll require a lot of work
to wrap the API and maintain it. Wheras a server-side SQL-callable
interface would be useful and immediately usable for all of them.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore