Re: [HACKERS] Press Release Draft - 2016-02-09 Cumulative Update

2017-02-07 Thread Jonathan S. Katz

> On Feb 7, 2017, at 6:40 PM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> 
> Jonathan S. Katz wrote:
>> 
>>> On Feb 7, 2017, at 4:07 PM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
>>> 
>>> Jonathan S. Katz wrote:
>>> 
>>>> Thanks for the clarification.  I have updated the recipe along with Emre’s 
>>>> comments here:
>>>> 
>>>> [updated text not included in the email]
>>> 
>>> I still don't think the recipe is a very good one because it leaves you
>>> with a window where the affected columns are not indexed at all.
>> 
>> Okay, so I propose two options:
>> 
>>  1.  Does anyone have a recipe they recommend that might be better? OR
> 
> Do the CREATE INDEX CONCURRENTLY first, then DROP INDEX CONCURRENTLY the
> old index, then rename the new index to the old name.

Cool.  Updated:

https://git.postgresql.org/gitweb/?p=press.git;a=blob;f=update_releases/current/20170209updaterelease.txt;h=56551b56eb1f70e734c5762abf5ddf6bd6965fdb;hb=c1b7f9c0cfd0ff106409d14a36c3122a8ee460d0
 
<https://git.postgresql.org/gitweb/?p=press.git;a=blob;f=update_releases/current/20170209updaterelease.txt;h=56551b56eb1f70e734c5762abf5ddf6bd6965fdb;hb=c1b7f9c0cfd0ff106409d14a36c3122a8ee460d0>

I added a note to alert people to disk space usage utilizing this method.

Thanks for the help!

Jonathan.



Re: [HACKERS] Press Release Draft - 2016-02-09 Cumulative Update

2017-02-07 Thread Jonathan S. Katz

> On Feb 7, 2017, at 4:39 PM, Michael Banck <michael.ba...@credativ.de> wrote:
> 
> Hi,
> 
> Am Dienstag, den 07.02.2017, 15:58 -0500 schrieb Jonathan S. Katz:
> 
> 
>> https://git.postgresql.org/gitweb/?p=press.git;a=blob;f=update_releases/current/20170209updaterelease.txt;h=f90d4716f240dbea4cca647b099f79865545b633;hb=d85498c284275bcab4752b91476834de780648b8
> 
> It says "[...]then rows that were updated by transactions running at the
> same time as the CREATE INDEX CONCURRENTLY command could have been index
> incorrectly."
> 
> That sounds off to me, shouldn't it be "indexed incorrectly" or
> something?

Yes, passive voice :(  I’ve made the modification on my local copy and will 
push it up after the resolution on the CREATE INDEX recipe.

Jonathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Press Release Draft - 2016-02-09 Cumulative Update

2017-02-07 Thread Jonathan S. Katz

> On Feb 7, 2017, at 4:07 PM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> 
> Jonathan S. Katz wrote:
> 
>> Thanks for the clarification.  I have updated the recipe along with Emre’s 
>> comments here:
>> 
>> [updated text not included in the email]
> 
> I still don't think the recipe is a very good one because it leaves you
> with a window where the affected columns are not indexed at all.

Okay, so I propose two options:

1.  Does anyone have a recipe they recommend that might be better? OR
2.  We just leave out the recipe altogether (which is what I am leaning 
towards at the moment).

Thanks!

Jonathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Press Release Draft - 2016-02-09 Cumulative Update

2017-02-07 Thread Jonathan S. Katz

> On Feb 7, 2017, at 12:44 PM, Alvaro Herrera <alvhe...@2ndquadrant.com> wrote:
> 
> Jonathan S. Katz wrote:
> 
>> Below is the draft of the press release for the update this Thursday:
>> 
>> https://git.postgresql.org/gitweb/?p=press.git;a=blob;f=update_releases/current/20170209updaterelease.md;h=0cccb8986c08527f65f13d704a78c36bb8de7fef;hb=afc01091dea8a1597e8e21430edc3908c581ce0c
>>  
>> <https://git.postgresql.org/gitweb/?p=press.git;a=blob;f=update_releases/current/20170209updaterelease.md;h=0cccb8986c08527f65f13d704a78c36bb8de7fef;hb=afc01091dea8a1597e8e21430edc3908c581ce0c>
>> 
>> As there are a lot of updates I did my best to consolidate some of the 
>> bullet points and as usual, people are directed to the release notes.  
>> Please let me know if there are any inaccuracies so I can fix them ASAP.
> 
> Please do post the proposed text on list for ease of review.  I wasn't
> looking at the text, so I wouldn't have noticed this if Emre hadn't
> replied:
> 
>  76 If you believe you have been affected by the aforementioned CREATE INDEX 
> CONCURRENTLY bug, you will have to rebuild the index.  An example of 
> rebuilding an index:
>  77 
>  78 BEGIN;
>  79 DROP INDEX bad_index_name;
>  80 CREATE INDEX CONCURRENTLY bad_index_name ON table_name (column_name); 
> /* replace names with your original index definition */
>  81 COMMIT;
> 
> This is not a good recipe, because using CREATE INDEX CONCURRENTLY in
> the same transaction that grabs an exclusive lock on the table for the
> DROP INDEX is pointless -- the access exclusive lock is held until the
> end of the transaction, and CIC does not work inside a transaction
> anyway so it'd raise an ERROR and rollback the DROP INDEX.  So the user
> would probably drop the BEGIN/COMMIT sequence in order for this to work
> in the first place.  (The other option is to use CREATE INDEX not
> concurrent, but that would lock the table for a very long time).

Thanks for the clarification.  I have updated the recipe along with Emre’s 
comments here:

https://git.postgresql.org/gitweb/?p=press.git;a=blob;f=update_releases/current/20170209updaterelease.txt;h=f90d4716f240dbea4cca647b099f79865545b633;hb=d85498c284275bcab4752b91476834de780648b8
 
<https://git.postgresql.org/gitweb/?p=press.git;a=blob;f=update_releases/current/20170209updaterelease.txt;h=f90d4716f240dbea4cca647b099f79865545b633;hb=d85498c284275bcab4752b91476834de780648b8>

Thanks!

Jonathan



Re: [HACKERS] Press Release Draft - 2016-02-09 Cumulative Update

2017-02-07 Thread Jonathan S. Katz

> On Feb 7, 2017, at 11:25 AM, Emre Hasegeli  wrote:
> 
>> As there are a lot of updates I did my best to consolidate some of the
>> bullet points and as usual, people are directed to the release notes.
>> Please let me know if there are any inaccuracies so I can fix them ASAP.
> 
> Just some minor points:
> 
>> * Several fixes for PostgreSQL operating in hot standby mode
> 
> It sounded unnatural to me.  Maybe it is better without "PostgreSQL".
> 
>> * Several vacuum and autovacuum fxies
> 
> Typo
> 
>> * Several Unicode fixes
> 
> It sounded alarming to me.  I see just one related item on the release
> notes.  Maybe we can clarify the problem.
> 
>> * Sync our copy of the timezone library with IANA release tzcode2016j
> 
> This is repeated on the following sentence.
> 
>> BEGIN;
>> DROP INDEX bad_index_name;
>> CREATE INDEX CONCURRENTLY bad_index_name ON table_name (column_name); /* 
>> replace names with your original index definition */
>> COMMIT;
> 
> Maybe you meant CREATE INDEX without CONCURRENTLY?

Thanks for the corrections / suggestions.  I have applied them and will push a 
new version shortly!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Press Release Draft - 2016-02-09 Cumulative Update

2017-02-07 Thread Jonathan S. Katz
Hi!

Below is the draft of the press release for the update this Thursday:

https://git.postgresql.org/gitweb/?p=press.git;a=blob;f=update_releases/current/20170209updaterelease.md;h=0cccb8986c08527f65f13d704a78c36bb8de7fef;hb=afc01091dea8a1597e8e21430edc3908c581ce0c
 


As there are a lot of updates I did my best to consolidate some of the bullet 
points and as usual, people are directed to the release notes.  Please let me 
know if there are any inaccuracies so I can fix them ASAP.

Thanks!

Jonathan




Re: [HACKERS] IS NOT DISTINCT FROM + Indexing

2014-07-22 Thread Jonathan S. Katz
On Jul 22, 2014, at 12:40 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Jonathan S. Katz jonathan.k...@excoventures.com writes:
 On Jul 21, 2014, at 9:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The short reason why not is that it's not an operator (where operator
 is defined as something with a pg_operator entry), and all our indexing
 infrastructure is built around the notion that indexable clauses are of
 the form indexed_column indexable_operator comparison_value.
 
 What got me thinking this initially problem is that I know IS NULL is 
 indexable and I was unsure of how adding IS NOT DISTINCT FROM would be too 
 different from that - of course, this is from my perspective from primarily 
 operating on the surface.  It sounds like the IS NULL work is in the btree 
 code?
 
 We hacked in IS [NOT] NULL as a potentially indexable construct, but the
 key thing that made that possible without major redesign is that IS [NOT]
 NULL is datatype independent, so there's no need to identify any
 particular underlying operator or opclass.  I'm not sure what we'd do to
 handle IS [NOT] DISTINCT FROM, but that particular approach ain't gonna
 cut it.
 
 Another point is that people are unlikely to be satisfied with planner
 optimization for IS NOT DISTINCT FROM that doesn't support it as a join
 clause (i.e., tab1.col1 IS NOT DISTINCT FROM tab2.col2); which is an issue
 that doesn't arise for IS [NOT] NULL, as it has only one argument.  So
 that brings you into not just indexability but hashing and merging
 support.  I hasten to say that that doesn't necessarily have to happen
 in a version-zero patch; but trying to make IS NOT DISTINCT FROM into
 a first-class construct is a big project.

Well that definitely answers how hard would it be. - before embarking on 
something laborious (as even just indexing is nontrivial), I think it would be 
good to figure out how people are using IS [NOT] DISTINCT FROM and if there is 
interest in having it be indexable, let alone used in a JOIN optimization.  It 
could become a handy tool to simplify the SQL in queries that are returning a 
lot of NULL / NOT NULL data mixed together.

Jonathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] IS NOT DISTINCT FROM + Indexing

2014-07-21 Thread Jonathan S. Katz
Hi,

I'm curious if there is a reason why IS NOT DISTINCT FROM is not an indexable 
operation in a B-tree index, as it is effectively testing for equality albeit 
with some magic for NULLs?  Here is an example of what I mean, running tests 
on 9.3.4:

-- create a table of integers
CREATE TABLE numbers AS
SELECT x FROM generate_series(1,100) x;

-- create a b-tree index
CREATE INDEX numbers_x_idx ON numbers (x);

-- find x = 500
SELECT * FROM numbers WHERE x = 500;
  x  
-
 500
(1 row)

-- query plan
EXPLAIN SELECT * FROM numbers WHERE x = 500;
QUERY PLAN  
  

--
 Index Only Scan using numbers_x_idx on numbers  (cost=0.42..8.44 
rows=1 width=4)
   Index Cond: (x = 500)
(2 rows)


-- now find x IS NOT DISTINCT FROM 500
SELECT * FROM numbers WHERE x IS NOT DISTINCT FROM 500;
  x  
-
 500
(1 row)

-- but the query plan is...
EXPLAIN SELECT * FROM numbers WHERE x IS NOT DISTINCT FROM 500;
QUERY PLAN 
---
 Seq Scan on numbers  (cost=0.00..16925.00 rows=1 width=4)
   Filter: (NOT (x IS DISTINCT FROM 500))

With NULLs being indexable, I was wondering if there was some reason why IS NOT 
DISTINCT FROM could not use the index?

Thanks,

Jonathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] IS NOT DISTINCT FROM + Indexing

2014-07-21 Thread Jonathan S. Katz
On Jul 21, 2014, at 9:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Jonathan S. Katz jonathan.k...@excoventures.com writes:
 I'm curious if there is a reason why IS NOT DISTINCT FROM is not an
 indexable operation in a B-tree index,
 
 The short reason why not is that it's not an operator (where operator
 is defined as something with a pg_operator entry), and all our indexing
 infrastructure is built around the notion that indexable clauses are of
 the form indexed_column indexable_operator comparison_value.
 
 You could certainly imagine ways to fix that, but nobody's put in the
 probably-nontrivial effort required to do so.  The btree code itself
 would likely be the easiest part to fix, as it sort of thinks nulls
 are real values already.

What got me thinking this initially problem is that I know IS NULL is 
indexable and I was unsure of how adding IS NOT DISTINCT FROM would be too 
different from that - of course, this is from my perspective from primarily 
operating on the surface.  It sounds like the IS NULL work is in the btree code?

Even if it is trivial, it would be tough for me personally to hack on without 
some hand-holding.  I did want to ask about it because it can be useful in 
simplifying some queries when you have do deal with NULLs (though in reality I 
tend to use IS DISTINCT FROM much more, though in things like triggers) and 
would be useful with exclusion constraints (though with those it sounds like it 
would have to be an operator?).

If it is a small project someone is interested, I would be happy to contribute 
by testing.

Jonathan



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore patch

2013-12-23 Thread Jonathan S. Katz
On Dec 23, 2013, at 6:28 AM, Robert Haas wrote:

 On Fri, Dec 20, 2013 at 6:16 PM, David E. Wheeler da...@justatheory.com 
 wrote:
 * New operators:
  + `hstore - int`: Get string value at array index (starting at 0)
  + `hstore ^ text`:Get numeric value for key
  + `hstore ^ int`: Get numeric value at array index
  + `hstore ? text`:Get boolean value for key
  + `hstore ? int`: Get boolean value at array index
  + `hstore # text[]`:  Get string value for key path
  + `hstore #^ text[]`: Get numeric value for key path
  + `hstore #? text[]`: Get boolean value for key path
  + `hstore % text`:Get hstore value for key
  + `hstore % int`: Get hstore value at array index
  + `hstore #% text[]`: Get hstore value for key path
  + `hstore ? int`:  Does hstore contain array index
  + `hstore #? text[]`:  Does hstore contain key path
  + `hstore - int`:  Delete index from left operand
  + `hstore #- text[]`:  Delete key path from left operand
 
 Although in some ways there's a certain elegance to this, it also
 sorta looks like punctuation soup.  I can't help wondering whether
 we'd be better off sticking to function names.

The key thing is making it easy for people to easily chain calls to their 
nested hstore objects, and I think these operators accomplish that.

Some of them are fairly intuitive, and I think as a community if we have a) 
good docs, b)  good blog posts on how to use nested hstore, and c) provides 
clear instructions @ PG events on how to use it, it would be okay, though some 
things, i.e. extracting the key by a path, might be better being in a function 
anyway.  However, having it as an operator might encourage more usage, only 
because people tend to think that functions will slow my query down.

My only concern is the consistency with the generally accepted standard of JSON 
and with the upcoming jsonb type.   I'm not sure if the jsonb API has  been 
defined yet, but it would be great to keep consistency between nested hstore 
and jsonb so people don't have to learn two different access systems.  Data 
extraction from JSON is often done by the dot operator in implementations, and 
depending on the language you are in, there are ways to add / test existence / 
remove objects from the JSON blob.

Being able to extract objects from nested hstore / JSON using the dot operator 
would be simple and intuitive and general well-understood, but of course there 
are challenges with doing that in PG and well, proper SQL.

Jonathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RETURNING syntax for COPY

2013-05-08 Thread Jonathan S. Katz
On May 8, 2013, at 1:16 PM, Tom Lane wrote:

 Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 08.05.2013 19:44, Tom Lane wrote:
 No there isn't; what you suggest would require FE/BE protocol
 extensions, making it several orders of magnitude more work than the
 other thing.
 
 I'd imagine that the flow would go something like this:
 
 BE   FE
 
 CopyInResponse
  CopyData
  CopyData
  ...
  CopyDone
 RowDescription
 DataRow
 DataRow
 CommandComplete
 
 That would require the backend to buffer the entire query response,
 which isn't a great idea.  I would expect that such an operation would
 need to interleave CopyData to the backend with DataRow responses.  Such
 a thing could possibly be built on COPY_BOTH mode, but it would be a lot
 of work (at both ends) for extremely debatable value.
 
 The general idea of COPY is to load data as fast as possible, so weighing
 it down with processing options seems like a pretty dubious idea even if
 the implementation were easy.

There are cases that I indeed want to load data very quickly, but I want to 
perform an operation on it immediately after, e.g. removing bad data that was 
immediately added from that copy.  For instance, I do have this scenario:

WITH new_data AS (
COPY FROM ...
RETURNING id, field_to_check
)
DELETE FROM table
USING new_data
WHERE
table.id = new_data.id AND
new_data.field_to_check ~* 'bad data';

Now I can take care of that all in one step, and I know I'm only removing 
fields I just added.  This comes up when I am importing external files from 
other sources where I may not necessarily want all of the rows or some of the 
rows contain bad data.

This also presumes that COPY works in a CTE, which I'm not sure it does (and I 
will do the TIAS test after I hit send on this message).

Jonathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers