Re: [HACKERS] Use of non-restart-safe storage by temp_tablespaces

2017-06-06 Thread Sven R. Kunze

On 06.06.2017 14:31, Bruce Momjian wrote:

On Tue, Jun  6, 2017 at 04:39:50AM -0300, Alvaro Herrera wrote:

I was thinking that you could create the init fork for each unlogged
table in a permanent tablespace (probably the default one for the
database).

FWIW I don't think calling these tablespaces "temporary" is the right
word.  It's not the tablespaces that are temporary.  Maybe "evanescent".

I was thinking "transient".  Amazon uses "ephemeral".



Just dropping a translator result here:

[1] results of https://www.dict.cc/deutsch-englisch/fl%C3%BCchtig.html

What about VOLATILE like computer memory (RAM)?

Sven

[1]
volatile {adj}chem.
briefly {adv}
fleeting {adj}
fugitive {adj} [temporary, fleeing]
brief {adj}
cursory {adj}
transient {adj}
vaguely {adv} [look, interested]
transitory {adj}
perfunctory {adj}
hasty {adj}
ephemeral {adj}
fleetingly {adv}
momentary {adj}
evanescent {adj}
passing {adj}
absconded {adj}
casual {adj} [glance, acquaintance etc.]
desultory {adj} [reading]
tangential {adj}
ethereal {adj}chem.
etherial {adj}chem.
flighty {adj}
sketchy {adj} [not detailed]
quick {adj}
perfunctorily {adv}
sketchily {adv}
fugitively {adv}
volatilely {adv}
hit and run {adj}
ephemerally {adv}
non-permanent {adj}
fugacious {adj} [literary]



Re: [HACKERS] Surjective functional indexes

2017-05-29 Thread Sven R. Kunze

On 29.05.2017 21:25, Sven R. Kunze wrote:

[...] non-surjective functions.

non-injective of course


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surjective functional indexes

2017-05-29 Thread Sven R. Kunze

On 29.05.2017 19:21, Christoph Berg wrote:

I think the term you were looking for is "projection".
https://en.wikipedia.org/wiki/Projection_(set_theory)


Maybe, I am seeing too much of a connection here but recently Raymond 
Hettinger held a very interesting talk [1] at PyCon 2017.


For those without the time or bandwidth to watch: it describes the 
history of the modern dict in Python in several steps.


1) avoiding having a database scheme with columns and rows and indexes
2) introducing hashing with bucket lists
3...6) several improvements
7) in the end looks like a database table with indexes again ;)

If you have the time, just go ahead and watch the 30 min video. He can 
explain things definitely better than me.



In order to draw the line back on-topic, if I am not completely 
mistaken, his talks basically shows that over time even datastructures 
with different APIs such as dicts (hashes, maps, sets, etc.) internally 
converge towards a relational-database-y design because of performance 
and resources reasons.



Thus let me think that also in the on-topic case, we might best be 
supporting the much narrow use-case of "Projection" (a term also used in 
relation database theory btw. ;-) ) instead of non-surjective functions.



Cheers,
Sven


[1] https://www.youtube.com/watch?v=npw4s1QTmPg


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-10 Thread Sven R. Kunze

On 10.05.2017 17:59, Robert Haas wrote:

Well, I don't think it would be a HUGE problem, but I think the fact
that Amit chose to implement this with syntax similar to that of
Oracle is probably not a coincidence, but rather a goal, and I think
the readability problem that you're worrying about is really pretty
minor.  I think most people aren't going to subpartition their default
partition, and I think those who do will probably find the syntax
clear enough anyway.


I agree here.


Optional keywords may not be the root of ALL evil, but they're pretty
evil.  See my posting earlier on this same thread on this topic:

http://postgr.es/m/CA+TgmoZGHgd3vKZvyQ1Qx3e0L3n=voxy57mz9ttncvet-al...@mail.gmail.com

The issues here are more or less the same.


Ah, I see. I didn't draw the conclusion from the optionality of a 
keyword the other day but after re-reading your post, it's exactly the 
same issue.

Let's avoid optional keywords!

Sven


Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-10 Thread Sven R. Kunze

On 09.05.2017 09:19, Rahila Syed wrote:
+1 for AS DEFAULT syntax if it helps in improving readability 
specially in following case


CREATE TABLE p1 PARTITION OF test AS DEFAULT PARTITION BY LIST(a);

Thank you,
Rahila Syed

On Tue, May 9, 2017 at 1:13 AM, Robert Haas <mailto:robertmh...@gmail.com>> wrote:


On Thu, May 4, 2017 at 4:40 PM, Sven R. Kunze mailto:srku...@mail.de>> wrote:
> It yields
>
> CREATE TABLE p1 PARTITION OF test DEFAULT PARTITION BY LIST(b);
>
> This reads to me like "DEFAULT PARTITION".
>
> I can imagine a lot of confusion when those queries are
encountered in the
> wild. I know this thread is about creating a default partition
but I were to
> propose a minor change in the following direction, I think
confusion would
> be greatly avoided:
>
> CREATE TABLE p1 PARTITION OF test AS DEFAULT PARTITIONED BY LIST(b);
>
> I know it's a bit longer but I think those 4 characters might serve
> readability in the long term. It was especially confusing to see
PARTITION
> in two positions serving two different functions.

Well, we certainly can't make that change just for default partitions.
I mean, that would be non-orthogonal, right?  You can't say that the
way to subpartition is to write "PARTITION BY strategy" when the table
unpartitioned or is a non-default partition but "PARTITIONED BY
strategy" when it is a default partition.  That would certainly not be
a good way of confusing users less, and would probably result in a
variety of special cases in places like ruleutils.c or pg_dump, plus
some weasel-wording in the documentation.  We COULD do a general
change from "CREATE TABLE table_name PARTITION BY strategy" to "CREATE
TABLE table_name PARTITIONED BY strategy".  I don't have any
particular arguments against that except that the current syntax is
more like Oracle, which might count for something, and maybe the fact
that we're a month after feature freeze.  Still, if we want to change
that, now would be the time; but I favor leaving it alone.



You are definitely right. Changing it here would require to change it 
everywhere AND thus to loose syntax parity with Oracle.


I am not in a position to judge this properly whether this would be a 
huge problem. Personally, I don't have an issue with that. But don't 
count me as most important opion on this.




So I guess I'm still in favor of the CREATE TABLE p1 PARTITION OF test
DEFAULT syntax, but if it ends up being AS DEFAULT instead, I can live
with that.



Is to make it optional an option?

Sven


Re: [HACKERS] WITH clause in CREATE STATISTICS

2017-05-04 Thread Sven R. Kunze

On 04.05.2017 23:13, Tom Lane wrote:

I'm not against what you've done here, because I had no love for USING
in this context anyway; it conveys approximately nothing to the mind
about what is in the list it's introducing.  But I'm concerned whether
we're boxing ourselves in by using ON.

Actually, "ON" doesn't seem all that mnemonic either.  Maybe "FOR"
would be a good substitute, if it turns out that "ON" has a problem?


The whole syntax reminds me of a regular SELECT clause. So, SELECT?


Also considering the most generic form of statistic support mentioned in 
[1], one could even thing about allowing aggregates, windowing functions 
etc, aka the full SELECT clause in the future.



Sven


[1] 
https://www.postgresql.org/message-id/CAEZATCUtGR+U5+QTwjHhe9rLG2nguEysHQ5NaqcK=vbj78v...@mail.gmail.com 




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing feature in Phrase Search?

2017-05-04 Thread Sven R. Kunze

Hi everybody,

On 21.04.2017 20:47, Josh Berkus wrote:

Oleg, Teodor, folks:

I was demo'ing phrase search for a meetup yesterday, and the user
feedback I got showed that there's a missing feature with phrase search.
  Let me explain by example:


'fix <-> error' will match 'fixed error', 'fixing error'
but not 'fixed language error' or 'fixed a small error'

'fix <2> error' will match 'fixed language error',
but not 'fixing error' or 'fixed a small error'

'fix <3> error' will match 'fixed a small error',
but not any of the other strings.


This is because the # in <#> is an exact match.

Seems like we could really use a way for users to indicate that they
want a range of word gaps.  Like, in the example above, users could
search on:

'fix <1:3> error'

... which would search for any phrase where "error" followed "fix" by
between 1 and 3 words.

Not wedded to any particular syntax for that, of course.


That could be useful. I would like to add another idea here about 
leaving out one side of the range.


'fix <:3> error'
'fix <2:> error'

To either indicate 1 (left) or unbounded (right).

Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding support for Default partition in partitioning

2017-05-04 Thread Sven R. Kunze

Hi Rahila,

still thinking about the syntax (sorry):


On 04.05.2017 13:44, Rahila Syed wrote:

[...] The syntax implemented in this patch is as follows,

CREATE TABLE p11 PARTITION OF p1 DEFAULT;


Rewriting the following:

On Thu, May 4, 2017 at 4:02 PM, amul sul > wrote:


[...] CREATE TABLE p1 PARTITION OF test FOR VALUES IN  (DEFAULT)
PARTITION BY LIST(b); [...]



It yields

CREATE TABLE p1 PARTITION OF test DEFAULT PARTITION BY LIST(b);

This reads to me like "DEFAULT PARTITION".


I can imagine a lot of confusion when those queries are encountered in 
the wild. I know this thread is about creating a default partition but I 
were to propose a minor change in the following direction, I think 
confusion would be greatly avoided:


CREATE TABLE p1 PARTITION OF test*AS *DEFAULT PARTITION*ED* BY LIST(b);

I know it's a bit longer but I think those 4 characters might serve 
readability in the long term. It was especially confusing to see 
PARTITION in two positions serving two different functions.


Sven


Re: [HACKERS] Adding support for Default partition in partitioning

2017-04-30 Thread Sven R. Kunze

On 27.04.2017 22:21, Robert Haas wrote:

On Thu, Apr 27, 2017 at 3:15 PM, Sven R. Kunze  wrote:

Just to make sound a little rounder:

CREATE TABLE ... PARTITION OF ... AS DEFAULT
CREATE TABLE ... PARTITION OF ... AS FALLBACK

or

CREATE TABLE ... PARTITION OF ... AS DEFAULT PARTITION
CREATE TABLE ... PARTITION OF ... AS FALLBACK PARTITION

Could any of these be feasible?

FALLBACK wouldn't be a good choice because it's not an existing parser
keyword.  We could probably insert AS before DEFAULT and/or PARTITION
afterwards, but they sort of seem like noise words.


You are right. I just thought it would make this variant more acceptable 
as people expressed concerns about understandability of the command.



SQL seems to have
been invented by people who didn't have any trouble remembering really
long command strings, but brevity is not without some merit.


For me, it's exactly the thing I like about SQL. It makes for an easy 
learning curve.



Sven


Re: [HACKERS] Adding support for Default partition in partitioning

2017-04-27 Thread Sven R. Kunze

On 27.04.2017 15:07, Robert Haas wrote:

On Thu, Apr 27, 2017 at 8:49 AM, Rahila Syed  wrote:

+1 for CREATE TABLE..PARTITION OF...DEFAULT syntax.
I think substituting DEFAULT for FOR VALUES is appropriate as
both cases are mutually exclusive.


Just to make sound a little rounder:

CREATE TABLE ... PARTITION OF ... AS DEFAULT
CREATE TABLE ... PARTITION OF ... AS FALLBACK

or

CREATE TABLE ... PARTITION OF ... AS DEFAULT PARTITION
CREATE TABLE ... PARTITION OF ... AS FALLBACK PARTITION


Could any of these be feasible?


Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multivariate statistics (v25)

2017-04-04 Thread Sven R. Kunze

Thanks Tomas and David for hacking on this patch.

On 04.04.2017 20:19, Tomas Vondra wrote:
I'm not sure we still need the min_group_size, when evaluating 
dependencies. It was meant to deal with 'noisy' data, but I think it 
after switching to the 'degree' it might actually be a bad idea.


Consider this:

create table t (a int, b int);
insert into t select 1, 1 from generate_series(1, 1) s(i);
insert into t select i, i from generate_series(2, 2) s(i);
create statistics s with (dependencies) on (a,b) from t;
analyze t;

select stadependencies from pg_statistic_ext ;
  stadependencies

 [{1 => 2 : 0.44}, {2 => 1 : 0.44}]
(1 row)

So the degree of the dependency is just ~0.333 although it's obviously 
a perfect dependency, i.e. a knowledge of 'a' determines 'b'. The 
reason is that we discard 2/3 of rows, because those groups are only a 
single row each, except for the one large group (1/3 of rows).


Just for me to follow the comments better. Is "dependency" roughly the 
same as when statisticians speak about " conditional probability"?


Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] few fts functions for jsonb

2017-04-03 Thread Sven R. Kunze

On 03.04.2017 21:30, Andrew Dunstan wrote:

On 04/03/2017 02:44 PM, Sven R. Kunze wrote:

On 01.04.2017 22:20, Andrew Dunstan wrote:

I added documentation when I committed it for the new functions, in the
FTS section. I'm not sure what we need to add to the JSON section if
anything.

Not sure, if this is related but the formatting of
https://www.postgresql.org/docs/devel/static/functions-textsearch.html
looks a bit strange.

Just 2 questions/notes:
1) in what order are the values of the JSON extracted?

In the order they exist in the underlying document.


Just asking as the order can have implications for fulltext searches. 
So, might be valuable for the docs.



Are these documents equally ordered in this sense?

srkunze=# select '{"a": "abc", "b": "def"}'::jsonb;
  jsonb
--
 {"a": "abc", "b": "def"}
(1 row)

srkunze=# select '{"b": "def", "a": "abc"}'::jsonb;
  jsonb
--
 {"a": "abc", "b": "def"}
(1 row)


Also what about non-ascii keys? Are they ordered by the default locale 
of the PostgreSQL cluster (say de_DE.utf-8)?



2) Regarding the additional line:
to_tsvector([ config regconfig , ] document json(b))tsvector
reduce document text to tsvectorto_tsvector('english', '{"a": "The
Fat Rats"}'::json)'fat':2 'rat':3

Maybe change "reduce document text to tsvector" to "extracting JSON
values  and reduce to tsvector"?




OK, I will do something along those lines.

cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] few fts functions for jsonb

2017-04-03 Thread Sven R. Kunze

On 01.04.2017 22:20, Andrew Dunstan wrote:

I added documentation when I committed it for the new functions, in the
FTS section. I'm not sure what we need to add to the JSON section if
anything.


Not sure, if this is related but the formatting of 
https://www.postgresql.org/docs/devel/static/functions-textsearch.html 
looks a bit strange.


Just 2 questions/notes:
1) in what order are the values of the JSON extracted?

2) Regarding the additional line:
to_tsvector([ config regconfig , ] document json(b))tsvector reduce 
document text to tsvectorto_tsvector('english', '{"a": "The Fat 
Rats"}'::json)'fat':2 'rat':3


Maybe change "reduce document text to tsvector" to "extracting JSON 
values  and reduce to tsvector"?



Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-13 Thread Sven R. Kunze

On 13.03.2017 07:24, Nico Williams wrote:

On Thu, Mar 09, 2017 at 07:12:07PM +0100, Sven R. Kunze wrote:

 From my day-to-day work I can tell, the date(time) type is the only missing
piece of JSON to make it perfect for business applications (besides, maybe,
a "currency" type).

And a binary type.  And a chunked-string type (to avoid having to escape
strings).  And an interval type.  And...


YMMV but I tend to say that those aren't the usual types of a business 
application where I come from.


Answering questions like "how many" (integer), "what" (text) and "when" 
(date) is far more common than "give me that binary blob" at least in 
the domain where I work. Never had the necessity for an interval type; 
usually had a start and end value where the "interval" was derived from 
those values.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-13 Thread Sven R. Kunze

On 10.03.2017 20:28, Josh Berkus wrote:

On 03/09/2017 10:12 AM, Sven R. Kunze wrote:


SaltStack uses YAML for their tools, too. I personally can empathize
with them (as a user of configuration management) about this as writing
JSON would be nightmare with all the quoting, commas, curly braces etc.
But that's my own preference maybe.

Yes, but automated tools can easily convert between JSON and
newline-delimited YAML and back.


Sure. That wasn't point, though.


Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-10 Thread Sven R. Kunze

On 10.03.2017 05:07, Petr Jelinek wrote:

The original complain was about JSON_VALUE extracting date but I don't
understand why there is problem with that, the SQL/JSON defines that
behavior. The RETURNING clause there is more or less just shorthand for
casting with some advanced options.


Thanks for clarifying. I mistook it as if JSON_VALUE itself returns a 
date value.


Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] adding an immutable variant of to_date

2017-03-09 Thread Sven R. Kunze

On 08.03.2017 03:37, Robert Haas wrote:

The commit message for 64353640e87b54250f1b8a1d2a708d270dff4bfd has
some interesting perspective on this.


"""
Also, mark to_date() and to_char(interval) as stable; although these appear
not to depend on any GUC variables as of CVS HEAD, that seems a property
unlikely to survive future improvements.  It seems best to mark all the
formatting functions stable and be done with it.
"""

My take away from this commit is the following:

Historically, the immutability of "to_date(text, text)" was an emergent 
feature. Proven to be possibly mutable, the parsing feature had a higher 
priority, so the immutability needed to be removed.


The proposed variant on the other hand should be immutable first before 
everything else. Thus, future improvements cannot violate that. This 
might restrict the possible parsing functionality but that shouldn't be 
a problem in practice.


Regards,
Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] adding an immutable variant of to_date

2017-03-09 Thread Sven R. Kunze

Thanks!

On 08.03.2017 17:37, Andreas Karlsson wrote:
The current code for to_char will on the first call to to_char build 
arrays with the localized names of the week days and the months. I 
suspect that you may need to build something similar but a set of 
arrays per locale.


Not sure what the most C-like way of solving this is. Coming from a 
Python background, I would tend to use a dict. The keys are the locales, 
the values are the arrays.


I would use something which is described here: 
http://stackoverflow.com/questions/3269881/how-to-represent-a-python-like-dictionary-in-c 
. Preferred libc-related, like hcreate and hsearch.



See the DCH_to_char function and its call to cache_locale_time.


Alright. Found the cache here:

 src/backend/utils/adt/pg_locale.c
/* lc_time localization cache */
char   *localized_abbrev_days[7];
char   *localized_full_days[7];
char   *localized_abbrev_months[12];
char   *localized_full_months[12];

/*  */

void
cache_locale_time(void)
{
char   *save_lc_time;
time_t  timenow;
struct tm  *timeinfo;
int i;

#ifdef WIN32
char   *save_lc_ctype;
#endif

/* did we do this already? */
if (CurrentLCTimeValid)
return;

I would replace the invalidation check with a hsearch lookup.



Grepping for the cache variables, I found:

~/src/postgres$ grep -r localized_abbrev_days *
src/backend/utils/adt/pg_locale.c:char *localized_abbrev_days[7];
src/backend/utils/adt/pg_locale.c: 
cache_single_time(&localized_abbrev_days[i], "%a", timeinfo);
src/backend/utils/adt/formatting.c:char *str = 
str_toupper_z(localized_abbrev_days[tm->tm_wday], collid);
src/backend/utils/adt/formatting.c:char *str = 
str_initcap_z(localized_abbrev_days[tm->tm_wday], collid);
src/backend/utils/adt/formatting.c:char *str = 
str_tolower_z(localized_abbrev_days[tm->tm_wday], collid);

src/include/utils/pg_locale.h:extern char *localized_abbrev_days[];

~/src/postgres$ grep -r localized_full_days *
src/backend/utils/adt/pg_locale.c:char *localized_full_days[7];
src/backend/utils/adt/pg_locale.c: 
cache_single_time(&localized_full_days[i], "%A", timeinfo);
src/backend/utils/adt/formatting.c:char *str = 
str_toupper_z(localized_full_days[tm->tm_wday], collid);
src/backend/utils/adt/formatting.c:char *str = 
str_initcap_z(localized_full_days[tm->tm_wday], collid);
src/backend/utils/adt/formatting.c:char *str = 
str_tolower_z(localized_full_days[tm->tm_wday], collid);

src/include/utils/pg_locale.h:extern char *localized_full_days[];

~/src/postgres$ grep -r localized_abbrev_months *
src/backend/utils/adt/pg_locale.c:char *localized_abbrev_months[12];
src/backend/utils/adt/pg_locale.c: 
cache_single_time(&localized_abbrev_months[i], "%b", timeinfo);
src/backend/utils/adt/formatting.c:char *str = 
str_toupper_z(localized_abbrev_months[tm->tm_mon - 1], collid);
src/backend/utils/adt/formatting.c:char *str = 
str_initcap_z(localized_abbrev_months[tm->tm_mon - 1], collid);
src/backend/utils/adt/formatting.c:char *str = 
str_tolower_z(localized_abbrev_months[tm->tm_mon - 1], collid);

src/include/utils/pg_locale.h:extern char *localized_abbrev_months[];

srkunze@mexico:~/src/postgres$ grep -r localized_full_months *
src/backend/utils/adt/pg_locale.c:char *localized_full_months[12];
src/backend/utils/adt/pg_locale.c: 
cache_single_time(&localized_full_months[i], "%B", timeinfo);
src/backend/utils/adt/formatting.c:char *str = 
str_toupper_z(localized_full_months[tm->tm_mon - 1], collid);
src/backend/utils/adt/formatting.c:char *str = 
str_initcap_z(localized_full_months[tm->tm_mon - 1], collid);
src/backend/utils/adt/formatting.c:char *str = 
str_tolower_z(localized_full_months[tm->tm_mon - 1], collid);

src/include/utils/pg_locale.h:extern char *localized_full_months[];


It seems to me, that I would need to make them point to the correct 
array address at the beginning of those functions by doing an hsearch.


That step is basically independent of the actual immutable to_date idea.


What do you think?

Best,
Sven



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-09 Thread Sven R. Kunze

On 09.03.2017 19:50, Peter van Hardenberg wrote:
Anecdotally, we just stored dates as strings and used a convention 
(key ends in "_at", I believe) to interpret them. The lack of support 
for dates in JSON is well-known, universally decried... and not a 
problem the PostgreSQL community can fix.


I completely agree here.

Regards,
Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-09 Thread Sven R. Kunze

On 09.03.2017 18:58, Robert Haas wrote:

Also, even if the superset thing were true on a theoretical plane, I'm
not sure it would do us much good in practice.  If we start using
YAML-specific constructs, we won't have valid JSON any more.  If we
use only things that are legal in JSON, YAML's irrelevant.


That's true. I just wanted to share my view of the "date guessing" part 
of pgpro's commits.
I don't have a good solution for it either, I can only tell that where I 
work we do have same issues: either we guess by looking at the string 
value or we know that "this particular key" must be a date.
Unsatisfied with either solution, we tend to use YAML for our APIs if 
possible.


Regards,
Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-09 Thread Sven R. Kunze

On 08.03.2017 20:52, Magnus Hagander wrote:
On Wed, Mar 8, 2017 at 11:48 AM, Peter van Hardenberg > wrote:


Small point of order: YAML is not strictly a super-set of JSON.

Editorializing slightly, I have not seen much interest in the
world for YAML support though I'd be interested in evidence to the
contrary.


The world of configuration management seems to for some reason run off 
YAML, but that's the only places I've seen it recently (ansible, 
puppet etc).


SaltStack uses YAML for their tools, too. I personally can empathize 
with them (as a user of configuration management) about this as writing 
JSON would be nightmare with all the quoting, commas, curly braces etc. 
But that's my own preference maybe.


(Btw. does "run off" mean like or avoid? At least my dictionaries tend 
to the latter.)


That said if we're introducing something new, it's usually better to 
copy from another format than to invite your own.


From my day-to-day work I can tell, the date(time) type is the only 
missing piece of JSON to make it perfect for business applications 
(besides, maybe, a "currency" type).


Regards,
Sven


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-09 Thread Sven R. Kunze

On 08.03.2017 20:48, Peter van Hardenberg wrote:

Small point of order: YAML is not strictly a super-set of JSON.


I haven't read the whole standard, but from what I can see the standard 
considers JSON an official subset of itself: 
http://www.yaml.org/spec/1.2/spec.html


Regards,
Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-03-07 Thread Sven R. Kunze

Hi,

about the datetime issue: as far as I know, JSON does not define a 
serialization format for dates and timestamps.


On the other hand, YAML (as a superset of JSON) already supports a 
language-independent date(time) serialization format 
(http://yaml.org/type/timestamp.html).


I haven't had a glance into the SQL/JSON standard yet and a quick search 
didn't reveal anything. However, reading your test case here 
https://github.com/postgrespro/sqljson/blob/5a8a241/src/test/regress/sql/sql_json.sql#L411 
it seems as if you intend to parse all strings in the form of 
"-MM-DD" as dates. This is problematic in case a string happens to 
look like this but is not intended to be a date.


Just for the sake of completeness: YAML solves this issue by omitting 
the quotation marks around the date string (just as JSON integers have 
no quotations marks around them).


Regards,
Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] adding an immutable variant of to_date

2017-03-07 Thread Sven R. Kunze

On 07.03.2017 03:21, Andreas Karlsson wrote:
1) I do not think we currently allow setting the locale like this 
anywhere, so this will introduce a new concept to PostgreSQL. And you 
will probably need to add support for caching per locale.


Good to know. Could you explain what you mean by "caching per locale"?

2) As far as I can tell from reading the code to_date currently 
ignores the M suffix which indicates that you want localized month/day 
names, so i guess that to_date is currently immutable. Maybe it is 
stable due to the idea that we may want to support the M suffix in the 
future.


I think that's the case.

3) I do not like the to_date function. It is much too forgiving with 
invalid input. For example 2017-02-30 because 2017-03-02.


That's indeed a funny parsing result. Why does to_date perform this kind 
of error-correction?



Also just ignoring the M suffix in the format string seems pretty bad.

Personally I would rather see a new date parsing function which is 
easier to work with or somehow fix to_date without pissing too many 
users off, but I have no idea if this is a view shared with the rest 
of the community.


Neither do I.

Many business applications have to deal with date(times). I came from 
the practical issue of indexing json objects.


Regards,
Sven



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] adding an immutable variant of to_date

2017-03-03 Thread Sven R. Kunze

Hello everybody,

following up on this thread: 
https://www.postgresql.org/message-id/flat/d297e048-ac49-9bed-32e3-9dd4e65d0978%40mail.de
specifically on this mail: 
https://www.postgresql.org/message-id/baef819f-acf0-a64d-c1eb-d2c5da1e5030%40mail.de
I hope this idea fulfills the requirements. So let's see what you have 
to say.


to_date is not an immutable function but the best way to parse dates 
besides casting. Unfortunately, creating an functional index with 
to_date or casting will fail with an error.


Researching the topic in the past and currently, the situation hasn't 
improved. Many SO solutions or blog posts suggest to roll your own UDF 
to wrap up an mutable expression and marking it immutable by doing so. 
This approach has several disadvantages:


1) unawareness of the locale and timezone issues date/timestamp parsing 
involves
2) hiding errors in production when environment variables are 
missing/set differently than on dev machines

3) many re-implementations of the same problem

I consider 1) the biggest issue. Developers need to be aware of these 
kinds of issues to solve them properly. This is especially problematic 
when defining functional indexes where you don't have access to the 
actual date values.


For this purpose, I would like to adhere to the PostgreSQL roadmap and 
"scratch my own itch" by implementing the following solution which I 
think best solves the issue at hand:


** Idea **

An additional, immutable variant of to_date with a third parameter to 
specify the locale by which the date should be parsed. The additional 
documentation line 
(https://www.postgresql.org/docs/9.6/static/functions-formatting.html) 
would look like this:


to_date(text, text, text)dateconvert string to date given a 
specific localeto_date('05 Dez 2000', 'DD Mon ', 'de_DE')



I think this approach has the following advantages:

1) a single, recommended and safe way to parse dates
2) make people aware of the locale/timezone issue but give them a 
standard tool to solve it
3) eventually make all those and related Google entries 
(https://www.google.de/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=postgresql+to_date+immutable&*) 
point to the same and safe solution


** Rejected Ideas **
Geoff suggest to simply mark to_date as stable and being able to parse 
dates by trying all locales. Two arguments work against this ideas: 1) 
it would create prohibitively large lookup tables and 2) some mappings 
are ambiguous: 
https://www.postgresql.org/message-id/aba44f78-2b84-e752-9b6f-3784bd0f981c%40mail.de


Another suggestion from Geoff >>format string with an additional locale 
"{locale=en_US}"<< couldn't fly as well because a function cannot be set 
immutable for some inputs.



What do you think?


Regards,
Sven


Re: [HACKERS] Indirect indexes

2016-10-22 Thread Sven R. Kunze

On 21.10.2016 22:54, Jim Nasby wrote:

On 10/21/16 2:48 PM, Sven R. Kunze wrote:



You don't need that limitation (and vacuum will be simpler) if you add

the PK as another key, akin to:


CREATE INDIRECT INDEX idx ON tab (a, b, c);

turns into

CREATE INDEX idx ON tab (a, b, c, pk);



I know I am late to this point but I wanted to present my mere user's
point of view.

First I liked it, as does not introduce yet another syntax to learn.


I believe you mis-understood what Claudio was saying. He's not
suggesting an index with the PK on the end magically becomes an indirect
index; he was saying that a "simple" way to overcome the 6 byte index
TID limitation would be to store the PK as part of the index key. He
used existing DDL to illustrate that, but that was just for
illustration, not how this would actually be implemented.


Alright. Thanks for clarifying. :)

Cheers,
Sven


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Indirect indexes

2016-10-21 Thread Sven R. Kunze

On 2016-10-18 20:04:32, Claudio Freire wrote:

> You don't need that limitation (and vacuum will be simpler) if you add
the PK as another key, akin to:
>
> CREATE INDIRECT INDEX idx ON tab (a, b, c);
>
> turns into
>
> CREATE INDEX idx ON tab (a, b, c, pk);


I know I am late to this point but I wanted to present my mere user's 
point of view.


First I liked it, as does not introduce yet another syntax to learn. 
However, after following the discussion, I see that indirect indexes 
have their disadvantages/slowdowns as well. If adding "pk" to the end of 
the column list just converts the index to an indirect index, I am 
unable to use a direct index which might be better in certain cases.


So, from a "dumb" user's point of view, I wonder if PostgreSQL can make 
the right decision of direct/indirect reliably (which would be great). 
And if not, what would be the alternatives? Introducing CREATE DIRECT INDEX?


Cheers,
Sven

PS: I mot saying I would be affected by this but IIRC we have (..., pk) 
indexes in production which then would be converted to indirect ones. 
But I cannot tell whether indirect indexes would do good or harm there.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers