Re: JSON query

2021-10-18 Thread Scott Ribe
> On Oct 18, 2021, at 10:02 PM, David G. Johnston  
> wrote:
> 
> (jsonb - text[]) = ‘{}’::jsonb …?

Aha, thank you!





JSON query

2021-10-18 Thread Scott Ribe
What's a good way to query jsonb column for

"no keys other than those in this list of keys"

in other words

"containing only keys from this list of keys"

--
Scott Ribe
scott_r...@elevated-dev.com
https://www.linkedin.com/in/scottribe/







Re: export to parquet

2020-08-26 Thread Scott Ribe
> On Aug 26, 2020, at 1:11 PM, Chris Travers  wrote:
> 
> For simple exporting, the simplest thing is a single-node instance of Spark.

Thanks.

> You can read parquet files in Postgres using 
> https://github.com/adjust/parquet_fdw if you so desire but it does not 
> support writing as parquet files are basically immutable.

Yep, that's the next step. Well, really it is what I am interested in testing, 
but first I need my data in parquet format (and confirmation that it gets 
decently compressed).



export to parquet

2020-08-26 Thread Scott Ribe
I have no Hadoop, no HDFS. Just looking for the easiest way to export some PG 
tables into Parquet format for testing--need to determine what kind of space 
reduction we can get before deciding whether to look into it more.

Any suggestions on particular tools? (PG 12, Linux)


--
Scott Ribe
scott_r...@elevated-dev.com
https://www.linkedin.com/in/scottribe/







Re: privileges oddity

2020-08-07 Thread Scott Ribe
So, one last follow-up, perhaps \du or \du+ should show when a role is mapped 
that way. If I'd seen a clue to this setting that had been made "before I got 
here" it would have been figured out sooner.

I realize ALTER ROLE... SET... can be used to set many more defaults, and there 
could be some debate about how much to display with \du[+], but the fact that a 
role abandons all its privs and adopts a different set seems like pretty 
important info to surface ;-)



Re: privileges oddity

2020-08-07 Thread Scott Ribe
> On Aug 7, 2020, at 1:32 PM, Tom Lane  wrote:
> 
> Yes, you are.  It looks like what you actually issued is
> 
> ALTER USER akanzler SET role confidential_read_only;
> 
> but that would have the effect that subsequent session starts would
> automatically do "SET ROLE confidential_read_only".

AHA! This is the correct answer, and it is solved now.

I *know* I executed grant role properly--it's right there in the script.

However, I think that SET ROLE had been accidentally misused instead of GRANT 
 at some point in the past, *AND* that the role contained privs to the 
user-specific schemas when it should not have. So revoking all privs from the 
role and adding back the proper ones resulted in inadvertently removing privs 
from users who'd properly had them explicitly granted.

(I've never even used SET ROLE and was unaware you could even do that!)

Anyway, thanks a million for being patient and sticking with this.



Re: privileges oddity

2020-08-07 Thread Scott Ribe
> On Aug 7, 2020, at 1:08 PM, Adrian Klaver  wrote:
> 
> "Using this command, it is possible to either add privileges or restrict 
> one's privileges. If the session user role has the INHERIT attribute, then it 
> automatically has all the privileges of every role that it could SET ROLE to; 
> in this case SET ROLE effectively drops all the privileges assigned directly 
> to the session user and to the other roles it is a member of, leaving only 
> the privileges available to the named role. On the other hand, if the session 
> user role has the NOINHERIT attribute, SET ROLE drops the privileges assigned 
> directly to the session user and instead acquires the privileges available to 
> the named role.
> "

So it would only have removed privs if I had used set role in the session, 
which I am not.






Re: privileges oddity

2020-08-07 Thread Scott Ribe
> On Aug 7, 2020, at 12:45 PM, Tom Lane  wrote:
> 
> If I'm reading this correctly, you have set things up so that any
> session logging in as akanzler will immediately do "SET ROLE
> confidential_read_only", after which it's the privileges of that
> role not akanzler that determine what happens.

YES, confidential_read_only has privs on everything *except* individual user's 
schemas, and rolinherit was accidentally set, that would certainly seem to be 
the problem. But I turned that off, and it still doesn't work--even in a new 
connection.





Re: privileges oddity

2020-08-07 Thread Scott Ribe
On Aug 7, 2020, at 12:27 PM, Adrian Klaver  wrote:
> 
> So what privileges does role 'confidential_read_only' have?

read on everything

I tried creating a new user without it, just doing the same grants otherwise as 
for akanzler, that worked. Then I added that user to confidential_read_only, 
still worked.





Re: privileges oddity

2020-08-07 Thread Scott Ribe
> On Aug 7, 2020, at 12:17 PM, Adrian Klaver  wrote:
> 
> Well if this for the same line as before it represents table privileges. The 
> problem is with schema access. Continuing grasping at straws:
> 
> select * from pg_roles where rolname = 'aakanzler';

rolname  | rolsuper | rolinherit | rolcreaterole | rolcreatedb | rolcanlogin | 
rolreplication | rolconnlimit | rolpassword | rolvaliduntil | rolbypassrls |
  rolconfig   |  oid
--+--++---+-+-++--+-+---+--+---+---
akanzler | f| t  | f | f   | t   | 
f  |   -1 | |   | f| 
{role=confidential_read_only} | 16391




Re: privileges oddity

2020-08-07 Thread Scott Ribe
> On Aug 7, 2020, at 11:31 AM, Scott Ribe  wrote:
> 
> Wondering if there's a code path somewhere that lets the default take 
> precedence???

So, I changed the defaults, now I see akanzler=arwdDxt/srv_risk, problem 
persists






Re: privileges oddity

2020-08-07 Thread Scott Ribe
> 
> What happens if you do?:
> 
> select has_schema_privilege('akanzler', 'zoewang', 'usage');

risk_oltp_prod=# select has_schema_privilege('akanzler', 'zoewang', 'usage');
has_schema_privilege
--
t
(1 row)

> In psql what does
> 
> \ddp
> 
> show?

risk_oltp_prod=# \ddp
Default access privileges
Owner  |  Schema  | Type  |   Access privileges
+--+---+
...
srv_risk   | zoewang  | table | akanzler=r/srv_risk   +
   |  |   | srv_risk=arwdD/srv_risk
...
(40 rows)


Wondering if there's a code path somewhere that lets the default take 
precedence???



Re: privileges oddity

2020-08-07 Thread Scott Ribe
Further update:

create a new user, grant all on schema & the table, works

reboot of server did not change anything, so the problem is in persistent state





Re: privileges oddity

2020-08-06 Thread Scott Ribe
FYI, REVOKE ALL ON SCHEMA... followed by GRANT ALL ON SCHEMA... did not change 
anything






Re: privileges oddity

2020-08-06 Thread Scott Ribe
> On Aug 6, 2020, at 12:53 PM, Stephen Frost  wrote:
> 
> Are you 110% sure that you're actually connecting to the same instance
> in both cases (I'd say database too, but hopefully psql isn't lying to
> you about that on your prompt, but maybe double-check anyway...).

yes--double checked

> Have you re-tried from the app (maybe someone fixed it in the
> meantime)?

still seeing it from psql

\




Re: privileges oddity

2020-08-06 Thread Scott Ribe
> On Aug 6, 2020, at 12:38 PM, Tom Lane  wrote:
> 
> Hmph.  Any chance of getting a stack trace from the point of the error?

possibly

> Also, which PG version is this?

12.3

It is probably relevant that we cleaned up roles & privs yesterday, lots of 
REVOKE & GRANT, and some DROP ROLE. I started out thinking I'd made a mistake 
with that, but now I'm starting to wonder if there's a bug I hit in some of 
that which fubar'd something in system catalog...

Maybe also relevant that the original creator & owner of the schema (zoewang) 
was dropped after schema owner was changed srv_risk. (Likewise, changed 
ownership of tables in the schema...)





Re: privileges oddity

2020-08-06 Thread Scott Ribe
> On Aug 6, 2020, at 12:36 PM, Adrian Klaver  wrote:
> 
> No triggers or FOREIGN KEYS?

No. No keys or indexes either--that was the entire table def.



Re: privileges oddity

2020-08-06 Thread Scott Ribe
On Aug 6, 2020, at 12:22 PM, Tom Lane  wrote:
> 
> Gonna need more context.  The session-level user seems to have the
> right privileges, but maybe something is happening inside a
> security-definer function that doesn't have privileges?

The only security definer function in the db is a simple pg_shadow lookup used 
by pgbouncer.

Hmm, I should check both direct to PG and through PG bouncer--even though he is 
getting connected as the correct user, per PG's error in the log.





Re: privileges oddity

2020-08-06 Thread Scott Ribe
> On Aug 6, 2020, at 12:22 PM, Adrian Klaver  wrote:
> 
> Schema for the table?


Nothing relevant:

  Column| Type  | Collation | Nullable | Default
-+---+---+--+-
curve_name  | character varying(30) |   |  |
curve_type  | character varying(15) |   |  |
tenor_name  | character varying(10) |   |  |
tenor_date  | date  |   |  |
value_date  | date  |   |  |
curve_value | numeric   |   |  |






privileges oddity

2020-08-06 Thread Scott Ribe
when user akanzler tries to run query "SELECT * FROM zoewang.sometable...", it 
triggers this error:

2020-08-06 17:27:27.664 UTC [15914]: [3] 
user=akanzler,db=risk_oltp_prod,app=[unknown],client=10.8.170.24: ERROR:  
permission denied for schema zoewang at character 15

--- YET ---

risk_oltp_prod=# \dn+ zoewang
 List of schemas
 Name   |  Owner   |   Access privileges| Description
-+--++-
zoewang | srv_risk | srv_risk=UC/srv_risk  +|
|  | akanzler=UC/srv_risk  +|
|  | srv_risk_ro=U/srv_risk |
(1 row)

HUH? (And the user also has all privs on all the tables in the schema...)

--
Scott Ribe
scott_r...@elevated-dev.com
https://www.linkedin.com/in/scottribe/







Re: is JIT available

2020-07-28 Thread Scott Ribe
> On Jul 27, 2020, at 9:33 PM, Tom Lane  wrote:
> 
> The general, non-hacker meaning of "jit is enabled" would seem to
> be pretty much what this function is already doing; and for that
> matter, the same can be said for "JIT compilation is available".
> We need something that's less tautological-looking.  Maybe along
> the lines of

My problem was that it says "is enabled", then calls out just one of the 
conditions for it to be available, but not the other one. Either calling out no 
conditions, or all of them, would be more clear.



Re: is JIT available

2020-07-28 Thread Scott Ribe
> On Jul 27, 2020, at 6:04 PM, David Rowley  wrote:
> 
> "returns true if jit is enabled and JIT compilation is available in
> this session (see Chapter 31)."

That is clearer. I didn't submit a suggestion myself because I'm not clear on 
the actual circumstances. I know it won't be available if:

- jit is not on in config
- PG was not compiled with JIT support

But does compilation with JIT enable and LLVM dev tools mean that all the LLVM 
compilation/optimization is built into the PG binaries, or does it require LLVM 
presence on the machine where deployed? And if so, does the function take that 
into account as well?

I would guess the function is telling the truth under all circumstances, but I 
don't know for sure.

Perhaps: "returns true if JIT (see Chapter 31) is available in this session. 
Availability of JIT requires that PG was compiled with JIT support, JIT is 
enabled in config, .



Re: bad JIT decision

2020-07-27 Thread Scott Ribe
> On Jul 27, 2020, at 4:00 PM, Alvaro Herrera  wrote:
> 
> I don't quite understand why is it that a table with 1000 partitions
> means that JIT compiles the thing 1000 times.  Sure, it is possible that
> some partitions have a different column layout, but it seems an easy bet
> that most cases are going to have identical column layout, and so tuple
> deforming can be shared.  (I'm less sure about sharing a compile of an
> expression, since the varno would vary. But presumably there's a way to
> take the varno as an input value for the compiled expr too?)  Now I
> don't actually know how this works so please correct if I misunderstand
> it.

I'm guessing it's because of inlining. You could optimize a function that takes 
parameters, no problem. But what's happening is inlining, with parameters, then 
optimizing.





Re: is JIT available

2020-07-27 Thread Scott Ribe
> On Jul 25, 2020, at 8:02 AM, Christoph Moench-Tegeder  
> wrote:
> 
> ## Scott Ribe (scott_r...@elevated-dev.com):
> 
>> So JIT is enabled in your conf, how can you tell from within a client
>> session whether it's actually available (PG compiled with it and
>> compiler available)?
> 
> pg_jit_available()  boolean  is JIT compilation available in this session
> 
> https://www.postgresql.org/docs/12/functions-info.html


Thanks, that seems to be exactly what I was looking for.

Even though the documentation is not clear, it does return false when jit = on 
but PG was not compiled with JIT.





Re: is JIT available

2020-07-25 Thread Scott Ribe
> On Jul 25, 2020, at 6:21 AM, Pavel Stehule  wrote:
> 
> It shows if Postgres was compiled with JIT support.
> 
> When you run EXPLAIN ANALYZE SELECT ... then you can see info about JIT 
> overhead. If you don't see notices about JIT in EXPLAIN, then JIT was not 
> used.

The presence of "jit = on" in the config file does not indicate whether the 
running PG was actually compiled with JIT. And I'm not sure whether beyond 
that, compiling with JIT requires presence of anything outside the PG install.



Re: is JIT available

2020-07-25 Thread Scott Ribe
> On Jul 24, 2020, at 9:55 PM, Pavel Stehule  wrote:
> 
> SELECT * FROM pg_config;

That doesn't tell me whether or not it can actually be used.





is JIT available

2020-07-24 Thread Scott Ribe
So JIT is enabled in your conf, how can you tell from within a client session 
whether it's actually available (PG compiled with it and compiler available)?

(In the other discussion I started, doing a dump and import of just the tables 
involved, onto a system where JIT was inadvertently not working for some 
reason, lead me down a dead end for a bit.)

--
Scott Ribe
scott_r...@elevated-dev.com
https://www.linkedin.com/in/scottribe/







Re: bad JIT decision

2020-07-24 Thread Scott Ribe
> On Jul 24, 2020, at 4:37 PM, Tom Lane  wrote:
> 
> Yeah.  I'm fairly convinced that the v12 defaults are far too low,
> because we are constantly seeing complaints of this sort.


They are certainly too low for our case; not sure if for folks who are not 
partitioning if they're way too low.

The passive-aggressive approach would really not be good general advice for 
you, but I'm actually glad that in our case they were low enough to get our 
attention early ;-)

I think I will disable optimization, because with our partitioning scheme we 
will commonly see blow ups of optimization time like this one.

The inlining time in this case is still much more than the query, but it is low 
enough to not be noticed by users, and I think that with different variations 
of the parameters coming in to the query, that for the slower versions (more 
partitions requiring actual scans), inlining will help. Slowing down the 
fastest while speeding up the slower is a trade off we can take.



Re: bad JIT decision

2020-07-24 Thread Scott Ribe
> On Jul 24, 2020, at 4:26 PM, David Rowley  wrote:
> 
> It does not really take into account the cost of jitting.

That is what I was missing.

I read about JIT when 12 was pre-release; in re-reading after my post I see 
that it does not attempt to estimate JIT cost. And in thinking about it, I 
realize that would be next to impossible to anticipate how expensive LLVM 
optimizstion was going to be.

In the case where a set of functions is replicated across partitions, it would 
be possible to do them once, then project the cost of the copies. Perhaps for 
PG 14 as better support for the combination of JIT optimization and 
highly-partitioned data ;-)






bad JIT decision

2020-07-24 Thread Scott Ribe
I have come across a case where PG 12 with default JIT settings makes a 
dramatically bad decision. PG11 without JIT, executes the query in <1ms, PG12 
with JIT takes 7s--and explain analyze attributes all that time to JIT. (The 
plan is the same on both 11 & 12, it's just the JIT.) 

It is a complex query, with joins to subqueries etc; there is a decent amount 
of data (~50M rows), and around 80 partitions (by date) on the main table. The 
particular query that I'm testing is intended as a sort of base case, in that 
it queries on a small set (4) of unique ids which will not match any rows, thus 
the complex bits never get executed, and this is reflected in the plan, where 
the innermost section is:

->  Index Scan using equities_rds_id on equities e0  (cost=0.42..33.74 rows=1 
width=37) (actual time=6751.892..6751.892 rows=0 loops=1)
   Index Cond: (rds_id = ANY ('{..., ..., ..., ...}'::uuid[]))
   Filter: (security_type = 'ETP'::text)
   Rows Removed by Filter: 4

And that is ultimately followed by a couple of sets of 80'ish scans of 
partitions, which show never executed, pretty much as expected since there are 
no rows left to check. The final bit is:

JIT:
  Functions: 683
  Options: Inlining true, Optimization true, Expressions true, Deforming true
  Timing: Generation 86.439 ms, Inlining 21.994 ms, Optimization 3900.318 ms, 
Emission 2561.409 ms, Total 6570.161 ms

Now I think the query is not so complex that there could possibly be 683 
distinct functions. I think this count must be the result of a smaller number 
of functions created per-partition. I can understand how that would make sense, 
and some testing in which I added conditions that would restrict the matches to 
a single partition seem to bear it out (JIT reports 79 functions in that case).

Given the magnitude of the miss in using JIT here, I am wondering: is it 
possible that the planner does not properly take into account the cost of 
JIT'ing a function for multiple partitions? Or is it that the planner doesn't 
have enough info about the restrictiveness of conditions, and is therefore 
anticipating running the functions against a great many rows?

--
Scott Ribe
scott_r...@elevated-dev.com
https://www.linkedin.com/in/scottribe/







Re: query, probably needs window functions

2020-05-23 Thread Scott Ribe
> On May 22, 2020, at 1:37 PM, Michael Lewis  wrote:
> 
> I believe something like this is what you want. You might be able to do it 
> without a sub-query by comparing the current name value to the lag value and 
> null it out if it's the same.
> ...

Thanks, that's what I needed! (And better than using lag in my case because 
there's 4 tables in the joins and many more columns involved. The repetition of 
"case when lag(...)..." would be really noisy, but it's good to know of that 
possibility anyway.)

One correction, just for posterity if someone else searches this question, the 
answer was missing "over", should have been:

select
case when row_number = 1 then id end AS id, 
case when row_number = 1 then name end as name,
phone.number
from(
  select person.id, person.name, phone.number, 
row_number() over partition by( phone.person_id order by phone.number ) as 
row_number
  from person
  join phone on person.id = phone.person_id
) AS sub
order by name, row_number;



query, probably needs window functions

2020-05-22 Thread Scott Ribe
given, let's say: 

create table person (id int not null, name varchar);
create table phone (id int not null, person_id int not null, number varchar);

select person.*, phone.number from person join phone on (person.id = 
phone.person_id) order by...

How would you get results where only the first row for a person was filled in, 
with rest of that person's phones showing blanks for those columns? I'm 
guessing that window functions provide this capability, but I don't know how.

--
Scott Ribe
scott_r...@elevated-dev.com
https://www.linkedin.com/in/scottribe/







Re: parameter limit

2020-04-23 Thread Scott Ribe
> On Apr 23, 2020, at 8:48 AM, Adrian Klaver  wrote:
> 
> No.

Ah, thanks a lot for that info.

It's not that I really normally want to have bazillions of params.

But we have some moderately large inserts, that are showing a sudden non-linear 
dropoff when scaling up, and a suspicion that the performance dropoff is NOT 
actually in PG. So, lots of spelunking...



parameter limit

2020-04-23 Thread Scott Ribe
In libpq, PQexecParams has nParams as type int. So on any reasonable platform, 
that's at least 4 bytes. My question then is: when I see documented limits of 
65535 params in various drivers and libraries, that is NOT a restriction of 
libpq nor of the protocol, but rather an arbitrary limit of the driver/library, 
correct?

--
Scott Ribe
scott_r...@elevated-dev.com
https://www.linkedin.com/in/scottribe/







logical replication protocol

2019-12-24 Thread Scott Ribe
I haven't been able to find documentation on the actual messages used in the 
logical replication protocol ('k' & 'w', lower case). I've figured things out 
mostly by reading pg_recvlogical.c, but "Read The Fine Source" doesn't seem in 
line with the way PG usually does it ;-)

Did I miss a doc somewhere in my searches???