Re: Increased storage size of jsonb in pg15

2023-12-29 Thread Junwang Zhao
On Fri, Dec 29, 2023 at 4:47 AM Adrian Klaver  wrote:
>
> On 12/28/23 12:41, Sean Flaherty wrote:
> > I'm rather new to the mailing list, are there any additional steps I
> > should take (i.e. posting to pgsql-hackers, etc.)?
> >
>
> For what purpose? You are seeing differences in compression strategies
> between lz4 and pglz. The 'fix' would be to go back to pglz.

Agreed, lz4 is known for its high compression speed, but lower
compression ratio, this is the trade off one should bear in mind.

>
> >
> > On Thu, Dec 28, 2023 at 11:23 AM Adrian Klaver
> > mailto:adrian.kla...@aklaver.com>> wrote:
> >
> > On 12/28/23 09:13, Sean Flaherty wrote:
> >  > Follow-up:
> >  > Working with AWS, we found that starting in RDS Postgres 15, the
> >  > default_toast_compression parameter is set to use lz4 compression
> >  > instead of pglz.  This resulted in the increased json storage
> > size we
> >  > were seeing.
> >  >
> >  > I have been able to reproduce the increased storage size on RDS
> > Postgres
> >  > and using my local docker instance of postgres 15.5 by changing the
> >  > local default_toast_compression value in postgresql.conf.
> >  >
> >  > I have attached the test script we use to create a table, insert
> > some
> >  > test records and a query to test the JSON data size on disk.
> >
> > I can confirm I see the same results using Postgres 16 installed from
> > the PGDG repo on Ubuntu 22.04. That the lz4 data size is greater then
> > the pglz data size.
> >
> >  >
> >  >
> >  > Kind regards,
> >  > Sean
> >  >
> >
> >
> > --
> > Adrian Klaver
> > adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com>
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>


-- 
Regards
Junwang Zhao




Re: How to generate random bigint

2023-12-22 Thread Junwang Zhao
On Sat, Dec 23, 2023 at 8:36 AM Phillip Diffley  wrote:
>
> Thank you for all the suggestions! I ended up using pgcrypto's
> pg_random_bytes() to build the random int. I haven't fully tested the
> function yet, but it looks like this works.
>
> CREATE EXTENSION IF NOT EXISTS pgcrypto;
> CREATE OR REPLACE FUNCTION gen_random_int() RETURNS INT8 AS $$
> DECLARE
> bytes bytea;
> BEGIN
> bytes := gen_random_bytes(8);
> RETURN
> (get_byte(bytes,0)::int8 << 8*0) |
> (get_byte(bytes,1)::int8 << 8*1) |
> (get_byte(bytes,2)::int8 << 8*2) |
> (get_byte(bytes,3)::int8 << 8*3) |
> (get_byte(bytes,4)::int8 << 8*4) |
> (get_byte(bytes,5)::int8 << 8*5) |
> (get_byte(bytes,6)::int8 << 8*6) |
> (get_byte(bytes,7)::int8 << 8*7);
> END;
> $$ LANGUAGE plpgsql;
>

postgres=# explain analyze select gen_random_int() from
generate_series(1, 100);
Time: 4794.352 ms (00:04.794)
postgres=# explain analyze select ('x'||encode(gen_random_bytes(8),
'hex'))::bit(64)::int8 from generate_series(1, 100);
Time: 2816.014 ms (00:02.816)
postgres=# explain analyze select
('x'||encode(pg_read_binary_file('/dev/urandom', 0, 8),
'hex'))::bit(64)::bigint from generate_series(1, 100);
Time: 18947.639 ms (00:18.948)
postgres=# explain analyze select (random() * 2147483648)::int8 *
4294967296 + (random() * 4294967296)::int8 from generate_series(1,
100);
Time: 728.368 ms

Peter's way has the best performance.

>
> On Thu, Dec 21, 2023 at 6:14 AM Peter J. Holzer  wrote:
> >
> > On 2023-12-21 00:06:39 -0600, Phillip Diffley wrote:
> > > Postgres's random() function generates a random double. That can be 
> > > converted
> > > to a random int for smaller integers, but a double can't represent all of 
> > > the
> > > values in a bigint. Is there a recommended way to generate a random 
> > > bigint in
> > > Postgres?
> >
> > Call random() twice and add the results?
> >
> > Like this:
> >
> > select (random() * 2147483648)::int8 * 4294967296
> >+ (random() * 4294967296)::int8;
> >
> > (This assumes that random() actually returns at least 32 random bits.
> > If that's not the case you'll need more calls to random())
> >
> > hp
> >
> > --
> >_  | Peter J. Holzer| Story must make more sense than reality.
> > |_|_) ||
> > | |   | h...@hjp.at |-- Charles Stross, "Creative writing
> > __/   | http://www.hjp.at/ |   challenge!"
>
>


-- 
Regards
Junwang Zhao




Re: jsonb on-disk size calculation

2023-07-31 Thread Junwang Zhao
convertJsonbObject convert JsonValue to Jsonb.

Jsonb->vl_len_ is 4 byte
JsonbContainer->header is 4 bytes
JsonbContainer->children is an array of two elements, that's 8 bytes
following the k/v part, 20 bytes

In total 36 bytes.

When Jsonb is stored to disk, I guess some conversion reduces the
vl_len_ to 1 byte, hence total 33 bytes,
but I cannot find the conversion logic.

On Mon, Jul 31, 2023 at 7:10 PM jian he  wrote:
>
>
> hi.
>
> drop table x;
> create table x(js jsonb);
> insert into x select '{"Hello world":1}'::jsonb;
> select pg_column_size(js) from x; -- return 33.
>
> based on src/include/utils/jsonb.h
> The key and value part is 20 bytes (is it correct?), Jsonb->vl_len_  is 4 
> byte, JsonbContainer->header is 4 bytes. That's 28 bytes.
>
> but now on-disk is 33 bytes.
> so I am not sure where the remaining bytes are.
>
>
>
>
>


-- 
Regards
Junwang Zhao




Re: timetz need more bytes than timestamptz

2023-06-18 Thread Junwang Zhao
timetz stores only the time units within a day, it uses an extra field
to store the zone info.

typedef int64 TimeADT;
typedef struct
{
TimeADT time; /* all time units other than months and years */
int32 zone; /* numeric time zone, in seconds */
} TimeTzADT;

timestamp is a count that starts on January 1st, 1970 at UTC.

You can convert a timestamp to any timezone, but not timetz.

On Mon, Jun 19, 2023 at 11:53 AM jian he  wrote:
>
>
> Hi,
> https://www.postgresql.org/docs/current/datatype-datetime.html
> timetz, timestamptz:
> same resolution.
> fractional digits in the seconds field are also the same.
> >
> > All timezone-aware dates and times are stored internally in UTC. They are 
> > converted to local time in the zone specified by the TimeZone configuration 
> > parameter before being displayed to the client.
>
> Why does timetz need more bytes (8 vs 12) than timestamptz?



-- 
Regards
Junwang Zhao




Re: ECCN for PostgreSQL

2022-09-22 Thread Junwang Zhao
There is no ECCN restriction, there was a thread talking about this, see [0]

[0]: https://www.postgresql.org/message-id/17562.1329923596%40sss.pgh.pa.us

On Thu, Sep 22, 2022 at 3:45 PM YangYuping(杨瑜萍)
 wrote:
>
> Hello, this is Panasonic software development department.
>
> Is there any ECCN restriction for PostgreSQL.
> Thank you.
>
>
>
> **
>
> 楊 瑜萍(Yang YuPing) 
>
> 蘇州松下生産科技有限公司(PFSS)   R&Dセンター ソフト開発部
>
> 中国江蘇省蘇州工業園区唯亭鎮臨埠街1号
>
> Panasonic Factory Solutions Suzhou Co.,Ltd (PFSS)  R&D Center
>
> TEL:+86-(0)512-80991188-393  FAX:+86-(0)512-62957913
>
> **********
>
>



-- 
Regards
Junwang Zhao




Re: [EXT] Re: log_min_messages = warning

2022-09-11 Thread Junwang Zhao
 part.test2()
>  LANGUAGE plpgsql
>  AS $$
>  DECLARE
>
> BEGIN
>
> raise exception '* raise ERROR test *';
>
> END $$;
>
> When I execute that through PSQL this is returned:
>
> shgroup02s=> call part.test2();
> ERROR:  * raise ERROR test *
>
> And in the log file I see this written:
>
> 2022-09-07 12:58:00 
> UTC:10.210.119.217(42434):postgres@shgroup02s:[15409]:ERROR: * raise 
> ERROR test *
> 2022-09-07 12:58:00 
> UTC:10.210.119.217(42434):postgres@shgroup02s:[15409]:CONTEXT: PL/pgSQL 
> function part.test2() line 6 at RAISE
> 2022-09-07 12:58:00 
> UTC:10.210.119.217(42434):postgres@shgroup02s:[15409]:STATEMENT: call 
> part.test2();
>
> So the raise exception gets written to the log file when I execute it through 
> PSQL which I believe is expected.
>
> I believe this clearly points to an issue with pg_cron.  Would you agree?
>
> Regards
> Steve
>
>


-- 
Regards
Junwang Zhao