Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Michel Pelletier
On Wed, Apr 24, 2019 at 3:11 PM pabloa98  wrote:

> We used tables because we have 2 types of queries on this table:
>
> SELECT * FROM table_wih_lots_of_columns WHERE condition involving a lot of
> columns.
> These type of queries read lot of rows.
>
> or
>
> SELECT columnX FROM table_wih_lots_of_columns WHERE condition involving a
> lot of columns
> These type of queries read very few rows.
>
>
Everyone else has had great advice on this, I'd like to add that arrays of
any dimension are limited to 1GB like all varlena objects.

You should check out pg-strom, it's highly optimized for running exactly
these kinds of queries on a GPU and comes with a native matrix type that
can exceed the 1GB limit.

http://heterodb.github.io/pg-strom/

-Michel

>
>>


Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Gavin Flower

On 25/04/2019 10:55, Alvaro Herrera wrote:

On 2019-Apr-24, pabloa98 wrote:


Regarding to (2), We are good by adding a patch and recompile a patched
version for our server databases.

But we are open on helping to add thousands of columns support as a
compile-time parameter if there are other people interested.

It's hard to say what you're doing wrong when we don't know
what are you actually doing.

I think raising the limit requires changing ItemIdData, t_hoff, and a
few members of PageHeaderData at the very least.  Reading the three
header files involved carefully would probably point out areas I've
forgotten to mention.  I think if you enlarge t_hoff and lp_off/lp_len
to 16 bits, you can use 64kB blocks, which might be useful too.

Note that with pg12 you could have your own table AM that supported
wider ItemIds as a (small?) change on heapam, rather than supplant it
for all tables.  That way you would only pay the (probably considerable)
cost of the wider line pointers on all tables ...

I wonder if it might prove a killer feature for some niche uses! 
Stranger things have come to pass.


Suspect that going beyond 1600 columns would never be the default, even 
if the pg core devs, were happy to allow it as an official option 
(presumably at compile time?).  As I think it would have negative 
performance impacts on the most uses of pg - as Alvaro hinted at.  IMnsHO


Certainly the will be many people intrigued as to what you are trying to 
do, even if we never want to do the same ourselves.







Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Ron

On 4/24/19 5:55 PM, Alvaro Herrera wrote:

On 2019-Apr-24, pabloa98 wrote:


Regarding to (2), We are good by adding a patch and recompile a patched
version for our server databases.

But we are open on helping to add thousands of columns support as a
compile-time parameter if there are other people interested.

It's hard to say what you're doing wrong when we don't know
what are you actually doing.

I think raising the limit requires changing ItemIdData, t_hoff, and a
few members of PageHeaderData at the very least.  Reading the three
header files involved carefully would probably point out areas I've
forgotten to mention.  I think if you enlarge t_hoff and lp_off/lp_len
to 16 bits, you can use 64kB blocks, which might be useful too.

Note that with pg12 you could have your own table AM that supported
wider ItemIds as a (small?) change on heapam, rather than supplant it
for all tables.  That way you would only pay the (probably considerable)
cost of the wider line pointers on all tables ...


Do views have the same 1600 column limit as tables?

--
Angular momentum makes the world go 'round.




Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Alvaro Herrera
On 2019-Apr-24, Alvaro Herrera wrote:


> Note that with pg12 you could have your own table AM that supported
> wider ItemIds as a (small?) change on heapam, rather than supplant it
> for all tables.  That way you would only pay the (probably considerable)
> cost of the wider line pointers on all tables ...

"That way you would only pay the (probably considerable) cost of the
wider line pointers on the tables that need it rather than all of them."

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Alvaro Herrera
On 2019-Apr-24, pabloa98 wrote:

> Regarding to (2), We are good by adding a patch and recompile a patched
> version for our server databases.
> 
> But we are open on helping to add thousands of columns support as a
> compile-time parameter if there are other people interested.

It's hard to say what you're doing wrong when we don't know
what are you actually doing.

I think raising the limit requires changing ItemIdData, t_hoff, and a
few members of PageHeaderData at the very least.  Reading the three
header files involved carefully would probably point out areas I've
forgotten to mention.  I think if you enlarge t_hoff and lp_off/lp_len
to 16 bits, you can use 64kB blocks, which might be useful too.

Note that with pg12 you could have your own table AM that supported
wider ItemIds as a (small?) change on heapam, rather than supplant it
for all tables.  That way you would only pay the (probably considerable)
cost of the wider line pointers on all tables ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: how to add more than 1600 columns in a table?

2019-04-24 Thread pabloa98
On Wed, Apr 24, 2019 at 3:28 PM Gavin Flower 
wrote:

The convention here is to bottom post, or to intersperse comments, like
> in all the replies to you.
>
> So it would be appreciated if you did that, rather than top post as you
> have been doing.
>
>
Thanks for the advice. I will follow the convention.



> I strongly suspect that:
>
> (1) making pg handle more than 1600 columns, would be way more
> complicated than you can imagine
>
> (2) suich a change would be unlikely to be accepted into the main
> line, which would mean you'd have to reapply your patch for every
> new version of pg you wanted to use!
>
>
Yes. Our 1st attempt showed us (1).

Regarding to (2), We are good by adding a patch and recompile a patched
version for our server databases.

But we are open on helping to add thousands of columns support as a
compile-time parameter if there are other people interested.

Something like --with_thousands_of_columns_support feature

We could add a node to the build farm to test this functionality on Ubuntu.
And helping and writing tests.

Pablo


>
> Cheers,
> Gavin
>
>


Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Gavin Flower

On 25/04/2019 10:11, pabloa98 wrote:

Thank you Joe! I will take a look

Pablo

On Wed, Apr 24, 2019 at 1:47 PM Joe Conway > wrote:


[...]

Hi Pablo,

The convention here is to bottom post, or to intersperse comments, like 
in all the replies to you.


So it would be appreciated if you did that, rather than top post as you 
have been doing.


I strongly suspect that:

   (1) making pg handle more than 1600 columns, would be way more
   complicated than you can imagine

   (2) suich a change would be unlikely to be accepted into the main
   line, which would mean you'd have to reapply your patch for every
   new version of pg you wanted to use!


Cheers,
Gavin





Re: how to add more than 1600 columns in a table?

2019-04-24 Thread pabloa98
if there is some no-SQL database supporting ACID + generic queries like a
SQL database, I would consider it.

However, the column limit is the problem, no the database. 1200 cols are
reasonable for standard problems. Having a Postgresql supporting thousands
of columns will be useful in Machine Learning and similar domains.

Pablo

On Wed, Apr 24, 2019 at 1:23 PM Ron  wrote:

> On 4/24/19 3:17 PM, pabloa98 wrote:
> > Hello
> >
> > Sadly today we hit the 1600 columns limit of Postgresql 11.
> >
> > How could we add more columns?
> >
> > Note: Tables are OK. We truly have 2400 columns now. Each column
> > represents a value in a matrix.
> >
> > We have millions of rows so I would prefer not to transpose each row to
> > (x, y, column_value) triplets because it will make all our code more
> > difficult of what it is.
> >
> > Any help, very appreciated.
>
> Is a relational database the proper tool for the job?
>
> --
> Angular momentum makes the world go 'round.
>
>
>


Re: how to add more than 1600 columns in a table?

2019-04-24 Thread pabloa98
Arrays could work, but it will make our code less clear. It is good to read
the column name (meaningful) than a number. We could use constants, but
then we should maintain them...

Pablo

On Wed, Apr 24, 2019 at 1:24 PM Alvaro Herrera 
wrote:

> On 2019-Apr-24, pabloa98 wrote:
>
> > How could we add more columns?
>
> Sorry.
>
> > Note: Tables are OK. We truly have 2400 columns now. Each column
> represents
> > a value in a matrix.
>
> Maybe you could use arrays?
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: how to add more than 1600 columns in a table?

2019-04-24 Thread pabloa98
Thank you Joe! I will take a look

Pablo

On Wed, Apr 24, 2019 at 1:47 PM Joe Conway  wrote:

> On 4/24/19 4:17 PM, pabloa98 wrote:
> > Sadly today we hit the 1600 columns limit of Postgresql 11.
> >
> > How could we add more columns?
> >
> > Note: Tables are OK. We truly have 2400 columns now. Each column
> > represents a value in a matrix.
>
> As everyone else has mentioned, your use case sounds like arrays are
> appropriate. Note that PostgreSQL supports 2 dimensional (actually more
> than 2d if you wanted) arrays which are essentially perfect for
> representing a matrix.
>
> If this works for your data model it will likely save a ton of storage
> space and perform much better than alternatives.
>
> FWIW, if you are needing to do matrix math, you might want to look into
> PL/R (https://github.com/postgres-plr/plr) as it supports 2d arrays as
> arguments which are converted directly into R matrices.
>
> I don't know for sure but likely PL/Python could be used to process
> matrices as well.
>
> HTH,
>
> Joe
>
> --
> Crunchy Data - http://crunchydata.com
> PostgreSQL Support for Secure Enterprises
> Consulting, Training, & Open Source Development
>
>


Re: how to add more than 1600 columns in a table?

2019-04-24 Thread pabloa98
Tom,

We are using currently 32kb pages. It was working fine with our 1200
columns tables. The new table design has around 2400 columns and it does
not fit in our modified postgresql11.

We are ready (meaning, all the ready we could be with java and python but
not too much C background) to modify the database source code if it is
possible. We tried that a few months ago and we did not go too far. :)
I believe we replace an 8 bits int by a 32 bytes int (perhaps in t_heap? I
am not sure) but it looked like we missed something because it did not
work. Several tests failed and some of them paged faulted the server.

I think we had to modify other datastructures and some code in places like
JIT to make it work.

A word about the database design: These tables are vectors (or matrixes) of
numbers, booleans, and other vectors. They are normalized and all the
columns are dependent on their PK column.

We used tables because we have 2 types of queries on this table:

SELECT * FROM table_wih_lots_of_columns WHERE condition involving a lot of
columns.
These type of queries read lot of rows.

or

SELECT columnX FROM table_wih_lots_of_columns WHERE condition involving a
lot of columns
These type of queries read very few rows.

UPDATES are at row level involving usually all the columns.

We like to use ACID semantic on these tables. So we chose PostgreSQL
because we can create transactions, joins with other (more normal) tables
and all that (compared with MongoDB it looks far better).

We considered several alternatives:

* Storing rows as JSONB, we would have to add a parsing stage in our code
that will generate a lot of temporal objects. Besides, it will need a LOT
more space.
* Storing rows as ARRAY[], some columns are arrays themselves. Other are
booleans (that could be converted to numbers). The worst part is that we
will have to reference a "column" using something like row_array[423] and
trust it has the right data type. It would make our far more complicated.
* Storing info in vertical columns using arrays (the same problem than
before, but adding much more complexity to the code).
* Storing outside as text files, no more ACID semantic :/

Having a plain table design let us write very clear code.

I think that PostgreSQL supporting long tables would give to the database a
definitely advantage compared to other technologies used in ML and
scientific programming (like MongoDB and Cassandra). Be

Pablo


On Wed, Apr 24, 2019 at 1:27 PM Tom Lane  wrote:

> pabloa98  writes:
> > Sadly today we hit the 1600 columns limit of Postgresql 11.
> > How could we add more columns?
>
> You can't, at least not without some pretty fundamental changes;
> that value is limited by field sizes within tuple headers.  You'd
> also, more than likely, find yourself hitting problems with the
> tuples-can't-cross-page-boundary restrictions.  (Our TOAST mechanism
> avoids that problem for large individual fields, but not for many small
> fields.)
>
> It seems pretty unlikely to me that any sane table design actually
> has thousands of truly-independent columns.  Consider using arrays,
> or perhaps composite sub-structures (JSON maybe?).
>
> regards, tom lane
>


Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Joe Conway
On 4/24/19 4:17 PM, pabloa98 wrote:
> Sadly today we hit the 1600 columns limit of Postgresql 11.
> 
> How could we add more columns?
> 
> Note: Tables are OK. We truly have 2400 columns now. Each column
> represents a value in a matrix.

As everyone else has mentioned, your use case sounds like arrays are
appropriate. Note that PostgreSQL supports 2 dimensional (actually more
than 2d if you wanted) arrays which are essentially perfect for
representing a matrix.

If this works for your data model it will likely save a ton of storage
space and perform much better than alternatives.

FWIW, if you are needing to do matrix math, you might want to look into
PL/R (https://github.com/postgres-plr/plr) as it supports 2d arrays as
arguments which are converted directly into R matrices.

I don't know for sure but likely PL/Python could be used to process
matrices as well.

HTH,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: how to add more than 1600 columns in a table?

2019-04-24 Thread David G. Johnston
On Wed, Apr 24, 2019 at 1:17 PM pabloa98  wrote:

> Hello
>
> Sadly today we hit the 1600 columns limit of Postgresql 11.
>
> How could we add more columns?
>

Add a second table and relate them together via a one-to-one-required
relationship?

I'm unsure whether a final result set is limited or just tables...

David J.


Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Tom Lane
pabloa98  writes:
> Sadly today we hit the 1600 columns limit of Postgresql 11.
> How could we add more columns?

You can't, at least not without some pretty fundamental changes;
that value is limited by field sizes within tuple headers.  You'd
also, more than likely, find yourself hitting problems with the
tuples-can't-cross-page-boundary restrictions.  (Our TOAST mechanism
avoids that problem for large individual fields, but not for many small
fields.)

It seems pretty unlikely to me that any sane table design actually
has thousands of truly-independent columns.  Consider using arrays,
or perhaps composite sub-structures (JSON maybe?).

regards, tom lane




Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Alvaro Herrera
On 2019-Apr-24, pabloa98 wrote:

> How could we add more columns?

Sorry.

> Note: Tables are OK. We truly have 2400 columns now. Each column represents
> a value in a matrix.

Maybe you could use arrays?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Ron

On 4/24/19 3:22 PM, Adrian Klaver wrote:

On 4/24/19 1:17 PM, pabloa98 wrote:

Hello

Sadly today we hit the 1600 columns limit of Postgresql 11.

How could we add more columns?

Note: Tables are OK. We truly have 2400 columns now. Each column 
represents a value in a matrix.


Not sure how hit 1600 column limit = having 2400 columns?


I think he means that *the design* has 2400 columns, and Pg is failing to 
implement it.


--
Angular momentum makes the world go 'round.


Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Adrian Klaver

On 4/24/19 1:17 PM, pabloa98 wrote:

Hello

Sadly today we hit the 1600 columns limit of Postgresql 11.

How could we add more columns?

Note: Tables are OK. We truly have 2400 columns now. Each column 
represents a value in a matrix.


Not sure how hit 1600 column limit = having 2400 columns?


Off hand I would say you are using the wrong tool for the job. More 
information on what you trying to achieve might help get you to the 
correct tool.




We have millions of rows so I would prefer not to transpose each row to 
(x, y, column_value) triplets because it will make all our code more 
difficult of what it is.


Any help, very appreciated.

Pablo



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Ron

On 4/24/19 3:17 PM, pabloa98 wrote:

Hello

Sadly today we hit the 1600 columns limit of Postgresql 11.

How could we add more columns?

Note: Tables are OK. We truly have 2400 columns now. Each column 
represents a value in a matrix.


We have millions of rows so I would prefer not to transpose each row to 
(x, y, column_value) triplets because it will make all our code more 
difficult of what it is.


Any help, very appreciated.


Is a relational database the proper tool for the job?

--
Angular momentum makes the world go 'round.




Re: how to add more than 1600 columns in a table?

2019-04-24 Thread Thomas Kellerer

pabloa98 schrieb am 24.04.2019 um 22:17:

Sadly today we hit the 1600 columns limit of Postgresql 11.

How could we add more columns?

Note: Tables are OK. We truly have 2400 columns now. Each column
represents a value in a matrix.

We have millions of rows so I would prefer not to transpose each row
to (x, y, column_value) triplets because it will make all our code
more difficult of what it is.


What about using an array?






how to add more than 1600 columns in a table?

2019-04-24 Thread pabloa98
Hello

Sadly today we hit the 1600 columns limit of Postgresql 11.

How could we add more columns?

Note: Tables are OK. We truly have 2400 columns now. Each column represents
a value in a matrix.

We have millions of rows so I would prefer not to transpose each row to (x,
y, column_value) triplets because it will make all our code more difficult
of what it is.

Any help, very appreciated.

Pablo