Re: [GENERAL] COPY vs \COPY FROM PROGRAM $$ quoting difference?

2017-09-29 Thread Alexander Stoddard
On Fri, Sep 29, 2017 at 11:54 AM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Fri, Sep 29, 2017 at 9:27 AM, Alexander Stoddard <
> alexander.stodd...@gmail.com> wrote:
>
>> I found what seems to be an odd difference between COPY and \copy parsing.
>>
> ​[...]
> ​
>
>
>> COPY dest_table FROM PROGRAM $$ sed 's/x/y/' | etc... $$
>>
>> To my surprise this worked with COPY but not \COPY which failed with:
>> \copy: parse error at "$$"
>>
>> Is this an undocumented difference? Is this even the appropriate email
>> list to ask this kind of question or report such a difference?
>>
>
> ​This is the correct place for seeking such clarification.​  The docs
> cannot cover every possible thing people might do and these lists (-general
> in particular) are here to fill in the gaps.
>
> ​The negative condition that "psql" itself doesn't understand
> dollar-quoting​ is not documented.  Dollar-quoting is documented as a
> server-interpreted SQL Syntax feature and only applies there.
>
> While the commands are similar COPY is server-side SQL while \copy is a
> psql meta-command that psql converts to SQL, executes, obtains the results,
> and processes.  Note that the server would never see "PROGRAM $$" since the
> server would be unable to access the local program being referred to.  The
> server sees "FROM stdin" and psql feeds the results of the PROGRAM
> invocation to the server over that pipe.
>
> David J.
>
>
Thank you, David. That helps makes sense of everything. There is the
situation where psql is executed by a non-superuser on the server. But the
docs make clear that only STDOUT / STDIN, not not named files or commands
are allowed in that case. So I now realize I would just have been trading a
parse error for a security one had my dollar-quoting worked with \copy!


Re: [GENERAL] COPY vs \COPY FROM PROGRAM $$ quoting difference?

2017-09-29 Thread David G. Johnston
On Fri, Sep 29, 2017 at 9:27 AM, Alexander Stoddard <
alexander.stodd...@gmail.com> wrote:

> I found what seems to be an odd difference between COPY and \copy parsing.
>
​[...]
​


> COPY dest_table FROM PROGRAM $$ sed 's/x/y/' | etc... $$
>
> To my surprise this worked with COPY but not \COPY which failed with:
> \copy: parse error at "$$"
>
> Is this an undocumented difference? Is this even the appropriate email
> list to ask this kind of question or report such a difference?
>

​This is the correct place for seeking such clarification.​  The docs
cannot cover every possible thing people might do and these lists (-general
in particular) are here to fill in the gaps.

​The negative condition that "psql" itself doesn't understand
dollar-quoting​ is not documented.  Dollar-quoting is documented as a
server-interpreted SQL Syntax feature and only applies there.

While the commands are similar COPY is server-side SQL while \copy is a
psql meta-command that psql converts to SQL, executes, obtains the results,
and processes.  Note that the server would never see "PROGRAM $$" since the
server would be unable to access the local program being referred to.  The
server sees "FROM stdin" and psql feeds the results of the PROGRAM
invocation to the server over that pipe.

David J.


Re: [GENERAL] COPY: row is too big

2017-05-27 Thread doganmeh
Yes, csvkit is what I decided to go with. Thank you all!



--
View this message in context: 
http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963559.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-05-27 Thread doganmeh
Yes, the delimiter was indeed ",". I fixed my original post . Seems I
carelessly copy/pasted from excel. 



--
View this message in context: 
http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963558.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-05-26 Thread Tom Lane
doganmeh  writes:
> I tried varchar(12) also, nothing changed. My questions is 1) I have
> 672x12=8,064 characters in the first row (which are actually the headers),
> why would it complain that it is 8760.

No, you have 672*13, because each varchar value will require a length
word (which is only 1 byte for short values like these).  Adding the
24-byte row header comes to 8760.

> 2) Is there anything I can do to work
> around this situation?

Maybe you could combine the strings into an array?  A large array would
be subject to compression and/or out-of-line storage, but 12-byte fields
are too small to benefit from either.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-05-26 Thread Andreas Kretschmer



Am 26.05.2017 um 14:07 schrieb doganmeh:


I tried varchar(12) also, nothing changed. My questions is 1) I have
672x12=8,064 characters in the first row (which are actually the headers),
why would it complain that it is 8760. I am assuming here type `text`
occupies 1 byte for a character.


please consider special chars, a little example:

test=*# create table demo(id int, t text);
CREATE TABLE
test=*# insert into demo values (1, '123')
test-# ;
INSERT 0 1
test=*# insert into demo values (2, '€€€');
INSERT 0 1
test=*# select id, t, length(t), pg_column_size(t) from demo;
 id |  t  | length | pg_column_size
+-++
  1 | 123 |  3 |  4
  2 | €€€ |  3 | 10
(2 Zeilen)



--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-05-26 Thread Charles Clavadetscher
Hello

> -Original Message-
> From: pgsql-general-ow...@postgresql.org 
> [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of doganmeh
> Sent: Freitag, 26. Mai 2017 14:08
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] COPY: row is too big
> 
> I am piggy-backing in this thread because I have the same issue as well. I 
> need to import a csv file that is 672
> columns long and each column consists of 12 alpha-numeric characters. Such as:
> 
> SA03ARE1015D  SA03ARE1S15NSB03ARE1015D  ...
> 356412275812  43106  ...
> 
> I am aware this is not normalized, however, we (or try to) keep source data 
> intact, and normalize after importing
> into our system.
> 
> While trying to import all columns to type `text` I get this error:
> 
> [54000] ERROR: row is too big: size 8760, maximum size 8160
> Where: COPY temp_table, line 3
> SQL statement "copy temp_table from
> '/home/edgleweb/data/raw/TX/TAPR/2015/ADV/SSTAAR1ADV.csv' with delimiter ','
> quote '"' csv "

Is the delimiter really ','? In the lines above it seems to be different.
Did you check line 3?

Regards,
Charles

> I tried varchar(12) also, nothing changed. My questions is 1) I have
> 672x12=8,064 characters in the first row (which are actually the headers), 
> why would it complain that it is 8760. I
> am assuming here type `text` occupies 1 byte for a character. 2) Is there 
> anything I can do to work around this
> situation?
> 
> Thanks in advance.
> 
> 
> 
> --
> View this message in context: 
> http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963385.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
> 
> 
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make 
> changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-05-26 Thread Adrian Klaver

On 05/26/2017 05:07 AM, doganmeh wrote:

I am piggy-backing in this thread because I have the same issue as well. I
need to import a csv file that is 672 columns long and each column consists
of 12 alpha-numeric characters. Such as:

SA03ARE1015DSA03ARE1S15NSB03ARE1015D  ...
356412  275812  43106  ...

I am aware this is not normalized, however, we (or try to) keep source data
intact, and normalize after importing into our system.

While trying to import all columns to type `text` I get this error:

[54000] ERROR: row is too big: size 8760, maximum size 8160
Where: COPY temp_table, line 3
SQL statement "copy temp_table from
'/home/edgleweb/data/raw/TX/TAPR/2015/ADV/SSTAAR1ADV.csv' with delimiter ','
quote '"' csv "

I tried varchar(12) also, nothing changed. My questions is 1) I have
672x12=8,064 characters in the first row (which are actually the headers),
why would it complain that it is 8760. I am assuming here type `text`
occupies 1 byte for a character. 2) Is there anything I can do to work


https://www.postgresql.org/docs/9.6/static/datatype-character.html

"The storage requirement for a short string (up to 126 bytes) is 1 byte 
plus the actual string, which includes the space padding in the case of 
character."



around this situation?


Use csvkit's csvcut tool to split the file?:

http://csvkit.readthedocs.io/en/1.0.2/scripts/csvcut.html



Thanks in advance.



--
View this message in context: 
http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963385.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-05-26 Thread doganmeh
BTW, we have pg9.5 run on ubuntu. 



--
View this message in context: 
http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963386.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-05-26 Thread doganmeh
I am piggy-backing in this thread because I have the same issue as well. I
need to import a csv file that is 672 columns long and each column consists
of 12 alpha-numeric characters. Such as:

SA03ARE1015DSA03ARE1S15NSB03ARE1015D  ...
356412  275812  43106  ...

I am aware this is not normalized, however, we (or try to) keep source data
intact, and normalize after importing into our system. 

While trying to import all columns to type `text` I get this error:

[54000] ERROR: row is too big: size 8760, maximum size 8160
Where: COPY temp_table, line 3
SQL statement "copy temp_table from
'/home/edgleweb/data/raw/TX/TAPR/2015/ADV/SSTAAR1ADV.csv' with delimiter ','
quote '"' csv "

I tried varchar(12) also, nothing changed. My questions is 1) I have
672x12=8,064 characters in the first row (which are actually the headers),
why would it complain that it is 8760. I am assuming here type `text`
occupies 1 byte for a character. 2) Is there anything I can do to work
around this situation?

Thanks in advance. 



--
View this message in context: 
http://www.postgresql-archive.org/COPY-row-is-too-big-tp5936997p5963385.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY ... FROM stdin WITH FORMAT csv

2017-03-22 Thread Alexander Farber
Hi David,

On Tue, Mar 21, 2017 at 8:21 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:
>
> On Tuesday, March 21, 2017, Alexander Farber 
wrote:
>>
>> words=> COPY words_reviews (uid, author, nice, review, updated) FROM
stdin FORMAT csv;
>
>
> What did you read that lead you to think the above shoud work?
>

ok thanks, I had misunderstood your text -

"The above means the entire "WITH" section is optional, as is the word
WITH.  However, if you want to add "with" options they must appear within
parentheses, those are not optional.  Multiple options can appear within
the single set of parentheses."

I think the root cause is that it is difficult to understand an english
sentence trying to describe a keyword "with" :-)

Regards
Alex


Re: [GENERAL] COPY ... FROM stdin WITH FORMAT csv

2017-03-21 Thread David G. Johnston
On Tue, Mar 21, 2017 at 1:45 PM, Adrian Klaver 
wrote:

> On 03/21/2017 12:11 PM, Alexander Farber wrote:
>
>> Thank you - this has worked:
>>
>> COPY words_reviews (uid, author, nice, review, updated) FROM stdin WITH
>> (FORMAT csv);
>> 1,2,1,'1 is nice by 2','2017-03-01'
>> 1,3,1,'1 is nice by 3','2017-03-02'
>> 1,4,1,'1 is nice by 4','2017-03-03'
>> 2,1,1,'2 is nice by 1','2017-03-01'
>> 2,3,1,'2 is nice by 3','2017-03-02'
>> 2,4,0,'2 is not nice by 4','2017-03-03'
>> \.
>>
>> but I am confused about the comments that I should use \copy and not
>> just COPY and also that I could leave out WITH and brackets.
>>
>
> The difference between COPY and \copy is explained here:
>
> https://www.postgresql.org/docs/9.6/static/app-psql.html
>
> \copy 
>
> Basically COPY runs as the server user and so the files it uses have to be
> accessible by the user the server runs as. \copy is a psql meta command
> that runs as local user so it can access files the local user can see and
> has privileges on.


​I take it, then, if one chooses not to use pre-existing files and instead
inline the content as shown here, the choice between \copy and COPY becomes
a matter of style and not correctness.  As a matter of style using \copy
makes it clear that everything that is needed to make the command work
exists locally.

The advantage to using "COPY FROM|TO stdin|stdout" is that the copy command
itself could (not tested) be spread out over multiple lines - which
especially for COPY TO can be advantageous.  The entire \copy meta-command
cannot be split up.

David J.


Re: [GENERAL] COPY ... FROM stdin WITH FORMAT csv

2017-03-21 Thread Adrian Klaver

On 03/21/2017 12:11 PM, Alexander Farber wrote:

Thank you - this has worked:

COPY words_reviews (uid, author, nice, review, updated) FROM stdin WITH
(FORMAT csv);
1,2,1,'1 is nice by 2','2017-03-01'
1,3,1,'1 is nice by 3','2017-03-02'
1,4,1,'1 is nice by 4','2017-03-03'
2,1,1,'2 is nice by 1','2017-03-01'
2,3,1,'2 is nice by 3','2017-03-02'
2,4,0,'2 is not nice by 4','2017-03-03'
\.

but I am confused about the comments that I should use \copy and not
just COPY and also that I could leave out WITH and brackets.


The difference between COPY and \copy is explained here:

https://www.postgresql.org/docs/9.6/static/app-psql.html

\copy 

Basically COPY runs as the server user and so the files it uses have to 
be accessible by the user the server runs as. \copy is a psql meta 
command that runs as local user so it can access files the local user 
can see and has privileges on.





Because please take a look at the 9.6.2 psql output (COPY works, and
leaving out WITH brackets - not):

words=> COPY words_reviews (uid, author, nice, review, updated) FROM
stdin WITH (FORMAT csv);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.

1,2,1,'1 is nice by 2','2017-03-01'

1,3,1,'1 is nice by 3','2017-03-02'
1,4,1,'1 is nice by 4','2017-03-03'
2,1,1,'2 is nice by 1','2017-03-01'
2,3,1,'2 is nice by 3','2017-03-02'
2,4,0,'2 is not nice by 4','2017-03-03'
\.

>> >> >> >> >> COPY 6

words=> COPY words_reviews (uid, author, nice, review, updated) FROM
stdin FORMAT csv;
ERROR:  syntax error at or near "FORMAT"
LINE 1: ...s (uid, author, nice, review, updated) FROM stdin FORMAT csv...
 ^




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY ... FROM stdin WITH FORMAT csv

2017-03-21 Thread David G. Johnston
On Tue, Mar 21, 2017 at 12:45 PM, Paul Jungwirth <
p...@illuminatedcomputing.com> wrote:

> On 03/21/2017 12:21 PM, David G. Johnston wrote:
>
>> >   words=> COPY words_reviews (uid, author, nice, review, updated) FROM
>> >   stdin FORMAT csv;
>>
>> What did you read that lead you to think the above shoud work?
>>
>
> I don't know about COPY FROM, but COPY TO works without parens (or
> FORMAT), like this:
>
> db=> copy (select 1+1, 2+2) to stdout with csv;
> 2,4
>
> (tested on pg 9.5)
>
> I never use parens, so I was surprised to see in the docs and the replies
> here that they are necessary. Am I just exploiting a bug in the parser?


That's documented backward compatibility syntax:

https://www.postgresql.org/docs/9.6/static/sql-copy.html

bottom of the page.

David J.
​


Re: [GENERAL] COPY ... FROM stdin WITH FORMAT csv

2017-03-21 Thread Paul Jungwirth

On 03/21/2017 12:21 PM, David G. Johnston wrote:

>   words=> COPY words_reviews (uid, author, nice, review, updated) FROM
>   stdin FORMAT csv;

What did you read that lead you to think the above shoud work?


I don't know about COPY FROM, but COPY TO works without parens (or 
FORMAT), like this:


db=> copy (select 1+1, 2+2) to stdout with csv; 


2,4

(tested on pg 9.5)

I never use parens, so I was surprised to see in the docs and the 
replies here that they are necessary. Am I just exploiting a bug in the 
parser?


Paul



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY ... FROM stdin WITH FORMAT csv

2017-03-21 Thread David G. Johnston
On Tuesday, March 21, 2017, Alexander Farber 
wrote:
>
> words=> COPY words_reviews (uid, author, nice, review, updated) FROM stdin
> FORMAT csv;
>

What did you read that lead you to think the above shoud work?

David J.


Re: [GENERAL] COPY ... FROM stdin WITH FORMAT csv

2017-03-21 Thread Alexander Farber
Thank you - this has worked:

COPY words_reviews (uid, author, nice, review, updated) FROM stdin WITH
(FORMAT csv);
1,2,1,'1 is nice by 2','2017-03-01'
1,3,1,'1 is nice by 3','2017-03-02'
1,4,1,'1 is nice by 4','2017-03-03'
2,1,1,'2 is nice by 1','2017-03-01'
2,3,1,'2 is nice by 3','2017-03-02'
2,4,0,'2 is not nice by 4','2017-03-03'
\.

but I am confused about the comments that I should use \copy and not just
COPY and also that I could leave out WITH and brackets.

Because please take a look at the 9.6.2 psql output (COPY works, and
leaving out WITH brackets - not):

words=> COPY words_reviews (uid, author, nice, review, updated) FROM stdin
WITH (FORMAT csv);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1,2,1,'1 is nice by 2','2017-03-01'
1,3,1,'1 is nice by 3','2017-03-02'
1,4,1,'1 is nice by 4','2017-03-03'
2,1,1,'2 is nice by 1','2017-03-01'
2,3,1,'2 is nice by 3','2017-03-02'
2,4,0,'2 is not nice by 4','2017-03-03'
\.
>> >> >> >> >> >> COPY 6
words=> COPY words_reviews (uid, author, nice, review, updated) FROM stdin
FORMAT csv;
ERROR:  syntax error at or near "FORMAT"
LINE 1: ...s (uid, author, nice, review, updated) FROM stdin FORMAT csv...
 ^


Re: [GENERAL] COPY ... FROM stdin WITH FORMAT csv

2017-03-21 Thread Francisco Olarte
Alexander:

On Tue, Mar 21, 2017 at 6:31 PM, Alexander Farber
 wrote:
> I keep rereading https://www.postgresql.org/docs/9.6/static/sql-copy.html
> but just can't figure the proper syntax to put some records into the table:

It's not that complex, let's see

> words=> COPY words_reviews (uid, author, nice, review, updated) FROM stdin
> WITH FORMAT 'csv';
> ERROR:  syntax error at or near "FORMAT"
> LINE 1: ...d, author, nice, review, updated) FROM stdin WITH FORMAT 'cs...
>  ^

Here you have an error ( more on this later ) so the next lines are
going to be interpreted as a new command

> words=> 1,2,1,'1 is nice by 2','2017-03-01',

Which you can clearly see because the prompt is => , initial, not ->,
continuation.

> words-> 1,3,1,'1 is nice by 3','2017-03-02',
> words-> 1,4,1,'1 is nice by 4','2017-03-03',
> words-> 2,1,1,'2 is nice by 1','2017-03-01',
> words-> 2,3,1,'2 is nice by 3','2017-03-02',

Also, you are putting an extra , at the end of the lines. This means
you have an empty string at the end, one extra fields. I do not
remember now if it hurts, but better omit it.

> words-> 2,4,0,'2 is not nice by 4','2017-03-03'
> words-> \.
> Invalid command \.. Try \? for help.

All the lines up to here are considered part of the previous sql (
remember => vs -> ) command. You are not in copy mode, so psql tries
to interpret '\.' as a meta command ( like \d ) but fails.

> words-> ;
> ERROR:  syntax error at or near "1"
> LINE 1: 1,2,1,'1 is nice by 2','2017-03-01',
> ^

And here you terminate the SQL command, so it fails ( note it referes
to the first error, the initial line with => ).

> I am not sure if FORMAT csv or FORMAT 'csv' should be used.

That is easy, try both. BUT! if you read the manual with care you will
notive it is "with ( options )", not "with options", so you are
missing parenthesis:

web_tarotpagos_staging=# create temporary table t(a varchar, b varchar);
CREATE TABLE

-- NO parens, no luck:
web_tarotpagos_staging=# copy t(a,b) from stdin with format csv;
ERROR:  syntax error at or near "format"
LINE 1: copy t(a,b) from stdin with format csv;
^
web_tarotpagos_staging=# copy t(a,b) from stdin with format 'csv';
ERROR:  syntax error at or near "format"
LINE 1: copy t(a,b) from stdin with format 'csv';


BUT, as soon as I put them:
   ^
web_tarotpagos_staging=# copy t(a,b) from stdin with (format csv);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> \.

Note the enter data approach. Also note copy is safe to try as you can
just abort it.

> And I have tried adding/removing commas at the end of lines too.
That is commented above.

> I have looked at pg_dump output, but it does not use csv.

pg_dump uses the default text format, a little more difficult but
vastly superior ( IMNSHO ) to CSV. It ( by default ) separates records
with newlines and fields with tab, and escapes newlines, tabs and
backslashes in data with backslash, so the transformation is
contextless, much easier than csv:

Copy out: Replace NULL with '\N', newline with '\n', tab with '\t',
backslash with '\\', join fields with tab, print with newline at end.

Copy In: Read till newline, split on tabs, replace '\n' with newline,
'\t' with tab, '\\' with backslash.

Much easier to get right than CSV ( how do you encode the C string ",;
\n\"\n\t;  \t\"\'" ? )

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY ... FROM stdin WITH FORMAT csv

2017-03-21 Thread John R Pierce

On 3/21/2017 10:31 AM, Alexander Farber wrote:
words=> COPY words_reviews (uid, author, nice, review, updated) FROM 
stdin WITH FORMAT 'csv';

ERROR:  syntax error at or near "FORMAT"
LINE 1: ...d, author, nice, review, updated) FROM stdin WITH FORMAT 'cs...


its just csv, not 'csv' ...


And I have tried adding/removing commas at the end of lines too.


if its getting a syntax error on the COPY command, its not yet read the 
data...


note that COPY  FROM STDIN  doesn't actually read from stdin, it 
requires the data to be passed through to it with a special API 
(PQputCopyData() in libpq, or similar in other APIs).   you can use 
\copy in psql to stream the data from the same input.





--
john r pierce, recycling bits in santa cruz



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY ... FROM stdin WITH FORMAT csv

2017-03-21 Thread David G. Johnston
On Tue, Mar 21, 2017 at 10:31 AM, Alexander Farber <
alexander.far...@gmail.com> wrote:

> Good evening,
>
> I keep rereading https://www.postgresql.org/docs/9.6/static/sql-copy.html
> but just can't figure the proper syntax to put some records into the table:
> ​[...]​
>
> words=> COPY words_reviews (uid, author, nice, review, updated) FROM stdin
> WITH FORMAT 'csv';
> ERROR:  syntax error at or near "FORMAT"
> LINE 1: ...d, author, nice, review, updated) FROM stdin WITH FORMAT 'cs...
>
>
​​[ [ WITH ] ( option [, ...] ) ]

The above means the entire "WITH" section is optional, as is the word
WITH.  However, if you want to add "with" options they must appear within
parentheses, those are not optional.  Multiple options can appear within
the single set of parentheses.

"""
FORMAT
Selects the data format to be read or written: text, csv (Comma Separated
Values), or binary. The default is text.
"""

Valid values for format are as listed, no single quote required (not sure
about if they are allowed)

Therefore:

WITH (FORMAT csv)

David J.


Re: [GENERAL] Copy database to another host without data from specific tables

2017-03-09 Thread Panagiotis Atmatzidis
Thanks for the replies, pg_dump —exclude-table will do for now.

Panagiotis (atmosx) Atmatzidis

email:  a...@convalesco.org
URL:http://www.convalesco.org
GnuPG ID: 0x1A7BFEC5
gpg --keyserver pgp.mit.edu --recv-keys 1A7BFEC5

"Everyone thinks of changing the world, but no one thinks of changing himself.” 
- Leo Tolstoy







Re: [GENERAL] Copy database to another host without data from specific tables

2017-03-07 Thread Vick Khera
On Tue, Mar 7, 2017 at 2:02 AM, Panagiotis Atmatzidis 
wrote:

> I want to make a clone of database1 which belongs to user1, to database2
> which belongs to user2. Database1 has 20+ tables. I want to avoid copying
> the DATA sitting on 5 tables on database1 (many Gigs).
>
> I've read one too many posts about how to perform the actions with
> "pg_dump" and "pg_restore" but I'm trying to figure out the easiest way to
> do this. The process I have in mind is this:
>
>
Your method will break down if you have FKs and you're not careful with the
order you copy your data.

pg_dump has an --exclude-table flag which I would suggest. Just apply it
multiple times to exclude your 5 tables.


Re: [GENERAL] Copy database to another host without data from specific tables

2017-03-07 Thread Achilleas Mantzios

On 07/03/2017 09:02, Panagiotis Atmatzidis wrote:

Hello,

I have 2 RDS instances on AWS running PSQL 9.4.7.

I want to make a clone of database1 which belongs to user1, to database2 which 
belongs to user2. Database1 has 20+ tables. I want to avoid copying the DATA 
sitting on 5 tables on database1 (many Gigs).

I've read one too many posts about how to perform the actions with "pg_dump" and 
"pg_restore" but I'm trying to figure out the easiest way to do this. The process I have 
in mind is this:

1) pg_dump the schema from DB1 to DB2 using --no-owner and pg_restore with 
--role=user2
2) pg_dump -Fc --no-owner --data-only -t 'table1' from DB1 and then restore with 
pg_restore -t 'table' --role=user2  to DB2
So you are talking about DBs on the same PgSQL cluster (RDS instance) or you are trying to clone a DB from 1st RDS instance to the 2nd? In each case, you cannot avoid copying. But if we're talking 
about the whole cluster 

This procedure though is very time consuming (although it could be scripted). 
Is there any better / faster / safer way to do this?

you could design smth based on replication, have a warm/hot standby applying 
changes from the primary, and then implement smth like :
- promote (i.e. recover and start in a new timeline as a new primary)
- run a script to change ownership to user2.
So the idea is to have pre-copied the data, so that the whole final procedure is very fast. Note, however, that after you do that, you would have to re-setup replication again, and that would be 
costly (you can't avoid ultimately copying data). Maybe it could help if you tell us the whole use case.



Thanks.

--
Panagiotis (atmosx) Atmatzidis

email:  a...@convalesco.org
URL:http://www.convalesco.org
GnuPG ID: 0x1A7BFEC5
gpg --keyserver pgp.mit.edu --recv-keys 1A7BFEC5

"Everyone thinks of changing the world, but no one thinks of changing himself.” 
- Leo Tolstoy









--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Copy database to another host without data from specific tables

2017-03-06 Thread Condor

On 07-03-2017 09:02, Panagiotis Atmatzidis wrote:

Hello,

I have 2 RDS instances on AWS running PSQL 9.4.7.

I want to make a clone of database1 which belongs to user1, to
database2 which belongs to user2. Database1 has 20+ tables. I want to
avoid copying the DATA sitting on 5 tables on database1 (many Gigs).

I've read one too many posts about how to perform the actions with
"pg_dump" and "pg_restore" but I'm trying to figure out the easiest
way to do this. The process I have in mind is this:

1) pg_dump the schema from DB1 to DB2 using --no-owner and pg_restore
with --role=user2
2) pg_dump -Fc --no-owner --data-only -t 'table1' from DB1 and then
restore with pg_restore -t 'table' --role=user2  to DB2

This procedure though is very time consuming (although it could be
scripted). Is there any better / faster / safer way to do this?

Thanks.

--
Panagiotis (atmosx) Atmatzidis

email:  a...@convalesco.org
URL:http://www.convalesco.org
GnuPG ID: 0x1A7BFEC5
gpg --keyserver pgp.mit.edu --recv-keys 1A7BFEC5

"Everyone thinks of changing the world, but no one thinks of changing
himself.” - Leo Tolstoy




Hello,

I do it with shell script, here is it:

#!/bin/sh

for table in a_tbl ab_tbl some_other_tbl
do
echo $table
psql -U data -h 192.168.1.152 second_db -c "TRUNCATE $table;"
/usr/bin/pg_dump -U postgres --no-tablespaces --no-owner -a -b 
-t $table first_db | psql -U data -h 192.168.1.152 second_db

done


# line bellow removing some data that should not be on backup.
psql -U data -h 192.168.1.152 second_db -c 'UPDATE a_tbl SET upss = 
DEFAULT;'




My servers are in local network. Of course for this way you need to 
create table structure on second_db that is the same on master.
Because I use it for backup only, I dont have index on second_db and 
process is fast.



Regards,
Hristo S


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Copy database to another host without data from specific tables

2017-03-06 Thread Condor

On 07-03-2017 09:02, Panagiotis Atmatzidis wrote:

Hello,

I have 2 RDS instances on AWS running PSQL 9.4.7.

I want to make a clone of database1 which belongs to user1, to
database2 which belongs to user2. Database1 has 20+ tables. I want to
avoid copying the DATA sitting on 5 tables on database1 (many Gigs).

I've read one too many posts about how to perform the actions with
"pg_dump" and "pg_restore" but I'm trying to figure out the easiest
way to do this. The process I have in mind is this:

1) pg_dump the schema from DB1 to DB2 using --no-owner and pg_restore
with --role=user2
2) pg_dump -Fc --no-owner --data-only -t 'table1' from DB1 and then
restore with pg_restore -t 'table' --role=user2  to DB2

This procedure though is very time consuming (although it could be
scripted). Is there any better / faster / safer way to do this?

Thanks.

--
Panagiotis (atmosx) Atmatzidis

email:  a...@convalesco.org
URL:http://www.convalesco.org
GnuPG ID: 0x1A7BFEC5
gpg --keyserver pgp.mit.edu --recv-keys 1A7BFEC5

"Everyone thinks of changing the world, but no one thinks of changing
himself.” - Leo Tolstoy



Hello,

I do it with shell script, here is it:

#!/bin/sh

for table in a_tbl ab_tbl some_other_tbl
do
echo $table
psql -U data -h 192.168.1.152 second_db -c "TRUNCATE $table;"
/usr/bin/pg_dump -U postgres --no-tablespaces --no-owner -a -b 
-t $table first_db | psql -U data -h 192.168.1.152 second_db

done


# line bellow removing some data that should not be on backup.
psql -U data -h 192.168.1.152 second_db -c 'UPDATE a_tbl SET upss = 
DEFAULT;'




My servers are in local network. Of course for this way you need to 
create table structure on second_db that is the same on master.
Because I use it for backup only, I dont have index on second_db and 
process is fast.



Regards,
Hristo S


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY to question

2017-01-17 Thread Steve Crawford
On Tue, Jan 17, 2017 at 10:23 AM, Rich Shepard 
wrote:

>   Running -9.6.1. I have a database created and owned by me, but cannot
> copy
> a table to my home directory. Postgres tells me it cannot write to that
> directory. The only way to copy tables to files is by doing so as the
> superuser (postgres).
>
>   Why is this, and can I change something so I, as a user, can copy tables
> directly to ~/?
>
> To add to the other answers, more info is available at
https://www.postgresql.org/docs/current/static/app-psql.html#APP-PSQL-META-COMMANDS-COPY
and https://wiki.postgresql.org/wiki/COPY

Note that you can invoke SQL COPY to STDOUT as in: COPY (some arbitrary
query) TO STDOUT;

You would either pipe/redirect the output of psql as desired or use the
"\o" within psql to reroute the output to a file or pipe to a program, for
example, output to a CSV using a pipe as the delimiter and double-quote as
the quote character but change all "ma" to "pa" and put into myoutput.txt

\o | sed s/ma/pa/g > myoutput.txt
copy (some query) to stdout csv header delimiter '|' quote '"';
\o

Cheers,
Steve


Re: [GENERAL] COPY to question

2017-01-17 Thread David G. Johnston
On Tue, Jan 17, 2017 at 11:23 AM, Rich Shepard 
wrote:

>   Running -9.6.1. I have a database created and owned by me, but cannot
> copy
> a table to my home directory. Postgres tells me it cannot write to that
> directory. The only way to copy tables to files is by doing so as the
> superuser (postgres).
>

​When you ask the server to access the filesystem (e.g., via SQL COPY) it
does so on your behalf but using its own operating system user​.  It makes
no attempt to match the role that you are logged in as with a corresponding
operating system user.

As Tom noted if you want to do things as "you" and not "the server" you
need to perform them within a client (psql being the main one).  In psql
you can get COPY functionality via the \copy meta-command.  The server
sends its output to the client which then redirects it to some path on the
local machine.  If you run psql on the server you can access a home
directory on the server.


>
>   Why is this


​COPY naming a file or command is only allowed to database superusers,
since it allows reading or writing any file that the server has privileges
to access.​

See above for why its the sever's privileges that matter.

David J.


Re: [GENERAL] COPY to question

2017-01-17 Thread Steve Atkins

> On Jan 17, 2017, at 10:23 AM, Rich Shepard  wrote:
> 
>  Running -9.6.1. I have a database created and owned by me, but cannot copy
> a table to my home directory. Postgres tells me it cannot write to that
> directory. The only way to copy tables to files is by doing so as the
> superuser (postgres).
> 
>  Why is this, and can I change something so I, as a user, can copy tables
> directly to ~/?

You can use "\copy" from psql to do the same thing as the SQL copy command,
but writing files as the user running psql, rather than the postgresql superuser
role. That's probably what you need.

Cheers,
  Steve



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY to question

2017-01-17 Thread Pavel Stehule
2017-01-17 19:23 GMT+01:00 Rich Shepard :

>   Running -9.6.1. I have a database created and owned by me, but cannot
> copy
> a table to my home directory. Postgres tells me it cannot write to that
> directory. The only way to copy tables to files is by doing so as the
> superuser (postgres).
>
>   Why is this, and can I change something so I, as a user, can copy tables
> directly to ~/?


You cannot to use server side COPY for writing directly to client side
directory.

If you use psql console, and there is client side backslash COPY statement.
There you can write anywhere on client side, where you have a access.

\COPY table TO ~/xxx.dta

is valid there.

Regards

Pavel


>
>
> Rich
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] COPY to question [ANSWERED]

2017-01-17 Thread Rich Shepard

On Tue, 17 Jan 2017, Tom Lane wrote:


Use psql's \copy instead.


  Thanks, Tom.

Rich


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY to question

2017-01-17 Thread Tom Lane
Rich Shepard  writes:
>Running -9.6.1. I have a database created and owned by me, but cannot copy
> a table to my home directory. Postgres tells me it cannot write to that
> directory. The only way to copy tables to files is by doing so as the
> superuser (postgres).

>Why is this, and can I change something so I, as a user, can copy tables
> directly to ~/?

Use psql's \copy instead.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY value TO STDOUT

2017-01-14 Thread Tom Lane
Denisa Cirstescu  writes:
> I want to COPY a value to STDOUT from PL/pgSQL language.

You can't.  Maybe RAISE NOTICE would serve the purpose?

> I saw that the STDOUT is not accessible from PL/pgSQL, but it is from SQL.
> This is why I am trying to create an auxiliary function declared as language 
> SQL and call that function from my PL/pgSQL code.

Oh, that's an oversight --- this case can't work either, but the SQL
function code fails to prevent it.  The COPY runs, but it completely
breaks the wire protocol, leading to weird errors on the client side,
eg

regression=# create or replace function printToStdout(abc text) returns void as 
$$
   copy (SELECT 42) to stdout;
$$ language sql;
CREATE FUNCTION
regression=# select printToStdout('z');
42
server sent data ("D" message) without prior row description ("T" message)
regression=# 

What this should produce is an error similar to the one you get in
plpgsql.  COPY to stdout/from stdin can't be executed from inside an
already-running query, because the wire protocol can't support
nesting those operations.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY value TO STDOUT

2017-01-14 Thread Pavel Stehule
Hi

2017-01-13 16:45 GMT+01:00 Denisa Cirstescu 
:

> I am not sure if this is the correct mailing list or if this is how you
> submit a question, but I am going to give it a try.
>
>
>
> I want to COPY a value to STDOUT from PL/pgSQL language.
>
>
>
> I saw that the STDOUT is not accessible from PL/pgSQL, but it is from SQL.
>
> This is why I am trying to create an auxiliary function declared as
> language SQL and call that function from my PL/pgSQL code.
>
>
>
> This is an example of PL/pgSQLcode:
>
>
>
> *DO *
>
> *$do$*
>
> *BEGIN*
>
> *   perform printToStdout('12');*
>
> *   perform printToStdout('34');*
>
> *   perform printToStdout('56');*
>
> *END *
>
> *$do$*
>
>
>
> And this is the definition of the auxiliary function:
>
>
>
> *create or replace function printToStdout(abc text) returns void as $$*
>
> *   copy (SELECT abc) to stdout;*
>
> *$$ language sql;*
>
>
>
>
>
> However, this is not working, because COPY doesn’t know how to use the
> value of my variable named abc and it returns the error:
>
> ERROR:  column "abc" does not exist
>
> If I use a constant, instead of the abc variable everything works fine;
> the constant is printed to STDOUT.
>
>
>
> Is there a way to achieve this without using an auxiliary table?
>
> The below code does the job, but is not ok for me because of the auxiliary
> table that might cause performance problems in a concurrent environment
> with a lot of requests:
>
>
>
> *create table if not exists printToStdoutTable(abc text);*
>
>
>
> *create or replace function printToStdout(abc text) returns void as $$*
>
> *   delete from printToStdoutTable;*
>
> *   insert into printToStdoutTable  values(abc);*
>
> *   copy (SELECT * from printToStdoutTable) to stdout;*
>
> *$$ language sql;*
>


You cannot do it in plain text language.

The identifier of column or table have not be a variable ever.

You can use dynamic SQL in PLpgSQL - where SQL command is created in
run-time and there you can do what you want.

Use plpgsql and EXECUTE statement
https://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

Regards

Pavel Stehule


Re: [GENERAL] COPY: row is too big

2017-01-05 Thread Rob Sargent



On 01/05/2017 11:46 AM, Adrian Klaver wrote:

On 01/05/2017 08:31 AM, Rob Sargent wrote:



On 01/05/2017 05:44 AM, vod vos wrote:

I finally figured it out as follows:

1. modified the corresponding data type of the columns to the csv file

2. if null values existed, defined the data type to varchar. The null
values cause problem too.

so 1100 culumns work well now.

This problem wasted me three days. I have lots of csv data to COPY.



Yes, you cost yourself a lot of time by not showing the original table
definition into which you were trying insert data.


Given that the table had 1100 columns I am not sure I wanted to see it:)

Still the OP did give it to us in description:

https://www.postgresql.org/message-id/15969913dd3.ea2ff58529997.7460368287916683127%40zoho.com 

"I create a table with 1100 columns with data type of varchar, and 
hope the COPY command will auto transfer the csv data that contains 
some character and date, most of which are numeric."


In retrospect I should have pressed for was a more complete 
description of the data. I underestimated this description:


"And some the values in the csv file contain nulls, do this null 
values matter? "



My apologies for missing that.  Was sure there would be room for some 
normalization but so be it: OP's happy, I'm happy





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-05 Thread Adrian Klaver

On 01/05/2017 08:31 AM, Rob Sargent wrote:



On 01/05/2017 05:44 AM, vod vos wrote:

I finally figured it out as follows:

1. modified the corresponding data type of the columns to the csv file

2. if null values existed, defined the data type to varchar. The null
values cause problem too.

so 1100 culumns work well now.

This problem wasted me three days. I have lots of csv data to COPY.



Yes, you cost yourself a lot of time by not showing the original table
definition into which you were trying insert data.


Given that the table had 1100 columns I am not sure I wanted to see it:)

Still the OP did give it to us in description:

https://www.postgresql.org/message-id/15969913dd3.ea2ff58529997.7460368287916683127%40zoho.com
"I create a table with 1100 columns with data type of varchar, and hope 
the COPY command will auto transfer the csv data that contains some 
character and date, most of which are numeric."


In retrospect I should have pressed for was a more complete description 
of the data. I underestimated this description:


"And some the values in the csv file contain nulls, do this null values 
matter? "



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-05 Thread Rob Sargent



On 01/05/2017 05:44 AM, vod vos wrote:

I finally figured it out as follows:

1. modified the corresponding data type of the columns to the csv file

2. if null values existed, defined the data type to varchar. The null 
values cause problem too.


so 1100 culumns work well now.

This problem wasted me three days. I have lots of csv data to COPY.


Yes, you cost yourself a lot of time by not showing the original table 
definition into which you were trying insert data.


Re: [GENERAL] COPY: row is too big

2017-01-05 Thread Adrian Klaver

On 01/05/2017 04:44 AM, vod vos wrote:

I finally figured it out as follows:

1. modified the corresponding data type of the columns to the csv file

2. if null values existed, defined the data type to varchar. The null
values cause problem too.


Did you change the NULLs to something else?

As Pavel said the type does not really matter for NULL:

https://www.postgresql.org/docs/9.6/static/storage-page-layout.html

See marked(<***>) up part

"All table rows are structured in the same way. There is a fixed-size 
header (occupying 23 bytes on most machines), followed by an optional 
null bitmap, an optional object ID field, and the user data. The header 
is detailed in Table 65-4. The actual user data (columns of the row) 
begins at the offset indicated by t_hoff, which must always be a 
multiple of the MAXALIGN distance for the platform. <***>The null bitmap 
is only present if the HEAP_HASNULL bit is set in t_infomask. If it is 
present it begins just after the fixed header and occupies enough bytes 
to have one bit per data column (that is, t_natts bits altogether). In 
this list of bits, a 1 bit indicates not-null, a 0 bit is a null. When 
the bitmap is not present, all columns are assumed not-null. <***> The 
object ID is only present if the HEAP_HASOID bit is set in t_infomask. 
If present, it appears just before the t_hoff boundary. Any padding 
needed to make t_hoff a MAXALIGN multiple will appear between the null 
bitmap and the object ID. (This in turn ensures that the object ID is 
suitably aligned.)"


In this post:

https://www.postgresql.org/message-id/1595fd48444.ba3ec57e13739.3837934651947496063%40zoho.com

you said:

"And some the values in the csv file contain nulls, do this null values 
matter?"


It looks like there are a good deal of NULLs in a row. In your original 
post COPY failed on the second line, so assuming the same data what is 
the NULL count in that line. Or can you provide some estimate of the 
high count of NULLS in your data rows?




so 1100 culumns work well now.

This problem wasted me three days. I have lots of csv data to COPY.


You solved the problem so it was not entirely wasted and it provided 
information for future reference when folks hit this list with a similar 
issue.







 On 星期三, 04 一月 2017 08:39:42 -0800 *Adrian Klaver
* wrote 

On 01/04/2017 08:32 AM, Steve Crawford wrote:
> ...
>
> Numeric is expensive type - try to use float instead, maybe double.
>
>
> If I am following the OP correctly the table itself has all the
> columns declared as varchar. The data in the CSV file is a mix of
> text, date and numeric, presumably cast to text on entry into the
table.
>
>
> But a CSV *is* purely text - no casting to text is needed.
Conversion is
> only needed when the strings in the CSV are text representations of
> *non*-text data.

Yeah, muddled thinking.

>
> I'm guessing that the OP is using all text fields to deal with
possibly
> flawed input data and then validating and migrating the data in
> subsequent steps. In that case, an ETL solution may be a better
> approach. Many options, both open- closed- and hybrid-source exist.
>
> Cheers,
> Steve


--
Adrian Klaver
adrian.kla...@aklaver.com 


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-05 Thread Pavel Stehule
2017-01-05 13:44 GMT+01:00 vod vos :

> I finally figured it out as follows:
>
> 1. modified the corresponding data type of the columns to the csv file
>
> 2. if null values existed, defined the data type to varchar. The null
> values cause problem too.
>

int, float, double can be null too - null needs same space (1bit) for all
types

Regards

Pavel


> so 1100 culumns work well now.
>
> This problem wasted me three days. I have lots of csv data to COPY.
>
>
>
>


Re: [GENERAL] COPY: row is too big

2017-01-05 Thread vod vos
I finally figured it out as follows:



1. modified the corresponding data type of the columns to the csv file



2. if null values existed, defined the data type to varchar. The null values 
cause problem too.



so 1100 culumns work well now. 



This problem wasted me three days. I have lots of csv data to COPY.









 On 星期三, 04 一月 2017 08:39:42 -0800 Adrian Klaver 
adrian.kla...@aklaver.com wrote 




On 01/04/2017 08:32 AM, Steve Crawford wrote: 

 ... 

 

 Numeric is expensive type - try to use float instead, maybe double. 

 

 

 If I am following the OP correctly the table itself has all the 

 columns declared as varchar. The data in the CSV file is a mix of 

 text, date and numeric, presumably cast to text on entry into the table. 

 

 

 But a CSV *is* purely text - no casting to text is needed. Conversion is 

 only needed when the strings in the CSV are text representations of 

 *non*-text data. 

 

Yeah, muddled thinking. 

 

 

 I'm guessing that the OP is using all text fields to deal with possibly 

 flawed input data and then validating and migrating the data in 

 subsequent steps. In that case, an ETL solution may be a better 

 approach. Many options, both open- closed- and hybrid-source exist. 

 

 Cheers, 

 Steve 

 

 

-- 

Adrian Klaver 

adrian.kla...@aklaver.com 

 

 

-- 

Sent via pgsql-general mailing list (pgsql-general@postgresql.org) 

To make changes to your subscription: 

http://www.postgresql.org/mailpref/pgsql-general 








Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Adrian Klaver

On 01/04/2017 08:32 AM, Steve Crawford wrote:

...

Numeric is expensive type - try to use float instead, maybe double.


If I am following the OP correctly the table itself has all the
columns declared as varchar. The data in the CSV file is a mix of
text, date and numeric, presumably cast to text on entry into the table.


But a CSV *is* purely text - no casting to text is needed. Conversion is
only needed when the strings in the CSV are text representations of
*non*-text data.


Yeah, muddled thinking.



I'm guessing that the OP is using all text fields to deal with possibly
flawed input data and then validating and migrating the data in
subsequent steps. In that case, an ETL solution may be a better
approach. Many options, both open- closed- and hybrid-source exist.

Cheers,
Steve



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Steve Crawford
...

> Numeric is expensive type - try to use float instead, maybe double.
>>
>
> If I am following the OP correctly the table itself has all the columns
> declared as varchar. The data in the CSV file is a mix of text, date and
> numeric, presumably cast to text on entry into the table.
>

But a CSV *is* purely text - no casting to text is needed. Conversion is
only needed when the strings in the CSV are text representations of
*non*-text data.

I'm guessing that the OP is using all text fields to deal with possibly
flawed input data and then validating and migrating the data in subsequent
steps. In that case, an ETL solution may be a better approach. Many
options, both open- closed- and hybrid-source exist.

Cheers,
Steve


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Adrian Klaver

On 01/04/2017 08:00 AM, rob stone wrote:

Hello,
On Wed, 2017-01-04 at 07:11 -0800, Adrian Klaver wrote:

On 01/04/2017 06:54 AM, Pavel Stehule wrote:

Hi

2017-01-04 14:00 GMT+01:00 vod vos >:

__
Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I
really dont want to split the csv file to pieces to avoid
mistakes
after this action.


The PostgreSQL limit is "Maximum Columns per Table250 - 1600
depending
on column types" - this limit is related to placing values or
pointers
to values to one page (8KB).

You can hit this limit not in CREATE TABLE time, but in INSERT
time.



I create a table with 1100 columns with data type of varchar,
and
hope the COPY command will auto transfer the csv data that
contains
some character and date, most of which are numeric.


Numeric is expensive type - try to use float instead, maybe double.


If I am following the OP correctly the table itself has all the
columns
declared as varchar. The data in the CSV file is a mix of text, date
and
numeric, presumably cast to text on entry into the table.



Regards

Pavel


I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
DELIMITER ';' ;

Then it shows:

ERROR:  row is too big: size 11808, maximum size 8160





Assuming this is a brand new database instance and not an existing
application, could the OP not compile from source and specify the
 --with-blocksize=16384 so as to overcome the 8k default page size
limit?


Well I was thinking along those lines also, then I did a search on 
BLCKSZ in the docs and saw all the configuration parameters that are 
keyed off it. I know I would have to do a lot more homework to 
understand the implications to the database instance as a whole and 
whether it was worth it to accommodate a single table.





My 2 cents.
Rob




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Peter J. Holzer
On 2017-01-04 06:53:31 -0800, Adrian Klaver wrote:
> On 01/04/2017 05:00 AM, vod vos wrote:
> >Now I am confused about I can create 1100 columns in a table in
> >postgresql, but I can't copy 1100 values into the table. And I really
> 
> As pointed out previously:
> 
> https://www.postgresql.org/about/
> Maximum Columns per Table 250 - 1600 depending on column types
> 
> That being dependent on both the number of columns and the actual data in
> the columns.

I think this is confusingly phrased. In my mind "column type" is static
- the type is the same, independent of the values which are stored. So
  "250 - 1600 depending on column types" implies to me that there is
some type A of which I can have only 250 columns and another type B of
which I can have 1600 columns. But it doesn't imply to me that the
number of columns depends on the values which ar put into those columns.

May I suggest the these improvements?

In https://www.postgresql.org/about/:
Instead of
| 250 - 1600 depending on column types
write
| 250 - 1600 depending on column types and data

In https://www.postgresql.org/docs/9.6/static/ddl-basics.html:
Replace the sentence:
| Depending on the column types, it is between 250 and 1600.
with:
| For all columns in a row, some information (either the data itself or
| a pointer to the data) must be stored in a single block (8 kB).
| Because for some types this data is itself of variable length, the
| maximum number of columns depends not only on the types of the columns
| but also on the data (e.g., a NULL uses less space than a non-NULL
| value). Therefore there is no simple way to compute the maximum number
| of columns, and it is possible to declare a table with more columns
| than can be filled. Keeping all this in mind, the limit is between 250
| and 1600.

hp

-- 
   _  | Peter J. Holzer| A coding theorist is someone who doesn't
|_|_) || think Alice is crazy.
| |   | h...@hjp.at | -- John Gordon
__/   | http://www.hjp.at/ |http://downlode.org/Etext/alicebob.html


signature.asc
Description: Digital signature


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread rob stone
Hello,
On Wed, 2017-01-04 at 07:11 -0800, Adrian Klaver wrote:
> On 01/04/2017 06:54 AM, Pavel Stehule wrote:
> > Hi
> > 
> > 2017-01-04 14:00 GMT+01:00 vod vos  > >:
> > 
> > __
> > Now I am confused about I can create 1100 columns in a table in
> > postgresql, but I can't copy 1100 values into the table. And I
> > really dont want to split the csv file to pieces to avoid
> > mistakes
> > after this action.
> > 
> > 
> > The PostgreSQL limit is "Maximum Columns per Table250 - 1600
> > depending
> > on column types" - this limit is related to placing values or
> > pointers
> > to values to one page (8KB).
> > 
> > You can hit this limit not in CREATE TABLE time, but in INSERT
> > time.
> > 
> > 
> > 
> > I create a table with 1100 columns with data type of varchar,
> > and
> > hope the COPY command will auto transfer the csv data that
> > contains
> > some character and date, most of which are numeric.
> > 
> > 
> > Numeric is expensive type - try to use float instead, maybe double.
> 
> If I am following the OP correctly the table itself has all the
> columns 
> declared as varchar. The data in the CSV file is a mix of text, date
> and 
> numeric, presumably cast to text on entry into the table.
> 
> > 
> > Regards
> > 
> > Pavel
> > 
> > 
> > I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
> > DELIMITER ';' ;
> > 
> > Then it shows:
> > 
> > ERROR:  row is too big: size 11808, maximum size 8160
> > 
> > 


Assuming this is a brand new database instance and not an existing
application, could the OP not compile from source and specify the
 --with-blocksize=16384 so as to overcome the 8k default page size
limit?

My 2 cents.
Rob


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Pavel Stehule
2017-01-04 16:11 GMT+01:00 Adrian Klaver :

> On 01/04/2017 06:54 AM, Pavel Stehule wrote:
>
>> Hi
>>
>> 2017-01-04 14:00 GMT+01:00 vod vos > >:
>>
>> __
>> Now I am confused about I can create 1100 columns in a table in
>> postgresql, but I can't copy 1100 values into the table. And I
>> really dont want to split the csv file to pieces to avoid mistakes
>> after this action.
>>
>>
>> The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending
>> on column types" - this limit is related to placing values or pointers
>> to values to one page (8KB).
>>
>> You can hit this limit not in CREATE TABLE time, but in INSERT time.
>>
>>
>>
>> I create a table with 1100 columns with data type of varchar, and
>> hope the COPY command will auto transfer the csv data that contains
>> some character and date, most of which are numeric.
>>
>>
>> Numeric is expensive type - try to use float instead, maybe double.
>>
>
> If I am following the OP correctly the table itself has all the columns
> declared as varchar. The data in the CSV file is a mix of text, date and
> numeric, presumably cast to text on entry into the table.
>

Table column type are important - Postgres enforces necessary
transformations.

Regards

Pavel


>
>
>> Regards
>>
>> Pavel
>>
>>
>> I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
>> DELIMITER ';' ;
>>
>> Then it shows:
>>
>> ERROR:  row is too big: size 11808, maximum size 8160
>>
>>
>>
>>
>>
>>
>>
>>  On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
>> > >* wrote 
>>
>> On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent
>> >wrote:
>>
>>
>> Perhaps this is your opportunity to correct someone else's
>> mistake. You need to show the table definition to convince
>> us that it cannot be improved. That it may be hard work
>> really doesn't mean it's not the right path.
>>
>>
>> ​This may not be possible. The data might be coming in from an
>> external source. I imagine you've run into the old "well, _we_
>> don't have any problems, so it must be on your end!" scenario.
>>
>> Example: we receive CSV files from an external source. These
>> files are _supposed_ to be validated. But we have often received
>> files where NOT NULL fields have "nothing" in them them. E.g. a
>> customer bill which has _everything_ in it _except_ the customer
>> number (or an invalid one such as "123{"); or missing some other
>> vital piece of information.
>>
>> In this particular case, the OP might want to do what we did in
>> a similar case. We had way too many columns in a table. The
>> performance was horrible. We did an analysis and, as usual, the
>> majority of the selects were for a subset of the columns, about
>> 15% of the total. We "split" the table into the "high use"
>> columns table & the "low use" columns table. We then used
>> triggers to make sure that if we added a new / deleted an old
>> row from one table, the corresponding row in the other was
>> created / deleted.
>>
>>
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list
>> (pgsql-general@postgresql.org
>> )
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>> 
>>
>>
>>
>>
>> --
>> There’s no obfuscated Perl contest because it’s pointless.
>>
>> —Jeff Polk
>>
>> Maranatha! <><
>> John McKown
>>
>>
>>
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread vod vos
OK, maybe the final solution is to split it into half.


 On 星期三, 04 一月 2017 06:53:31 -0800 Adrian Klaver 
adrian.kla...@aklaver.com wrote 




On 01/04/2017 05:00 AM, vod vos wrote: 

 Now I am confused about I can create 1100 columns in a table in 

 postgresql, but I can't copy 1100 values into the table. And I really 

 

As pointed out previously: 

 

https://www.postgresql.org/about/ 

Maximum Columns per Table250 - 1600 depending on column types 

 

That being dependent on both the number of columns and the actual data 

in the columns. Empty columns are not the problem, it is when you start 

filling them that you get the error. 

 

 dont want to split the csv file to pieces to avoid mistakes after this 

 action. 

 

 I create a table with 1100 columns with data type of varchar, and hope 

 the COPY command will auto transfer the csv data that contains some 

 

I am afraid the solution is going to require more then hope. You are 

going to need to break the data up. I suspect that just splitting it 

into half would do the trick. So: 

 

Table 1 

column 1 for a primary key(assuming first column of your present data) 

columns 2-550 

 

Table 2 

column 1 for a primary key(assuming first column of your present data) 

columns 551-1100 

 

Using the program I mentioned previously: 

 

http://csvkit.readthedocs.io/en/1.0.1/scripts/csvcut.html 

 

That translates into: 

 

csvcut -c 1,2-550 your_big.csv  table_1.csv 

 

csvcut -c 1,551-1100 your_big.csv  table_2.csv 

 

 

 character and date, most of which are numeric. 

 

Is this a different data set? 

Previously you said: 

"The most of the data type are text or varhcar, ..." 

 

 I use the command: COPY rius FROM "/var/www/test/test.csv" WITH 

 DELIMITER ';' ; 

 

 Then it shows: 

 

 ERROR: row is too big: size 11808, maximum size 8160 

 

 

 

 

 

 

 

  On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown 

 john.archie.mck...@gmail.com* wrote  

 

 On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent robjsarg...@gmail.com 

 mailto:robjsarg...@gmail.comwrote: 

 

 Perhaps this is your opportunity to correct someone else's 

 mistake. You need to show the table definition to convince us 

 that it cannot be improved. That it may be hard work really 

 doesn't mean it's not the right path. 

 

 

 ​This may not be possible. The data might be coming in from an 

 external source. I imagine you've run into the old "well, _we_ don't 

 have any problems, so it must be on your end!" scenario. 

 

 Example: we receive CSV files from an external source. These files 

 are _supposed_ to be validated. But we have often received files 

 where NOT NULL fields have "nothing" in them them. E.g. a customer 

 bill which has _everything_ in it _except_ the customer number (or 

 an invalid one such as "123{"); or missing some other vital piece of 

 information. 

 

 In this particular case, the OP might want to do what we did in a 

 similar case. We had way too many columns in a table. The 

 performance was horrible. We did an analysis and, as usual, the 

 majority of the selects were for a subset of the columns, about 15% 

 of the total. We "split" the table into the "high use" columns table 

  the "low use" columns table. We then used triggers to make sure 

 that if we added a new / deleted an old row from one table, the 

 corresponding row in the other was created / deleted. 

 

 

 

 

 

 -- 

 Sent via pgsql-general mailing list 

 (pgsql-general@postgresql.org mailto:pgsql-general@postgresql.org;) 

 To make changes to your subscription: 

 http://www.postgresql.org/mailpref/pgsql-general 

 

 

 

 

 -- 

 There’s no obfuscated Perl contest because it’s pointless. 

 

 —Jeff Polk 

 

 Maranatha!  

 John McKown 

 

 

 

 

-- 

Adrian Klaver 

adrian.kla...@aklaver.com 








Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Adrian Klaver

On 01/04/2017 06:54 AM, Pavel Stehule wrote:

Hi

2017-01-04 14:00 GMT+01:00 vod vos >:

__
Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I
really dont want to split the csv file to pieces to avoid mistakes
after this action.


The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending
on column types" - this limit is related to placing values or pointers
to values to one page (8KB).

You can hit this limit not in CREATE TABLE time, but in INSERT time.



I create a table with 1100 columns with data type of varchar, and
hope the COPY command will auto transfer the csv data that contains
some character and date, most of which are numeric.


Numeric is expensive type - try to use float instead, maybe double.


If I am following the OP correctly the table itself has all the columns 
declared as varchar. The data in the CSV file is a mix of text, date and 
numeric, presumably cast to text on entry into the table.




Regards

Pavel


I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
DELIMITER ';' ;

Then it shows:

ERROR:  row is too big: size 11808, maximum size 8160







 On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
>* wrote 

On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent
>wrote:

Perhaps this is your opportunity to correct someone else's
mistake. You need to show the table definition to convince
us that it cannot be improved. That it may be hard work
really doesn't mean it's not the right path.


​This may not be possible. The data might be coming in from an
external source. I imagine you've run into the old "well, _we_
don't have any problems, so it must be on your end!" scenario.

Example: we receive CSV files from an external source. These
files are _supposed_ to be validated. But we have often received
files where NOT NULL fields have "nothing" in them them. E.g. a
customer bill which has _everything_ in it _except_ the customer
number (or an invalid one such as "123{"); or missing some other
vital piece of information.

In this particular case, the OP might want to do what we did in
a similar case. We had way too many columns in a table. The
performance was horrible. We did an analysis and, as usual, the
majority of the selects were for a subset of the columns, about
15% of the total. We "split" the table into the "high use"
columns table & the "low use" columns table. We then used
triggers to make sure that if we added a new / deleted an old
row from one table, the corresponding row in the other was
created / deleted.





--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





--
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Pavel Stehule
Hi

2017-01-04 14:00 GMT+01:00 vod vos :

> Now I am confused about I can create 1100 columns in a table in
> postgresql, but I can't copy 1100 values into the table. And I really dont
> want to split the csv file to pieces to avoid mistakes after this action.
>

The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending on
column types" - this limit is related to placing values or pointers to
values to one page (8KB).

You can hit this limit not in CREATE TABLE time, but in INSERT time.


>
> I create a table with 1100 columns with data type of varchar, and hope the
> COPY command will auto transfer the csv data that contains some character
> and date, most of which are numeric.
>

Numeric is expensive type - try to use float instead, maybe double.

Regards

Pavel


> I use the command: COPY rius FROM "/var/www/test/test.csv" WITH DELIMITER
> ';' ;
>
> Then it shows:
>
> ERROR:  row is too big: size 11808, maximum size 8160
>
>
>
>
>
>
>
>  On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
> >* wrote 
>
> On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent  wrote:
>
> Perhaps this is your opportunity to correct someone else's mistake. You
> need to show the table definition to convince us that it cannot be
> improved. That it may be hard work really doesn't mean it's not the right
> path.
>
>
> ​This may not be possible. The data might be coming in from an external
> source. I imagine you've run into the old "well, _we_ don't have any
> problems, so it must be on your end!" scenario.
>
> Example: we receive CSV files from an external source. These files are
> _supposed_ to be validated. But we have often received files where NOT NULL
> fields have "nothing" in them them. E.g. a customer bill which has
> _everything_ in it _except_ the customer number (or an invalid one such as
> "123{"); or missing some other vital piece of information.
>
> In this particular case, the OP might want to do what we did in a similar
> case. We had way too many columns in a table. The performance was horrible.
> We did an analysis and, as usual, the majority of the selects were for a
> subset of the columns, about 15% of the total. We "split" the table into
> the "high use" columns table & the "low use" columns table. We then used
> triggers to make sure that if we added a new / deleted an old row from one
> table, the corresponding row in the other was created / deleted.
>
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>
>
>
> --
> There’s no obfuscated Perl contest because it’s pointless.
>
> —Jeff Polk
>
> Maranatha! <><
> John McKown
>
>
>


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Adrian Klaver

On 01/04/2017 05:00 AM, vod vos wrote:

Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I really


As pointed out previously:

https://www.postgresql.org/about/
Maximum Columns per Table   250 - 1600 depending on column types

That being dependent on both the number of columns and the actual data 
in the columns. Empty columns are not the problem, it is when you start 
filling them that you get the error.



dont want to split the csv file to pieces to avoid mistakes after this
action.

I create a table with 1100 columns with data type of varchar, and hope
the COPY command will auto transfer the csv data that contains some


I am afraid the solution is going to require more then hope. You are 
going to need to break the data up. I suspect that just splitting it 
into half would do the trick. So:


Table 1
column 1 for a primary key(assuming first column of your present data)
columns 2-550

Table 2
column 1 for a primary key(assuming first column of your present data)
columns 551-1100

Using the program I mentioned previously:

http://csvkit.readthedocs.io/en/1.0.1/scripts/csvcut.html

That translates into:

csvcut -c 1,2-550 your_big.csv > table_1.csv

csvcut -c 1,551-1100 your_big.csv > table_2.csv



character and date, most of which are numeric.


Is this a different data set?
Previously you said:
"The most of the data type are text or varhcar, ..."


I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
DELIMITER ';' ;

Then it shows:

ERROR:  row is too big: size 11808, maximum size 8160







 On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
* wrote 

On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent >wrote:

Perhaps this is your opportunity to correct someone else's
mistake. You need to show the table definition to convince us
that it cannot be improved. That it may be hard work really
doesn't mean it's not the right path.


​This may not be possible. The data might be coming in from an
external source. I imagine you've run into the old "well, _we_ don't
have any problems, so it must be on your end!" scenario.

Example: we receive CSV files from an external source. These files
are _supposed_ to be validated. But we have often received files
where NOT NULL fields have "nothing" in them them. E.g. a customer
bill which has _everything_ in it _except_ the customer number (or
an invalid one such as "123{"); or missing some other vital piece of
information.

In this particular case, the OP might want to do what we did in a
similar case. We had way too many columns in a table. The
performance was horrible. We did an analysis and, as usual, the
majority of the selects were for a subset of the columns, about 15%
of the total. We "split" the table into the "high use" columns table
& the "low use" columns table. We then used triggers to make sure
that if we added a new / deleted an old row from one table, the
corresponding row in the other was created / deleted.





--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org )
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




--
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread vod vos
Now I am confused about I can create 1100 columns in a table in postgresql, but 
I can't copy 1100 values into the table. And I really dont want to split the 
csv file to pieces to avoid mistakes after this action.



I create a table with 1100 columns with data type of varchar, and hope the COPY 
command will auto transfer the csv data that contains some character and date, 
most of which are numeric.

 

I use the command: COPY rius FROM "/var/www/test/test.csv" WITH DELIMITER ';' ;



Then it shows: 



ERROR:  row is too big: size 11808, maximum size 8160















 On 星期二, 03 一月 2017 05:24:18 -0800 John McKown 
john.archie.mck...@gmail.com wrote 




On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent robjsarg...@gmail.com wrote:

Perhaps this is your opportunity to correct someone else's mistake. You need to 
show the table definition to convince us that it cannot be improved. That it 
may be hard work really doesn't mean it's not the right path.




​This may not be possible. The data might be coming in from an external source. 
I imagine you've run into the old "well, _we_ don't have any problems, so it 
must be on your end!" scenario. 



Example: we receive CSV files from an external source. These files are 
_supposed_ to be validated. But we have often received files where NOT NULL 
fields have "nothing" in them them. E.g. a customer bill which has _everything_ 
in it _except_ the customer number (or an invalid one such as "123{"); or 
missing some other vital piece of information.



In this particular case, the OP might want to do what we did in a similar case. 
We had way too many columns in a table. The performance was horrible. We did an 
analysis and, as usual, the majority of the selects were for a subset of the 
columns, about 15% of the total. We "split" the table into the "high use" 
columns table  the "low use" columns table. We then used triggers to make 
sure that if we added a new / deleted an old row from one table, the 
corresponding row in the other was created / deleted.




 






--

 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)

 To make changes to your subscription:

 http://www.postgresql.org/mailpref/pgsql-general










-- 

There’s no obfuscated Perl contest because it’s pointless.



—Jeff Polk




Maranatha! 

John McKown





































Re: [GENERAL] COPY: row is too big

2017-01-03 Thread John McKown
On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent  wrote:

> Perhaps this is your opportunity to correct someone else's mistake. You
> need to show the table definition to convince us that it cannot be
> improved. That it may be hard work really doesn't mean it's not the right
> path.
>

​This may not be possible. The data might be coming in from an external
source. I imagine you've run into the old "well, _we_ don't have any
problems, so it must be on your end!" scenario.

Example: we receive CSV files from an external source. These files are
_supposed_ to be validated. But we have often received files where NOT NULL
fields have "nothing" in them them. E.g. a customer bill which has
_everything_ in it _except_ the customer number (or an invalid one such as
"123{"); or missing some other vital piece of information.

In this particular case, the OP might want to do what we did in a similar
case. We had way too many columns in a table. The performance was horrible.
We did an analysis and, as usual, the majority of the selects were for a
subset of the columns, about 15% of the total. We "split" the table into
the "high use" columns table & the "low use" columns table. We then used
triggers to make sure that if we added a new / deleted an old row from one
table, the corresponding row in the other was created / deleted.



>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown


Re: [GENERAL] COPY: row is too big

2017-01-02 Thread Rob Sargent


> On Jan 2, 2017, at 10:13 AM, Adrian Klaver  wrote:
> 
>> On 01/02/2017 09:03 AM, vod vos wrote:
>> You know, the csv file was exported from other database of a machine, so
>> I really dont want to break it for it is a hard work. Every csv file
>> contains headers and values. If I redesign the table, then I have to cut
>> all the csv files into pieces one by one. 
> 
> If it helps:
> 
> http://csvkit.readthedocs.io/en/latest/tutorial/1_getting_started.html#csvcut-data-scalpel
>> 
>> 
>>  On 星期一, 02 一月 2017 08:21:29 -0800 *Tom Lane
>> * wrote 
>> 
>>vod vos > writes:
>>> When I copy data from csv file, a very long values for many
>>columns (about 1100 columns). The errors appears:
>>> ERROR: row is too big: size 11808, maximum size 8160
>> 
>>You need to rethink your table schema so you have fewer columns.
>>Perhaps you can combine some of them into arrays, for example.
>>JSON might be a useful option, too.
>> 
>>regards, tom lane
>> 
>> 
>>-- 
>>Sent via pgsql-general mailing list (pgsql-general@postgresql.org
>>)
>>To make changes to your subscription:
>>http://www.postgresql.org/mailpref/pgsql-general
>> 
>> 
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
Perhaps this is your opportunity to correct someone else's mistake. You need to 
show the table definition to convince us that it cannot be improved. That it 
may be hard work really doesn't mean it's not the right path. 

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-02 Thread Adrian Klaver
On 01/02/2017 09:03 AM, vod vos wrote:
> You know, the csv file was exported from other database of a machine, so
> I really dont want to break it for it is a hard work. Every csv file
> contains headers and values. If I redesign the table, then I have to cut
> all the csv files into pieces one by one. 

If it helps:

http://csvkit.readthedocs.io/en/latest/tutorial/1_getting_started.html#csvcut-data-scalpel
> 
> 
>  On 星期一, 02 一月 2017 08:21:29 -0800 *Tom Lane
> * wrote 
> 
> vod vos > writes:
> > When I copy data from csv file, a very long values for many
> columns (about 1100 columns). The errors appears:
> > ERROR: row is too big: size 11808, maximum size 8160
> 
> You need to rethink your table schema so you have fewer columns.
> Perhaps you can combine some of them into arrays, for example.
> JSON might be a useful option, too.
> 
> regards, tom lane
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
> 
> 


-- 
Adrian Klaver
adrian.kla...@aklaver.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-02 Thread vod vos
You know, the csv file was exported from other database of a machine, so I 
really dont want to break it for it is a hard work. Every csv file contains 
headers and values. If I redesign the table, then I have to cut all the csv 
files into pieces one by one. 





 On 星期一, 02 一月 2017 08:21:29 -0800 Tom Lane t...@sss.pgh.pa.us 
wrote 




vod vos vod...@zoho.com writes: 

 When I copy data from csv file, a very long values for many columns (about 
1100 columns). The errors appears: 

 ERROR: row is too big: size 11808, maximum size 8160 

 

You need to rethink your table schema so you have fewer columns. 

Perhaps you can combine some of them into arrays, for example. 

JSON might be a useful option, too. 

 

regards, tom lane 

 

 

-- 

Sent via pgsql-general mailing list (pgsql-general@postgresql.org) 

To make changes to your subscription: 

http://www.postgresql.org/mailpref/pgsql-general 








Re: [GENERAL] COPY: row is too big

2017-01-02 Thread Tom Lane
vod vos  writes:
> When I copy data from csv file, a very long values for many columns (about 
> 1100 columns). The errors appears:
> ERROR:  row is too big: size 11808, maximum size 8160

You need to rethink your table schema so you have fewer columns.
Perhaps you can combine some of them into arrays, for example.
JSON might be a useful option, too.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-02 Thread vod vos
The most of the data type are text or varhcar, and I use:



COPY rius FROM "/var/www/test/aa.csv" WITH DELIMITER ';' ;



And some the values in the csv file contain nulls, do this null values matter? 



Thanks.




 On 星期一, 02 一月 2017 03:11:14 -0800 vod vos vod...@zoho.com wrote 





Hi everyone,



My postgresql is 9.61.



When I copy data from csv file, a very long values for many columns (about 1100 
columns). The errors appears:





ERROR:  row is too big: size 11808, maximum size 8160CONTEXT:  



COPY rius, line 2



rius is the table.



I have searched the mailing list, but seems no solutions founded.



Thanks.









Re: [GENERAL] COPY: row is too big

2017-01-02 Thread Adrian Klaver

On 01/02/2017 03:11 AM, vod vos wrote:

Hi everyone,

My postgresql is 9.61.

When I copy data from csv file, a very long values for many columns
(about 1100 columns). The errors appears:


My guess is you are tripping this:

https://www.postgresql.org/about/
Maximum Columns per Table   250 - 1600 depending on column types

So what are you storing in table rius and can you give a general idea of 
its schema? Not all 1100 columns just a sampling of the data types involved.


Also what is the COPY command you are using?




ERROR:  row is too big: size 11808, maximum size 8160CONTEXT:

COPY rius, line 2

rius is the table.

I have searched the mailing list, but seems no solutions founded.

Thanks.



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY: row is too big

2017-01-02 Thread John McKown
On Mon, Jan 2, 2017 at 5:11 AM, vod vos  wrote:

> Hi everyone,
>
> My postgresql is 9.61.
>
> When I copy data from csv file, a very long values for many columns (about
> 1100 columns). The errors appears:
>
>
> ERROR:  row is too big: size 11808, maximum size 8160CONTEXT:
>
> COPY rius, line 2
>
> rius is the table.
>
> I have searched the mailing list, but seems no solutions founded.
>
> Thanks.
>

​I looked in the source code. That message _seems_ to be coming from the
file ./src/backend/heap/hio.c and relates to MaxHeapTupleSize. This is set,
indirectly, from the BLKCZ set in the "configure" from when PostgreSQL was
originally compiled. That is, this is a "hard coded" limit which can only
be overridden by re-customizing PostgreSQL yourself using the source.
Apparently whomever did the PostgreSQL compilation setup took the default
BLKCZ of 8192. So there is no solution other than "do it yourself" by
getting the PostgreSQL source code and configuring it yourself. I can give
you the first step. You can get the PostgreSQL source one of two ways. You
can go here: https://www.postgresql.org/ftp/source/v9.6.1/ - download the
proper file. Or, if you have and know "git", you can enter the command: git
clone git://git.postgresql.org/git/postgresql.git .

Oh, I assumed (bad me!) that you're running on Linux. I know _nothing_
about how to do the above on Windows.

I am not a PostgreSQL guru. Perhaps I made a stupid mistake in my analysis
and the truly knowledgeable will have a better answer for you.
​
-- 
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown


Re: [GENERAL] COPY command & binary format

2016-05-10 Thread Pujol Mathieu



Le 10/05/2016 à 12:56, Nicolas Paris a écrit :

Hello,

What is the way to build a binary format (instead of a csv) ? Is there 
specification for this file ?

http://www.postgresql.org/docs/9.5/static/sql-copy.html

Could I create such format from java ?

I guess this would be far faster, and maybe safer than CSVs

Thanks by advance,

Hi
Making a driver that do what you want is not difficult. You will achieve 
better performances than than loading data from CSV, and you also will 
have better precision for floating values (there is no text conversion).
In the link you provide there is a description of the file format in 
section Binary Format.

Mathieu Pujol



Re: [GENERAL] COPY command & binary format

2016-05-10 Thread Sameer Kumar
On Tue, May 10, 2016 at 4:36 PM Sameer Kumar 
wrote:

> On Tue, May 10, 2016 at 4:26 PM Nicolas Paris  wrote:
>
>> Hello,
>>
>> What is the way to build a binary format (instead of a csv) ? Is there
>> specification for this file ?
>> http://www.postgresql.org/docs/9.5/static/sql-copy.html
>>
>
>>
>> Could I create such format from java ?
>>
>
> You can use COPY JDBC API to copy to STDOUT and then compress it before
> you use usual Java file operations to write it to a file. You will have to
> follow the reverse process while reading from this file and LOADING to a
> table.
>
> But why would you want to do that?
>
>
>>
>> I guess this would be far faster, and maybe safer than CSVs
>>
>
> I don't think assumption is right. COPY is not meant for backup, it is for
> LOAD and UN-LOAD.
>
> What you probably need is pg_dump with -Fc format.
> http://www.postgresql.org/docs/current/static/app-pgdump.html
>
>

Like someone else suggested upthread you can use Binary format in COPY
command (default is text)


>
>> Thanks by advance,
>>
> --
> --
> Best Regards
> Sameer Kumar | DB Solution Architect
> *ASHNIK PTE. LTD.*
>
> 101 Cecil Street, #11-11 Tong Eng Building, Singapore 069 533
>
> T: +65 6438 3504 | M: +65 8110 0350 | www.ashnik.com
>
-- 
--
Best Regards
Sameer Kumar | DB Solution Architect
*ASHNIK PTE. LTD.*

101 Cecil Street, #11-11 Tong Eng Building, Singapore 069 533

T: +65 6438 3504 | M: +65 8110 0350 | www.ashnik.com


Re: [GENERAL] COPY command & binary format

2016-05-10 Thread Sameer Kumar
On Tue, May 10, 2016 at 4:26 PM Nicolas Paris  wrote:

> Hello,
>
> What is the way to build a binary format (instead of a csv) ? Is there
> specification for this file ?
> http://www.postgresql.org/docs/9.5/static/sql-copy.html
>

>
> Could I create such format from java ?
>

You can use COPY JDBC API to copy to STDOUT and then compress it before you
use usual Java file operations to write it to a file. You will have to
follow the reverse process while reading from this file and LOADING to a
table.

But why would you want to do that?


>
> I guess this would be far faster, and maybe safer than CSVs
>

I don't think assumption is right. COPY is not meant for backup, it is for
LOAD and UN-LOAD.

What you probably need is pg_dump with -Fc format.
http://www.postgresql.org/docs/current/static/app-pgdump.html


>
> Thanks by advance,
>
-- 
--
Best Regards
Sameer Kumar | DB Solution Architect
*ASHNIK PTE. LTD.*

101 Cecil Street, #11-11 Tong Eng Building, Singapore 069 533

T: +65 6438 3504 | M: +65 8110 0350 | www.ashnik.com


Re: [GENERAL] COPY FROM STDIN

2016-01-10 Thread Jim Nasby

On 1/8/16 10:37 AM, Luke Coldiron wrote:

On 1/6/16 9:45 PM, Luke Coldiron wrote:

In the example above I'm not sure if I can use some sub struct of the
SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or
if I need to go about this entirely different. Any advice on the
matter would be much appreciated.


I don't know off-hand. I suggest you look at what psql does to implement

\copy (note the \).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in

Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in
Treble! http://BlueTreble.com

I took a look at the \copy but as far as I can tell this is using the
frontend (client side) of postgresql and I need to be able to do this on the


Oops.


backend (server side). I don't see a way with this either to attach to the
connection that called the c function and execute the copy statement. The
DoCopy in commands/copy.h appears to me to be the server side copy command.
And I think I can get it to work if I can figure out how to attach to the
connection of the calling function and pass in the CopyStmt variable.


That sounds... bad.

I think the way to handle this is to add a new type to CopyDest and 
modify CopyGetData() and CopySendEndOfRow() accordingly.


It might be tempting to use CopyState->filename as the pointer to a 
StringInfoData (StringInfo), but I'm not sure that's a great idea. I 
think it'd be better to add a new field to CopyStateData.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY FROM STDIN

2016-01-08 Thread Luke Coldiron
> On 1/6/16 9:45 PM, Luke Coldiron wrote:
> > In the example above I'm not sure if I can use some sub struct of the 
> > SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or 
> > if I need to go about this entirely different. Any advice on the 
> > matter would be much appreciated.
>
> I don't know off-hand. I suggest you look at what psql does to implement
\copy (note the \).
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in
Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in
Treble! http://BlueTreble.com

I took a look at the \copy but as far as I can tell this is using the
frontend (client side) of postgresql and I need to be able to do this on the
backend (server side). I don't see a way with this either to attach to the
connection that called the c function and execute the copy statement. The
DoCopy in commands/copy.h appears to me to be the server side copy command.
And I think I can get it to work if I can figure out how to attach to the
connection of the calling function and pass in the CopyStmt variable.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY FROM STDIN

2016-01-06 Thread Jim Nasby

On 1/6/16 9:45 PM, Luke Coldiron wrote:

In the example above I'm not sure if I can use some sub struct of the
SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or if I
need to go about this entirely different. Any advice on the matter would be
much appreciated.


I don't know off-hand. I suggest you look at what psql does to implement 
\copy (note the \).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY FROM STDIN

2016-01-06 Thread Luke Coldiron
> On 1/4/16 12:18 PM, Luke Coldiron wrote:
> > Is there a way to achieve the performance of the COPY FROM STDIN command
> > within a C extension function connected to the db connection that called
> > the C function? I have text that I would like to receive as input to a C
> > function that contains many COPY command statements in the file that
> > would be parsed similar to how psql would handle the file but I don't
> > want to shell out to psql as I need to do all of this work on the db
> > connection that the function was called from as there are other commands
> > that I need to perform as well after before and after handling the COPY
> > commands on this same connection. I would like the unit of work to be
> > all or nothing and have the performance of the COPY FROM STDIN command
> > and not break things out into SELECT INTO or INSERT statements for
> > performance.
> >
> > Ideally I would like to be able to attach to the calling db connection
> > via SPI_connect() and then use the libpq library to issue the copy
> > commands via PQputCopyData, PQputCopyEnd.
>
> C functions can use SPI, so I'm not sure what the issue is?
>
> http://www.postgresql.org/docs/9.5/static/spi.html
>
> (BTW, you'll want to scroll to the bottom of that page...)
> -- 
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://bluetreble.com/

I should probably back the boat up. I'm not too familiar with the
inter-workings of postgres. I have a general understanding of the SPI API
and realize that other c functions can be called. Part of the problem is I'm
not sure which ones. The first thing I am trying to figure out is if I can
perform a COPY FROM STDIN command via a C extension function using the SPI.
>From what I read in the documentation it seemed to indicate that this may
not be possible
(http://www.postgresql.org/docs/9.3/static/spi-spi-execute.html). 

Here is what I have tried thus far.

#include "postgres.h"
#include "fmgr.h"
#include "funcapi.h"
#include "executor/spi.h"
#include "catalog/pg_type.h"
#include "commands/copy.h"

// Attempt #1
Datum copy_test(PG_FUNCTION_ARGS)
{
   SPI_connect();

   int spi_status = SPI_execute("COPY public.test_table(val1, val2) FROM
stdin;" /* command */,
false, /* read_only */
0 /* count */);
   if (spi_status != SPI_OK_SELECT)
   {
  ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR),
errmsg_internal("Failed: %s.", SPI_result_code_string(spi_status;
   }

   SPI_finish();

   PG_RETURN_VOID();
}

When I run the function above I get this error:  ERROR:  Failed:
SPI_ERROR_COPY. Which is what I would expect from the documentation. 

However if I try something like this:

// Attempt #2
Datum copy_test(PG_FUNCTION_ARGS)
{
   SPI_connect();

   SPIPlanPtr plan = SPI_prepare("COPY public.test_table(val1, val2) FROM
stdin;" /* command */, 0 /* nargs */, NULL /* argtypes*/);
   if (plan == NULL)
   {
  ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR),
errmsg_internal("Failed: %s.", SPI_result_code_string(SPI_result;
   }

   SPI_finish();

   PG_RETURN_VOID();
}

It works, although I really haven't got to the point of doing anything. It
looked like the "commands/copy.h" has the interface that I want to use but
I'm not sure if I am even going about it the correct way. Here roughly what
I am thinking I need to do.

// Attempt #3
Datum copy_test(PG_FUNCTION_ARGS)
{
   SPI_connect();

   SPIPlanPtr plan = SPI_prepare("COPY public.test_table(val1, val2) FROM
stdin;" /* command */, 0 /* nargs */, NULL /* argtypes*/);
   if (plan == NULL)
   {
  ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR),
errmsg_internal("Failed: %s.", SPI_result_code_string(SPI_result;
   }

   uint64 processed;
   Oid table_oid = DoCopy(const CopyStmt *stmt,
  "COPY public.test_table(val1, val2) FROM stdin;"
/* queryString */,
   /* processed */);

   //CopyState cstate = BeginCopyFrom(table_oid /* rel */,
  NULL
/* filename */,
false /* is_program */,
  List
*attnamelist,
  List
*options);

  // Somehow  

   // End the copy command
   // EndCopyFrom(cstate);

// TODO: Make use of the callback
   // extern void CopyFromErrorCallback(void *arg);
   // extern DestReceiver *CreateCopyDestReceiver(void);

   SPI_finish();

   PG_RETURN_VOID();
}

In the example above I'm not sure if I can use some sub struct of the
SPIPlanPtr and hand it off to the DoCopy function as the CopyStmt or if I
need to go about this entirely different. Any advice on the matter would be
much appreciated.




-- 
Sent via pgsql-general mailing list 

Re: [GENERAL] COPY FROM STDIN

2016-01-05 Thread Jim Nasby

On 1/4/16 12:18 PM, Luke Coldiron wrote:

Is there a way to achieve the performance of the COPY FROM STDIN command
within a C extension function connected to the db connection that called
the C function? I have text that I would like to receive as input to a C
function that contains many COPY command statements in the file that
would be parsed similar to how psql would handle the file but I don't
want to shell out to psql as I need to do all of this work on the db
connection that the function was called from as there are other commands
that I need to perform as well after before and after handling the COPY
commands on this same connection. I would like the unit of work to be
all or nothing and have the performance of the COPY FROM STDIN command
and not break things out into SELECT INTO or INSERT statements for
performance.

Ideally I would like to be able to attach to the calling db connection
via SPI_connect() and then use the libpq library to issue the copy
commands via PQputCopyData, PQputCopyEnd.


C functions can use SPI, so I'm not sure what the issue is?

http://www.postgresql.org/docs/9.5/static/spi.html

(BTW, you'll want to scroll to the bottom of that page...)
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY command file name encoding issue (UTF8/WIN1252)

2015-03-23 Thread Pujol Mathieu
Maybe a new option could be added to let caller specifies the file name 
encoding, it may know it because he create the source/destination file.
I tried to give him a WIN1252 text by doing COPY test TO 
convert_from(convert_to('C:/tmp/é.bin','UTF8'),'WIN1252') WITH BINARY 
but this call is not allowed. Sending him a text containing escaped 
WIN1252 hex value fails, because query parser detect invalid UTF8 
sequence (which is logical).

The problem is that I can't find any way to workaround this bug.

Regards

Mathieu Pujol

Le 23/03/2015 11:46, Albe Laurenz a écrit :

Pujol Mathieu wrote:

I have a problem using COPY command with a file name containing non
ASCII characters.
I use Postgres 9.3.5 x64 on a Windows 7.
OS local encoding is WIN1252.
My database is encoded in UTF8.
I initiate client connection with libpq, connection encoding is set to UTF8.
I build properly my file path taking care of encoding.

When I run COPY test TO 'C:/tmp/é.bin' WITH BINARY
it creates a file named é.bin which is utf8 name interpreted as local8.
It could be reproduced convert_from(convert_to('é','UTF8'),'WIN1252').
é in UTF8 \303\251
é in WIN1252\351

This command works on a database encoded in WIN1252 (same as OS) .
So it seems that COPY command don't take care of file name encoding.
Is it a bug ? a limitation ?
Thanks for your help

I didn't look at the code, but I'd say that the database encoding is
used for the file name, which is why it works when database encoding
and OS locale are the same.

I guess that it would be possible for PostgreSQL to figure out with what
OS locale the postmaster is running and to convert file names accordingly,
but it's probably not trivial since it is OS dependent.

Yours,
Laurenz Albe





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY command file name encoding issue (UTF8/WIN1252)

2015-03-23 Thread Albe Laurenz
Pujol Mathieu wrote:
 I have a problem using COPY command with a file name containing non
 ASCII characters.
 I use Postgres 9.3.5 x64 on a Windows 7.
 OS local encoding is WIN1252.
 My database is encoded in UTF8.
 I initiate client connection with libpq, connection encoding is set to UTF8.
 I build properly my file path taking care of encoding.
 
 When I run COPY test TO 'C:/tmp/é.bin' WITH BINARY
 it creates a file named é.bin which is utf8 name interpreted as local8.
 It could be reproduced convert_from(convert_to('é','UTF8'),'WIN1252').
 é in UTF8 \303\251
 é in WIN1252\351
 
 This command works on a database encoded in WIN1252 (same as OS) .
 So it seems that COPY command don't take care of file name encoding.
 Is it a bug ? a limitation ?
 Thanks for your help

I didn't look at the code, but I'd say that the database encoding is
used for the file name, which is why it works when database encoding
and OS locale are the same.

I guess that it would be possible for PostgreSQL to figure out with what
OS locale the postmaster is running and to convert file names accordingly,
but it's probably not trivial since it is OS dependent.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Copy Data between different databases

2015-03-05 Thread Tim Semmelhaack
Hi Francisco,

 

thanks!

The solution with 

 

(cat q2.sql; ​ psql -h host1 -U user1 -d db1  -f /q1.sql) | psql -h host2 -U 
user2 -d db2

worked! First I have forgotten the semicolon at the end of q2.sql and got an 
error. 

 

Tim

 

Von: Francisco Olarte [mailto:fola...@peoplecall.com] 
Gesendet: Mittwoch, 4. März 2015 15:48
An: Adrian Klaver
Cc: Tim Semmelhaack; pgsql-general@postgresql.org
Betreff: Re: [GENERAL] Copy Data between different databases

 

Hi Adrian:

 

On Wed, Mar 4, 2015 at 1:03 AM, Adrian Klaver adrian.kla...@aklaver.com 
mailto:adrian.kla...@aklaver.com  wrote:

 

​As you pointed, my bet is in the -f case COPY FROM STDIN expects the
data on the file ( otherwise pg_dumps would not work ), but your
sugestion seems to have a problem of double redirection, let me elaborate:


Well according to here, they should be roughly equivalent:

http://www.postgresql.org/docs/9.3/interactive/app-psql.html


​   Yeah, they should​, I' was not discussing you. I was pointing the SHELL 
line was incorrect, 


 

Trying it showed they where and ended with the same result, the data was not 
copied over:(

 

​Of course, I'll be greatly surprissed if they did. ​

 



If I did this:

psql -h host1 -U user1 -d db1  -f /q1.sql | psql -h host2 -U user2 -d db2 -f -

I saw the stdout from my 'q1.sql' show up at the second command, where it threw 
an error because it was just the data without the COPY .. FROM statement. So 
the second command must eat the stdin before it actually runs q2.sql. Figured 
this would have been an easy fix. In my case for this sort of thing I use 
Python/psycopg2 and its COPY TO/FROM  commands and run it through a buffer. 
Though of late I have starting using Pandas also.

 

​Of course you end up with an error, I would have reported a bug otherwise. And 
also you are not using q2.sql so the result would have been wrong. I did send 
you a form ( what you've nicely quoted back ) :

 


(cat q2.sql; ​ psql -h host1 -U user1 -d db1  -f /q1.sql) | psql -h host2 -U 
user2 -d db2

 

​of putting q2.sql in front ​of the output from q1.sql in the same pipe, even 
with some samples of how this pipes works. Maybe you stopped reading too soon. 
I cannot try it as I do not have q1.sql or q2.sql, but given what I know about 
the reading/writing code of psql ( and that I have made this kinds of things 
before ) it should work. It's a classical shell construct, use a sub-shell ( 
parentheses ) to combine several commands and pipe its output to another one. 
The problem what all the others constructs seem to be trying to do it with a 
pipe of single commands, which is much more difficult. Of course, if the 
problemis due to inadequate shells ( as, say, cmd.exe ) it may need to be done 
in other ways.




Regards.

   Francisco Olarte.

 



Re: [GENERAL] Copy Data between different databases

2015-03-05 Thread Jim Nasby

On 3/3/15 8:18 AM, Tim Semmelhaack wrote:

When I run a much simpler version of the query with the -c Select ..
option it works. Because the sql-scripts are quite long, I don't to do it
without the -f option.


When you say quite long... are you trying to do multiple commands in q1 
or q2? As in, do you have more than just a single COPY statement in 
either file? Because the odds of that working are not very good...

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Copy Data between different databases

2015-03-04 Thread Francisco Olarte
Hi Adrian:

On Wed, Mar 4, 2015 at 1:03 AM, Adrian Klaver adrian.kla...@aklaver.com
wrote:


 ​As you pointed, my bet is in the -f case COPY FROM STDIN expects the
 data on the file ( otherwise pg_dumps would not work ), but your
 sugestion seems to have a problem of double redirection, let me elaborate:


 Well according to here, they should be roughly equivalent:

 http://www.postgresql.org/docs/9.3/interactive/app-psql.html


​   Yeah, they should​, I' was not discussing you. I was pointing the
SHELL line was incorrect,



 Trying it showed they where and ended with the same result, the data was
 not copied over:(


​Of course, I'll be greatly surprissed if they did. ​




 If I did this:

 psql -h host1 -U user1 -d db1  -f /q1.sql | psql -h host2 -U user2 -d db2
 -f -

 I saw the stdout from my 'q1.sql' show up at the second command, where it
 threw an error because it was just the data without the COPY .. FROM
 statement. So the second command must eat the stdin before it actually runs
 q2.sql. Figured this would have been an easy fix. In my case for this sort
 of thing I use Python/psycopg2 and its COPY TO/FROM  commands and run it
 through a buffer. Though of late I have starting using Pandas also.


​Of course you end up with an error, I would have reported a bug otherwise.
And also you are not using q2.sql so the result would have been wrong. I
did send you a form ( what you've nicely quoted back ) :


 (cat q2.sql; ​ psql -h host1 -U user1 -d db1  -f /q1.sql) | psql -h host2
 -U user2 -d db2


​of putting q2.sql in front ​of the output from q1.sql in the same pipe,
even with some samples of how this pipes works. Maybe you stopped reading
too soon. I cannot try it as I do not have q1.sql or q2.sql, but given what
I know about the reading/writing code of psql ( and that I have made this
kinds of things before ) it should work. It's a classical shell construct,
use a sub-shell ( parentheses ) to combine several commands and pipe its
output to another one. The problem what all the others constructs seem to
be trying to do it with a pipe of single commands, which is much more
difficult. Of course, if the problemis due to inadequate shells ( as, say,
cmd.exe ) it may need to be done in other ways.




Regards.
   Francisco Olarte.


Re: [GENERAL] Copy Data between different databases

2015-03-03 Thread Adrian Klaver

On 03/03/2015 06:18 AM, Tim Semmelhaack wrote:

Hi,

I want to copy data between two servers (Version 9.1 and 9.4)

I've tried

psql -h host1 -U user1 -d db1  -f /q1.sql | psql -h host2 -U user2 -d db2 -f
/q2.sql

Both sql-scripts include the COPY (SELECT ...) TO STDOUT or COPY (SELECT
...) TO STDIN
As a result nothing is copied.

When I run a much simpler version of the query with the -c Select ..
option it works. Because the sql-scripts are quite long, I don't to do it
without the -f option.


Have you tried?:

psql -h host1 -U user1 -d db1  /q1.sql | psql -h host2 -U user2 -d db2 
 /q2.sql




So where is the difference between the -c and the -f option?

Tim
-- Semmelhaack(at)gmx(dot).de







--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Copy Data between different databases

2015-03-03 Thread Francisco Olarte
Hi Adrian:

On Tue, Mar 3, 2015 at 4:44 PM, Adrian Klaver adrian.kla...@aklaver.com
wrote:

 On 03/03/2015 06:18 AM, Tim Semmelhaack wrote:

 Hi,

 I want to copy data between two servers (Version 9.1 and 9.4)

 I've tried

 ​​
 psql -h host1 -U user1 -d db1  -f /q1.sql | psql -h host2 -U user2 -d db2
 -f
 /q2.sql

 Both sql-scripts include the COPY (SELECT ...) TO STDOUT or COPY (SELECT
 ...) TO STDIN
 As a result nothing is copied.

 When I run a much simpler version of the query with the -c Select ..
 option it works. Because the sql-scripts are quite long, I don't to do it
 without the -f option.


 Have you tried?:

 psql -h host1 -U user1 -d db1  /q1.sql | psql -h host2 -U user2 -d db2 
 /q2.sql


​As you pointed, my bet is in the -f case COPY FROM STDIN expects the data
on the file ( otherwise pg_dumps would not work ), but your sugestion seems
to have a problem of double redirection, let me elaborate:

folarte@paqueton:~$ echo   A
folarte@paqueton:~$ echo  | cat  A


( in this case the A file will simulate q2.sql, echo  is simulating the
first psql command and cat is simulating the second psql command ). You are
redirecting the second psql input twice, one with | other with . A simple
variant is:

folarte@paqueton:~$ (cat A; echo ) | cat



Which, translating back to psql, should be:

(cat q2.sql; ​
psql -h host1 -U user1 -d db1  -f /q1.sql) | psql -h host2 -U user2 -d db2

Regards.
Francisco Olarte.











 So where is the difference between the -c and the -f option?

 Tim
 -- Semmelhaack(at)gmx(dot).de






 --
 Adrian Klaver
 adrian.kla...@aklaver.com



 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] Copy Data between different databases

2015-03-03 Thread Ryan King
Have you considered using dblink() or foreign data wrappers to transfer the 
data? 
You can do a select from source, insert into target using one of these methods.

RC

 On Mar 3, 2015, at 12:09 PM, Francisco Olarte fola...@peoplecall.com wrote:
 
 Hi Adrian:
 
 On Tue, Mar 3, 2015 at 4:44 PM, Adrian Klaver adrian.kla...@aklaver.com 
 mailto:adrian.kla...@aklaver.com wrote:
 On 03/03/2015 06:18 AM, Tim Semmelhaack wrote:
 Hi,
 
 I want to copy data between two servers (Version 9.1 and 9.4)
 
 I've tried
 
 ​​psql -h host1 -U user1 -d db1  -f /q1.sql | psql -h host2 -U user2 -d db2 -f
 /q2.sql
 
 Both sql-scripts include the COPY (SELECT ...) TO STDOUT or COPY (SELECT
 ...) TO STDIN
 As a result nothing is copied.
 
 When I run a much simpler version of the query with the -c Select ..
 option it works. Because the sql-scripts are quite long, I don't to do it
 without the -f option.
 
 Have you tried?:
 
 psql -h host1 -U user1 -d db1  /q1.sql | psql -h host2 -U user2 -d db2  
 /q2.sql
 
 ​As you pointed, my bet is in the -f case COPY FROM STDIN expects the data on 
 the file ( otherwise pg_dumps would not work ), but your sugestion seems to 
 have a problem of double redirection, let me elaborate:
 
 folarte@paqueton:~$ echo   A
 folarte@paqueton:~$ echo  | cat  A
 
 
 ( in this case the A file will simulate q2.sql, echo  is simulating the 
 first psql command and cat is simulating the second psql command ). You are 
 redirecting the second psql input twice, one with | other with . A simple 
 variant is:
 
 folarte@paqueton:~$ (cat A; echo ) | cat
 
 
 
 Which, translating back to psql, should be:
 
 (cat q2.sql; ​psql -h host1 -U user1 -d db1  -f /q1.sql) | psql -h host2 -U 
 user2 -d db2 
 
 Regards.
 Francisco Olarte.
 
 
 
 
 
 
 
  
 
 
 So where is the difference between the -c and the -f option?
 
 Tim
 -- Semmelhaack(at)gmx(dot).de
 
 
 
 
 
 
 -- 
 Adrian Klaver
 adrian.kla...@aklaver.com mailto:adrian.kla...@aklaver.com
 
 
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org 
 mailto:pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general 
 http://www.postgresql.org/mailpref/pgsql-general
 



Re: [GENERAL] Copy Data between different databases

2015-03-03 Thread Adrian Klaver

On 03/03/2015 10:09 AM, Francisco Olarte wrote:

Hi Adrian:

On Tue, Mar 3, 2015 at 4:44 PM, Adrian Klaver adrian.kla...@aklaver.com
mailto:adrian.kla...@aklaver.com wrote:

On 03/03/2015 06:18 AM, Tim Semmelhaack wrote:

Hi,

I want to copy data between two servers (Version 9.1 and 9.4)

I've tried

​​
psql -h host1 -U user1 -d db1  -f /q1.sql | psql -h host2 -U
user2 -d db2 -f
/q2.sql

Both sql-scripts include the COPY (SELECT ...) TO STDOUT or COPY
(SELECT
...) TO STDIN
As a result nothing is copied.

When I run a much simpler version of the query with the -c
Select ..
option it works. Because the sql-scripts are quite long, I don't
to do it
without the -f option.


Have you tried?:

psql -h host1 -U user1 -d db1  /q1.sql | psql -h host2 -U user2 -d
db2  /q2.sql


​As you pointed, my bet is in the -f case COPY FROM STDIN expects the
data on the file ( otherwise pg_dumps would not work ), but your
sugestion seems to have a problem of double redirection, let me elaborate:


Well according to here, they should be roughly equivalent:

http://www.postgresql.org/docs/9.3/interactive/app-psql.html


-f filename
--file=filename

Use the file filename as the source of commands instead of reading 
commands interactively. After the file is processed, psql terminates. 
This is in many ways equivalent to the meta-command \i.


If filename is - (hyphen), then standard input is read.

Using this option is subtly different from writing psql  filename. 
In general, both will do what you expect, but using -f enables some nice 
features such as error messages with line numbers. There is also a 
slight chance that using this option will reduce the start-up overhead. 
On the other hand, the variant using the shell's input redirection is 
(in theory) guaranteed to yield exactly the same output you would have 
received had you entered everything by hand.


Trying it showed they where and ended with the same result, the data was 
not copied over:(



If I did this:

psql -h host1 -U user1 -d db1  -f /q1.sql | psql -h host2 -U user2 -d 
db2 -f -


I saw the stdout from my 'q1.sql' show up at the second command, where 
it threw an error because it was just the data without the COPY .. FROM 
statement. So the second command must eat the stdin before it actually 
runs q2.sql. Figured this would have been an easy fix. In my case for 
this sort of thing I use Python/psycopg2 and its COPY TO/FROM  commands 
and run it through a buffer. Though of late I have starting using Pandas 
also.




folarte@paqueton:~$ echo   A
folarte@paqueton:~$ echo  | cat  A


( in this case the A file will simulate q2.sql, echo  is simulating
the first psql command and cat is simulating the second psql command ).
You are redirecting the second psql input twice, one with | other with
. A simple variant is:

folarte@paqueton:~$ (cat A; echo ) | cat



Which, translating back to psql, should be:

(cat q2.sql; ​
psql -h host1 -U user1 -d db1  -f /q1.sql) | psql -h host2 -U user2 -d db2

Regards.
 Francisco Olarte.









So where is the difference between the -c and the -f option?

Tim
-- Semmelhaack(at)gmx(dot).de






--
Adrian Klaver
adrian.kla...@aklaver.com mailto:adrian.kla...@aklaver.com



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
mailto:pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/__mailpref/pgsql-general
http://www.postgresql.org/mailpref/pgsql-general





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY data into a table with a SERIAL column?

2014-10-16 Thread Rob Sargent

On 10/16/2014 10:33 AM, Steve Wampler wrote:


Hi,

This is with Postgresql 9.3.5.

I'm looking at using a COPY command (via jdbc) to do bulk inserts into 
a table that

includes a BIGSERIAL column.   Is there a way to mark the data in that
column so it gets assigned a new value on entry - akin to the use of 
'default'
in an INSERT?   Some of the rows have values for the serial column, 
others

don't.

Or is the only way to use COPY for this task:

  COPY table_name (columnnameA, columnnameB, columnnameD) FROM source;

where the serial column name is omitted?  This wouldn't preserve the 
values

for the serial column on rows that have one already.

Thanks!
Steve

Doesn't this guarantee collision at some point?

I might add a column to the target table which would contain the 
foreign serial id and give all records the local serial. Update 
local to foreign iff safe and desired.





Re: [GENERAL] COPY data into a table with a SERIAL column?

2014-10-16 Thread Steve Wampler

On 10/16/2014 09:42 AM, Rob Sargent wrote:

On 10/16/2014 10:33 AM, Steve Wampler wrote:

This is with Postgresql 9.3.5.

I'm looking at using a COPY command (via jdbc) to do bulk inserts into a table 
that
includes a BIGSERIAL column.   Is there a way to mark the data in that
column so it gets assigned a new value on entry - akin to the use of 'default'
in an INSERT?   Some of the rows have values for the serial column, others
don't.

Or is the only way to use COPY for this task:

  COPY table_name (columnnameA, columnnameB, columnnameD) FROM source;

where the serial column name is omitted?  This wouldn't preserve the values
for the serial column on rows that have one already.


Doesn't this guarantee collision at some point?


Depends - without the UNIQUE tag on that column it shouldn't matter.
Or, with a bigserial there's a lot of room to play with.  The rows with existing
serial values might all have negative values for that column, for example.


I might add a column to the target table which would contain the foreign serial id and 
give all records the local
serial. Update local to foreign iff safe and desired.


I don't think this addresses the problem of having entry rows with no serial 
column in them.

Let me generalize the problem a bit:  How can I specify that the default value 
of a column
is to be used with a COPY command when some rows have values for that column and
some don't?

--
Steve Wampler -- swamp...@noao.edu
The gods that smiled on your birth are now laughing out loud.

--
This mail was scanned by a trial version of BitDefender
For more information please visit http://www.bitdefender.com/



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY data into a table with a SERIAL column?

2014-10-16 Thread David G Johnston
Steve Wampler wrote
 Let me generalize the problem a bit:  How can I specify that the default
 value of a column
 is to be used with a COPY command when some rows have values for that
 column and
 some don't?

If you provide a value for a column, including NULL, the default expression
is not evaluated.

COPY is dumb but fast.  If you need logic you need to add it yourself. 
Either before the copy or copy into a temporary UNLOGGED table and write
smart SQL to migrate from that to the live table.

You can also put smarts into a trigger.

Personally I would generally stage all the data then write two INSERT INTO
... SELECT statements; one for the known values and one where you omit the
column and let the system use the default.

David J.

 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/COPY-data-into-a-table-with-a-SERIAL-column-tp5823278p5823291.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY data into a table with a SERIAL column?

2014-10-16 Thread Rob Sargent

On 10/16/2014 11:04 AM, Steve Wampler wrote:

On 10/16/2014 09:42 AM, Rob Sargent wrote:

On 10/16/2014 10:33 AM, Steve Wampler wrote:

This is with Postgresql 9.3.5.

I'm looking at using a COPY command (via jdbc) to do bulk inserts 
into a table that

includes a BIGSERIAL column.   Is there a way to mark the data in that
column so it gets assigned a new value on entry - akin to the use of 
'default'
in an INSERT?   Some of the rows have values for the serial column, 
others

don't.

Or is the only way to use COPY for this task:

  COPY table_name (columnnameA, columnnameB, columnnameD) FROM source;

where the serial column name is omitted?  This wouldn't preserve the 
values

for the serial column on rows that have one already.


Doesn't this guarantee collision at some point?


Depends - without the UNIQUE tag on that column it shouldn't matter.
Or, with a bigserial there's a lot of room to play with.  The rows 
with existing
serial values might all have negative values for that column, for 
example.


I might add a column to the target table which would contain the 
foreign serial id and give all records the local

serial. Update local to foreign iff safe and desired.


I don't think this addresses the problem of having entry rows with no 
serial column in them.

No data in the column (null) or no column at all?

I appreciate the vastness of bigserial but I think it starts at 1. Are 
negative numbers even allowed?  To clarify my suggestion: all incoming 
records would get a new local big serial and those incoming records 
WITH a value would set the foreign bigserial though that column would 
have to be typed as bigint nullable (this would allow negative values).
That said, according to my test, the supplied bigserial value would get 
insert as supplied if not null (without the extra column I suggested)

My test

   postgres=# create table t (id bigserial, name text);
   CREATE TABLE
   postgres=# insert into t values('rjs');
   ERROR:  invalid input syntax for integer: rjs
   LINE 1: insert into t values('rjs');
 ^
   postgres=#
   postgres=# insert into t (name) values('rjs');
   INSERT 0 1
   postgres=# select * from t;
 id | name
   +--
  1 | rjs
   (1 row)

   postgres=# insert into t (id, name) values(777, 'rjs');
   INSERT 0 1
   postgres=# select * from t;
 id  | name
   -+--
   1 | rjs
 777 | rjs
   (2 rows)






Let me generalize the problem a bit:  How can I specify that the 
default value of a column
is to be used with a COPY command when some rows have values for that 
column and

some don't?





Re: [GENERAL] COPY data into a table with a SERIAL column?

2014-10-16 Thread Rob Sargent

On 10/16/2014 11:38 AM, David G Johnston wrote:

Steve Wampler wrote

Let me generalize the problem a bit:  How can I specify that the default
value of a column
is to be used with a COPY command when some rows have values for that
column and
some don't?

If you provide a value for a column, including NULL, the default expression
is not evaluated.

COPY is dumb but fast.  If you need logic you need to add it yourself.
Either before the copy or copy into a temporary UNLOGGED table and write
smart SQL to migrate from that to the live table.

You can also put smarts into a trigger.

Personally I would generally stage all the data then write two INSERT INTO
... SELECT statements; one for the known values and one where you omit the
column and let the system use the default.

David J.

  




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/COPY-data-into-a-table-with-a-SERIAL-column-tp5823278p5823291.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.



Yeah, part three of my test proves his point:

postgres=# insert into t (id, name) values(null, 'rjs');
ERROR:  null value in column id violates not-null constraint
DETAIL:  Failing row contains (null, rjs).




Re: [GENERAL] COPY data into a table with a SERIAL column?

2014-10-16 Thread David G Johnston
On Thu, Oct 16, 2014 at 11:44 AM, lup [via PostgreSQL] 
ml-node+s1045698n5823292...@n5.nabble.com wrote:



 I appreciate the vastness of bigserial but I think it starts at 1.  Are
 negative numbers even allowed?


​http://www.postgresql.org/docs/9.3/static/sql-createsequence.html

A DEFAULT sequence starts at one but it is able to generate any biginteger
value.​  Regardless, the value generated by the sequence and the allowed
values for the target column are distinct - which is why a sequence
attached to a normal integer will start throwing value out of bounds
errors before it runs out of values.

Therefore, by default if one is able to live with disallowing half of the
bigint range for auto-generation using the negative half of the range for
manual assignment is a quick-and-simple solution to the problem.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/COPY-data-into-a-table-with-a-SERIAL-column-tp5823278p5823296.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: [GENERAL] COPY data into a table with a SERIAL column?

2014-10-16 Thread Rob Sargent

On 10/16/2014 11:52 AM, David G Johnston wrote:
On Thu, Oct 16, 2014 at 11:44 AM, lup [via PostgreSQL] [hidden email] 
/user/SendEmail.jtp?type=nodenode=5823296i=0wrote:




I appreciate the vastness of bigserial but I think it starts at
1.  Are negative numbers even allowed?


​http://www.postgresql.org/docs/9.3/static/sql-createsequence.html

A DEFAULT sequence starts at one but it is able to generate any 
biginteger value.​  Regardless, the value generated by the sequence 
and the allowed values for the target column are distinct - which is 
why a sequence attached to a normal integer will start throwing value 
out of bounds errors before it runs out of values.


Therefore, by default if one is able to live with disallowing half of 
the bigint range for auto-generation using the negative half of the 
range for manual assignment is a quick-and-simple solution to the problem.


David J.



View this message in context: Re: COPY data into a table with a SERIAL 
column? 
http://postgresql.1045698.n5.nabble.com/COPY-data-into-a-table-with-a-SERIAL-column-tp5823278p5823296.html
Sent from the PostgreSQL - general mailing list archive 
http://postgresql.1045698.n5.nabble.com/PostgreSQL-general-f1843780.html 
at Nabble.com.

As proven by part 4 :)
postgres=# insert into t (id, name) values(-777, 'rjs');
INSERT 0 1
postgres=# select * from t;
  id  | name
--+--
1 | rjs
  777 | rjs
 -777 | rjs
(3 rows)



Re: [GENERAL] COPY data into a table with a SERIAL column?

2014-10-16 Thread Steve Wampler

On 10/16/2014 10:44 AM, Rob Sargent wrote:

On 10/16/2014 11:38 AM, David G Johnston wrote:

COPY is dumb but fast.  If you need logic you need to add it yourself.
Either before the copy or copy into a temporary UNLOGGED table and write
smart SQL to migrate from that to the live table.

You can also put smarts into a trigger.


Never thought about a trigger on a COPY before.  I'll look into that and
see what the hit is.

I was kinda hoping there was the equivalent of \N for indicating the use
of a default value instead of a null value, but I accept that such a thing
might be too expensive for COPY's goal in life.

Maybe the best approach is to switch to a batched insert, which might be
fast enough for my needs.

Thanks for the suggestions!

--
Steve Wampler -- swamp...@noao.edu
The gods that smiled on your birth are now laughing out loud.

--
This mail was scanned by a trial version of BitDefender
For more information please visit http://www.bitdefender.com/



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY data into a table with a SERIAL column?

2014-10-16 Thread Adrian Klaver

On 10/16/2014 11:17 AM, Steve Wampler wrote:

On 10/16/2014 10:44 AM, Rob Sargent wrote:

On 10/16/2014 11:38 AM, David G Johnston wrote:

COPY is dumb but fast.  If you need logic you need to add it yourself.
Either before the copy or copy into a temporary UNLOGGED table and write
smart SQL to migrate from that to the live table.

You can also put smarts into a trigger.


Never thought about a trigger on a COPY before.  I'll look into that and
see what the hit is.

I was kinda hoping there was the equivalent of \N for indicating the use
of a default value instead of a null value, but I accept that such a thing
might be too expensive for COPY's goal in life.

Maybe the best approach is to switch to a batched insert, which might be
fast enough for my needs.


Might want to take a look at pg_bulkload:

http://pgbulkload.projects.pgfoundry.org/pg_bulkload.html

in particular its FILTER function:

http://pgbulkload.projects.pgfoundry.org/pg_bulkload.html#filter



Thanks for the suggestions!




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] copy/dump database to text/csv files

2014-07-25 Thread Francisco Olarte
Hi William:


On Thu, Jul 24, 2014 at 9:04 PM, William Nolf bn...@xceleratesolutions.com
wrote:

 We have a postgres database that was used for an application we no longer
 use.  However, we wouldlike to copy/dump the tables to files, text or csv
 so we can post them to sharepoint.




How BIG is your public schema? As this is a one-shot I would recommend
first doing a custom format backup, and then working from it ( you can make
an sql format backup from a custom one with pg_restore ).

THEN I would suggest making a mini-script ( I'm partial to perl for that,
but this is due to 20+ years  hacking with it ) through which I'll pipe a
data-only sql backup. You just wait for the line with 'copy
table(comma,separated,field,names) from stdin' line, open the file
table.whatever, write a header line if needed, read, transform and write
copy lines until the '\.' end of data marker and close the file, repeat
until input exhausted. Something like the following perl seudo code:

BEWARE, UNTESTED CODE FOR EXAMPLE PURPOSES:
OUTER: # Loop around all the backup
while() {
if (my ($table, $fields) = /^copy\s*(.*?)\s*\((.*?)\)\s*from stdin;$/) {
# Header line found, do headers etc
my $fname = get_filename($table); # Dequote, add extensions.
open (my $fh, '', $fname) or die open problem: $!;
print $fh generate_header($fields); # If needed.
  INNER: # Loop around a single table data.
while() {
/^\\\.$/ and last;
print $fh transform($_); # Chomp, split, quote, etc...
}
close($fh) or die Close problem; # Disk full, .
}
}
# Yeah, I know print should be tested too.

One of this things should give you a file for each table as fast as your
database can send a backup, or your machine can run pg_restore.

Regards.
   Francisco Olarte.


Re: [GENERAL] copy/dump database to text/csv files

2014-07-25 Thread Marc Mamin
This is probably an easy one for most sql users but I don't use it very often.



We have a postgres database that was used for an application we no longer use. 
 However, we would

like to copy/dump the tables to files, text or csv so we can post them to 
sharepoint.



Copy seems to be what I want.  But it only does one table at a time.  There is 
204 tables with a schemaname=public.  I need to be copy the tables with data 
to files.   I need something like a for

loop which checks if the table is populated if so, copy it to tablename.csv 
file



Any ideas?

You could do it with pgpsql, or if your schema is too large generate a sript 
with it.

something like:

DO $$
DECLARE
test boolean;
rec record;
BEGIN
  for rec in  select tablename from pg_tables where schemaname = 'public'
  LOOP
EXECUTE 'select exists (select * from public.'||rec.tablename||')' into 
test;
IF test THEN raise notice 'COPY public.% TO 
%.dump',rec.tablename,rec.tablename;
END IF;
  END LOOP;
END;
$$ language plpgsql

regards,

Marc Mamin


Re: [GENERAL] copy/dump database to text/csv files

2014-07-24 Thread John R Pierce

On 7/24/2014 12:04 PM, William Nolf wrote:


This is probably an easy one for most sql users but I don't use it 
very often.


We have a postgres database that was used for an application we no 
longer use.  However, we would


like to copy/dump the tables to files, text or csv so we can post them 
to sharepoint.


Copy seems to be what I want.  But it only does one table at a time.  
There is 204 tables with a schemaname=public.  I need to be copy the 
tables with data to files.   I need something like a for


loop which checks if the table is populated if so, copy it to 
tablename.csv file


Any ideas?



what scripting/programming language are you best in?   I'd probably whip 
up a perl script to do that.


start with something like,

select table_schema||'.'||table_name from information_schema.tables 
where table_schema not in ('pg_catalog','information_schema')


which will output a list of all table names.

I suppose even bash scripting...

for $table in $(psql -tc select table_schema||'.'||table_name from 
information_schema.tables where table_schema not in 
('pg_catalog','information_schema') dbname; do
psql -c \copy $table to '$table.csv' with (format csv, 
header,...) dbname

done



--
john r pierce  37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] copy/dump database to text/csv files

2014-07-24 Thread Thomas Kellerer

William Nolf wrote on 24.07.2014 21:04:

This is probably an easy one for most sql users but I don't use it
very often.

We have a postgres database that was used for an application we no
longer use.  However, we would

like to copy/dump the tables to files, text or csv so we can post
them to sharepoint.

Copy seems to be what I want.  But it only does one table at a time.
There is 204 tables with a schemaname=public.  I need to be copy the
tables with data to files.   I need something like a for

loop which checks if the table is populated if so, copy it to
tablename.csv file

Any ideas?


If you are not constrained to psql you might want to have a look at my tool SQL 
Workbench/J (http://www.sql-workbench.net/) which contains an export command to 
do just that: export all tables of a schema (or multiple schemas) in a defined 
format,  see here: http://www.sql-workbench.net/manual/command-export.html

Using that, this would be as easy as:

WbExport -outputDir=/some/export/directory
 -type=text
 -header=true
 -delimiter=','
 -sourceTable=public.*;

You also have more control over the format of the exported data than you would 
have with psql I dare say.

Disclosure: I am the author of that tool.

Regards
Thomas
 








--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] \COPY from CSV ERROR: unterminated CSV quoted field

2014-06-19 Thread Tom Lane
ogromm alex.schiller1...@web.de writes:
 I get the error unterminated CSV quoted field when I try to copy text with
 new line \. new line

 For example:

 CREATE TABLE test (text TEXT);
 \COPY test FROM 'test.csv' WITH DELIMITER ',' CSV HEADER;

 test.csv:
 Text
 some text
 \.
 more text

Yeah, psql's \copy command doesn't know anything about CSV mode, and
will take \. as an EOF marker even though it shouldn't.  If you need
to copy data like this I'd suggest using a backend-side COPY.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY TO returns ERROR: could not open file for writing: No such file or directory

2014-05-24 Thread Alban Hertroys
On 24 May 2014, at 8:21, David Noel david.i.n...@gmail.com wrote:

 COPY (SELECT * FROM page WHERE PublishDate between '2014-03-01' and
 '2014-04-01') TO
 '/home/ygg/sql/backup/pagedump.2014-03-01.to.2014-04-01.copy';
 
 gives me:
 
 ERROR:  could not open file
 /home/ygg/sql/backup/pagedump.2014-03-01.to.2014-04-01.copy for
 writing: No such file or directory

 FreeBSD. PostgreSQL 9.3.
 
 Has anyone run into this before?

Two possibilities I can think of:

1. You’re not writing that file on the database server.

2. It’s a permission issue. Can you cd to that directory as user pgsql and 
create a file there?

# su - pgsql
# cd /home/ygg/sql/backup
# touch test

It’s possible that you lack execute permission on the symlink from /home to 
/usr/home, for example.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] copy expensive local view to an RDS instance

2014-05-06 Thread bricklen
On Tue, May 6, 2014 at 5:52 AM, Marcus Engene meng...@engene.se wrote:

 Hi,

 I have a local db behind a firewall etc. Basically, I'd like to do what
 I'd locally would...

 create table abc
 as
 select
 *
 from
 local_expensive_view;

 abc - on RDS
 local_expensive_view - on local machine

 How would you go about doing this?

 Thanks,
 Marcus


A very quick search shows that rds supports dblink, so perhaps that would
work.
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html


Re: [GENERAL] copy expensive local view to an RDS instance

2014-05-06 Thread Marcus Engene

On 06/05/14 16:58, bricklen wrote:


A very quick search shows that rds supports dblink, so perhaps that 
would work.

http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html


Then I'd need to open our servers to external visits. It would be lovely 
if dblink_exec could push a subselect of data instead instead of pull 
from RDS. Does this make sense?


Thanks,
Marcus



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] copy expensive local view to an RDS instance

2014-05-06 Thread bricklen
On Tue, May 6, 2014 at 8:07 AM, Marcus Engene meng...@engene.se wrote:

 On 06/05/14 16:58, bricklen wrote:


 A very quick search shows that rds supports dblink, so perhaps that would
 work.
 http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/
 CHAP_PostgreSQL.html


 Then I'd need to open our servers to external visits. It would be lovely
 if dblink_exec could push a subselect of data instead instead of pull from
 RDS. Does this make sense?


Is the idea to pull data from the RDS to your local machine? If so, dblink
or possibly plproxy[1] should be able to do that.

[1] https://wiki.postgresql.org/wiki/PL/Proxy


Re: [GENERAL] copy expensive local view to an RDS instance

2014-05-06 Thread Marcus Engene

On 06/05/14 17:15, bricklen wrote:
On Tue, May 6, 2014 at 8:07 AM, Marcus Engene meng...@engene.se 
mailto:meng...@engene.se wrote:


On 06/05/14 16:58, bricklen wrote:


A very quick search shows that rds supports dblink, so perhaps
that would work.

http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html


Then I'd need to open our servers to external visits. It would be
lovely if dblink_exec could push a subselect of data instead
instead of pull from RDS. Does this make sense?


Is the idea to pull data from the RDS to your local machine? If so, 
dblink or possibly plproxy[1] should be able to do that.


[1] https://wiki.postgresql.org/wiki/PL/Proxy
Sorry, no, I want to push data from my local machine but I've only seen 
examples of push with dblink_exec and litteral values rather than a 
subselect or smth.


Best regards,
Marcus



Re: [GENERAL] copy expensive local view to an RDS instance

2014-05-06 Thread Paul Jungwirth
 A very quick search shows that rds supports dblink
 Then I'd need to open our servers to external visits.

This is sort of getting away from Postgres, but if the RDS instance is
in a VPC, you could put a VPN on the VPC so dblink wouldn't have to go
over the open Internet.

Paul


On Tue, May 6, 2014 at 8:07 AM, Marcus Engene meng...@engene.se wrote:
 On 06/05/14 16:58, bricklen wrote:


 A very quick search shows that rds supports dblink, so perhaps that would
 work.
 http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html


 Then I'd need to open our servers to external visits. It would be lovely if
 dblink_exec could push a subselect of data instead instead of pull from RDS.
 Does this make sense?


 Thanks,
 Marcus



 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



-- 
_
Pulchritudo splendor veritatis.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY v. java performance comparison

2014-04-03 Thread Rob Sargent

On 04/02/2014 08:40 PM, Adrian Klaver wrote:

On 04/02/2014 05:30 PM, Rob Sargent wrote:

On 04/02/2014 06:06 PM, Adrian Klaver wrote:

On 04/02/2014 02:27 PM, Rob Sargent wrote:

On 04/02/2014 03:11 PM, Adrian Klaver wrote:

On 04/02/2014 02:04 PM, Rob Sargent wrote:

On 04/02/2014 02:36 PM, Adrian Klaver wrote:

On 04/02/2014 01:14 PM, Rob Sargent wrote:

On 04/02/2014 01:56 PM, Steve Atkins wrote:




Have you tried moving the input file to the same disk as the server,
to factor out the controller?

I labour under the delusion that it is through the controller one 
reads

and writes and that there might be some slight advantage to not doing
both against one drive if avoidable. Wrong again?


Well there is one way to find out:)

Might try with something less then the whole file to get come up an
approximate row/sec rate.






Well things slow down over time, and lots of too frequents:

Have done 500 batches in 24219 ms
Have done 1000 batches in 52362 ms
Have done 1500 batches in 82256 ms
Have done 2000 batches in 113754 ms
Have done 2500 batches in 149637 ms
Have done 3000 batches in 211314 ms
Have done 3500 batches in 301989 ms
Have done 4000 batches in 430817 ms
Have done 4500 batches in 596043 ms
Have done 5000 batches in 804250 ms

where a batch is 500,000 lines.  This on the java side of course.


This is drive to drive or on single drive?










Same as first go round, drive to drive.



Re: [GENERAL] COPY v. java performance comparison

2014-04-03 Thread Thomas Kellerer
Rob Sargent, 02.04.2014 21:37:
 I loaded 37M+ records using jOOQ (batching every 1000 lines) in 12+
 hours (800+ records/sec).  Then I tried COPY and killed that after
 11.25 hours when I realised that I had added on non-unque index on
 the name fields after the first load. By that point is was on line
 28301887, so ~0.75 done which implies it would have take ~15hours to
 complete.
 
 Would the overhead of the index likely explain this decrease in
 throughput?
 
 Impatience got the better of me and I killed the second COPY.  This
 time it had done 54% of the file in 6.75 hours, extrapolating to
 roughly 12 hours to do the whole thing.
 
 That matches up with the java speed. Not sure if I should be elated
 with jOOQ or disappointed with COPY.
 

This is not what I see with COPY FROM STDIN

When I load 2million rows using a batch size of 1000 with plain JDBC that takes 
about 4 minutes

Loading the same file through Java and COPY FROM STDIN takes about 4 seconds

The table looks like this:

Table public.products
  Column   |  Type  | Modifiers
---++---
 product_id| integer| not null
 ean_code  | bigint | not null
 product_name  | character varying(100) | not null
 manufacturer_name | character varying  | not null
 price | numeric(10,2)  | not null
 publish_date  | date   | not null
Indexes:
products_pkey PRIMARY KEY, btree (product_id)
idx_publish_date btree (publish_date, product_id)


During the load both indexes are present.

Regards
Thomas
 




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY v. java performance comparison

2014-04-03 Thread Andy Colson

On 4/2/2014 7:30 PM, Rob Sargent wrote:







Well things slow down over time, and lots of too frequents:

Have done 500 batches in 24219 ms
Have done 1000 batches in 52362 ms
Have done 1500 batches in 82256 ms
Have done 2000 batches in 113754 ms
Have done 2500 batches in 149637 ms
Have done 3000 batches in 211314 ms
Have done 3500 batches in 301989 ms
Have done 4000 batches in 430817 ms
Have done 4500 batches in 596043 ms
Have done 5000 batches in 804250 ms

where a batch is 500,000 lines.  This on the java side of course.






Have you watched cpu usage of the java process vs the PG process in top? 
 I had a perl program importing data that was the bottleneck because it 
was calling rand().  I'll bet generating uuid's is cpu intensive too.


Is the java app cpu bound?

Also watch vmstat 3 for a minute or two.  The last two numbers (wa  
id) (some vmstat's have a steal, ignore that) will tell you if you are 
io bound.


-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY v. java performance comparison

2014-04-03 Thread Rob Sargent

On 04/03/2014 09:01 AM, Thomas Kellerer wrote:

Rob Sargent, 02.04.2014 21:37:

I loaded 37M+ records using jOOQ (batching every 1000 lines) in 12+
hours (800+ records/sec).  Then I tried COPY and killed that after
11.25 hours when I realised that I had added on non-unque index on
the name fields after the first load. By that point is was on line
28301887, so ~0.75 done which implies it would have take ~15hours to
complete.

Would the overhead of the index likely explain this decrease in
throughput?

Impatience got the better of me and I killed the second COPY.  This
time it had done 54% of the file in 6.75 hours, extrapolating to
roughly 12 hours to do the whole thing.

That matches up with the java speed. Not sure if I should be elated
with jOOQ or disappointed with COPY.


This is not what I see with COPY FROM STDIN

When I load 2million rows using a batch size of 1000 with plain JDBC that takes 
about 4 minutes

Loading the same file through Java and COPY FROM STDIN takes about 4 seconds

The table looks like this:

 Table public.products
   Column   |  Type  | Modifiers
---++---
  product_id| integer| not null
  ean_code  | bigint | not null
  product_name  | character varying(100) | not null
  manufacturer_name | character varying  | not null
  price | numeric(10,2)  | not null
  publish_date  | date   | not null
Indexes:
 products_pkey PRIMARY KEY, btree (product_id)
 idx_publish_date btree (publish_date, product_id)


During the load both indexes are present.

Regards
Thomas
  


Thomas thanks for these numbers.

I have to straighten out my environment, which I admit I was hoping to 
avoid. I reset checkpoint_segments to 12 and restarted my server.
I kicked of the COPY at 19:00. That generated a couple of the too 
frequent statements but 52 WARNING:  pgstat wait timeout lines during 
the next 8 hours starting at 00:37 (5 hours in) 'til finally keeling 
over at 03:04 on line 37363768.  That's the last line of the input so 
obviously I didn't flush my last println properly. I'm beyond getting 
embarrassed at this point.


Is turning auto-vacuum off a reasonable way through this?



Re: [GENERAL] COPY v. java performance comparison

2014-04-03 Thread Rob Sargent

Is the java app cpu bound?



 Also watch vmstat 3 for a minute or two.  The last two numbers (wa
  id) (some vmstat's have a steal, ignore that) will tell you if you
 are io bound.

 -Andy


During COPY, with autovaccume off (server restarted, manual vacuum to 
get things going).  Immediately hit the too frequently gripe. 
checkpoint_segments still at 12.


Not cpu bound, so waiting for IO, but surely one expects that?

vmstat 3
procs ---memory-- ---swap-- -io --system-- 
-cpu-
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy 
id wa st
 0  2 100148 3317248 763264 3373103200 0 33355 10555 20122  
1  1 84 14  0
 1  0 100148 3311040 763272 3373697200 0 18128 8 20601  
5  1 87  6  0
 1  0 100148 3277808 763272 3376900800 0 12536 11373 20551  
7  2 90  1  0
 1  1 100148 3267020 763272 3377873200 0 43125 10968 20848  
5  2 90  3  0
 0  1 100148 3250528 763272 3379280400 0 63567 10467 20121  
2  1 81 16  0
 0  1 100148 3258584 763276 3378430800 0 17121 10553 20146  
1  1 86 12  0
 1  2 100148 3237616 763276 3380202400 0 36327 11198 20151  
4  2 85  9  0
 1  0 100148 3223224 763276 3381655600 0 32189 10763 19900  
2  1 84 13  0
 0  0 100148 3206740 763276 3383158000 0 13233 11080 20469  
7  3 89  1  0
 0  2 100148 3204872 763276 3382958000 0 75205 10500 20912  
2  1 84 13  0
 0  2 100148 3188256 763276 3384754400 0 35448 11028 20788  
3  2 86  9  0
 0  2 100148 3190248 763276 3384488800 0 21808 11938 20848  
2  1 82 15  0
 0  1 100148 3184656 763280 3384896800 0 18592 11562 20574  
1  1 84 14  0





  1   2   3   4   5   6   7   >