[GENERAL] Re: The fastest way to update thousands of rows in moderately sized table

2015-07-23 Thread twoflower
林士博 wrote
> Can you post execution plan of the original update sql.EXPLAIN (ANALYZE
> ON, BUFFERS ON) update "TRANSLATION" setfk_assignmentwhere fk_job = 1000;

Here it is:

Update on "TRANSLATION"  (cost=0.56..9645.13 rows=3113 width=391) (actual
time=35091.036..35091.036 rows=0 loops=1)
    Buffers: shared hit=74842343 read=7242 dirtied=7513
  ->  Index Scan using
"TRANSLATION_idx_composite_job_last_revision" on "TRANSLATION" 
(cost=0.56..9645.13 rows=3113 width=391) (actual time=0.042..24.147
rows=8920 loops=1)
      Index Cond: (fk_job = 59004)
      Buffers: shared hit=626

Planning time: 0.362 msExecution time: 35091.192 ms



--
View this message in context: 
http://postgresql.nabble.com/The-fastest-way-to-update-thousands-of-rows-in-moderately-sized-table-tp5859144p5859197.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: [GENERAL] Re: The fastest way to update thousands of rows in moderately sized table

2015-07-23 Thread 林士博
Sorry, my mistake.
Execution plan of an update sql seems always get  rows=0.

Can you post execution plan of the original update sql.
EXPLAIN (ANALYZE ON, BUFFERS ON) update "TRANSLATION" set
fk_assignmentwhere fk_job = 1000;


[GENERAL] Re: The fastest way to update thousands of rows in moderately sized table

2015-07-23 Thread twoflower
林士博 wrote
> Try creating an index on TRANSLATION fk_job.

The index is already there.



--
View this message in context: 
http://postgresql.nabble.com/The-fastest-way-to-update-thousands-of-rows-in-moderately-sized-table-tp5859144p5859191.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: [GENERAL] Re: The fastest way to update thousands of rows in moderately sized table

2015-07-23 Thread 林士博
Try creating an index on TRANSLATION fk_job.

>From the Execution plan you posted,
->(actual time=43262.266..43262.266 rows=0 loops=1)
actually, there is no row to be updated.

So, if you have a index on TRANSLATION fk_job, the update sql as behind
should be finished within several seconds.
->update "TRANSLATION" set fk_assignmentwhere fk_job = 1000;


Re: [GENERAL] Using the database to validate data

2015-07-23 Thread 林士博
Yes. You are right.
Do it in postgresql procedure is faster than in application.


Re: [GENERAL] Using the database to validate data

2015-07-23 Thread Adrian Klaver

On 07/23/2015 08:20 PM, 林士博 wrote:

If I am following correctly, you can do it in your application as follows.
1.begin transaction
2.insert each data. Catch db exception,
and save exception message and other information you need to array.
3.in  the end ,you can get all the information about the
wrong data in array if there is any.
and then you can decide whether it is need to rollback or to commit.
By the way, this is about programming but not postgresql.


Not sure I am following, the OP was asking the best way to catch any 
errors Postgres might throw using built in database features as much as 
possible. That seems to to be Postgres related.




2015-07-24 5:58 GMT+09:00 Tim Clarke mailto:tim.cla...@manifest.co.uk>>:

Shouldn't be too difficult to import those new rows into one table,
write a procedure that inserts them into the real table one by one and
logs the validation failure if any - committing good rows and rolling
back bad. In fact if you could then write the failures to a third table
with a completely relaxed (or no) validation?

Tim Clarke

On 23/07/15 21:48, Adrian Klaver wrote:
 > On 07/23/2015 12:04 PM, Jon Lapham wrote:
 >> On 07/23/2015 03:02 PM, Adrian Klaver wrote:
 >>> http://pgloader.io/
 >>
 >> Ok, thanks, I'll look into pgloader's data validation abilities.
 >>
 >> However, my naive understanding of pgloader is that it is used to
 >> quickly load data into a database, which is not what I am
looking to do.
 >> I want to validate data integrity *before* putting it into the
database.
 >> If there is a problem with any part of the data, I don't want
any of it
 >> in the database.
 >
 > I misunderstood, I thought you just wanted  information on the rows
 > that did not get in. pgloader does this by including the rejected
data
 > in *.dat and the Postgres log of why it was rejected in *.log.
 >
 > 
 >
 > I could still see making use of this by using the --before
 > , where file_name contains a CREATE TEMPORARY TABLE
 > some_table script that mimics the permanent table. Then it would load
 > against the temporary table, write out any errors and then drop the
 > table at the end. This would not put data into the permanent table on
 > complete success though. That would require some magic in AFTER LOAD
 > EXECUTE that I have not come up with yet:)
 >
 > 
 >>
 >> -Jon
 >>
 >
 >



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




--
─repica group──
▼ポイント×電子マネー×メールで店舗販促に必要な機能を全て提供!
【point+plus】http://www.repica.jp/pointplus/

▼フォローアップメールや外部連携に対応!
【mail solution】http://ms.repica.jp/

▼9年連続シェアNo.1 個人情報漏えい対策ソフト
【P-Pointer】http://ppointer.jp/

▼単月導入可能!AR動画再生アプリ
【marcs】http://www.arappli.com/service/marcs/

▼ITビジネスを創造しながら未来を創る
【VARCHAR】http://varchar.co.jp/
───



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using the database to validate data

2015-07-23 Thread 林士博
If I am following correctly, you can do it in your application as follows.
1.begin transaction
2.insert each data. Catch db exception,
and save exception message and other information you need to array.
3.in the end ,you can get all the information about the wrong data in array
if there is any.
   and then you can decide whether it is need to rollback or to commit.

By the way, this is about programming but not postgresql.

2015-07-24 5:58 GMT+09:00 Tim Clarke :

> Shouldn't be too difficult to import those new rows into one table,
> write a procedure that inserts them into the real table one by one and
> logs the validation failure if any - committing good rows and rolling
> back bad. In fact if you could then write the failures to a third table
> with a completely relaxed (or no) validation?
>
> Tim Clarke
>
> On 23/07/15 21:48, Adrian Klaver wrote:
> > On 07/23/2015 12:04 PM, Jon Lapham wrote:
> >> On 07/23/2015 03:02 PM, Adrian Klaver wrote:
> >>> http://pgloader.io/
> >>
> >> Ok, thanks, I'll look into pgloader's data validation abilities.
> >>
> >> However, my naive understanding of pgloader is that it is used to
> >> quickly load data into a database, which is not what I am looking to do.
> >> I want to validate data integrity *before* putting it into the database.
> >> If there is a problem with any part of the data, I don't want any of it
> >> in the database.
> >
> > I misunderstood, I thought you just wanted  information on the rows
> > that did not get in. pgloader does this by including the rejected data
> > in *.dat and the Postgres log of why it was rejected in *.log.
> >
> > 
> >
> > I could still see making use of this by using the --before
> > , where file_name contains a CREATE TEMPORARY TABLE
> > some_table script that mimics the permanent table. Then it would load
> > against the temporary table, write out any errors and then drop the
> > table at the end. This would not put data into the permanent table on
> > complete success though. That would require some magic in AFTER LOAD
> > EXECUTE that I have not come up with yet:)
> >
> > 
> >>
> >> -Jon
> >>
> >
> >
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
─repica group──
▼ポイント×電子マネー×メールで店舗販促に必要な機能を全て提供!
【point+plus】http://www.repica.jp/pointplus/

▼フォローアップメールや外部連携に対応!
【mail solution】http://ms.repica.jp/

▼9年連続シェアNo.1 個人情報漏えい対策ソフト
【P-Pointer】http://ppointer.jp/

▼単月導入可能!AR動画再生アプリ
【marcs】http://www.arappli.com/service/marcs/

▼ITビジネスを創造しながら未来を創る
【VARCHAR】http://varchar.co.jp/
───


Re: [GENERAL] Schema Help Needed To Get Unstuck

2015-07-23 Thread Rich Shepard

On Thu, 23 Jul 2015, Melvin Davidson wrote:


Does this help?


Melvin,

  Yep.

  After pondering David's response I recognized my error: I had the permit
table as the center of the relationships when it should be the conditions
table. As you wrote, permits, parameters, and locations all feed the
conditions table (where additional attributes are added), then a row from
that table is added to the monitoring table where quantities and other
attributes are entered.

  Sometimes we get so deep in a rut we can't see over the edge and find the
correct path.

Thanks very much,

Rich


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Schema Help Needed To Get Unstuck

2015-07-23 Thread Melvin Davidson
Does this help?

CREATE TABLE permits
(
permit_number bigint,
...,
other_columns ,
...,
CONSTRAINT permits_pk PRIMARY KEY (permit_number)
);

CREATE TABLE parameters
(
permit_number bigint,
parameter varchar(50),
...,
other_columns ,
...,
CONSTRAINT parameters_pk PRIMARY KEY (permit_number, parameter),
CONSTRAINT parameters_permit_fk FOREIGN KEY (permit_number)
  REFERENCES permits (permit_number)
);

CREATE TABLE locations
(
permit_number bigint,
location varchar(50),
...,
other_columns ,
...,
CONSTRAINT locations_pk PRIMARY KEY (permit_number, location),
CONSTRAINT locations_permit_fk FOREIGN KEY (permit_number)
  REFERENCES permits (permit_number)
);

CREATE TABLE conditions
(
permit_number bigint,
condition varchar(50),
location varchar(50),
frequency varchar(10),
start_time timestamp,
end_time timestamp,
...,
other_columns ,
...,
CONSTRAINT locations_pk PRIMARY KEY (permit_number, condition, location)
CONSTRAINT conditions_permit_fk FOREIGN KEY (permit_number)
  REFERENCES permits (permit_number),
CONSTRAINT conditions_location_fk FOREIGN KEY (permit_number, location)
  REFERENCES locations (permit_number, location)
);

CREATE TABLE monitoring
(
permit_number bigint,
monitor_date timestamp,
location varchar(50),
frequency varchar(10),
start_time timestamp,
end_time timestamp,
...,
other_columns ,
...,
CONSTRAINT locations_pk PRIMARY KEY (permit_number, condition, location)
CONSTRAINT monitoring_permit_fk FOREIGN KEY (permit_number)
  REFERENCES permits (permit_number),
CONSTRAINT monitoring_location_fk FOREIGN KEY (permit_number, location)
  REFERENCES locations (permit_number, location),
CONSTRAINT monitoring_parameter_fk FOREIGN KEY (permit_number, parameter)
  REFERENCES locations (permit_number, parameter)

);


Re: [GENERAL] Schema Help Needed To Get Unstuck

2015-07-23 Thread Rich Shepard

On Thu, 23 Jul 2015, David G. Johnston wrote:


Conditions <- (Permit + Location + Parameter + Frequency + Temporal (from,
until))
Monitoring <- (Condition + Event)

While you may require additional modelling tables to support your user
interface (i.e., picking valid combinations of PeLoPa when creating a new
condition) the fundamental concept is that each permit+location+parameter
combination needs to be monitored during a given period (temporal) at a
given frequency in that period.  Every time that combination generates a
number you combine the "Condition" with the event (data + time) and add the
result to the monitoring table.

Active combinations of permits, locations, and parameters can be directly
derived from the conditions table.  The temporal fields facilitate history
generation.


David,

  This is what I was working to achieve, but did not have the details
straight. I'll ponder this and make sure I set the proper FKs on the
appropriate tables.

Thanks,

Rich


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Adrian Klaver

On 07/23/2015 05:37 PM, Rob Sargent wrote:

On 07/23/2015 06:27 PM, Adrian Klaver wrote:

On 07/23/2015 05:08 PM, Rob Sargent wrote:

On 07/23/2015 04:15 PM, Karsten Hilbert wrote:

On Thu, Jul 23, 2015 at 12:28:32PM -0600, Rob Sargent wrote:


I'm suggesting OP might find changing truncate statements to deletes
(without a where clause) a simpler solution. Something has to change.

Well, OP isn't looking for a solution to "delete all rows"
but rather to _prevent_ deletion.

Tim can't go forth and tell Blackhats to "please use DELETE
rather than TRUNCATE", right ?

AFAICT it'd be more useful to advise OP to revoke TRUNCATE
rights on tables.

Karsten

Not sure about Tim and the Blackhats (there's a band name in there
somewhere) but Wouldn't OP have exact same code to fix, one way or
another?



I think the point was, the OP(Tim) might not have access to the code
that is trying to TRUNCATE. This could be because it is coming from
authorized users who are writing their own code or unauthorized
users(Blackhats) who are trying to sneak code in.



Fair enough but both blackhats and the authorized are just as likely to
drop the database as truncate something (intentionally or not) and
backups stashed everywhere is the first order of business.


Well that is a different crisis and not covered by rules or triggers:)


--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Rob Sargent

On 07/23/2015 06:27 PM, Adrian Klaver wrote:

On 07/23/2015 05:08 PM, Rob Sargent wrote:

On 07/23/2015 04:15 PM, Karsten Hilbert wrote:

On Thu, Jul 23, 2015 at 12:28:32PM -0600, Rob Sargent wrote:


I'm suggesting OP might find changing truncate statements to deletes
(without a where clause) a simpler solution. Something has to change.

Well, OP isn't looking for a solution to "delete all rows"
but rather to _prevent_ deletion.

Tim can't go forth and tell Blackhats to "please use DELETE
rather than TRUNCATE", right ?

AFAICT it'd be more useful to advise OP to revoke TRUNCATE
rights on tables.

Karsten

Not sure about Tim and the Blackhats (there's a band name in there
somewhere) but Wouldn't OP have exact same code to fix, one way or 
another?




I think the point was, the OP(Tim) might not have access to the code 
that is trying to TRUNCATE. This could be because it is coming from 
authorized users who are writing their own code or unauthorized 
users(Blackhats) who are trying to sneak code in.



Fair enough but both blackhats and the authorized are just as likely to 
drop the database as truncate something (intentionally or not) and 
backups stashed everywhere is the first order of business.


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Adrian Klaver

On 07/23/2015 05:08 PM, Rob Sargent wrote:

On 07/23/2015 04:15 PM, Karsten Hilbert wrote:

On Thu, Jul 23, 2015 at 12:28:32PM -0600, Rob Sargent wrote:


I'm suggesting OP might find changing truncate statements to deletes
(without a where clause) a simpler solution. Something has to change.

Well, OP isn't looking for a solution to "delete all rows"
but rather to _prevent_ deletion.

Tim can't go forth and tell Blackhats to "please use DELETE
rather than TRUNCATE", right ?

AFAICT it'd be more useful to advise OP to revoke TRUNCATE
rights on tables.

Karsten

Not sure about Tim and the Blackhats (there's a band name in there
somewhere) but Wouldn't OP have exact same code to fix, one way or another?



I think the point was, the OP(Tim) might not have access to the code 
that is trying to TRUNCATE. This could be because it is coming from 
authorized users who are writing their own code or unauthorized 
users(Blackhats) who are trying to sneak code in.



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Schema Help Needed To Get Unstuck

2015-07-23 Thread David G. Johnston
On Thu, Jul 23, 2015 at 3:06 PM, Rich Shepard 
wrote:

>   While designing the schema for a new application have become
> high-centered
> and stuck relating some many-to-many tables. Fresh eyes and suggestions are
> needed on how to create many-to-many association tables among these five.
>
>   Table 'Permits': Contains information about each permit, PK is permit
> number.
>
>   Table 'Parameters': Contains all parameters (physical, chemical,
> biological, operational) by name and other attributes. This is a record of
> all parameters over the life of the permit. (Many parameters, many
> permits.)
>
>   Table 'Locations': Contains details about each monitoring location. This
> is a record of all locations over the life of the permit. (Many locations,
> many permits.)
>
>   Table 'Conditions': Contains permit numbers and which paramenters are to
> be monitored at which locations and at what frequencies and over what time
> period. I'm thinking one row per permit that reference the permit number,
> parameters, locations, frequencies, and active dates. Each location has
> multiple parameters, and each parameter is monitored at multiple locations.
> FK references permit number. (Many parameters, many locations, many
> frequencies for each of many permits.)
>
>   Table 'Monitoring': Contains columns for date, location, parameter,
> quantity, and other attributes. FK references permit number. (Many rows for
> each permit.)
>
>
​Conditions <- (Permit + Location + Parameter + Frequency + Temporal (from,
until))
Monitoring <- (Condition + Event)

While you may require additional modelling tables to support your user
interface (i.e., picking valid combinations of PeLoPa when creating a new
condition) the fundamental concept is that each permit+location+parameter
combination needs to be monitored during a given period (temporal) at a
given frequency in that period.  Every time that combination generates a
number you combine the "Condition" with the event (data + time) and add the
result to the monitoring table.

Active combinations of permits, locations, and parameters can be directly
derived from the conditions table.  The temporal fields facilitate history
generation.

David J.


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Rob Sargent

On 07/23/2015 04:15 PM, Karsten Hilbert wrote:

On Thu, Jul 23, 2015 at 12:28:32PM -0600, Rob Sargent wrote:


I'm suggesting OP might find changing truncate statements to deletes
(without a where clause) a simpler solution. Something has to change.

Well, OP isn't looking for a solution to "delete all rows"
but rather to _prevent_ deletion.

Tim can't go forth and tell Blackhats to "please use DELETE
rather than TRUNCATE", right ?

AFAICT it'd be more useful to advise OP to revoke TRUNCATE
rights on tables.

Karsten
Not sure about Tim and the Blackhats (there's a band name in there 
somewhere) but Wouldn't OP have exact same code to fix, one way or another?




[GENERAL] Schema Help Needed To Get Unstuck

2015-07-23 Thread Rich Shepard

  While designing the schema for a new application have become high-centered
and stuck relating some many-to-many tables. Fresh eyes and suggestions are
needed on how to create many-to-many association tables among these five.

  Table 'Permits': Contains information about each permit, PK is permit
number.

  Table 'Parameters': Contains all parameters (physical, chemical,
biological, operational) by name and other attributes. This is a record of
all parameters over the life of the permit. (Many parameters, many permits.)

  Table 'Locations': Contains details about each monitoring location. This
is a record of all locations over the life of the permit. (Many locations,
many permits.)

  Table 'Conditions': Contains permit numbers and which paramenters are to
be monitored at which locations and at what frequencies and over what time
period. I'm thinking one row per permit that reference the permit number,
parameters, locations, frequencies, and active dates. Each location has
multiple parameters, and each parameter is monitored at multiple locations.
FK references permit number. (Many parameters, many locations, many
frequencies for each of many permits.)

  Table 'Monitoring': Contains columns for date, location, parameter,
quantity, and other attributes. FK references permit number. (Many rows for
each permit.)

TIA,

Rich




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Karsten Hilbert
On Thu, Jul 23, 2015 at 12:28:32PM -0600, Rob Sargent wrote:

> I'm suggesting OP might find changing truncate statements to deletes
> (without a where clause) a simpler solution. Something has to change.

Well, OP isn't looking for a solution to "delete all rows"
but rather to _prevent_ deletion.

Tim can't go forth and tell Blackhats to "please use DELETE
rather than TRUNCATE", right ?

AFAICT it'd be more useful to advise OP to revoke TRUNCATE
rights on tables.

Karsten
-- 
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Schema Help Needed To Get Unstuck

2015-07-23 Thread Rich Shepard

  While designing the schema for a new application have become high-centered
and stuck relating some many-to-many tables. Fresh eyes and suggestions are
needed on how to create many-to-many association tables among these five.

  Table 'Permits': Contains information about each permit, PK is permit
number.

  Table 'Parameters': Contains all parameters (physical, chemical,
biological, operational) by name and other attributes. This is a record of
all parameters over the life of the permit. (Many parameters, many permits.)

  Table 'Locations': Contains details about each monitoring location. This
is a record of all locations over the life of the permit. (Many locations,
many permits.)

  Table 'Conditions': Contains permit numbers and which paramenters are to
be monitored at which locations and at what frequencies and over what time
period. I'm thinking one row per permit that reference the permit number,
parameters, locations, frequencies, and active dates. Each location has
multiple parameters, and each parameter is monitored at multiple locations.
FK references permit number. (Many parameters, many locations, many
frequencies for each of many permits.)

  Table 'Monitoring': Contains columns for date, location, parameter,
quantity, and other attributes. FK references permit number. (Many rows for
each permit.)

TIA,

Rich




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Backup fatal issue

2015-07-23 Thread Mephysto
Hi Tom,
I am using only pg_ctl to start and stop my database. At the moment I do
not use any script.

Bye.

Meph

On 17 July 2015 at 16:32, Tom Lane  wrote:

> mephysto  writes:
> > I have some problems in a postgres cluster when I try to execute a
> backup.
> > ...
> > FATAL:  semctl(983046, 3, SETVAL, 0) failed: Invalid argument
> > ...
> > LOG:  could not remove shared memory segment "/PostgreSQL.1804289383": No
> > such file or directory
> > LOG:  semctl(786432, 0, IPC_RMID, ...) failed: Invalid argument
> > LOG:  semctl(819201, 0, IPC_RMID, ...) failed: Invalid argument
> > LOG:  semctl(851970, 0, IPC_RMID, ...) failed: Invalid argument
> > LOG:  semctl(884739, 0, IPC_RMID, ...) failed: Invalid argument
> > LOG:  semctl(917508, 0, IPC_RMID, ...) failed: Invalid argument
> > LOG:  semctl(950277, 0, IPC_RMID, ...) failed: Invalid argument
> > LOG:  semctl(983046, 0, IPC_RMID, ...) failed: Invalid argument
> > LOG:  semctl(1015815, 0, IPC_RMID, ...) failed: Invalid argument
>
> Assuming you didn't manually destroy the shared memory segment and
> semaphores, the only very likely explanation for this is that another
> postmaster instance started up, used the same memory and semaphore
> IDs, and then removed them when it shut down.  This should be impossible
> given the interlocks around shared memory segment creation and use of
> a data directory; but it's possible to break those interlocks if you
> try hard enough.
>
> I speculate that you've got a postmaster-startup script that
> unconditionally removes the postmaster.pid lock file before starting the
> postmaster, and that your backup procedure or some other routine action
> is invoking that even when there might be a live postmaster already.
>
> If so, the answer is simple: get rid of the script's attempt to remove the
> lock file.  And have a word with whoever put it in.  *Nothing* except the
> postmaster's own code should *ever* remove postmaster.pid.
>
> regards, tom lane
>


Re: [GENERAL] Backup fatal issue

2015-07-23 Thread Mephysto
​Hi Melvin,
I tried to execute pg_dump and after I tried to launch pg_start_backup and
pg_stop_backup. In both cases, I did not receive any error, so I decided to
investigate in BarMan.

Thanks for your help.

Bye.

Meph​

On 17 July 2015 at 16:16, Melvin Davidson  wrote:

> If this is occurring with Barman, try doing a manual backup.
> IE: pg_dump your_database > /tmp/somefile.sql
>
> If that works, then the problem is with Barman or it's configuration.
>
> On Fri, Jul 17, 2015 at 10:07 AM, Mephysto 
> wrote:
>
>> Hi Melvin,
>> I am using Pstgres 9.4.4 in a virtualized Debian Linux 8.0. I do not know
>> backup command because is launched by BARMAN 1.4.1. I have tried many times
>> to execute the backup. I have restarted and reinstalled Postgres and I also
>> rebooted OS many times.
>>
>> Thanks for your help.
>>
>> Meph
>>
>> On 17 July 2015 at 15:12, Melvin Davidson  wrote:
>>
>>> I think we need just a little bit more information.
>>>
>>> What is the O/S?
>>> What is the PostgreSQL version?
>>> What is the backup command?
>>> Does this always occur with the same command?
>>> Did you need to restarrt PostgreSQL after this started?
>>>
>>>
>>>
>>> On Fri, Jul 17, 2015 at 3:29 AM, mephysto 
>>> wrote:
>>>
 Hi there,
 I have some problems in a postgres cluster when I try to execute a
 backup.
 This is the log of entire operation:

 LOG:  connection received: host=91.121.182.110 port=54957
 LOG:  connection authorized: user=postgres database=postgres
 LOG:  statement: BEGIN
 LOG:  statement: SET application_name TO barman
 LOG:  statement: SHOW "data_directory"
 LOG:  connection received: host=91.121.182.110 port=54958
 LOG:  connection authorized: user=postgres database=postgres
 LOG:  statement: BEGIN
 LOG:  statement: SET application_name TO barman
 LOG:  statement: SELECT name, setting FROM pg_settings WHERE name IN
 ('config_file', 'hba_file', 'ident_file')
 LOG:  connection received: host=91.121.182.110 port=54959
 LOG:  connection authorized: user=postgres database=postgres
 LOG:  statement: BEGIN
 LOG:  statement: SET application_name TO barman
 LOG:  statement: SELECT spcname, oid, pg_tablespace_location(oid) AS
 spclocation FROM pg_tablespace WHERE pg_tablespace_location(oid) != ''
 LOG:  connection received: host=91.121.182.110 port=54960
 LOG:  connection authorized: user=postgres database=postgres
 LOG:  statement: BEGIN
 LOG:  statement: SET application_name TO barman
 LOG:  statement: select pg_is_in_recovery()
 LOG:  statement: SELECT xlog_loc, (pg_xlogfile_name_offset(xlog_loc)).*,
 now() FROM pg_start_backup('Barman backup redevodb_citrix
 20150717T091702',false) as xlog_loc
 LOG:  connection received: host=91.121.182.110 port=54988
 FATAL:  semctl(983046, 3, SETVAL, 0) failed: Invalid argument
 LOG:  server process (PID 1756) exited with exit code 1
 LOG:  terminating any other active server processes
 WARNING:  terminating connection because of crash of another server
 process
 DETAIL:  The postmaster has commanded this server process to roll back
 the
 current transaction and exit, because another server process exited
 abnormally and possibly corrupted shared memory.
 HINT:  In a moment you should be able to reconnect to the database and
 repeat your command.
 LOG:  all server processes terminated; reinitializing
 LOG:  could not remove shared memory segment "/PostgreSQL.1804289383":
 No
 such file or directory
 LOG:  semctl(786432, 0, IPC_RMID, ...) failed: Invalid argument
 LOG:  semctl(819201, 0, IPC_RMID, ...) failed: Invalid argument
 LOG:  semctl(851970, 0, IPC_RMID, ...) failed: Invalid argument
 LOG:  semctl(884739, 0, IPC_RMID, ...) failed: Invalid argument
 LOG:  semctl(917508, 0, IPC_RMID, ...) failed: Invalid argument
 LOG:  semctl(950277, 0, IPC_RMID, ...) failed: Invalid argument
 LOG:  semctl(983046, 0, IPC_RMID, ...) failed: Invalid argument
 LOG:  semctl(1015815, 0, IPC_RMID, ...) failed: Invalid argument
 LOG:  database system was interrupted; last known up at 2015-07-17
 09:18:10
 CEST
 LOG:  redo starts at 16/9F28
 LOG:  record with zero length at 16/9F000100
 LOG:  redo done at 16/9FC8
 LOG:  MultiXact member wraparound protections are now enabled
 LOG:  database system is ready to accept connections
 LOG:  autovacuum launcher started

 Can anyone help me to identify and to resolve this inconvenience?

 Thanks in advance.

 Meph



 --
 View this message in context:
 http://postgresql.nabble.com/Backup-fatal-issue-tp5858258.html
 Sent from the PostgreSQL - general mailing list archive at Nabble.com.


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org

[GENERAL] Re: The fastest way to update thousands of rows in moderately sized table

2015-07-23 Thread twoflower
Adrian Klaver-4 wrote
> Have you tried wrapping the above in a BEGIN/COMMIT block?

Yes, I am running the tests inside a BEGIN TRANSACTION / ROLLBACK block.



--
View this message in context: 
http://postgresql.nabble.com/The-fastest-way-to-update-thousands-of-rows-in-moderately-sized-table-tp5859144p5859148.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: [GENERAL] Using the database to validate data

2015-07-23 Thread Tim Clarke
Shouldn't be too difficult to import those new rows into one table,
write a procedure that inserts them into the real table one by one and
logs the validation failure if any - committing good rows and rolling
back bad. In fact if you could then write the failures to a third table
with a completely relaxed (or no) validation?

Tim Clarke

On 23/07/15 21:48, Adrian Klaver wrote:
> On 07/23/2015 12:04 PM, Jon Lapham wrote:
>> On 07/23/2015 03:02 PM, Adrian Klaver wrote:
>>> http://pgloader.io/
>>
>> Ok, thanks, I'll look into pgloader's data validation abilities.
>>
>> However, my naive understanding of pgloader is that it is used to
>> quickly load data into a database, which is not what I am looking to do.
>> I want to validate data integrity *before* putting it into the database.
>> If there is a problem with any part of the data, I don't want any of it
>> in the database.
>
> I misunderstood, I thought you just wanted  information on the rows
> that did not get in. pgloader does this by including the rejected data
> in *.dat and the Postgres log of why it was rejected in *.log.
>
> 
>
> I could still see making use of this by using the --before
> , where file_name contains a CREATE TEMPORARY TABLE
> some_table script that mimics the permanent table. Then it would load
> against the temporary table, write out any errors and then drop the
> table at the end. This would not put data into the permanent table on
> complete success though. That would require some magic in AFTER LOAD
> EXECUTE that I have not come up with yet:)
>
> 
>>
>> -Jon
>>
>
>



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] The fastest way to update thousands of rows in moderately sized table

2015-07-23 Thread Adrian Klaver

On 07/23/2015 01:17 PM, twoflower wrote:

Hello, I have a table with 30 million records in which I need to update
a single column for a couple of thousands of rows, let's say 10 000. The
new column value is identical for all matching rows. Doing

|update "TRANSLATION" set fk_assignment where fk_job = 1000; |

takes 45 seconds. I understand that |UPDATE| is basically an |INSERT|
followed by |DELETE| but I was hoping I could do better than that. I
found a suggestion to use a temporary table to speed things up, so now I
have this:

|create unlogged table "temp_table" as
select id, fk_assignment
from "TRANSLATION"
where fk_job = 1000;

update "temp_table" set fk_assignment = null;

update "TRANSLATION" _target
set fk_assignment = _source.fk_assignment
from "temp_table" _source
where _target.id = _source.id;

drop table "temp_table";


Have you tried wrapping the above in a BEGIN/COMMIT block?

So:

BEGIN;

create unlogged table "temp_table" as
select id, fk_assignment
from "TRANSLATION"
where fk_job = 1000;

update "temp_table" set fk_assignment = null;

update "TRANSLATION" _target
set fk_assignment = _source.fk_assignment
from "temp_table" _source
where _target.id = _source.id;

drop table "temp_table";

COMMIT;


|

This got me to about 37 seconds. Still pretty slow. The |TRANSLATION|
has an index and a foreign key constraint on |fk_assignment|. Removing
the constraint brought very little benefit. Removing the index is
probably out of question as these kind of operations are very frequent
and the table itself is used heavily, including the index. Execution plan:

|Update on "TRANSLATION" _target (cost=0.56..116987.76 rows=13983
width=405) (actual time=43262.266..43262.266 rows=0 loops=1)
  -> Nested Loop (cost=0.56..116987.76 rows=13983 width=405) (actual
time=0.566..146.084 rows=8920 loops=1)
-> Seq Scan on temp_segs _source (cost=0.00..218.83 rows=13983
width=22) (actual time=0.457..13.994 rows=8920 loops=1)
-> Index Scan using "TRANSLATION_pkey" on "TRANSLATION" _target
(cost=0.56..8.34 rows=1 width=391) (actual time=0.009..0.011 rows=1
loops=8920)
   Index Cond: (id = _source.id)

Planning time: 1.167 ms
Execution time: 43262.577 ms
|
Is there anything else worth trying? Are these numbers something to be
expected, from your experience?

I have Postgres 9.4, the database is on SSD.

Thank you very much for any suggestions.

Standa

View this message in context: The fastest way to update thousands of
rows in moderately sized table

Sent from the PostgreSQL - general mailing list archive
 at
Nabble.com.



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Using the database to validate data

2015-07-23 Thread Adrian Klaver

On 07/23/2015 12:04 PM, Jon Lapham wrote:

On 07/23/2015 03:02 PM, Adrian Klaver wrote:

http://pgloader.io/


Ok, thanks, I'll look into pgloader's data validation abilities.

However, my naive understanding of pgloader is that it is used to
quickly load data into a database, which is not what I am looking to do.
I want to validate data integrity *before* putting it into the database.
If there is a problem with any part of the data, I don't want any of it
in the database.


I misunderstood, I thought you just wanted  information on the rows that 
did not get in. pgloader does this by including the rejected data in 
*.dat and the Postgres log of why it was rejected in *.log.




I could still see making use of this by using the --before , 
where file_name contains a CREATE TEMPORARY TABLE some_table script that 
mimics the permanent table. Then it would load against the temporary 
table, write out any errors and then drop the table at the end. This 
would not put data into the permanent table on complete success though. 
That would require some magic in AFTER LOAD EXECUTE that I have not come 
up with yet:)





-Jon




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] The fastest way to update thousands of rows in moderately sized table

2015-07-23 Thread twoflower
Hello,I have a table with 30 million records in which I need to update a
single column for a couple of thousands of rows, let's say 10 000. The new
column value is identical for all matching rows.Doing

update "TRANSLATION" set fk_assignmentwhere fk_job = 1000;

takes 45 seconds. I understand that UPDATE is basically an INSERT followed
by DELETE but I was hoping I could do better than that.I found a suggestion
to use a temporary table to speed things up, so now I have this:

create unlogged table "temp_table" as
  select id, fk_assignment
  from "TRANSLATION"
  where fk_job = 1000;

update "temp_table" set fk_assignment = null;

update "TRANSLATION" _target
set fk_assignment = _source.fk_assignment
from "temp_table" _source
where _target.id = _source.id;

drop table "temp_table";


This got me to about 37 seconds. Still pretty slow.The TRANSLATION has an
index and a foreign key constraint on fk_assignment. Removing the constraint
brought very little benefit. Removing the index is probably out of question
as these kind of operations are very frequent and the table itself is used
heavily, including the index.Execution plan:

Update on "TRANSLATION" _target  (cost=0.56..116987.76 rows=13983 width=405)
(actual time=43262.266..43262.266 rows=0 loops=1)
 ->  Nested Loop  (cost=0.56..116987.76 rows=13983 width=405) (actual
time=0.566..146.084 rows=8920 loops=1)
   ->  Seq Scan on temp_segs _source  (cost=0.00..218.83
rows=13983 width=22) (actual time=0.457..13.994 rows=8920 loops=1)
   ->  Index Scan using "TRANSLATION_pkey" on "TRANSLATION"
_target  (cost=0.56..8.34 rows=1 width=391) (actual time=0.009..0.011 rows=1
loops=8920)
  Index Cond: (id = _source.id)

Planning time: 1.167 ms
Execution time: 43262.577 ms

Is there anything else worth trying? Are these numbers something to be
expected, from your experience?

I have Postgres 9.4, the database is on SSD.

Thank you very much for any suggestions.

Standa



--
View this message in context: 
http://postgresql.nabble.com/The-fastest-way-to-update-thousands-of-rows-in-moderately-sized-table-tp5859144.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: [GENERAL] Setting up HA postgresql

2015-07-23 Thread William Dunn
Thanks so much for lending your expertise Tatsuo.

Aviel I have found and linked below the relevant section of the official
pgpool-II docs on using pgpool-II for failover. It is available starting in
pgpool-II version 3.1:
http://www.pgpool.net/docs/latest/pgpool-en.html#stream

*Will J. Dunn*

On Tue, Jul 21, 2015 at 10:07 PM, Tatsuo Ishii  wrote:

> > But it appears that the fail condition for "watchdog" is the failure of a
> > pgpool-II instance. In the configuration described in the wiki you would
> > put a pgpool-II instance on each Postgres node, and if one of the
> pgpool-II
> > instances fails it executes a script (which can create the trigger file
> to
> > promote the standby to master). However, if the fail condition for
> watchdog
> > is a failure of the pgpool-II instance what happens if the DBMS has
> > availability issues but the pgpool-II process on that server is ok? The
> > fail condition would never be triggered, right?
>
> No. The condition (PostgreSQL failure) will be detected by one of
> pgpool-II (not watchdog process) and the pgpool-II will disconnect the
> failed PostgreSQL, then execute failover script (if defined) which
> will usually promote standby PostgreSQL (if failed PostgreSQL was
> primary) or just ignore the broken PostgreSQL if it was standby.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
> > *Will J. Dunn*
> > *willjdunn.com *
> >
> > On Tue, Jul 21, 2015 at 4:27 PM, Joshua D. Drake 
> > wrote:
> >
> >>
> >> On 07/21/2015 01:21 PM, William Dunn wrote:
> >>
> >>> That's pretty cool! But the intended use of watchdog is so you can have
> >>> multiple pgpool-II instances and failover among them
> >>> (http://www.pgpool.net/docs/latest/pgpool-en.html#watchdog) rather
> than
> >>> failure of Postgres. In the configuration described in that wiki what
> >>> happens when the DBMS goes down but pgpool-II is fine? The watchdog
> >>> appears to be monitoring the pgpool-II process, not the
> >>> postgres/postmaster process.
> >>>
> >>
> >> From what I see, it has the ability to execute a command/script based
> on a
> >> failed condition which could include but not be limited to a command to
> >> fail over a standby.
> >>
> >>
> >> JD
> >>
> >> --
> >> Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
> >> PostgreSQL Centered full stack support, consulting and development.
> >> Announcing "I'm offended" is basically telling the world you can't
> >> control your own emotions, so everyone else should do it for you.
> >>
>


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Rob Sargent

On 07/23/2015 12:25 PM, Adrian Klaver wrote:

On 07/23/2015 11:15 AM, Rob Sargent wrote:

On 07/23/2015 12:09 PM, Adrian Klaver wrote:

On 07/23/2015 04:57 AM, Tim Smith wrote:

Andrew,

 From the manual:

It is important to realize that a rule is really a command
transformation mechanism, or command macro. The transformation happens
before the execution of the command starts. If you actually want an
operation that fires independently for each physical row, you probably
want to use a trigger, not a rule


Thus, I should not have to use a trigger for TRUNCATE because the 
"each
row" concept does not apply. Plus it makes perfect sense to 
want to

transform the truncate command and transform into ignore



Just in case it has not been made obvious yet, rules are silently
deprecated. They still exist because views depend on them, but it is
generally considered best practices to not use them outside that
realm. So if you want the rule behavior to change for TRUNCATE(if that
is even possible) you are fighting an uphill battle. You may pursue
that fight of course, but I would think you will get a quicker return
on your time if you just forget about using a RULE and stick to a
TRIGGER instead.


Or change to using delete instead of truncate?



Well Tim has an ON DELETE rule:

http://www.postgresql.org/message-id/CA+HuS5G2bZYYOGTJrw+VosjUPO298swxuU=jorfav54ut7v...@mail.gmail.com 



His expectation was that would also catch a TRUNCATE based on this:

"... It has the same effect as an unqualified DELETE on each table, ..."

from here:

http://www.postgresql.org/docs/9.4/interactive/sql-truncate.html

It was then explained that while TRUNCATE had the same end result as 
'DELETE FROM some_table' it was actually a separate command and 
action. Tim wants to catch a TRUNCATE and turn it into an ignore.



I'm suggesting OP might find changing truncate statements to deletes 
(without a where clause) a simpler solution. Something has to change.





Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Scott Marlowe
On Thu, Jul 23, 2015 at 12:09 PM, Adrian Klaver
 wrote:
> On 07/23/2015 04:57 AM, Tim Smith wrote:
>>
>> Andrew,
>>
>>  From the manual:
>>
>> It is important to realize that a rule is really a command
>> transformation mechanism, or command macro. The transformation happens
>> before the execution of the command starts. If you actually want an
>> operation that fires independently for each physical row, you probably
>> want to use a trigger, not a rule
>>
>>
>> Thus, I should not have to use a trigger for TRUNCATE because the "each
>> row" concept does not apply. Plus it makes perfect sense to want to
>> transform the truncate command and transform into ignore
>>
>
> Just in case it has not been made obvious yet, rules are silently
> deprecated. They still exist because views depend on them, but it is
> generally considered best practices to not use them outside that realm. So
> if you want the rule behavior to change for TRUNCATE(if that is even
> possible) you are fighting an uphill battle. You may pursue that fight of
> course, but I would think you will get a quicker return on your time if you
> just forget about using a RULE and stick to a TRIGGER instead.
>


Also OP needs to know that COPY commands are ignored by rules as well.
I agree, stick to triggers, they're faster and less error prone.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Adrian Klaver

On 07/23/2015 11:15 AM, Rob Sargent wrote:

On 07/23/2015 12:09 PM, Adrian Klaver wrote:

On 07/23/2015 04:57 AM, Tim Smith wrote:

Andrew,

 From the manual:

It is important to realize that a rule is really a command
transformation mechanism, or command macro. The transformation happens
before the execution of the command starts. If you actually want an
operation that fires independently for each physical row, you probably
want to use a trigger, not a rule


Thus, I should not have to use a trigger for TRUNCATE because the "each
row" concept does not apply. Plus it makes perfect sense to want to
transform the truncate command and transform into ignore



Just in case it has not been made obvious yet, rules are silently
deprecated. They still exist because views depend on them, but it is
generally considered best practices to not use them outside that
realm. So if you want the rule behavior to change for TRUNCATE(if that
is even possible) you are fighting an uphill battle. You may pursue
that fight of course, but I would think you will get a quicker return
on your time if you just forget about using a RULE and stick to a
TRIGGER instead.


Or change to using delete instead of truncate?



Well Tim has an ON DELETE rule:

http://www.postgresql.org/message-id/CA+HuS5G2bZYYOGTJrw+VosjUPO298swxuU=jorfav54ut7v...@mail.gmail.com

His expectation was that would also catch a TRUNCATE based on this:

"... It has the same effect as an unqualified DELETE on each table, ..."

from here:

http://www.postgresql.org/docs/9.4/interactive/sql-truncate.html

It was then explained that while TRUNCATE had the same end result as 
'DELETE FROM some_table' it was actually a separate command and action. 
Tim wants to catch a TRUNCATE and turn it into an ignore.



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Rob Sargent

On 07/23/2015 12:09 PM, Adrian Klaver wrote:

On 07/23/2015 04:57 AM, Tim Smith wrote:

Andrew,

 From the manual:

It is important to realize that a rule is really a command
transformation mechanism, or command macro. The transformation happens
before the execution of the command starts. If you actually want an
operation that fires independently for each physical row, you probably
want to use a trigger, not a rule


Thus, I should not have to use a trigger for TRUNCATE because the "each
row" concept does not apply. Plus it makes perfect sense to want to
transform the truncate command and transform into ignore



Just in case it has not been made obvious yet, rules are silently 
deprecated. They still exist because views depend on them, but it is 
generally considered best practices to not use them outside that 
realm. So if you want the rule behavior to change for TRUNCATE(if that 
is even possible) you are fighting an uphill battle. You may pursue 
that fight of course, but I would think you will get a quicker return 
on your time if you just forget about using a RULE and stick to a 
TRIGGER instead.



Or change to using delete instead of truncate?



Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Adrian Klaver

On 07/23/2015 04:57 AM, Tim Smith wrote:

Andrew,

 From the manual:

It is important to realize that a rule is really a command
transformation mechanism, or command macro. The transformation happens
before the execution of the command starts. If you actually want an
operation that fires independently for each physical row, you probably
want to use a trigger, not a rule


Thus, I should not have to use a trigger for TRUNCATE because the "each
row" concept does not apply. Plus it makes perfect sense to want to
transform the truncate command and transform into ignore



Just in case it has not been made obvious yet, rules are silently 
deprecated. They still exist because views depend on them, but it is 
generally considered best practices to not use them outside that realm. 
So if you want the rule behavior to change for TRUNCATE(if that is even 
possible) you are fighting an uphill battle. You may pursue that fight 
of course, but I would think you will get a quicker return on your time 
if you just forget about using a RULE and stick to a TRIGGER instead.


--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Andres Freund
On 2015-07-23 12:57:20 +0100, Tim Smith wrote:
> Thus, I should not have to use a trigger for TRUNCATE because the "each
> row" concept does not apply. Plus it makes perfect sense to want to
> transform the truncate command and transform into ignore

That'd entirely defeat the point of TRUNCATE being fast.


Either way, this isn't going to change, so it seems a bit pointless to
continue arguing around it circles.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Q: text query search and

2015-07-23 Thread Dane Foster
Kevin,

I am also interested in a deeper explanation of what you mean by, "The easy
way to do that is just to add a trigram index and search for similar
strings, and forget about full text search." Because I need to make a
decision about whether to use full text search or use other pattern
matching facilities such as LIKE and/or regular expressions. For me, the
reason I don't just default to full text search is the documents are
relative small (i.e, HTML <= 128K) and number fewer than 10,000 so I'm not
sure if the effort expended to learn the ins/outs of full text search will
be beneficial to my use case.

Regards,

Dane

On Thu, Jul 23, 2015 at 12:17 PM, Jeff Janes  wrote:

> On Thu, Jul 23, 2015 at 2:55 AM, amihay gonen  wrote:
>
>> Hi  I'm trying to implement a text search in PG .
>>
>> My goal to enable the user search on several columns also on partial
>> words.
>>
>> here is sample code :
>> create table test_test( text_data tsvector, text_a varchar,text_b
>> varchar);
>>
>> insert into test_test(text_a,text_b) select 'name 10.10.2.3 ','name3
>> name' from generate_series(1,500);
>> update test_test set text_data=to_tsvector(text_a||' '||text_b);
>> CREATE INDEX test_test_idx ON test_test USING gin(text_data);
>>
>> explain ANALYZE select * from test_test where text_data@
>> @plainto_tsquery('name');
>>
>> my questions are :
>> 1. why the index is not used (I guess it is related to the way the data
>> is generated)
>>
>
> It returns the entire table, so there is no point in using an index.  Yes,
> it is the way it is generated, the same data repeated over and over is not
> very realistic.  If you just want random text, I use md5(random()::text).
> But if you want text that looks vaguely like English, I don't have a nice
> trick for that.  Maybe load the sgml files into a table.
>
>
>> 2, how can I use pg_trgm with ts_vector to enable to answer query like
>> 10.10 or nam ?
>>the idea is to use the gin index , maybe there are other option
>> without using pg_trgm?
>>
>
> Do you mean:
>
> WHERE text_a LIKE '%10.10%' or text_a LIKE '%nam%' ?
>
> With the or, that going to be hard to optimize.
>
> Anyway, pg_tgrm requires its own special GIN index, it can't piggy back on
> the tsvector GIN index.
>
> CREATE INDEX whatever ON test_test USING gin(text_a gin_trgm_ops);
>
> or
>
> CREATE INDEX whatever ON test_test USING gin((text_a||' '||text_b)
> gin_trgm_ops);
>
> But, LIKE '%10.10%' is going to be tough for a pg_trgm index to help with,
> unless you compile your own code after removing "#define KEEPONLYALNUM"
>
> Cheers,
>
> Jeff
>


Re: [GENERAL] Using the database to validate data

2015-07-23 Thread Adrian Klaver

On 07/23/2015 05:55 AM, JPLapham wrote:

Hello,

I have an application that occasionally performs large batch inserts of user
hand-generated data. Input is a tab delimited file with typically hundreds
to a thousand lines of data.

Because the data is generated by hand, there are always many
transaction-stopping errors in a typical input run. For example, missing
datum in a NOT NULL column, a duplicate value in a UNIQUE column, data type
mismatch, FOREIGN KEY reference to something non-existing, etc. Of course,
we chose PostgreSQL exactly because of these problems, because of the robust
transactional control, rollback on errors, etc.

My question is the following. I would like to *test* the data input for
integrity in such a way that I can create a report to the user informing
them of exactly where in their input file to correct the problems.

IDEA 1: My first attempt at this was to simply slurp the data into the
database, collect the errors, and then rollback. Of course (as I now know),
this doesn't work because after the first problem, the database reports,
"current transaction is aborted, commands ignored until end of transaction
block". This means that I can only report to the user the location of the
first problem, and then they run the data again, and keep looping through
the process until the data is good, a huge waste of time.

IDEA 2: My second idea on how to do this was to ROLLBACK after each INSERT.
This allows me to check for things like NOT NULL and data type issues, but
not violations of UNIQUE within the new data.

IDEA 3: Write my own data pre-conditioner. Ugh, what a nightmare, I feel
like I'm writing my own database! Checking for FKEY constraints, UNIQUE, etc
is not trivial. It seems ridiculous to do this when I have the *actual*
database available to test against!


That has already been done:

http://pgloader.io/




Has anyone dealt with this kind of issue before? What are your opinions on
best practice for this? Of course I do not want to actually COMMIT until the
data is perfect!

Thanks for your time!
-Jon



--
View this message in context: 
http://postgresql.nabble.com/Using-the-database-to-validate-data-tp5859046.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.





--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Tim Smith
Andrew,

>From the manual:

It is important to realize that a rule is really a command transformation
mechanism, or command macro. The transformation happens before the
execution of the command starts. If you actually want an operation that
fires independently for each physical row, you probably want to use a
trigger, not a rule


Thus, I should not have to use a trigger for TRUNCATE because the "each
row" concept does not apply. Plus it makes perfect sense to want to
transform the truncate command and transform into ignore



On Thursday, 23 July 2015, Andrew Sullivan  wrote:

> On Thu, Jul 23, 2015 at 08:06:19AM +0100, Tim Smith wrote:
> > What exactly is was the design decision that lead to TRUNCATE being
> > supported by triggers but not by rules ?
>
> There are two things.  First, probably the design decision was, "I
> care about triggers."  TRUNCATE was added (I believe) in version 7.0,
> and even then there was some caution indicated about the use of rules.
> See for instance
> http://www.postgresql.org/docs/7.0/static/rules19784.htm.  So you
> might be partly right.
>
> But second, it isn't clear what it would mean for TRUNCATE to be
> supported by rules.  Rules do query parse tree rewriting.  That is,
> they rewrite the query on the way through the system before they can
> possibly have any effect, changing one SQL statement into
> (effectively) a different one by the time it executes.  There is only
> one possible effect from TRUNCATE, and that is to eliminate all the
> data in the table.  I don't know what rewriting such a query would
> mean.
>
> Best regards,
>
> A
>
> --
> Andrew Sullivan
> a...@crankycanuck.ca 
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


[GENERAL] Using the database to validate data

2015-07-23 Thread JPLapham
Hello,

I have an application that occasionally performs large batch inserts of user
hand-generated data. Input is a tab delimited file with typically hundreds
to a thousand lines of data.

Because the data is generated by hand, there are always many
transaction-stopping errors in a typical input run. For example, missing
datum in a NOT NULL column, a duplicate value in a UNIQUE column, data type
mismatch, FOREIGN KEY reference to something non-existing, etc. Of course,
we chose PostgreSQL exactly because of these problems, because of the robust
transactional control, rollback on errors, etc.

My question is the following. I would like to *test* the data input for
integrity in such a way that I can create a report to the user informing
them of exactly where in their input file to correct the problems. 

IDEA 1: My first attempt at this was to simply slurp the data into the
database, collect the errors, and then rollback. Of course (as I now know),
this doesn't work because after the first problem, the database reports,
"current transaction is aborted, commands ignored until end of transaction
block". This means that I can only report to the user the location of the
first problem, and then they run the data again, and keep looping through
the process until the data is good, a huge waste of time.

IDEA 2: My second idea on how to do this was to ROLLBACK after each INSERT.
This allows me to check for things like NOT NULL and data type issues, but
not violations of UNIQUE within the new data.

IDEA 3: Write my own data pre-conditioner. Ugh, what a nightmare, I feel
like I'm writing my own database! Checking for FKEY constraints, UNIQUE, etc
is not trivial. It seems ridiculous to do this when I have the *actual*
database available to test against!

Has anyone dealt with this kind of issue before? What are your opinions on
best practice for this? Of course I do not want to actually COMMIT until the
data is perfect!

Thanks for your time!
-Jon



--
View this message in context: 
http://postgresql.nabble.com/Using-the-database-to-validate-data-tp5859046.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Q: text query search and

2015-07-23 Thread Jeff Janes
On Thu, Jul 23, 2015 at 2:55 AM, amihay gonen  wrote:

> Hi  I'm trying to implement a text search in PG .
>
> My goal to enable the user search on several columns also on partial words.
>
> here is sample code :
> create table test_test( text_data tsvector, text_a varchar,text_b varchar);
>
> insert into test_test(text_a,text_b) select 'name 10.10.2.3 ','name3 name'
> from generate_series(1,500);
> update test_test set text_data=to_tsvector(text_a||' '||text_b);
> CREATE INDEX test_test_idx ON test_test USING gin(text_data);
>
> explain ANALYZE select * from test_test where text_data@
> @plainto_tsquery('name');
>
> my questions are :
> 1. why the index is not used (I guess it is related to the way the data is
> generated)
>

It returns the entire table, so there is no point in using an index.  Yes,
it is the way it is generated, the same data repeated over and over is not
very realistic.  If you just want random text, I use md5(random()::text).
But if you want text that looks vaguely like English, I don't have a nice
trick for that.  Maybe load the sgml files into a table.


> 2, how can I use pg_trgm with ts_vector to enable to answer query like
> 10.10 or nam ?
>the idea is to use the gin index , maybe there are other option without
> using pg_trgm?
>

Do you mean:

WHERE text_a LIKE '%10.10%' or text_a LIKE '%nam%' ?

With the or, that going to be hard to optimize.

Anyway, pg_tgrm requires its own special GIN index, it can't piggy back on
the tsvector GIN index.

CREATE INDEX whatever ON test_test USING gin(text_a gin_trgm_ops);

or

CREATE INDEX whatever ON test_test USING gin((text_a||' '||text_b)
gin_trgm_ops);

But, LIKE '%10.10%' is going to be tough for a pg_trgm index to help with,
unless you compile your own code after removing "#define KEEPONLYALNUM"

Cheers,

Jeff


Re: [GENERAL] Q: text query search and

2015-07-23 Thread Kevin Grittner
amihay gonen  wrote:

> explain ANALYZE select * from test_test
>   where text_data@@plainto_tsquery('name');
>
> my questions are :
> 1. why the index is not used (I guess it is related to the way
>the data is generated)

Because the sequential scan has a lower cost based on available
statistics.  If I run all the example statements without any
delays, I actually see it using the index; it is only if I run
VACUUM ANALYZE (or give autovacuum enough time to do so) that I see
a sequential scan.  That's because it sees that it will need to
visit all the tuples in the heap anyway, so why bother also
visiting all the index tuples?  If you search for a word that's not
in the table, you'll see it use the index, too.

> 2, how can I use pg_trgm with ts_vector to enable to answer query
>like 10.10 or nam ?  the idea is to use the gin index , maybe
>there are other option without using pg_trgm?

The easy way to do that is just to add a trigram index and search
for similar strings, and forget about full text search.  If you
want to use full text search for this you may need to use special
dictionaries or parser code.  (I used both for a system to search
court document text, and it seemed to work well.)  The facilities
for custom full text search parsers seem pretty bad; I found what I
needed using regular expressions and cast to the appopriate ts
types.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] plv8 package in PostgreSQL 9.4 Yum repositories

2015-07-23 Thread Devrim GÜNDÜZ


Hi,

On Thu, 2015-07-23 at 13:09 +0200, Jens Depuydt wrote:
> Is there a specific reason why plv8 doesn't get distributed anymore 
> with the Yum repositories for PostgreSQL 9.4?

I was not aware that 1.4.4 is released and it can be built cleanly.
Sorry about that.

> In the list of available/supported packages, plv8 is mentioned on
> http://yum.postgresql.org/news-packagelist.php and it was available 
> up to PostgreSQL 9.3.

Pushed 1.4.4 to 9.4 and 9.5 repositories.

Regards,

-- 
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Andrew Sullivan
On Thu, Jul 23, 2015 at 12:57:20PM +0100, Tim Smith wrote:
> It is important to realize that a rule is really a command transformation
> mechanism, or command macro. The transformation happens before the
> execution of the command starts. If you actually want an operation that
> fires independently for each physical row, you probably want to use a
> trigger, not a rule

Well, yes, but the discussion of the rules system in earlier manuals
was actually, I thought, somewhat more detailed; and it outlined what
rules really did, which was alter the command at the parse tree.
That's what I think the above is saying also, but it may not be quite
as plain.  So it's rather more like a statement-level trigger.  

> Thus, I should not have to use a trigger for TRUNCATE because the "each
> row" concept does not apply. Plus it makes perfect sense to want to
> transform the truncate command and transform into ignore

Well, yes, but really in this case you want a per-statement trigger,
and there's not the same distinction in rules, either.

I can't believe that people would reject a patch (though you should
ask on -hackers, not here); but you asked what was behind the design
decision and I told you.  But in general, the experience seems to be
that triggers are easier to get right (novice or no, _pace_ section
38.7).

Best regards,

A

-- 
Andrew Sullivan
a...@crankycanuck.ca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-23 Thread Heikki Linnakangas

On 07/23/2015 11:31 AM, Spiros Ioannou wrote:

Well, so far with commit_delay=0 no problems. I will report back of couse
if something happens, but I believe that the problem may indeed be
solved/masked with that setting.

Rough description of our setup, or how to reproduce:
* Timeseries data in table , say, "measurements", size: 3-4TB, about 1000
inserts/second
* table measurements also has a trigger on insert to also insert on
measurements_a (for daily export purposes)
Just the above would cause a stuck query after a few days.

Now for exporting we run the following CTE query (measurements_b is an
empty table, measurements_a has about 5GB)
* WITH d_rows AS (DELETE FROM measurement_events_a RETURNING * )  INSERT
INTO measurement_events_b SELECT * FROM d_rows;
The above caused the problem to appear every time, after a 10-20 minutes.


I was able to reproduce something like this with pgbench, by running a 
custom little module that calls the WaitXLogInsertionsToFinish() in a 
tight loop, and checks that the value it returns moves monotonically 
forward. With commit_delay on, once every minute or so, it moves backwards.


I'll investigate why that happens...

- Heikki



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] plv8 package in PostgreSQL 9.4 Yum repositories

2015-07-23 Thread Jens Depuydt
Hi,

Is there a specific reason why plv8 doesn't get distributed anymore with
the Yum repositories for PostgreSQL 9.4?
In the list of available/supported packages, plv8 is mentioned on
http://yum.postgresql.org/news-packagelist.php and it was available up to
PostgreSQL 9.3.

As far as we see, recent sources of plv8 work fine with 9.4 but it's quite
some trouble to maintain and build every recent version, with the risk of
compatibility problems, when it's not availble in the repositories.

Thanks for the info.

Jens Depuydt

-- 


 DISCLAIMER 

http://www.tvh.com/glob/en/email-disclaimer

"This message is delivered to all addressees subject to the conditions
set forth in the attached disclaimer, which is an integral part of this
message."


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Andrew Sullivan
On Thu, Jul 23, 2015 at 08:06:19AM +0100, Tim Smith wrote:
> What exactly is was the design decision that lead to TRUNCATE being
> supported by triggers but not by rules ?

There are two things.  First, probably the design decision was, "I
care about triggers."  TRUNCATE was added (I believe) in version 7.0,
and even then there was some caution indicated about the use of rules.
See for instance
http://www.postgresql.org/docs/7.0/static/rules19784.htm.  So you
might be partly right.

But second, it isn't clear what it would mean for TRUNCATE to be
supported by rules.  Rules do query parse tree rewriting.  That is,
they rewrite the query on the way through the system before they can
possibly have any effect, changing one SQL statement into
(effectively) a different one by the time it executes.  There is only
one possible effect from TRUNCATE, and that is to eliminate all the
data in the table.  I don't know what rewriting such a query would
mean.

Best regards,

A

-- 
Andrew Sullivan
a...@crankycanuck.ca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Q: text query search and

2015-07-23 Thread amihay gonen
Hi  I'm trying to implement a text search in PG .

My goal to enable the user search on several columns also on partial words.

here is sample code :
create table test_test( text_data tsvector, text_a varchar,text_b varchar);

insert into test_test(text_a,text_b) select 'name 10.10.2.3 ','name3 name'
from generate_series(1,500);
update test_test set text_data=to_tsvector(text_a||' '||text_b);
CREATE INDEX test_test_idx ON test_test USING gin(text_data);

explain ANALYZE select * from test_test where text_data@
@plainto_tsquery('name');

my questions are :
1. why the index is not used (I guess it is related to the way the data is
generated)
2, how can I use pg_trgm with ts_vector to enable to answer query like
10.10 or nam ?
   the idea is to use the gin index , maybe there are other option without
using pg_trgm?

thanks
amihay


Re: [GENERAL] Lots of stuck queries after upgrade to 9.4

2015-07-23 Thread Spiros Ioannou
Well, so far with commit_delay=0 no problems. I will report back of couse
if something happens, but I believe that the problem may indeed be
solved/masked with that setting.

Rough description of our setup, or how to reproduce:
* Timeseries data in table , say, "measurements", size: 3-4TB, about 1000
inserts/second
* table measurements also has a trigger on insert to also insert on
measurements_a (for daily export purposes)
Just the above would cause a stuck query after a few days.

Now for exporting we run the following CTE query (measurements_b is an
empty table, measurements_a has about 5GB)
* WITH d_rows AS (DELETE FROM measurement_events_a RETURNING * )  INSERT
INTO measurement_events_b SELECT * FROM d_rows;
The above caused the problem to appear every time, after a 10-20 minutes.

Regards,
-Spiros















*Spiros Ioannou IT Manager, inAccesswww.inaccess.com
M: +30 6973-903808T: +30 210-6802-358*

On 20 July 2015 at 17:02, Andres Freund  wrote:

> On 2015-07-20 17:00:52 +0300, Spiros Ioannou wrote:
> > FYI we have an 9.3.5 with commit_delay = 4000 and commit_siblings = 5
> with
> > a 8TB dataset which seems fine. (Runs on different - faster hardware
> > though).
>
> 9.4 has a different xlog insertion algorithm (scaling much better), so
> that unfortunately doesn't say very much...
>


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Guillaume Lelarge
2015-07-23 9:06 GMT+02:00 Tim Smith :

> So tell me guys, instead of bashing away at the fact I only quoted
> half a sentence or whatever, how about you answer the following :
>
> What exactly is was the design decision that lead to TRUNCATE being
> supported by triggers but not by rules ?
>
>
Someone had time to implement it for triggers, no-one had time for rules.


> I suspect that TRUNCATE was added to triggers because some dev thought
> it would be a neat idea, and it was never implemented in rules as a
> result of an accidental omission for  rather than a
> deliberate design constraint.
>
>
It is a neat idea for tiggers. Slony uses that to replicate TRUNCATE on
slaves of a Slony cluster.

It wouldn't be such a neat idea for rules as, IIRC, rules are only
supported because views are based on them. Without that, they would
probably be ripped out of the code.


-- 
Guillaume.
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com


Re: [GENERAL] Delete rule does not prevent truncate

2015-07-23 Thread Tim Smith
So tell me guys, instead of bashing away at the fact I only quoted
half a sentence or whatever, how about you answer the following :

What exactly is was the design decision that lead to TRUNCATE being
supported by triggers but not by rules ?

I suspect that TRUNCATE was added to triggers because some dev thought
it would be a neat idea, and it was never implemented in rules as a
result of an accidental omission for  rather than a
deliberate design constraint.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general