Re: [GENERAL] Lost rows/data corruption?

2005-02-28 Thread Andrew Hall
;Keith C. Perry" <[EMAIL PROTECTED]> To: "Andrew Hall" <[EMAIL PROTECTED]> Cc: "Alban Hertroys" <[EMAIL PROTECTED]>; "Marco Colombo" <[EMAIL PROTECTED]>; Sent: Saturday, February 26, 2005 6:02 AM Subject: Re: [GENERAL] Lost rows/data corr

Re: [GENERAL] Lost rows/data corruption?

2005-02-25 Thread Keith C. Perry
Quoting Andrew Hall <[EMAIL PROTECTED]>: > > Do you happen to have the same type disks in all these systems? That could > > > point to a disk cache "problem" (f.e. the disks lying about having written > > > data from the cache to disk). > > > > Or do you use the same disk parameters on all these

Re: [GENERAL] Lost rows/data corruption?

2005-02-17 Thread Scott Marlowe
IL PROTECTED]> > Cc: "Scott Marlowe" <[EMAIL PROTECTED]>; "Alban Hertroys" > <[EMAIL PROTECTED]>; "Marco Colombo" <[EMAIL PROTECTED]>; > > Sent: Friday, February 18, 2005 1:56 AM > Subject: Re: [GENERAL] Lost rows/data corruption? >

Re: [GENERAL] Lost rows/data corruption?

2005-02-17 Thread Andrew Hall
c.). - Original Message - From: "Michael Fuhr" <[EMAIL PROTECTED]> To: "Andrew Hall" <[EMAIL PROTECTED]> Cc: "Scott Marlowe" <[EMAIL PROTECTED]>; "Alban Hertroys" <[EMAIL PROTECTED]>; "Marco Colombo" <[EMAIL PROTEC

Re: [GENERAL] Lost rows/data corruption?

2005-02-17 Thread Tom Lane
"Andrew Hall" <[EMAIL PROTECTED]> writes: > I can't be sure. We have an automated maintenance process that reboots all > our customers machines every 10 days at 2am. Why? Sounds like a decision made by someone who is used to Windows. I've never seen any variant of Unix that needed that.

Re: [GENERAL] Lost rows/data corruption?

2005-02-17 Thread Michael Fuhr
On Thu, Feb 17, 2005 at 07:40:25PM +1100, Andrew Hall wrote: > > We have an automated maintenance process that reboots all our > customers machines every 10 days at 2am. What's the purpose of doing this? If it's necessary then the reboots aren't really fixing anything. Is whatever problem that

Re: [GENERAL] Lost rows/data corruption?

2005-02-17 Thread Andrew Hall
I was wondering if this problem had ever shown up on a machine that HADN'T lost power abrubtly or not. IFF the only machines that experience corruption have lost power beforehand sometime, then I would look towards either the drives, controller or file system or somewhere in there. I can't be sure

Re: [GENERAL] Lost rows/data corruption?

2005-02-17 Thread Andrew Hall
Do you happen to have the same type disks in all these systems? That could point to a disk cache "problem" (f.e. the disks lying about having written data from the cache to disk). Or do you use the same disk parameters on all these machines? Have you tried using the disks w/o write caching and/

Re: [GENERAL] Lost rows/data corruption?

2005-02-17 Thread Andrew Hall
I know this is a silly question, but when you write 'We do nothing with any indexes' do you mean indeces are never, _never_ touched (I mean explicitly, as in drop/create index), i.e. they are created at schema creation time and then left alone? Just to make sure... Hi and thanks for your feedback,

Re: [GENERAL] Lost rows/data corruption?

2005-02-16 Thread Andrew Hall
- Original Message - From: "Tom Lane" <[EMAIL PROTECTED]> To: "Andrew Hall" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, February 15, 2005 3:25 PM Subject: Re: [GENERAL] Lost rows/data corruption? "Andrew Hall" <[EMAIL PROTECTED]> writes: Here

Re: [GENERAL] Lost rows/data corruption?

2005-02-16 Thread Marco Colombo
On Wed, 16 Feb 2005, Scott Marlowe wrote: I know there are write modes in ext3 that will allow corruption on power loss (I think it's writeback). I know little of XFS in a production environment, as I run ext3, warts and all. Yeah, but even in writeback mode, ext3 doesn't lie on fsync. No FS does.

Re: [GENERAL] Lost rows/data corruption?

2005-02-16 Thread Scott Marlowe
On Wed, 2005-02-16 at 07:14, Alban Hertroys wrote: > Marco Colombo wrote: > > On Wed, 16 Feb 2005, Andrew Hall wrote: > > > >> fsync is on for all these boxes. Our customers run their own hardware > >> with many different specification of hardware in use. Many of our > >> customers don't have UP

Re: [GENERAL] Lost rows/data corruption?

2005-02-16 Thread Alban Hertroys
Marco Colombo wrote: On Wed, 16 Feb 2005, Andrew Hall wrote: fsync is on for all these boxes. Our customers run their own hardware with many different specification of hardware in use. Many of our customers don't have UPS, although their power is probably pretty reliable (normal city based utili

Re: [GENERAL] Lost rows/data corruption?

2005-02-16 Thread Marco Colombo
On Wed, 16 Feb 2005, Andrew Hall wrote: fsync is on for all these boxes. Our customers run their own hardware with many different specification of hardware in use. Many of our customers don't have UPS, although their power is probably pretty reliable (normal city based utilities), but of course

Re: [GENERAL] Lost rows/data corruption?

2005-02-15 Thread Andrew Hall
fsync is on for all these boxes. Our customers run their own hardware with many different specification of hardware in use. Many of our customers don't have UPS, although their power is probably pretty reliable (normal city based utilities), but of course I can't guarantee they don't get an outa

Re: [GENERAL] Lost rows/data corruption?

2005-02-15 Thread Marco Colombo
On Tue, 15 Feb 2005, Andrew Hall wrote: It sounds like a mess, all right. Do you have a procedure to follow to replicate this havoc? Are you sure there's not a hardware problem underlying it all? regards, tom lane We haven't been able to isolate what causes it but it's unlikely to be hardware as

Re: [GENERAL] Lost rows/data corruption?

2005-02-15 Thread Scott Marlowe
On Tue, 2005-02-15 at 04:56, Geoffrey wrote: > Tom Lane wrote: > > "Andrew Hall" <[EMAIL PROTECTED]> writes: > > > >> We haven't been able to isolate what causes it but it's unlikely to be > >> hardware as it happens on quite a few of our customer's boxes. > > > > > > Okay, then not hardware; bu

Re: [GENERAL] Lost rows/data corruption?

2005-02-15 Thread Geoffrey
Tom Lane wrote: "Andrew Hall" <[EMAIL PROTECTED]> writes: We haven't been able to isolate what causes it but it's unlikely to be hardware as it happens on quite a few of our customer's boxes. Okay, then not hardware; but it seems like you ought to be in a position to create a test case for other p

Re: [GENERAL] Lost rows/data corruption?

2005-02-14 Thread Tom Lane
"Andrew Hall" <[EMAIL PROTECTED]> writes: > We haven't been able to isolate what causes it but it's unlikely to be > hardware as it happens on quite a few of our customer's boxes. Okay, then not hardware; but it seems like you ought to be in a position to create a test case for other people to p

Re: [GENERAL] Lost rows/data corruption?

2005-02-14 Thread Andrew Hall
It sounds like a mess, all right. Do you have a procedure to follow to replicate this havoc? Are you sure there's not a hardware problem underlying it all? regards, tom lane We haven't been able to isolate what causes it but it's unlikely to be hardware as it happens on quite a few of our custom

Re: [GENERAL] Lost rows/data corruption?

2005-02-14 Thread Tom Lane
"Andrew Hall" <[EMAIL PROTECTED]> writes: > Here is the data you requested. It took little while to gather it as this > kind of corruption doesn't happen all the time. It sounds like a mess, all right. Do you have a procedure to follow to replicate this havoc? Are you sure there's not a hardwar

Re: [GENERAL] Lost rows/data corruption?

2005-02-14 Thread Andrew Hall
<[EMAIL PROTECTED]> To: "Andrew Hall" <[EMAIL PROTECTED]> Cc: Sent: Friday, February 04, 2005 10:12 AM Subject: Re: [GENERAL] Lost rows/data corruption? "Andrew Hall" <[EMAIL PROTECTED]> writes: We have a long running DB application using PG7.4.6. We do a

Re: [GENERAL] Lost rows/data corruption?

2005-02-03 Thread Tom Lane
"Andrew Hall" <[EMAIL PROTECTED]> writes: > We have a long running DB application using PG7.4.6. We do a VACUUM FULL > every night and a normal 'maintenance' VACUUM every hour. We do nothing with > any indexes. Every now and then we get errors from the database whereby an > update will fail on a ta

[GENERAL] Lost rows/data corruption?

2005-02-03 Thread Andrew Hall
Hello, We have a long running DB application using PG7.4.6. We do a VACUUM FULL every night and a normal 'maintenance' VACUUM every hour. We do nothing with any indexes. Every now and then we get errors from the database whereby an update will fail on a table saying that there is duplicate violatio