date:20040907

[BUGS] BUG #1242: Major bug in pgSQL

2004-09-07 Thread PostgreSQL Bugs List


The following bug has been logged online:

Bug reference:  1242
Logged by:  

Email address:  [EMAIL PROTECTED]

PostgreSQL version: 7.4.3

Operating system:   Linux Debian 3.1

Description:Major bug in pgSQL

Details: 

Hi,

apparently we have found a critical bug in pgSQL. At the present moment we 
can not reproduce it, but here is the description: 

Foreword: we have a very high loaded pgSQL-based application with thousands 
of simultaneous users and overall 20 high performance server. Another 
nice-to-know thing to mention - we have very large transaction blocks, with 
some hunderds of SQL statements in each block. 

Problem: in some cases we experience the following problem - we have found 
in the database some _absolutely_ identical rows, despite the fact, that we 
have defined some unique (!) indexes on some of the fields and even primary 
(!) keys, we can see, that the rows are _exactly_ the same. In some cases we 
have seen up to 7 absolutely identical rows, with the same primary keys and 
the same unique indexed fields. 

This problem is in our eyes absolutely critical. We are even considering 
right now the change to another DBMS :(, even though we were in the past 
always very satisfied with pgSQL...  

We are looking forward to hear, if there are any known solutions for this 
kind of problem. Thank you very much! 


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[BUGS] BUG #1243: Driver problem

2004-09-07 Thread PostgreSQL Bugs List


The following bug has been logged online:

Bug reference:  1243
Logged by:  Hanna Tapani

Email address:  [EMAIL PROTECTED]

PostgreSQL version: 7.3.2

Operating system:   Solaris

Description:Driver problem

Details: 

SQLException: Something unusual has occured to cause the driver to fail. 
Please  
report this exception: Exception: java.sql.SQLException: ERROR:  Conversion 
betw 
een UNICODE and LATIN1 is not supported
Stack Trace:

java.sql.SQLException: ERROR:  Conversion between UNICODE and LATIN1 is not 
supp 
orted
at 
org.postgresql.core.QueryExecutor.executeV2(QueryExecutor.java:288) 
at org.postgresql.core.QueryExecutor.execute(QueryExecutor.java:104)
at org.postgresql.core.QueryExecutor.execute(QueryExecutor.java:43)
at 
org.postgresql.jdbc1.AbstractJdbc1Connection.execSQL(AbstractJdbc1Con 
nection.java:887)
at 
org.postgresql.jdbc1.AbstractJdbc1Connection.openConnectionV2(Abstrac 
tJdbc1Connection.java:816)
at 
org.postgresql.jdbc1.AbstractJdbc1Connection.openConnectionV3(Abstrac 
tJdbc1Connection.java:334)
at 
org.postgresql.jdbc1.AbstractJdbc1Connection.openConnection(AbstractJ 
dbc1Connection.java:214)
at org.postgresql.Driver.connect(Driver.java:139)
at java.sql.DriverManager.getConnection(DriverManager.java:512)
at java.sql.DriverManager.getConnection(DriverManager.java:171)
at Insert.dbInsert(Insert.java:20)
at Data$1$Dtextlistener1.keyTyped(Data.java:68)
at java.awt.Component.processKeyEvent(Component.java:4977)
at java.awt.Component.processEvent(Component.java:4831)
at java.awt.TextComponent.processEvent(TextComponent.java:624)
at java.awt.TextField.processEvent(TextField.java:544)
at java.awt.Component.dispatchEventImpl(Component.java:3527)
at java.awt.Component.dispatchEvent(Component.java:3368)
at 
java.awt.KeyboardFocusManager.redispatchEvent(KeyboardFocusManager.ja 
va:1700)
at 
java.awt.DefaultKeyboardFocusManager.dispatchKeyEvent(DefaultKeyboard 
FocusManager.java:568)
at 
java.awt.DefaultKeyboardFocusManager.preDispatchKeyEvent(DefaultKeybo 
ardFocusManager.java:740)
at 
java.awt.DefaultKeyboardFocusManager.typeAheadAssertions(DefaultKeybo 
ardFocusManager.java:673)
at 
java.awt.DefaultKeyboardFocusManager.dispatchEvent(DefaultKeyboardFoc 
usManager.java:534)
at java.awt.Component.dispatchEventImpl(Component.java:3397)
at java.awt.Component.dispatchEvent(Component.java:3368)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:445)
at 
java.awt.EventDispatchThread.pumpOneEventForHierarchy(EventDispatchTh 
read.java:191)
at 
java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThre 
ad.java:144)
at 
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:138) 
at 
java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:130) 
at java.awt.EventDispatchThread.run(EventDispatchThread.java:98)
End of Stack Trace



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [BUGS] BUG #1242: Major bug in pgSQL

2004-09-07 Thread Gaetano Mendola

PostgreSQL Bugs List wrote:

Problem: in some cases we experience the following problem - we have found 
in the database some _absolutely_ identical rows, despite the fact, that we 
have defined some unique (!) indexes on some of the fields and even primary 
(!) keys, we can see, that the rows are _exactly_ the same. In some cases we 
have seen up to 7 absolutely identical rows, with the same primary keys and 
the same unique indexed fields. 
I had the same experiences in 7.3 release and I realized that this is due to
some interaction between vacuum, reindex and update on the same table. See
this posts:
http://archives.postgresql.org/pgsql-bugs/2003-05/msg00060.php
http://www.mail-archive.com/[EMAIL PROTECTED]/msg09025.html
http://archives.postgresql.org/pgsql-admin/2003-04/msg00407.php
http://archives.postgresql.org/pgsql-bugs/2003-11/msg00129.php
unfortunatelly I never was able to reproduce it.
When you are experiencing this show us the result of this query:
select cmax, cmin, xmax, xmin, * from  where ;
where  is a filter in order to obtain the rows wit the
primary key duplicated.
However I'm sure that you don't have two row with duplicated primary
key but two version of the same row, the result however is the same.
Are you reindexing your tables regulary ?

Regards
Gaetano Mendola


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [BUGS] BUG #1242: Major bug in pgSQL

2004-09-07 Thread Tom Lane

Gaetano Mendola <[EMAIL PROTECTED]> writes:
> When you are experiencing this show us the result of this query:
> select cmax, cmin, xmax, xmin, * from  where ;

Also, please, the ctid and oid columns (but leave out oid if you made
the table WITHOUT OIDS).

Also, if the condition is one that will normally use an index, try
the same query with and without "set enable_indexscan = off".  It
could be that a corrupted index would cause the query to visit the
same rows multiple times (or miss rows!).

It might be a good idea to REINDEX the primary-key index on the table,
but I would counsel not doing so until we have more data on what's
happening.  If the problem is index corruption then REINDEX would
destroy all the evidence ...

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PATCHES] [pgsql-hackers-win32] [BUGS] Win32 deadlock detection not working for Postgres8beta1

2004-09-07 Thread Tom Lane

"Magnus Hagander" <[EMAIL PROTECTED]> writes:
>> How does this fix that case?

> It doesn't. This is why the second version of the patch was required,
> per http://archives.postgresql.org/pgsql-patches/2004-09/msg00039.php.

Okay, I've applied the right version of the patch now ;-)

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[BUGS] BUG #1244: Postgres doesn't work on FAT

2004-09-07 Thread PostgreSQL Bugs List


The following bug has been logged online:

Bug reference:  1244
Logged by:  Fernando del Valle

Email address:  [EMAIL PROTECTED]

PostgreSQL version: 8.0 Beta

Operating system:   Windows XP

Description:Postgres doesn't work on FAT

Details: 

Postgres installs fine on W2K + FAT32, but not on Windows XP + FAT32. The 
difference seems to be in CACLS.EXE, which is used by the installer to 
change the permissions of files. When used on a file in a FAT32 filesystem, 
in Windows 2000, CACLS.EXE returns ok (even though it does not perform the 
operation, as it can't). On XP, CACLS.EXE returns an error and the installer 
aborts. We have tested it with 7.5 devel and 8.0, and fails with both of 
them. 

BTW, the link to the bug reporter (this page) is not working (it should 
point to /bugform.html and not to /bugs/bugs.php). 



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [BUGS] BUG #1244: Postgres doesn't work on FAT

2004-09-07 Thread Harald Armin Massa

Hello,

in the release notes there is a "FAT is not really supported" notice
for PostgreSQL.

I suggest following workaround:

create a cacls.exe and put it first in the search path. That cacls.exe
should just return "OK" no matter of the imput-params.


Be warned that "FAT is not really supported" has good reasons.

Harald



On Tue,  7 Sep 2004 16:21:58 +0100 (BST), PostgreSQL Bugs List
<[EMAIL PROTECTED]> wrote:
> 
> The following bug has been logged online:
> 
> Bug reference:  1244
> Logged by:  Fernando del Valle
> 
> Email address:  [EMAIL PROTECTED] 
> 
> PostgreSQL version: 8.0 Beta
> 
> Operating system:   Windows XP
> 
> Description:Postgres doesn't work on FAT
> 
> Details:
> 
> Postgres installs fine on W2K + FAT32, but not on Windows XP + FAT32. The
> difference seems to be in CACLS.EXE, which is used by the installer to
> change the permissions of files. When used on a file in a FAT32 filesystem,
> in Windows 2000, CACLS.EXE returns ok (even though it does not perform the
> operation, as it can't). On XP, CACLS.EXE returns an error and the installer
> aborts. We have tested it with 7.5 devel and 8.0, and fails with both of
> them.
> 
> BTW, the link to the bug reporter (this page) is not working (it should
> point to /bugform.html and not to /bugs/bugs.php).
> 
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED] )
> 



-- 
GHUM Harald Massa
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [BUGS] BUG #1244: Postgres doesn't work on FAT

2004-09-07 Thread Gaetano Mendola

Harald Armin Massa wrote:
Hello,
in the release notes there is a "FAT is not really supported" notice
for PostgreSQL.
I suggest following workaround:
create a cacls.exe and put it first in the search path. That cacls.exe
should just return "OK" no matter of the imput-params.
Be warned that "FAT is not really supported" has good reasons.
I think that we have to decide if postgres is supposed to work on
FAT32 or not. If, as I expect, the response is not, then we have to find
a relaiable way to detect a NTFS ( also when a tablespace is created BTW )
and refuse to start on a FATXX, and not trust on the failre of a cacls.exe
My 2 cents.
Regards
Gaetano Mendola


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

[BUGS] PosgreSQL is crashing with a signal 11 - Bug?

2004-09-07 Thread Rafael Martinez Guerrero

Hello

We have a problem with one of our central databases and we need your
help to find a solution.

--
* Description: 
--
"PosgreSQL is crashing with a signal 11 and we do not think is a
hardware problem" :(

Since last week, we are having a big problem with one of our postgreSQL
installations. The database is not so big but it is used intensely with
different jobs running parallel transactions.

The first time the database crashed, we had been running 7.3.5 for a
long time without problems. Because the signal 11, we thought it was a
problem with defective memory, we changed RAM in the server and restored
the database from last backup. The memory was defective and we thought
we found the problem to our signal 11.

After some hours the database crashed again with the samme error. We did
not take any chances and moved the database to a new server. This did
not help and we got the samme problem after some hours. We updated to
7.3.7 hoping for the best but it did not help either.

Today it crashed again but this time we have logged more information
from the crash and we hope you can help us to find a solution to this
problem. 

Below, some relevant information about our system. Please do not
hesitate to ask for more information if you need it.


* OS/Machine/Filesystem: 

- Red Hat Enterprise Linux WS release 3 (Taroon Update 3)
  kernel 2.4.21-15.0.3.ELsmp

- Dell 2650: 
  2 x Intel(R)Xeon(TM)CPU 2.40GHz
  2GB RAM
  PERCRAID Mirror 2 x 73GB

- LVM - ext3

-
* Version / compilator / libc
-
- PostgreSQL 7.3.7

- Options given to 'configure' script when PostgreSQL was built:
'--prefix=/local/opt/postgresql' '--mandir=/local/share/man'
'--with-openssl=/local' '--with-perl' '--with-java' 'CC=cc-wrapper'
'LDFLAGS=-L/local/lib'

- gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-42)
- glibc-2.3.2-95.27 (RHEL3)

--
* Configuration / DBSize
--

- Relevant parameters that have been changed by us:

postgresql.conf:
max_connections = 600
superuser_reserved_connections = 2

shared_buffers = 8192
max_fsm_relations = 1000
max_fsm_pages = 2
wal_buffers = 64 

sort_mem = 2048
vacuum_mem = 32768 
fsync = true

effective_cache_size = 131072 
random_page_cost = 2

stats_start_collector = true
stats_command_string = true
stats_row_level = true
stats_block_level = true

autocommit=false


/etc/sysctl.conf:
kernel.shmall = 134217728
kernel.shmmax = 134217728


-bash-2.05b# du -h
4.2M./base/1
3.6M./base/16975
4.0K./base/63684339/pgsql_tmp
2.3G./base/63684339
2.3G./base
168K./global
129M./pg_xlog
8.6M./pg_clog
2.4G.

---
* Information from CORE dump we got without --enable-debug.
---

This GDB was configured as "i386-redhat-linux-gnu"...(no debugging
symbols found)...Using host libthread_db library
"/lib/tls/libthread_db.so.1".

Program terminated with signal 11, Segmentation fault.

[..]
(gdb) bt
#0  0xb734d07c in memcpy () from /lib/tls/libc.so.6
#1  0x0806bba8 in DataFill ()
#2  0x0806c3ee in heap_formtuple ()
#3  0x080d1af1 in ExecTargetList ()
#4  0x080d1cdb in ExecProject ()
#5  0x080d1d7d in ExecScan ()
#6  0x080d5b5e in ExecIndexScan ()
#7  0x080cfd91 in ExecProcNode ()
#8  0x082fcd08 in ?? ()
#9  0x in ?? ()
#10 0x082ff120 in ?? ()
#11 0x in ?? ()
#12 0x082f9a78 in ?? ()
#13 0xbfff8028 in ?? ()
#14 0x080d6e5a in ExecMergeJoin ()
Previous frame inner to this frame (corrupt stack?)
-

---
* Information from CORE dump we got with --enable-debug. We have
compiled a new version of postgres and run it through gdb with the core
dump we had/got from postgres without --enable-debug. 
---

#0  0xb734d07c in memcpy () from /lib/tls/libc.so.6

#1  0x0806bba8 in DataFill (data=0xb7489000 , tupleDesc=0x82fd554, value=0x82fd550, nulls=0xbfff7ec0 "  n  ",
infomask=0x836e904c, bit=0x836e904f "\003\f") at heaptuple.c:139

#2  0x0806c3ee in heap_formtuple (tupleDescriptor=0x82fd620,
value=0x82fd550, nulls=0xbfff7ec0 "  n  ") at heaptuple.c:623

#3  0x080d1af1 in ExecTargetList (targetlist=0x82fa250, nodomains=5,
targettype=0x82fd620, values=0x82fd550, econtext=0x82fd4e0,
isDone=0xbfff7f68) at execQual.c:2230

#4  0x080d1cdb in ExecScan (node=0x82fd528, accessMtd=0xbfff7f68) at
execScan.c:49

#5  0x080d1d7d in ExecScan (node=0x82fa140, accessMtd=0x80d58d4
) at execScan.c:146

#6  0x080d5b5e in ExecIndexReScan (node=0x82fa140, exprCtxt=0xb72117d,
parent=0x0) at nodeIndexscan.c:284

#7  0x080d58d4 in IndexNext (node=0x0) at nodeIndexscan.c:87
Previous frame inner to this frame (corrupt stack?)


* Re

Re: [BUGS] PosgreSQL is crashing with a signal 11 - Bug?

2004-09-07 Thread Tom Lane

Rafael Martinez Guerrero <[EMAIL PROTECTED]> writes:
> * Information from CORE dump we got with --enable-debug. We have
> compiled a new version of postgres and run it through gdb with the core
> dump we had/got from postgres without --enable-debug.=20

Okay, theoretically that works, but it might be smarter to install the
debug build and get a fresh core dump that definitely corresponds to it.

> #0  0xb734d07c in memcpy () from /lib/tls/libc.so.6

> #1  0x0806bba8 in DataFill (data=3D0xb7489000  bounds>, tupleDesc=3D0x82fd554, value=3D0x82fd550, nulls=3D0xbfff7ec0 "  n =
>  ",
> infomask=3D0x836e904c, bit=3D0x836e904f "\003\f") at heaptuple.c:139

If accurate, that says it's crashing here:

/* fixed-length pass-by-reference */
Assert(att[i]->attlen > 0);
data_length = att[i]->attlen;
--> memcpy(data, DatumGetPointer(value[i]), data_length);

which suggests either that att[i]->attlen is corrupt, or that the
computed length for the preceding column was wacko (leading to the
data pointer being moved to a silly address), or that the provided
value[i] is wrong.  In the context at hand none of these seem especially
likely, but one of them must be the case.  Can you look with jdb to
 see what the value of i is, and print out the contents of the *(att[i])
struct?  Also look at "data" and "value[i]" to see if they are sensible
pointers or not.

How reproducible is the crash --- does it happen every time you execute
this particular FETCH?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [BUGS] PosgreSQL is crashing with a signal 11 - Bug?

2004-09-07 Thread Rafael Martinez

On Tue, 2004-09-07 at 19:58, Tom Lane wrote:

> Rafael Martinez Guerrero <[EMAIL PROTECTED]> writes:
> > * Information from CORE dump we got with --enable-debug. We have
> > compiled a new version of postgres and run it through gdb with the core
> > dump we had/got from postgres without --enable-debug.=20
> 
> Okay, theoretically that works, but it might be smarter to install the
> debug build and get a fresh core dump that definitely corresponds to it.
> 

It is late in Norway and we need to sleep, we will try this tomorrow
morning.


> > #0  0xb734d07c in memcpy () from /lib/tls/libc.so.6
> 
> > #1  0x0806bba8 in DataFill (data=3D0xb7489000  > bounds>, tupleDesc=3D0x82fd554, value=3D0x82fd550, nulls=3D0xbfff7ec0 "  n =
> >  ",
> > infomask=3D0x836e904c, bit=3D0x836e904f "\003\f") at heaptuple.c:139
> 
> If accurate, that says it's crashing here:
> 
> /* fixed-length pass-by-reference */
> Assert(att[i]->attlen > 0);
> data_length = att[i]->attlen;
> --> memcpy(data, DatumGetPointer(value[i]), data_length);
> 
> which suggests either that att[i]->attlen is corrupt, or that the
> computed length for the preceding column was wacko (leading to the
> data pointer being moved to a silly address), or that the provided
> value[i] is wrong.  In the context at hand none of these seem especially
> likely, but one of them must be the case.  Can you look with jdb to
>  see what the value of i is, and print out the contents of the *(att[i])
> struct?  Also look at "data" and "value[i]" to see if they are sensible
> pointers or not.
> 

I got this from one of our developers (from the core dump generated by
7.3.7 without --enable-debug):
--
(gdb) inspect i
$1 = 1

(gdb) inspect att[i]
$2 = 0x82fd6e8

(gdb) inspect *att[i]
$3 = {attrelid = 0, attname = {data = '\0' ,
alignmentDummy = 0}, atttypid = 1700, attstattarget = -1, attlen = -1,
attnum = 2, attndims = 0, attcacheoff = -1, atttypmod = 393220, attbyval
= 0 '\0', attstorage = 109 'm', attisset = 0 '\0', attalign = 105 'i',
attnotnull = 0 '\0', atthasdef = 0 '\0', attisdropped = 0 '\0',
attislocal = 1 '\001', attinhcount = 0}

(gdb) inspect data
$4 = 0xb7489000 

(gdb) inspect value[i]
$5 = 3054556648


> How reproducible is the crash --- does it happen every time you execute
> this particular FETCH?
> 

We are not sure about this. We did not log as much as we should in the
beginning. One thing is sure, the last time, it happens after this
FETCH. We have full logging on now and we will be able to know more
about this if/when it crash again.



>   regards, tom lane


Thanks for your help. I hope you/we will be able to find out this, right
now is a big crisis for us.

-- 
 Rafael Martinez, <[EMAIL PROTECTED]>
 Center for Information Technology Services
 University of Oslo, Norway



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [BUGS] PosgreSQL is crashing with a signal 11 - Bug?

2004-09-07 Thread Rafael Martinez

On Tue, 2004-09-07 at 23:36, Tom Lane wrote:
> Rafael Martinez <[EMAIL PROTECTED]> writes:
> > I got this from one of our developers (from the core dump generated by
> > 7.3.7 without --enable-debug):
> 
> > (gdb) inspect *att[i]
> > $3 = {attrelid = 0, attname = {data = '\0' ,
> > alignmentDummy = 0}, atttypid = 1700, attstattarget = -1, attlen = -1,
> > attnum = 2, attndims = 0, attcacheoff = -1, atttypmod = 393220, attbyval
> > = 0 '\0', attstorage = 109 'm', attisset = 0 '\0', attalign = 105 'i',
> > attnotnull = 0 '\0', atthasdef = 0 '\0', attisdropped = 0 '\0',
> > attislocal = 1 '\001', attinhcount = 0}
> 
> That looks reasonable ...
> 
> > (gdb) inspect data
> > $4 = 0xb7489000 
> 
> > (gdb) inspect value[i]
> > $5 = 3054556648
> 
> Hmm, what do you get from "x/10 3054556648" ?  Also, it'd be worth
> looking at the contents of *att[0] to see if that's also sensible,
> as well as value[0] and wherever that points (if it's a pointer).
> 
>   regards, tom lane

(gdb) x/10 3054556648
0xb610d5e8: 0x2f0c  0x0002  0x3017 
0x020c6172  
0xb610d5f8: 0x  0x  0x00ae 
0x0006002b  
0xb610d608: 0x2f1c0913  0x0404b70b

(gdb) inspect att[0]
$1 = 0x82fd660

(gdb) inspect *att[0]
$2 = {attrelid = 0, attname = {data = '\0' ,
alignmentDummy = 0}, atttypid = 1700, attstattarget = -1, attlen = -1,
attnum = 1, attndims = 0, attcacheoff = 0, atttypmod = 786436, attbyval
= 0 '\0', attstorage = 109 'm', attisset = 0 '\0', attalign = 105 'i',
attnotnull = 0 '\0', atthasdef = 0 '\0', attisdropped = 0 '\0',
attislocal = 1 '\001', attinhcount = 0}

(gdb) inspect value[0]
$3 = 3054556612

(gdb) inspect *value[0]
$4 = 12


-- 
 Rafael Martinez, <[EMAIL PROTECTED]>
 Center for Information Technology Services
 University of Oslo, Norway



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [BUGS] PosgreSQL is crashing with a signal 11 - Bug?

2004-09-07 Thread Tom Lane

Rafael Martinez <[EMAIL PROTECTED]> writes:
>> Hmm, what do you get from "x/10 3054556648" ?

> (gdb) x/10 3054556648
> 0xb610d5e8: 0x2f0c  0x0002  0x3017 0x020c6172
> 0xb610d5f8: 0x  0x  0x00ae 0x0006002b
> 0xb610d608: 0x2f1c0913  0x0404b70b

Well, that's certainly not a sensible first word for a numeric field;
the first word should be a length and this obviously isn't.

A reasonable theory at this point is that the data on disk for this
table have gotten corrupted, probably in the way of a bad length value
for whatever field(s) lie between the two that are being extracted here.
That could result in a miscomputed address for the next field, which
seems to be what we're looking at.

What I would suggest doing next is backtracking to find out which
physical tuple this is on which disk page, and then dumping that out
with pg_filedump (or your tool of choice) so that we can verify or
disprove the hypothesis of bad stored data.  If it is bad data, we'll
want to examine the whole page anyway to see if we can see any pattern
of corruption.

You should be able to find out the physical tuple involved by looking at
the "ecxt_scantuple" field of ExecTargetList's econtext parameter.  Its
"val" field should point to something like this:

(gdb) p *econtext->ecxt_scantuple->val
$3 = {t_len = 276, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 1},
  t_tableOid = 863135, t_datamcxt = 0x0, t_data = 0xc2c0fc48}

t_tableOid is the source table OID, ip_blkid is the page number (divided
into high and low 16-bit halves for arcane reasons), and ip_posid is the
tuple number on that page.  You can also look at *t_data for additional
confirmation of what you are dealing with:

(gdb) p *econtext->ecxt_scantuple->val->t_data
$4 = {t_choice = {t_heap = {t_xmin = 42833, t_cmin = 0, t_xmax = 863136,
  t_field4 = {t_cmax = 0, t_xvac = 0}}, t_datum = {datum_len = 42833,
  datum_typmod = 0, datum_typeid = 863136}}, t_ctid = {ip_blkid = {
  bi_hi = 0, bi_lo = 0}, ip_posid = 1}, t_natts = 16, t_infomask = 2320,
  t_hoff = 32 ' ', t_bits = ""}

I'm using CVS tip to prepare this example, so the field layout is not
the same as what you'll see in 7.4, but there will be a t_ctid field
and it will probably have the same contents as what you saw in the
scantuple struct.

Once you have the table OID, discover its file node number:

regression=# select relfilenode from pg_class where oid = 863135;
 relfilenode
-
  863135
(1 row)

(These will often be the same, but don't assume so without verifying.)
And look up your database OID:

regression=# select oid from pg_database where datname = 'mydb';

Now the file you want to look at is $PGDATA/base/dboid/relfilenode.

If you are using pg_filedump (see http://sources.redhat.com/rhdb/)
then I'd recommend a command along the lines of 

pg_filedump -i -f -R pagenum $PGDATA/base/dboid/relfilenode

to dump the page in the most useful format.

We'll need to know the table schema ("\d tabname") also to interpret
what's in the dump.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [BUGS] PosgreSQL is crashing with a signal 11 - Bug?

2004-09-07 Thread Tom Lane

Rafael Martinez <[EMAIL PROTECTED]> writes:
> I got this from one of our developers (from the core dump generated by
> 7.3.7 without --enable-debug):

> (gdb) inspect *att[i]
> $3 = {attrelid = 0, attname = {data = '\0' ,
> alignmentDummy = 0}, atttypid = 1700, attstattarget = -1, attlen = -1,
> attnum = 2, attndims = 0, attcacheoff = -1, atttypmod = 393220, attbyval
> = 0 '\0', attstorage = 109 'm', attisset = 0 '\0', attalign = 105 'i',
> attnotnull = 0 '\0', atthasdef = 0 '\0', attisdropped = 0 '\0',
> attislocal = 1 '\001', attinhcount = 0}

That looks reasonable ...

> (gdb) inspect data
> $4 = 0xb7489000 

> (gdb) inspect value[i]
> $5 = 3054556648

Hmm, what do you get from "x/10 3054556648" ?  Also, it'd be worth
looking at the contents of *att[0] to see if that's also sensible,
as well as value[0] and wherever that points (if it's a pointer).

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[BUGS] BUG #1242: Major bug in pgSQL

[BUGS] BUG #1243: Driver problem

Re: [BUGS] BUG #1242: Major bug in pgSQL

Re: [BUGS] BUG #1242: Major bug in pgSQL

Re: [PATCHES] [pgsql-hackers-win32] [BUGS] Win32 deadlock detection not working for Postgres8beta1

[BUGS] BUG #1244: Postgres doesn't work on FAT

Re: [BUGS] BUG #1244: Postgres doesn't work on FAT

Re: [BUGS] BUG #1244: Postgres doesn't work on FAT

[BUGS] PosgreSQL is crashing with a signal 11 - Bug?

Re: [BUGS] PosgreSQL is crashing with a signal 11 - Bug?

Re: [BUGS] PosgreSQL is crashing with a signal 11 - Bug?

Re: [BUGS] PosgreSQL is crashing with a signal 11 - Bug?

Re: [BUGS] PosgreSQL is crashing with a signal 11 - Bug?

Re: [BUGS] PosgreSQL is crashing with a signal 11 - Bug?

14 matches

Site Navigation

Mail list logo

Footer information