Re: [PATCHES] Re: [HACKERS] Re: Tuple-valued datums on Alpha (was Re: 7.1 on DEC/Alpha)

2000-12-28 Thread Oliver Elphick

Tom Lane wrote:
  "Oliver Elphick" [EMAIL PROTECTED] writes:
   FATAL 2:  Checkpoint lock is busy while data base is shutting down

   It's not just on Alpha; I've seen that on my i386 Linux system.

  FWIW, I do *not* see this behavior on HPUX.  It seems perfectly
  reproducible on the Debian Alpha box.  Is it reproducible on your
  i386 box, or only sometimes?


Hmm. I'm just waking up a bit more.  Now I'm thinking slightly more
clearly, I saw the problem yesterday when I was doing an Alpha build
on faure.debian.org; so I think it was actually on Alpha, not i386 after
all.  Sorry for the red herring.

-- 
Oliver Elphick[EMAIL PROTECTED]
Isle of Wight  http://www.lfix.co.uk/oliver
PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47  6B 7E 39 CC 56 E4 C1 47
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
 
 "For God shall bring every work into judgment, 
  with every secret thing, whether it be good, or  
  whether it be evil."   Ecclesiastes 12:14 





Re: [HACKERS] Re: Tuple-valued datums on Alpha (was Re: 7.1 on DEC/Alpha)

2000-12-27 Thread Brent Verner

On 26 Dec 2000 at 23:41 (-0500), Tom Lane wrote:
| Brent Verner [EMAIL PROTECTED] writes:
|  | Please apply it locally and let me know what you find.
| 
|  what I'm seeing now is much the same.
| 
| Drat.  More to do, then.

after hours in the gdb-hole, I see this... maybe a clue? :)

src/include/access/common/heaptuple.c:

450 {
451 
452   /*
453* Fix me when going to a machine with more than a four-byte
454* word!
455*/
456   off = att_align(off, att[j]-attlen, att[j]-attalign);
457 
458   att[j]-attcacheoff = off;
459 
460   off = att_addlength(off, att[j]-attlen, tp + off);
461 }

I'm pretty sure I don't know best how to fix this, but I've got some
randomly entered code compiling now :)  If it passes the regression 
tests I'll send it along.

  brent 'glad the coffee shop in the backyard is open now :)'




Re: [HACKERS] Re: Tuple-valued datums on Alpha (was Re: 7.1 on DEC/Alpha)

2000-12-27 Thread Tom Lane

Brent Verner [EMAIL PROTECTED] writes:
 after hours in the gdb-hole, I see this... maybe a clue? :)

I don't think that comment means anything.  Possibly it's a leftover
from a time when there was something unportable there.  But if att_align
were broken on Alphas, you'd have a lot worse problems than what you're
seeing.

regards, tom lane



Re: [HACKERS] Re: Tuple-valued datums on Alpha (was Re: 7.1 on DEC/Alpha)

2000-12-27 Thread Tom Lane

Brent Verner [EMAIL PROTECTED] writes:
 these are the steps leading up the the assignment of the fscked
 fcache-fcinfo.arg[i] at execQual.c:603, which is what will eventually
 blow up ExecEvalFieldSelect.

That looks OK as far as it goes.  Inside ExecEvalVar, you need to look
at the tuple_type data structure in more detail, specifically
p *tuple_type-attrs[0]
p *tuple_type-attrs[1]
(I think the leading * is correct here, try omitting it if gdb gets
unhappy.)

 (gdb) print *variable
 $57 = {type = T_Var, varno = 65001, varattno = 1, vartype = 21220, 
   vartypmod = 8, varlevelsup = 0, varnoold = 1, varoattno = 0}

That part looks promising --- vartypmod is sizeof(Pointer) not -1,
so the front-end part of my patch seems to be working.  What I suspect
we'll find is that the tupledesc doesn't show sizeof the first field to
be 8 the way we want.  Which would imply that I missed a place (or
multiple places :-() that needs to know about the convention for typmod
of a tuple datatype.

regards, tom lane



Re: [PATCHES] Re: [HACKERS] Re: Tuple-valued datums on Alpha (was Re: 7.1 on DEC/Alpha)

2000-12-27 Thread Tom Lane

Brent Verner [EMAIL PROTECTED] writes:
 | Hm.  I thought I'd fixed that.  Are you up to date on
 | src/backend/utils/adt/oid.c ?  Current CVS has rev 1.42.

 yup. got that version -- 1.42 2000/12/22 21:36:09 tgl

You're right, it was still broken :-(.  I think I've got it now, though.

Oliver Elphick was kind enough to arrange access to an Alpha running
Debian Linux, and I find that current-as-of-this-moment sources pass
all regression tests in either serial or parallel test mode on that
system.  Curiously, however, the system fails when you try to shut
it down:

Smart Shutdown request at Thu Dec 28 02:41:49 2000
DEBUG:  shutting down
FATAL 2:  Checkpoint lock is busy while data base is shutting down
Shutdown failed - abort

I have no idea why this should be.  Evidently there's something wrong
with the TAS() macro --- yet it seems to work fine elsewhere.  Ideas
anyone?

regards, tom lane



Re: [PATCHES] Re: [HACKERS] Re: Tuple-valued datums on Alpha (was Re: 7.1 on DEC/Alpha)

2000-12-27 Thread Brent Verner

On 27 Dec 2000 at 21:45 (-0500), Tom Lane wrote:
| Brent Verner [EMAIL PROTECTED] writes:
|  | Hm.  I thought I'd fixed that.  Are you up to date on
|  | src/backend/utils/adt/oid.c ?  Current CVS has rev 1.42.
| 
|  yup. got that version -- 1.42 2000/12/22 21:36:09 tgl
| 
| You're right, it was still broken :-(.  I think I've got it now, though.

i'll check it tomorrow.

| Oliver Elphick was kind enough to arrange access to an Alpha running
| Debian Linux, and I find that current-as-of-this-moment sources pass
| all regression tests in either serial or parallel test mode on that
| system.  Curiously, however, the system fails when you try to shut
| it down:

good. I'm glad you guys linked up :)

| Smart Shutdown request at Thu Dec 28 02:41:49 2000
| DEBUG:  shutting down
| FATAL 2:  Checkpoint lock is busy while data base is shutting down
| Shutdown failed - abort

I'm not seeing this with my latest revision of the TAS() asm.

Smart Shutdown request at Wed Dec 27 19:25:45 2000
DEBUG:  shutting down
DEBUG:  MoveOfflineLogs: remove 
DEBUG:  database system is shut down

| I have no idea why this should be.  Evidently there's something wrong
| with the TAS() macro --- yet it seems to work fine elsewhere.  Ideas
| anyone?

re-evaluating the asm stuff now.

thanks.
  brent



Re: [PATCHES] Re: [HACKERS] Re: Tuple-valued datums on Alpha (was Re: 7.1 on DEC/Alpha)

2000-12-27 Thread Oliver Elphick

Tom Lane wrote:
...
  system.  Curiously, however, the system fails when you try to shut
  it down:
  
  Smart Shutdown request at Thu Dec 28 02:41:49 2000
  DEBUG:  shutting down
  FATAL 2:  Checkpoint lock is busy while data base is shutting down
  Shutdown failed - abort
  
  I have no idea why this should be.  Evidently there's something wrong
  with the TAS() macro --- yet it seems to work fine elsewhere.  Ideas
  anyone?
 
It's not just on Alpha; I've seen that on my i386 Linux system.

-- 
Oliver Elphick[EMAIL PROTECTED]
Isle of Wight  http://www.lfix.co.uk/oliver
PGP: 1024R/32B8FAA1: 97 EA 1D 47 72 3F 28 47  6B 7E 39 CC 56 E4 C1 47
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
 
 "For God shall bring every work into judgment, 
  with every secret thing, whether it be good, or  
  whether it be evil."   Ecclesiastes 12:14 





Re: [PATCHES] Re: [HACKERS] Re: Tuple-valued datums on Alpha (was Re: 7.1 on DEC/Alpha)

2000-12-27 Thread Tom Lane

"Oliver Elphick" [EMAIL PROTECTED] writes:
 Smart Shutdown request at Thu Dec 28 02:41:49 2000
 DEBUG:  shutting down
 FATAL 2:  Checkpoint lock is busy while data base is shutting down
 Shutdown failed - abort
 
 It's not just on Alpha; I've seen that on my i386 Linux system.

Oooh, that's interesting.  I was just blindly assuming that it was
a problem with the Alpha spinlock code (we've sure heard plenty of
discussion of same).  But maybe there's an actual logic bug in the
checkpoint code.  I don't see one in a quick scan though.

FWIW, I do *not* see this behavior on HPUX.  It seems perfectly
reproducible on the Debian Alpha box.  Is it reproducible on your
i386 box, or only sometimes?

Vadim, any ideas?

regards, tom lane



Re: [HACKERS] Re: Tuple-valued datums on Alpha (was Re: 7.1 on DEC/Alpha)

2000-12-26 Thread Tom Lane

Brent Verner [EMAIL PROTECTED] writes:
 | Please apply it locally and let me know what you find.

 what I'm seeing now is much the same.

Drat.  More to do, then.

 i've been in circles trying to figure out where fcinfo-arg is filled.
 can you point me toward that?

See src/backend/utils/fmgr/README and src/backend/utils/fmgr/fmgr.c.
But fmgr is probably only the carrier of disease, not the source...

regards, tom lane



Re: [HACKERS] Re: Tuple-valued datums on Alpha (was Re: 7.1 on DEC/Alpha)

2000-12-26 Thread Brent Verner

On 26 Dec 2000 at 23:41 (-0500), Tom Lane wrote:
| Brent Verner [EMAIL PROTECTED] writes:
|  | Please apply it locally and let me know what you find.
| 
|  what I'm seeing now is much the same.

sorry, I sent the previous email w/o the details of the different 
behavior. Inside ExecEvalFieldSelect(), result is now 303, instead
of 110599844 (...or whatever is was). I'm not sure if this gives 
you any additional clues.

thanks.
  brent