Re: KGDB/i386 broken/supposed to work?

2015-06-25 Thread Timo Buhrmester
 I'll keep digging.

The problem is that KGDB isn't prepared to deal with the non-blocking serial 
port reads that matt's commit introduced in 2013 (am I really the first one to 
try this on bare metal since then?)

Where the read function `com_common_getc` used to block, it will now return -1, 
which KGDB happily takes for real input (an endless stream of 0xff) having 
arrived on the serial port.
This leads to excessively long input (from KGDB's perspective) which eventually 
makes it bail out of interpreting the received data.
Compare sys/kern/kgdb_stub.c:

268  while ((c = GETC()) != KGDB_END  len  maxlen) {
269  DPRINTF((%c,c));
 [...]
273  len++;
274  }
 [...][`len` now has a significant value due to `c` repeatedly being -1]
 [...][so the following conditional is taken:]
279  if (len = maxlen) {
280  DPRINTF((Long- ));
281  PUTC(KGDB_BADP);
282  continue;
283  }

(It would have been easy to spot thanks to the DPRINTFs, but the one on line 
269 completely flooded the console, preventing me from catching the one on line 
280)

If I replace the `GETC` macro with a function that spins as long as -1 is 
read, as in the patch below, the problem disappears.

The patch below is of course just an ad-hoc fix that Works For Me(TM), I'm 
currently not sure how to address the problem in a more general manner -- 
perhaps by providing a blocking interface to the serial port on top of the 
non-blocking one, in a way similar to what the `GETC` function below does, and 
directing kgdb to use that instead?


--- sys/kern/kgdb_stub.c.orig   2015-06-26 01:49:32.0 +0200
+++ sys/kern/kgdb_stub.c2015-06-26 01:51:31.0 +0200
@@ -85,7 +85,17 @@
 static u_char buffer[KGDB_BUFLEN];
 static kgdb_reg_t gdb_regs[KGDB_NUMREGS];
 
-#define GETC() ((*kgdb_getc)(kgdb_ioarg))
+static int
+GETC(void)
+{
+   int c;
+
+   while ((c = kgdb_getc(kgdb_ioarg)) == -1)
+   ;
+
+   return c;
+}
+//#define GETC()   ((*kgdb_getc)(kgdb_ioarg))
 #define PUTC(c)((*kgdb_putc)(kgdb_ioarg, c))
 
 /*


Re: KGDB/i386 broken/supposed to work?

2015-06-22 Thread Timo Buhrmester
  There is no delay between ``kgdb waiting...'' and ``fatal breakpoint trap 
  in supervisor mode''.
 
 Looks like that behavior was introduced by the following commit
 to src/sys/arch/i386/i386/trap.c:
 
   
   revision 1.239
   date: 2008-05-30 13:38:21 +0300;  author: ad;  state: Exp;  lines: +10 -11;
   Since breakpoints don't work, dump basic info about the trap before
   entering the debugger. Sometimes ddb only makes the situation worse.
   
Yes, I've arrived at that commit too, after a longish bisect session.  It's 
indeed the expected behavior.

I'm now bisecting 6-stable (it seems to work there indeed), vs -head; but this 
will take quite a while.  I'll report back once I arrived somewhere. 

Thanks for your replies,

Timo


Re: KGDB/i386 broken/supposed to work?

2015-06-22 Thread Andreas Gustafsson
Timo Buhrmester wrote:
 Using a GENERIC kernel with only the modifications required to enable KGDB 
 (see bottom for config diff), I get the following behavior on the TARGET 
 machine:
 |  boot netbsd -d
 | 15741968+590492+466076 [689568+730405]=0x1161fd4
 | kernel text is mapped with 4 large pages and 5 normal pages
 | Loaded initial symtab at 0xc110750c, strtab at 0xc11afaac, # entries 43075
 | kgdb waiting...fatal breakpoint trap in supervisor mode
 | trap type 1 code 0 eip c02a6744 cs 8 eflags 202 cr2 0 ilevel 8 esp c1265ea0
 | curlwp 0xc1078900 pid 0 lid 1 lowest kstack 0xc12632c0
 
 There is no delay between ``kgdb waiting...'' and ``fatal breakpoint trap in 
 supervisor mode''.
 I'm not sure whether or not this is the expected behavior, because eip 
 c02a6744 is in the `breakpoint` function so that would make sense; but the 
 documentation makes it sound like it should just say ``kgdb waiting...''.

Looks like that behavior was introduced by the following commit
to src/sys/arch/i386/i386/trap.c:

  
  revision 1.239
  date: 2008-05-30 13:38:21 +0300;  author: ad;  state: Exp;  lines: +10 -11;
  Since breakpoints don't work, dump basic info about the trap before
  entering the debugger. Sometimes ddb only makes the situation worse.
  

 Any idea whether a) KGDB is tested/supposed to work and b) what I
 might be doing wrong?

KGDB over serial on i386 worked for me in January 2008.  I don't know
if it is still working now; the kernel debugging I've done since then
has been using qemu's built-in gdb stub instead, as described in
https://wiki.netbsd.org/kernel_debugging_with_qemu/.
-- 
Andreas Gustafsson, g...@gson.org


Re: KGDB/i386 broken/supposed to work?

2015-06-22 Thread Greg Troxel

I am a bit fuzzy on the details, but definitely in 2010 on netbsd-5
remote kgdb on i386 worked.  I am 95% sure it still worked on netbsd-6
in 2011/2012.


pgpCcJa1jGgY9.pgp
Description: PGP signature


Re: KGDB/i386 broken/supposed to work?

2015-06-19 Thread Christos Zoulas
In article 20150619201302.GA243@frozen.localdomain,
Timo Buhrmester  fstd.l...@gmail.com wrote:
I'm failing to get KGDB on i386 working for kernel debugging over a
serial (nullmodem) link, as described in
http://www.netbsd.org/docs/kernel/kgdb.html

The TARGET (to-be-debugged) system has two serial ports, com0 is the
boot console, com1 is what I set KGDB to operate on.
The REMOTE (debugger) system uses its com0 port to connect to the
target's com1.

Using a GENERIC kernel with only the modifications required to enable
KGDB (see bottom for config diff), I get the following behavior on the
TARGET machine:
|  boot netbsd -d
| 15741968+590492+466076 [689568+730405]=0x1161fd4
| kernel text is mapped with 4 large pages and 5 normal pages
| Loaded initial symtab at 0xc110750c, strtab at 0xc11afaac, # entries 43075
| kgdb waiting...fatal breakpoint trap in supervisor mode
| trap type 1 code 0 eip c02a6744 cs 8 eflags 202 cr2 0 ilevel 8 esp c1265ea0
| curlwp 0xc1078900 pid 0 lid 1 lowest kstack 0xc12632c0

There is no delay between ``kgdb waiting...'' and ``fatal breakpoint
trap in supervisor mode''.
I'm not sure whether or not this is the expected behavior, because eip
c02a6744 is in the `breakpoint` function so that would make sense; but
the documentation makes it sound like it should just say ``kgdb
waiting...''.


On the REMOTE (debugger) machine (serial port tty00) I get/do:
| # gdb -q netbsd.gdb
| Reading symbols from netbsd.gdb...done.
| (gdb) set remotebaud 38400 
| Warning: command 'set remotebaud' is deprecated.
| Use 'set serial baud'.
|
| (gdb) set serial baud 38400
| (gdb) set remotebreak 1
| Warning: command 'set remotebreak' is deprecated.
| Use 'set remote interrupt-sequence'.
|
| (gdb) set remote interrupt-sequence Ctrl-C 
| (gdb) set remotetimeout 5 
| (gdb) target remote /dev/tty00
| Remote debugging using /dev/tty00
| Ignoring packet error, continuing...
| warning: unrecognized item timeout in qSupported response
| Ignoring packet error, continuing...
| Ignoring packet error, continuing...
| Bogus trace status reply from target: timeout
| (gdb)

..which I presume is due to the target already having ceased execution.


Both machines run the same, recent -current build (7.99.18) on i386.
I have verified that the serial connection works in both directions,
using a non-KGDB GENERIC kernel.
I have also verified that kgdb is actually in the kernel and using the
right port (com1) when booting the KGDB kernel without -d:
| com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
| com0: console
| com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
| com1: kgdb


The difference between GENERIC and my KGDB-enabled version of it:
-#options  DEBUG   # expensive debugging checks/support
+options   DEBUG   # expensive debugging checks/support
 #options  LOCKDEBUG   # expensive locking checks/support
 #options  KMEMSTATS   # kernel memory statistics (vmstat -m)
-options   DDB # in-kernel debugger
+#options  DDB # in-kernel debugger
 #options  DDB_ONPANIC=1   # see also sysctl(7): `ddb.onpanic'
-options   DDB_HISTORY_SIZE=512# enable history editing in DDB
+#options  DDB_HISTORY_SIZE=512# enable history editing in DDB
 #options  DDB_VERBOSE_HELP
-#options  KGDB# remote debugger
-#options  KGDB_DEVNAME=\com\,KGDB_DEVADDR=0x3f8,KGDB_DEVRATE=9600
-#makeoptions  DEBUG=-g  # compile full symbol table
+options   KGDB# remote debugger
+options   KGDB_DEVNAME=\com\,KGDB_DEVADDR=0x2f8,KGDB_DEVRATE=38400
+makeoptions   DEBUG=-g  # compile full symbol table
 #options  SYSCALL_STATS   # per syscall counts
 #options  SYSCALL_TIMES   # per syscall times
 #options  SYSCALL_TIMES_HASCOUNTER# use 'broken' rdtsc (soekris)


Any idea whether a) KGDB is tested/supposed to work and b) what I might
be doing wrong?
Is there any other relevant information I missed that would be useful
to provide?


No, but the explanation is that support for it has probably rotted out.
I would file a PR so this information is not lost.

christos