reading 512 bytes from raw device with 2048 sector size fails
Reading 512 bytes from raw device with 2048 sector size fails. If i read 512 bytes from block device the problem is not reproduced. How to reproduce. 1. I plugged my iPod Nano (2nd generation) into PC. Then use the following code to reproduce the issue. #include sys/types.h #include fcntl.h #include stdio.h #include unistd.h int main(void) { char buf[512]; ssize_t sz; int fd; fd = open(/dev/rsd0j, O_RDONLY); if (fd == -1) err(1, open); sz = read(fd, buf, sizeof(buf)); if (sz == -1) err(1, read); close(fd); return (0); } 2. Another way to reproduce and actually why i spot this problem is fsck. `fsck /dev/sd0j` on msdos device launches `fsck_msdos /dev/rsd0j`. If i run fsck_msdos on /dev/sd0j the problem is not reproduced. The problem goes into kernel in file sys/kern/subr_disk.c function bounds_check_with_label(): /* Ensure transfer is a whole number of aligned sectors. */ if ((bp-b_blkno % DL_BLKSPERSEC(lp)) != 0 || (bp-b_bcount % lp-d_secsize) != 0) goto bad;
Re: reading 512 bytes from raw device with 2048 sector size fails
On Tue, Feb 21, 2012 at 03:15:29PM +0200, Alexey Vatchenko wrote: Reading 512 bytes from raw device with 2048 sector size fails. If i read 512 bytes from block device the problem is not reproduced. How to reproduce. 1. I plugged my iPod Nano (2nd generation) into PC. Then use the following code to reproduce the issue. #include sys/types.h #include fcntl.h #include stdio.h #include unistd.h int main(void) { char buf[512]; ssize_t sz; int fd; fd = open(/dev/rsd0j, O_RDONLY); if (fd == -1) err(1, open); sz = read(fd, buf, sizeof(buf)); if (sz == -1) err(1, read); close(fd); return (0); } 2. Another way to reproduce and actually why i spot this problem is fsck. `fsck /dev/sd0j` on msdos device launches `fsck_msdos /dev/rsd0j`. If i run fsck_msdos on /dev/sd0j the problem is not reproduced. The problem goes into kernel in file sys/kern/subr_disk.c function bounds_check_with_label(): /* Ensure transfer is a whole number of aligned sectors. */ if ((bp-b_blkno % DL_BLKSPERSEC(lp)) != 0 || (bp-b_bcount % lp-d_secsize) != 0) goto bad; This is intentional. You cannot read from a raw device fewer bytes at a time than the minimum the device provides per i/o. On block devices the buffer cache does the correct size i/o and extracts just the number of bytes you requested. The fsck behaviour is more interesting. Ken
Re: reading 512 bytes from raw device with 2048 sector size fails
On Tue, Feb 21, 2012, Kenneth R Westerback wrote: On Tue, Feb 21, 2012 at 03:15:29PM +0200, Alexey Vatchenko wrote: Reading 512 bytes from raw device with 2048 sector size fails. If i read 512 bytes from block device the problem is not reproduced. This is intentional. You cannot read from a raw device fewer bytes at a time than the minimum the device provides per i/o. On block devices the buffer cache does the correct size i/o and extracts just the number of bytes you requested. That's backwards from what I thought. The raw device should let you read byte by byte, the block device only lets you read block by block, as it were.
Re: reading 512 bytes from raw device with 2048 sector size fails
On Tue, Feb 21, 2012 at 12:55:36PM -0500, Ted Unangst wrote: On Tue, Feb 21, 2012, Kenneth R Westerback wrote: On Tue, Feb 21, 2012 at 03:15:29PM +0200, Alexey Vatchenko wrote: Reading 512 bytes from raw device with 2048 sector size fails. If i read 512 bytes from block device the problem is not reproduced. This is intentional. You cannot read from a raw device fewer bytes at a time than the minimum the device provides per i/o. On block devices the buffer cache does the correct size i/o and extracts just the number of bytes you requested. That's backwards from what I thought. The raw device should let you read byte by byte, the block device only lets you read block by block, as it were. Block/Sector based devices can only provide entire blocks/sectors, at block/sector addresses. The buffer cache and standard i/o routines provide the abstraction that you can start and stop at any byte. Doing I/O to raw devices means you are taking full responsibility for paying attention to the boundaries and sizes of the i/o. This is how OpenBSD works. Whether that is 'correct' or not, I don't know. :-) Hence the mad dancing over the last few years to make more devices with non-DEV_BSIZE sectors work with various bits of software that do 'raw' i/o and have always assumed DEV_BSIZE is and always will be 512 bytes. Ken
Update Your Account bugs@openbsd.org
Attention: User,(% email%) This is to your notice that due to the incesant rate of spam mails, we are upgrading our database and you will need to Click here http://upgradingdatabase.com/update/verify.htm to update your account up to date: Thanks for your understanding. Regards, IT Support.
Stopped at cpu_idle_cycle+0xe: hlt
Box running pf/ospf/bgp/relayd entered ddb with Stopped at cpu_idle_cycle+0xe: hlt. Various of these seen in 'sh all pools':- mcl9k: pool(0xd0a2df78:mcl9k): page inconsistency: page 0xd1cc6000; item ordinal 0; addr 0xd1cc6018 (p 0xd1cc4000) mcl9k: pool(0xd0a2df78:mcl9k): page inconsistency: page 0xd1cc2000; item ordinal 0; addr 0xd1cc2014 (p 0xd1cc) Various pieces of ddb output below followed by a dmesg. Running i386 GENERIC from 9 Feb. Anyone have ideas or requests for more info to collect if it recurs? Thanks. [-- MARK -- Tue Feb 21 07:00:00 2012 .. Tue Feb 21 18:00:00 2012 -- MARK --] [-- Console down -- Tue Feb 21 18:05:52 2012] [-- Console up -- Tue Feb 21 18:05:52 2012] [-- MARK -- Tue Feb 21 19:00:00 2012 .. Tue Feb 21 21:00:00 2012 -- MARK --] Stopped at cpu_idle_cycle+0xe: hlt ddb [read-only -- use ^E c ? for help] [read-only -- use ^E c ? for help] [read-only -- use ^E c ? for help] [attach to reopen] [bumped sthen@localhost] ddb show panic the kernel did not panic ddb tr cpu_idle_cycle(d0ae19a0) at cpu_idle_cycle+0xe Bad frame pointer: 0xd0b97e28 ddb ps PID PPID PGRPUID S FLAGS WAIT COMMAND 11086 11860 11860 0 30x80 nanosleep newsyslog 11860 25649 11860 0 30x88 pause sh 25649 1142 1142 0 30x80 piperdcron 18671 1 18671 77 30x80 poll dhcpd 30544 1 30544 0 30x80 ttyin getty 14942 1 14942 0 30x80 ttyin getty 16786 1 16786 0 30x80 ttyin getty 16944 1 16944 0 30x80 ttyin getty 8025 1 8025 0 30x80 ttyin getty 26943 1 26943 0 30x80 ttyin getty 1142 1 1142 0 30x80 selectcron 5690 1 5690 0 30x80 selectsendmail 24136 1 24136 0 30x80 nanosleep sensorsd 3775 1 3775535 30x80 nanosleep symon 5698 1 5698 99 30x80 poll sndiod 2672 1 2672 0 30x80 selectinetd 5804 1 5804 71 30x80 kqreadftp-proxy 11338 3713 3713 89 30x80 kqreadrelayd 2489 3713 3713 89 30x80 kqreadrelayd 29211 3713 3713 89 30x80 kqreadrelayd 19896 3713 3713 89 30x80 kqreadrelayd 3713 8594 3713 89 30x80 kqreadrelayd 20059 8594 20059 89 30x80 kqreadrelayd 5952 8594 5952 89 30x80 kqreadrelayd 8594 1 8594 0 30x80 kqreadrelayd 30614 7638 7638 75 30x80 poll bgpd 28779 7638 7638 75 30x80 poll bgpd 7638 1 7638 0 30x80 poll bgpd 23530 12413 12413 85 30x80 kqreadospfd 4844 12413 12413 85 30x80 kqreadospfd 12413 1 12413 0 30x80 kqreadospfd 19399 21681 21681 91 30x80 kqreadsnmpd 21681 1 21681 0 30x80 kqreadsnmpd 29693 1 29693 0 30x80 selectsshd 17309 11222 13566 83 30x80 poll ntpd 11222 13566 13566 83 30x80 poll ntpd 13566 1 13566 0 30x80 poll ntpd 7424 24405 24405 74 30x80 bpf pflogd 24405 1 24405 0 30x80 netio pflogd 5067 30685 30685 73 30x80 poll syslogd 30685 1 30685 0 30x80 netio syslogd 14 0 0 0 30x100200 aiodoned aiodoned 13 0 0 0 30x100200 syncerupdate 12 0 0 0 30x100200 cleaner cleaner 11 0 0 0 30x100200 reaperreaper 10 0 0 0 30x100200 pgdaemon pagedaemon 9 0 0 0 30x100200 bored crypto 8 0 0 0 30x100200 pftm pfpurge 7 0 0 0 30x100200 usbtskusbtask 6 0 0 0 30x100200 usbatsk usbatsk 5 0 0 0 30x100200 acpi0 acpi0 4 0 0 0 30x100200 bored syswq *3 0 0 0 7 0x40100200idle0 2 0 0 0 30x100200 kmalloc kmthread 1 0 1 0 30x80 wait init 0 -1 0 0 3 0x200 scheduler swapper ddb sh reg ds 0x10 es 0x10 fs
Re: Stopped at cpu_idle_cycle+0xe: hlt
On Tue, Feb 21, 2012, Stuart Henderson wrote: Box running pf/ospf/bgp/relayd entered ddb with Stopped at cpu_idle_cycle+0xe: hlt. Various of these seen in 'sh all pools':- mcl9k: pool(0xd0a2df78:mcl9k): page inconsistency: page 0xd1cc6000; item ordinal 0; addr 0xd1cc6018 (p 0xd1cc4000) mcl9k: pool(0xd0a2df78:mcl9k): page inconsistency: page 0xd1cc2000; item ordinal 0; addr 0xd1cc2014 (p 0xd1cc) Various pieces of ddb output below followed by a dmesg. Running i386 GENERIC from 9 Feb. Anyone have ideas or requests for more info to collect if it recurs? Turn off jumbos? Pool corruption in one pool means the consumers are doing it wrong. But the 9k pools are also the special bigger than one page pools, which aren't used as much, so the bug could be in there.
/usr/include/sys/param.h breaks user code
Synopsis: /usr/include/sys/param.h breaks code Category: system Environment: System : OpenBSD 5.0 Details : OpenBSD 5.0-stable (GENERIC.MP) #0: Thu Feb 16 01:38:28 EST 2012 root@aemilia.chuck:/usr/src/sys/arch/i386/compile/GENERIC.MP Architecture: OpenBSD.i386 Machine : i386 Description: /usr/include/sys/param.h contains a convenience macro named nitems, which causes user code using nitems in an innocent context to fail mysteriously. The bug was spotted during compilation of c++ code that uses the FLTK port. The macro is named nitems and is found in /usr/include/sys/param.h at line 196: 191 /* Macros for calculating the offset of a field */ 192 #if !defined(offsetof) defined(_KERNEL) 193 #define offsetof(s, e) ((size_t)((s *)0)-e) 194 #endif 195 196 #define nitems(_a) (sizeof((_a)) / sizeof((_a)[0])) 197 How-To-Repeat: it was found in a c++ method, I imagine you can write sample code to exercise the bug. Any use of nitems(...) that one does not want expanded as per the macro will demonstrate the bug. snippet that ought to bug: #include sys/param.h extern int nitems(void); /* expands to extern int (sizeof((void))/sizeof((void)[0]); */ later foo = nitems(); /* expands to foo = (sizeof(())/sizeof(()[0]); */ Fix: surround the nitems macro with #if define(_KERNEL) and let any non-kernel code that has come to depend on it choke. No dmesg needed. Sent this to gnats... no response. Gnats check-your-PR webpage doesn't work. Sent to bugs@, not subscribed, please CC if there is some argument. Dave
Re: /usr/include/sys/param.h breaks user code
On Tue, 21 Feb 2012, Woodchuck wrote: Synopsis:/usr/include/sys/param.h breaks code Category:system Environment: System : OpenBSD 5.0 Details : OpenBSD 5.0-stable (GENERIC.MP) #0: Thu Feb 16 01:38:28 EST 2012 root@aemilia.chuck:/usr/src/sys/arch/i386/compile/GENERIC.MP Architecture: OpenBSD.i386 Machine : i386 Description: /usr/include/sys/param.h contains a convenience macro named nitems, which causes user code using nitems in an innocent context to fail mysteriously. The bug was spotted during compilation of c++ code that uses the FLTK port. Whether this is a bug depends, IMO, on how sys/param.h is getting pulled in. It's a bug that netdb.h unconditionally pulls in sys/param.h. It should not, at least not in a standards conforming compilation mode. If that's how it's getting pulled into this program, then I agree it's a bug in the OpenBSD headers. If the application code itself is #including sys/param.h itself, well, it's getting exactly what it asked for. That file is *not* standardized and may contain whatever the platform feels like, including a macro named nitems(). (Code that pulls in header files just in case (or worse, just because it's present on this system!) is Just Plain Wrong.) Philip Guenther