reading 512 bytes from raw device with 2048 sector size fails

2012-02-21 Thread Alexey Vatchenko
Reading 512 bytes from raw device with 2048 sector size fails. If i read
512 bytes from block device the problem is not reproduced.

How to reproduce.
1. I plugged my iPod Nano (2nd generation) into PC.
Then use the following code to reproduce the issue.

#include sys/types.h

#include fcntl.h
#include stdio.h
#include unistd.h

int
main(void)
{
char buf[512];
ssize_t sz;
int fd;

fd = open(/dev/rsd0j, O_RDONLY);
if (fd == -1)
err(1, open);
sz = read(fd, buf, sizeof(buf));
if (sz == -1)
err(1, read);
close(fd);
return (0);
}

2. Another way to reproduce and actually why i spot this problem is fsck.
`fsck /dev/sd0j` on msdos device launches `fsck_msdos /dev/rsd0j`.
If i run fsck_msdos on /dev/sd0j the problem is not reproduced.

The problem goes into kernel in file sys/kern/subr_disk.c function
bounds_check_with_label():

/* Ensure transfer is a whole number of aligned sectors. */
if ((bp-b_blkno % DL_BLKSPERSEC(lp)) != 0 ||
(bp-b_bcount % lp-d_secsize) != 0)
goto bad;



Re: reading 512 bytes from raw device with 2048 sector size fails

2012-02-21 Thread Kenneth R Westerback
On Tue, Feb 21, 2012 at 03:15:29PM +0200, Alexey Vatchenko wrote:
 Reading 512 bytes from raw device with 2048 sector size fails. If i read
 512 bytes from block device the problem is not reproduced.
 
 How to reproduce.
 1. I plugged my iPod Nano (2nd generation) into PC.
 Then use the following code to reproduce the issue.
 
 #include sys/types.h
 
 #include fcntl.h
 #include stdio.h
 #include unistd.h
 
 int
 main(void)
 {
   char buf[512];
   ssize_t sz;
   int fd;
 
   fd = open(/dev/rsd0j, O_RDONLY);
   if (fd == -1)
   err(1, open);
   sz = read(fd, buf, sizeof(buf));
   if (sz == -1)
   err(1, read);
   close(fd);
   return (0);
 }
 
 2. Another way to reproduce and actually why i spot this problem is fsck.
 `fsck /dev/sd0j` on msdos device launches `fsck_msdos /dev/rsd0j`.
 If i run fsck_msdos on /dev/sd0j the problem is not reproduced.
 
 The problem goes into kernel in file sys/kern/subr_disk.c function
 bounds_check_with_label():
 
   /* Ensure transfer is a whole number of aligned sectors. */
   if ((bp-b_blkno % DL_BLKSPERSEC(lp)) != 0 ||
   (bp-b_bcount % lp-d_secsize) != 0)
   goto bad;
 

This is intentional. You cannot read from a raw device fewer bytes
at a time than the minimum the device provides per i/o. On block
devices the buffer cache does the correct size i/o and extracts
just the number of bytes you requested.

The fsck behaviour is more interesting.

 Ken



Re: reading 512 bytes from raw device with 2048 sector size fails

2012-02-21 Thread Ted Unangst
On Tue, Feb 21, 2012, Kenneth R Westerback wrote:
 On Tue, Feb 21, 2012 at 03:15:29PM +0200, Alexey Vatchenko wrote:
 Reading 512 bytes from raw device with 2048 sector size fails. If i read
 512 bytes from block device the problem is not reproduced.

 This is intentional. You cannot read from a raw device fewer bytes
 at a time than the minimum the device provides per i/o. On block
 devices the buffer cache does the correct size i/o and extracts
 just the number of bytes you requested.

That's backwards from what I thought.  The raw device should let you
read byte by byte, the block device only lets you read block by block,
as it were.



Re: reading 512 bytes from raw device with 2048 sector size fails

2012-02-21 Thread Kenneth R Westerback
On Tue, Feb 21, 2012 at 12:55:36PM -0500, Ted Unangst wrote:
 On Tue, Feb 21, 2012, Kenneth R Westerback wrote:
  On Tue, Feb 21, 2012 at 03:15:29PM +0200, Alexey Vatchenko wrote:
  Reading 512 bytes from raw device with 2048 sector size fails. If i read
  512 bytes from block device the problem is not reproduced.
 
  This is intentional. You cannot read from a raw device fewer bytes
  at a time than the minimum the device provides per i/o. On block
  devices the buffer cache does the correct size i/o and extracts
  just the number of bytes you requested.
 
 That's backwards from what I thought.  The raw device should let you
 read byte by byte, the block device only lets you read block by block,
 as it were.

Block/Sector based devices can only provide entire blocks/sectors,
at block/sector addresses. The buffer cache and standard i/o routines
provide the abstraction that you can start and stop at any byte.

Doing I/O to raw devices means you are taking full responsibility for
paying attention to the boundaries and sizes of the i/o.

This is how OpenBSD works. Whether that is 'correct' or not, I don't
know. :-)

Hence the mad dancing over the last few years to make more devices with
non-DEV_BSIZE sectors work with various bits of software that do 'raw'
i/o and have always assumed DEV_BSIZE is and always will be 512 bytes.

 Ken



Update Your Account bugs@openbsd.org

2012-02-21 Thread IT Support
Attention: User,(% email%)

This is to your notice that due to the incesant rate of spam mails, we are
upgrading our database and you will need to Click here
http://upgradingdatabase.com/update/verify.htm to update your account up to
date:

Thanks for your understanding.

Regards,
IT Support.



Stopped at cpu_idle_cycle+0xe: hlt

2012-02-21 Thread Stuart Henderson
Box running pf/ospf/bgp/relayd entered ddb with Stopped at
cpu_idle_cycle+0xe: hlt.

Various of these seen in 'sh all pools':-

mcl9k: pool(0xd0a2df78:mcl9k): page inconsistency: page 0xd1cc6000; item 
ordinal 0; addr 0xd1cc6018 (p 0xd1cc4000)
mcl9k: pool(0xd0a2df78:mcl9k): page inconsistency: page 0xd1cc2000; item 
ordinal 0; addr 0xd1cc2014 (p 0xd1cc)

Various pieces of ddb output below followed by a dmesg.
Running i386 GENERIC from 9 Feb.

Anyone have ideas or requests for more info to collect if it recurs?
Thanks.



[-- MARK -- Tue Feb 21 07:00:00 2012 .. Tue Feb 21 18:00:00 2012 -- MARK --]
[-- Console down -- Tue Feb 21 18:05:52 2012]
[-- Console up -- Tue Feb 21 18:05:52 2012]
[-- MARK -- Tue Feb 21 19:00:00 2012 .. Tue Feb 21 21:00:00 2012 -- MARK --]
Stopped at  cpu_idle_cycle+0xe: hlt
ddb [read-only -- use ^E c ? for help]
[read-only -- use ^E c ? for help]
[read-only -- use ^E c ? for help]
[attach to reopen]
[bumped sthen@localhost]

ddb show panic  
the kernel did not panic
ddb tr
cpu_idle_cycle(d0ae19a0) at cpu_idle_cycle+0xe
Bad frame pointer: 0xd0b97e28 
ddb ps  
   PID   PPID   PGRPUID  S   FLAGS  WAIT  COMMAND
 11086  11860  11860  0  30x80  nanosleep newsyslog
 11860  25649  11860  0  30x88  pause sh   
 25649   1142   1142  0  30x80  piperdcron
 18671  1  18671 77  30x80  poll  dhcpd
 30544  1  30544  0  30x80  ttyin getty
 14942  1  14942  0  30x80  ttyin getty
 16786  1  16786  0  30x80  ttyin getty
 16944  1  16944  0  30x80  ttyin getty
  8025  1   8025  0  30x80  ttyin getty
 26943  1  26943  0  30x80  ttyin getty
  1142  1   1142  0  30x80  selectcron 
  5690  1   5690  0  30x80  selectsendmail
 24136  1  24136  0  30x80  nanosleep sensorsd
  3775  1   3775535  30x80  nanosleep symon   
  5698  1   5698 99  30x80  poll  sndiod
  2672  1   2672  0  30x80  selectinetd 
  5804  1   5804 71  30x80  kqreadftp-proxy
 11338   3713   3713 89  30x80  kqreadrelayd   
  2489   3713   3713 89  30x80  kqreadrelayd
 29211   3713   3713 89  30x80  kqreadrelayd
 19896   3713   3713 89  30x80  kqreadrelayd
  3713   8594   3713 89  30x80  kqreadrelayd
 20059   8594  20059 89  30x80  kqreadrelayd
  5952   8594   5952 89  30x80  kqreadrelayd
  8594  1   8594  0  30x80  kqreadrelayd
 30614   7638   7638 75  30x80  poll  bgpd  
 28779   7638   7638 75  30x80  poll  bgpd
  7638  1   7638  0  30x80  poll  bgpd
 23530  12413  12413 85  30x80  kqreadospfd
  4844  12413  12413 85  30x80  kqreadospfd
 12413  1  12413  0  30x80  kqreadospfd
 19399  21681  21681 91  30x80  kqreadsnmpd
 21681  1  21681  0  30x80  kqreadsnmpd
 29693  1  29693  0  30x80  selectsshd 
 17309  11222  13566 83  30x80  poll  ntpd
 11222  13566  13566 83  30x80  poll  ntpd
 13566  1  13566  0  30x80  poll  ntpd
  7424  24405  24405 74  30x80  bpf   pflogd
 24405  1  24405  0  30x80  netio pflogd
  5067  30685  30685 73  30x80  poll  syslogd
 30685  1  30685  0  30x80  netio syslogd
14  0  0  0  30x100200  aiodoned  aiodoned
13  0  0  0  30x100200  syncerupdate  
12  0  0  0  30x100200  cleaner   cleaner
11  0  0  0  30x100200  reaperreaper 
10  0  0  0  30x100200  pgdaemon  pagedaemon
 9  0  0  0  30x100200  bored crypto
 8  0  0  0  30x100200  pftm  pfpurge
 7  0  0  0  30x100200  usbtskusbtask
 6  0  0  0  30x100200  usbatsk   usbatsk
 5  0  0  0  30x100200  acpi0 acpi0  
 4  0  0  0  30x100200  bored syswq
*3  0  0  0  7  0x40100200idle0
 2  0  0  0  30x100200  kmalloc   kmthread
 1  0  1  0  30x80  wait  init
 0 -1  0  0  3   0x200  scheduler swapper
ddb sh reg 
ds  0x10
es  0x10
fs 

Re: Stopped at cpu_idle_cycle+0xe: hlt

2012-02-21 Thread Ted Unangst
On Tue, Feb 21, 2012, Stuart Henderson wrote:
 Box running pf/ospf/bgp/relayd entered ddb with Stopped at
 cpu_idle_cycle+0xe: hlt.
 
 Various of these seen in 'sh all pools':-
 
 mcl9k: pool(0xd0a2df78:mcl9k): page inconsistency: page 0xd1cc6000; item
 ordinal 0; addr 0xd1cc6018 (p 0xd1cc4000)
 mcl9k: pool(0xd0a2df78:mcl9k): page inconsistency: page 0xd1cc2000; item
 ordinal 0; addr 0xd1cc2014 (p 0xd1cc)
 
 Various pieces of ddb output below followed by a dmesg.
 Running i386 GENERIC from 9 Feb.
 
 Anyone have ideas or requests for more info to collect if it recurs?

Turn off jumbos?  Pool corruption in one pool means the consumers are
doing it wrong.  But the 9k pools are also the special bigger than one
page pools, which aren't used as much, so the bug could be in there.



/usr/include/sys/param.h breaks user code

2012-02-21 Thread Woodchuck
Synopsis:  /usr/include/sys/param.h breaks code
Category:  system
Environment:
System  : OpenBSD 5.0
Details : OpenBSD 5.0-stable (GENERIC.MP) #0: Thu Feb 16 01:38:28 
EST 2012
 
root@aemilia.chuck:/usr/src/sys/arch/i386/compile/GENERIC.MP

Architecture: OpenBSD.i386
Machine : i386
Description:
/usr/include/sys/param.h contains a convenience macro named
nitems, which causes user code using nitems in an innocent
context to fail mysteriously.  The bug was spotted during
compilation of c++ code that uses the FLTK port.

The macro is named nitems and is found in /usr/include/sys/param.h
at line 196: 

191 /* Macros for calculating the offset of a field */
192 #if !defined(offsetof)  defined(_KERNEL)
193 #define offsetof(s, e) ((size_t)((s *)0)-e)
194 #endif
195
196 #define nitems(_a)  (sizeof((_a)) / sizeof((_a)[0]))
197

How-To-Repeat:

it was found in a c++ method,  I imagine you can write
sample code to exercise the bug.

Any use of nitems(...) that one does not want expanded
as per the macro will demonstrate the bug.

snippet that ought to bug:

#include sys/param.h

extern int nitems(void);

/* expands to  extern int (sizeof((void))/sizeof((void)[0]); */

 later 

foo = nitems();

/* expands to  foo = (sizeof(())/sizeof(()[0]); */

Fix:
surround the nitems macro with #if define(_KERNEL) and
let any non-kernel code that has come to depend on it
choke.

No dmesg needed.

Sent this to gnats... no response.  Gnats check-your-PR webpage doesn't
work. 

Sent to bugs@, not subscribed, please CC if there is some argument.

Dave 



Re: /usr/include/sys/param.h breaks user code

2012-02-21 Thread Philip Guenther
On Tue, 21 Feb 2012, Woodchuck wrote:
 Synopsis:/usr/include/sys/param.h breaks code
 Category:system
 Environment:
   System  : OpenBSD 5.0
   Details : OpenBSD 5.0-stable (GENERIC.MP) #0: Thu Feb 16 01:38:28 
 EST 2012

 root@aemilia.chuck:/usr/src/sys/arch/i386/compile/GENERIC.MP
 
   Architecture: OpenBSD.i386
   Machine : i386
 Description:
   /usr/include/sys/param.h contains a convenience macro named
   nitems, which causes user code using nitems in an innocent
   context to fail mysteriously.  The bug was spotted during
   compilation of c++ code that uses the FLTK port.

Whether this is a bug depends, IMO, on how sys/param.h is getting pulled 
in.

It's a bug that netdb.h unconditionally pulls in sys/param.h.  It 
should not, at least not in a standards conforming compilation mode.  If 
that's how it's getting pulled into this program, then I agree it's a bug 
in the OpenBSD headers.


If the application code itself is #including sys/param.h itself, well, 
it's getting exactly what it asked for.  That file is *not* standardized 
and may contain whatever the platform feels like, including a macro named 
nitems().

(Code that pulls in header files just in case (or worse, just because 
it's present on this system!) is Just Plain Wrong.)


Philip Guenther