Re: Phantom ACL-related xattrs on 3.14.4 NFS client

2014-06-17 Thread Philippe Troin
Hi Christopher,

On Wed, 2014-06-11 at 09:22 -0700, Christoph Hellwig wrote:
> On Wed, Jun 11, 2014 at 09:15:18AM -0700, Philippe Troin wrote:
> > So, the only regression remaining between 3.13.11 and 3.14.6 + your
> > patch is the one where listxattr(2) and friends do not NUL-terminate the
> > xattr names they return.  This is detailed in
> > <1402435203.24047.9.ca...@ceramic.home.fifi.org> I sent yesterday.
> 
> Oh, that's a bug in my patch.  The following incremental patch should
> fix it:
> 
> diff --git a/fs/nfs/nfs3acl.c b/fs/nfs/nfs3acl.c
> index e083827..8f854dd 100644
> --- a/fs/nfs/nfs3acl.c
> +++ b/fs/nfs/nfs3acl.c
> @@ -262,6 +262,7 @@ nfs3_list_one_acl(struct inode *inode, int type, const 
> char *name, void *data,
>   posix_acl_release(acl);
>  
>   *result += strlen(name);
> + *result += 1;
>   if (!size)
>   return 0;
>   if (*result > size)


I'm belatedly confirming that both patches applied together in
<20140607140414.ga26...@infradead.org> and
<20140611162238.ga28...@infradead.org> fix the problem I was seeing with
NFSv3 client mounts on 3.6.14.x.

I've tried vanilla 3.6.14.6 + both patches and the regression is gone.
Remains the issue where on a NFSv3 client.

Phil.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Phantom ACL-related xattrs on 3.14.4 NFS client

2014-06-11 Thread Philippe Troin
Christoph,

On Wed, 2014-06-11 at 00:24 -0700, Christoph Hellwig wrote:
> On Tue, Jun 10, 2014 at 02:20:03PM -0700, Philippe Troin wrote:
> > Trond, Christoph,
> > 
> > Since my last email, I've been testing 3.14.6.
> > Stock 3.14.6 is still broken, and Christoph's patch does help, but does
> > not entirely cure the problem.
> 
> Can you send me the output of 
> 
> getfattr -n system.posix_acl_access -e hex 
> 
> for the working case, and the current kernel with my previous patch?

Here's the output on the broken kernel (vanilla 3.14.6 + your patch):

% mkdir x
% cd x
% getfacl .   
# file: .
# owner: phil
# group: phil
user::rwx
group::rwx
other::r-x

% getfattr -e hex -n system.posix_acl_access .
.: system.posix_acl_access: No such attribute
[2]1901 exit 1 getfattr -e hex -n system.posix_acl_access .
% setfacl -m u:root:r .   
% getfacl .   
# file: .
# owner: phil
# group: phil
user::rwx
user:root:r--
group::rwx
mask::rwx
other::r-x

% getfattr -e hex -n system.posix_acl_access .
# file: .

system.posix_acl_access=0x020001000700020004000400070017002500

% setfacl -b .
% getfacl .   
# file: .
# owner: phil
# group: phil
user::rwx
group::rwx
other::r-x

% getfattr -e hex -n system.posix_acl_access .
# file: .

system.posix_acl_access=0x020001000700040007002500

On a working system (3.13.11 + Fedora patches), the output is the same.
So there's no regression here between 3.13.11 and 3.14.6 + your patch.
I would argue that this behavior (system.posix_acl_access still present
after clear the ACLs with setfacl -b) is wrong, and in fact there are no
traces of this xattr on the server, but it's not new.
I had missed that this counter-intuitive behavior was already in earlier
kernels.  My apologies.
Trond, what's your take on that one?

So, the only regression remaining between 3.13.11 and 3.14.6 + your
patch is the one where listxattr(2) and friends do not NUL-terminate the
xattr names they return.  This is detailed in
<1402435203.24047.9.ca...@ceramic.home.fifi.org> I sent yesterday.

Phil.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Phantom ACL-related xattrs on 3.14.4 NFS client

2014-06-10 Thread Philippe Troin
Trond, Christoph,

Since my last email, I've been testing 3.14.6.
Stock 3.14.6 is still broken, and Christoph's patch does help, but does
not entirely cure the problem.

On Sat, 2014-06-07 at 19:48 -0700, Philippe Troin wrote:
> It's still broken, but in a different way.
> The phantom attrs are gone, but the attr/acl interaction is still
> uncertain.
> 
> I have tested vanilla  3.14.5 + this patch on x86_64.
> Mount options are the same as last time (NFSv3).
> 
> This is what I see on the client:
> 
> nfsv3client% mkdir x
> nfsv3client% cd x
> nfsv3client% getfattr -m '.*' .
> nfsv3client% getfacl .
> # file: .
> # owner: phil
> # group: phil
> user::rwx
> group::rwx
> other::r-x
> 
> OK so far: no more phantom attrs.
> This is where things get dodgy:
> 
> nfsv3client% setfacl -m u:root:r .   
> nfsv3client% getfacl .
> # file: .
> # owner: phil
> # group: phil
> user::rwx
> user:root:r--
> group::rwx
> mask::rwx
> other::r-x
> 
> nfsv3client% getfattr -m '.*' .
> [1]2123 segmentation fault  getfattr -m '.*' .
> % strace getfattr -m '.*' . 2>&1 | tail -n 20
> fstat(3, {st_mode=S_IFREG|0644, st_size=26254, ...}) = 0
> mmap(NULL, 26254, PROT_READ, MAP_SHARED, 3, 0) = 0x7f46a145
> close(3)= 0
> getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=4*1024}) = 0
> lstat(".", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
> listxattr(".", NULL, 0) = 23
> listxattr(".", "system.posix_acl_access", 256) = 23
> brk(0)  = 0x1138000
> brk(0x1178000)  = 0x1178000
> brk(0)  = 0x1178000
> brk(0)  = 0x1178000
> brk(0x1159000)  = 0x1159000
> brk(0)  = 0x1159000
> mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
> -1, 0) = 0x7f46a140f000
> brk(0)  = 0x1159000
> brk(0)  = 0x1159000
> brk(0x1139000)  = 0x1139000
> brk(0)  = 0x1139000
> --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, 
> si_addr=0x11586e8} ---
> +++ killed by SIGSEGV +++
> [1]2311 segmentation fault  strace getfattr -m '.*' . 2>&1
> | 
>2312 donetail -n 20

I have since discovered that getfattr crashes because on an NFSv3 mount,
listxattr() does not NULL terminate the attribute strings.

Compare a broken 3.14.6:
listxattr(".", NULL, 0) = 23
listxattr(".", "system.posix_acl_access", 256) = 23
vs a working 3.13:
listxattr(".", NULL, 0)  = 24
listxattr(".", "system.posix_acl_access\0", 256) = 24

The above behavior happens with or without Christoph's patch.

Also, with Christoph's patch applied:

> On Sat, 2014-06-07 at 07:04 -0700, Christoph Hellwig wrote:
> > On Fri, Jun 06, 2014 at 04:37:03PM -0400, Trond Myklebust wrote:
> > > Christoph, what is the intended interface for telling
> > > posix_acl_xattr_list() that there are no acls on a particular file?
> > > Should there perhaps be a call to get_acl()?
> > 
> > The interface is to not call posix_acl_xattr_list unless you have ACLs.
> > Every implementation does this, except for generic_listxattr which is
> > only used by NFS.
> > 
> > Philippe, can you test the patch below?
> > 
> > 
> > diff --git a/fs/nfs/nfs3acl.c b/fs/nfs/nfs3acl.c
> > index 871d6ed..e083827 100644
> > --- a/fs/nfs/nfs3acl.c
> > +++ b/fs/nfs/nfs3acl.c
> > @@ -247,3 +247,45 @@ const struct xattr_handler *nfs3_xattr_handlers[] = {
> > &posix_acl_default_xattr_handler,
> > NULL,
> >  };
> > +
> > +static int
> > +nfs3_list_one_acl(struct inode *inode, int type, const char *name, void 
> > *data,
> > +   size_t size, ssize_t *result)
> > +{
> > +   struct posix_acl *acl;
> > +   char *p = data + *result;
> > +
> > +   acl = get_acl(inode, type);
> > +   if (!acl)
> > +   return 0;
> > +
> > +   posix_acl_release(a

Re: Phantom ACL-related xattrs on 3.14.4 NFS client

2014-06-09 Thread Philippe Troin
On Mon, 2014-06-09 at 10:46 -0400, J. Bruce Fields wrote:
> On Sat, Jun 07, 2014 at 07:48:21PM -0700, Philippe Troin wrote:
> > Hi Trond & Christoph,
> > 
> > It's still broken, but in a different way.
> > The phantom attrs are gone, but the attr/acl interaction is still
> > uncertain.
> > 
> > I have tested vanilla  3.14.5 + this patch on x86_64.
> > Mount options are the same as last time (NFSv3).
> > 
> > This is what I see on the client:
> > 
> > nfsv3client% mkdir x
> > nfsv3client% cd x
> > nfsv3client% getfattr -m '.*' .
> > nfsv3client% getfacl .
> > # file: .
> > # owner: phil
> > # group: phil
> > user::rwx
> > group::rwx
> > other::r-x
> > 
> > OK so far: no more phantom attrs.
> > This is where things get dodgy:
> > 
> > nfsv3client% setfacl -m u:root:r .   
> > nfsv3client% getfacl .
> > # file: .
> > # owner: phil
> > # group: phil
> > user::rwx
> > user:root:r--
> > group::rwx
> > mask::rwx
> > other::r-x
> > 
> > nfsv3client% getfattr -m '.*' .
> > [1]2123 segmentation fault  getfattr -m '.*' .
> 
> Is there a backtrace or anything in the system logs?

No, nothing but the SIGSEGV getting logged in dmesg.

Since I've tested on 3.14.5, 3.14.6 came out, and contains NFSd related
patches that look to address further ACL issues.  I'm going to be trying
that out.

Phil.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Phantom ACL-related xattrs on 3.14.4 NFS client

2014-06-07 Thread Philippe Troin
Hi Trond & Christoph,

It's still broken, but in a different way.
The phantom attrs are gone, but the attr/acl interaction is still
uncertain.

I have tested vanilla  3.14.5 + this patch on x86_64.
Mount options are the same as last time (NFSv3).

This is what I see on the client:

nfsv3client% mkdir x
nfsv3client% cd x
nfsv3client% getfattr -m '.*' .
nfsv3client% getfacl .
# file: .
# owner: phil
# group: phil
user::rwx
group::rwx
other::r-x

OK so far: no more phantom attrs.
This is where things get dodgy:

nfsv3client% setfacl -m u:root:r .   
nfsv3client% getfacl .
# file: .
# owner: phil
# group: phil
user::rwx
user:root:r--
group::rwx
mask::rwx
other::r-x

nfsv3client% getfattr -m '.*' .
[1]2123 segmentation fault  getfattr -m '.*' .
% strace getfattr -m '.*' . 2>&1 | tail -n 20
fstat(3, {st_mode=S_IFREG|0644, st_size=26254, ...}) = 0
mmap(NULL, 26254, PROT_READ, MAP_SHARED, 3, 0) = 0x7f46a145
close(3)= 0
getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=4*1024}) = 0
lstat(".", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
listxattr(".", NULL, 0) = 23
listxattr(".", "system.posix_acl_access", 256) = 23
brk(0)  = 0x1138000
brk(0x1178000)  = 0x1178000
brk(0)  = 0x1178000
brk(0)  = 0x1178000
brk(0x1159000)  = 0x1159000
brk(0)  = 0x1159000
mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x7f46a140f000
brk(0)  = 0x1159000
brk(0)  = 0x1159000
brk(0x1139000)  = 0x1139000
brk(0)  = 0x1139000
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x11586e8} 
---
+++ killed by SIGSEGV +++
[1]2311 segmentation fault  strace getfattr -m '.*' . 2>&1
| 
   2312 donetail -n 20

I have no idea get getfattr crashes right after the listxattr() syscall,
but it surely doesn't on the NFSv3 server nor with 3.13.
A quick check on the NFS server shows the the ACLs are correctly set:

nfsv3server% cd /path/to/x
nfsv3server% getfacl .
# file: .
# owner: phil
# group: phil
user::rwx
user:root:r--
group::rwx
mask::rwx
other::r-x

nfsv3server% getfattr -m '.*' .
# file: .
system.posix_acl_access

Back on the client, clearing the ACL confuses the client further:

nfsv3client% setfacl -b .   
nfsv3client% getfacl .  
# file: .
# owner: phil
# group: phil
user::rwx
group::rwx
other::r-x

nfsv3client% strace getfattr -m '.*' . 2>&1 | tail -n 20
fstat(3, {st_mode=S_IFREG|0644, st_size=26254, ...}) = 0
mmap(NULL, 26254, PROT_READ, MAP_SHARED, 3, 0) = 0x7fc7e3f9a000
close(3)= 0
getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=4*1024}) = 0
lstat(".", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
listxattr(".", NULL, 0) = 23
listxattr(".", "system.posix_acl_access", 256) = 23
brk(0)  = 0x1655000
brk(0x1695000)  = 0x1695000
brk(0)  = 0x1695000
brk(0)  = 0x1695000
brk(0x1676000)  = 0x1676000
brk(0)  = 0x1676000
mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x7fc7e3f59000
brk(0)  = 0x1676000
brk(0)  = 0x1676000
brk(0x1656000)  = 0x1656000
brk(0)  = 0x1656000
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x16756e8} 
---
+++ killed by SIGSEGV +++
[1]2353 segmentation fault  strace getfattr -m '.*' . 2>&1
| 
   2354 donetail -n 20
nfsv3client% getfattr -n system.posix_acl_access .
# file: .
system.posix_acl_access=0sAgEABwD/BAAHAP8gAAUA/w==

See how:
  * getfacl says there's no ACLs
  * getfattr says there's still a system.posix_acl_access attr.
Interestingly, the server says othe

Phantom ACL-related xattrs on 3.14.4 NFS client

2014-06-06 Thread Philippe Troin
This happens on an NFS client running on:
Linux ceramic32 3.14.4 #1 SMP Fri May 30 00:52:07 PDT 2014 i686 i686 i386 
GNU/Linux
(also happens on x86_64).

The NFS server can be either 3.14 or 3.13, it doesn't change a thing.

Mount options are:
(from /proc/mtab)
ceramic:/export/home/phil /home/phil nfs 
rw,nosuid,nodev,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.17.1.2,mountvers=3,mountport=20048,mountproto=tcp,local_lock=none,addr=172.17.1.2
 0 0
(This is NFSv3)

The symptom:

Run getfacl on any NFS inode.  See there are no ACLs:

% getfacl .
# file: .
# owner: phil
# group: phil
user::rwx
group::r-x
other::r-x

Yet, getfattr says there are some acl-related xattrs:

% getfattr -m '.*' .
# file: .
system.posix_acl_access
system.posix_acl_default

But when you want to retrieve these phantom xattrs, I get errors:

% getfattr -n system.posix_acl_access . 
.: system.posix_acl_access: No such attribute
[1]1136 exit 1 getfattr -n system.posix_acl_access .
% getfattr -n system.posix_acl_default .
.: system.posix_acl_default: No such attribute
[1]1146 exit 1 getfattr -n system.posix_acl_default .

I've noticed because it breaks the patch utility.

This is a regression from 3.13, probably due to the 3.14 NFS ACL overhaul.

I'm ready & willing to try patches.

Phil.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Write is not atomic?

2012-10-15 Thread Philippe Troin
On Tue, 2012-10-16 at 10:13 +1100, Dave Chinner wrote:
> On Mon, Oct 15, 2012 at 11:36:15PM +0200, Juliusz Chroboczek wrote:
> > Hi,
> > 
> > The Linux manual page for write(2) says:
> > 
> > The adjustment of the file offset and the write operation are
> > performed as an atomic step.
> 
> That's wrong. The file offset update is not synchronised at all with
> the write, and for a shared fd the update will race.

That's what O_APPEND or pread/pwrite are for.

> > This is apparently an extension to POSIX, which says
> > 
> > This volume of IEEE Std 1003.1-2001 does not specify behavior of
> > concurrent writes to a file from multiple processes. Applications
> > should use some form of concurrency control.
> 
> This is how Linux behaves.
> 
> > The following fragment of code
> > 
> > int fd;
> > fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC, 0666);
> > fork();
> > write(fd, "Ouille", 6);
> > close(fd);

can be replaced with:

int fd;
fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC | O_APPEND, 0666);
fork();
write(fd, "Ouille", 6);
close(fd);

or:

int fd;
fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC, 0666);
pid_t pid = fork();
pwrite(fd, "Ouille", 6, strlen("Ouille")*(pid == 0));
close(fd);

(both code fragments untested)

Phil.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


mptsas drops then re-adds hard drive

2007-07-09 Thread Philippe Troin
System info:
Linux 2.6.20-1.2320.fc5 SMP x86_64

lspci:
00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07)
00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05)
00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-8111 IDE (rev 03)
00:07.2 SMBus: Advanced Micro Devices [AMD] AMD-8111 SMBus 2.0 (rev 02)
00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-8111 ACPI (rev 05)
00:07.5 Multimedia audio controller: Advanced Micro Devices [AMD] AMD-8111 AC97 
Audio (rev 03)
00:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
00:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
00:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
00:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address 
Map
00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
01:00.0 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b)
01:00.1 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b)
01:0a.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
01:0a.1 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
01:0a.2 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
01:0a.3 USB Controller: ALi Corporation USB 2.0 Controller (rev 01)
01:0a.4 FireWire (IEEE 1394): ALi Corporation M5253 P1394 OHCI 1.1 Controller
01:0c.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 
Controller (PHY/Link)
02:07.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X 
Fusion-MPT SAS (rev 01)
02:09.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5703X Gigabit 
Ethernet (rev 02)
03:06.0 SCSI storage controller: Adaptec AIC-7892A U160/m (rev 02)
04:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-8151 System Controller 
(rev 13)
04:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8151 AGP Bridge (rev 13)
05:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200] 
(rev 01)
05:00.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200] 
(Secondary) (rev 01)

% cat /proc/scsi/mptsas/1
ioc0: LSISAS1068, FwRev=0112h, Ports=1, MaxQ=511


I have four SATA drives attached to the SAS1068X-R controller, all
four are Seagate ST3750640AS (750 GB Barracuda 7200.10) with firmware
revision 3.AAE.

I was doing load-testing for a RAID array and running badblocks in a
loop over all four drives in parallel.  One drive was dropped after
about 24h.  By dropped I mean the device node /dev/sdb became
"unresponsive" and the drive was re-added with another device node
(sdj).

This is the kernel message:

mptbase: ioc0: LogInfo(0x31110d00): Originator={PL}, Code={Reset}, 
SubCode(0x0d00)
mptbase: ioc0: LogInfo(0x31110d00): Originator={PL}, Code={Reset}, 
SubCode(0x0d00)
mptbase: ioc0: LogInfo(0x3117): Originator={PL}, Code={IO Device Missing 
Delay Retry}, SubCode(0x)
mptbase: ioc0: LogInfo(0x3113): Originator={PL}, Code={IO Not Yet 
Executed}, SubCode(0x)
sd 1:0:0:0: SCSI error: return code = 0x0001
end_request: I/O error, dev sdb, sector 330342400
sd 1:0:0:0: SCSI error: return code = 0x0001
end_request: I/O error, dev sdb, sector 330342400
Buffer I/O error on device sdb, logical block 41292800
Buffer I/O error on device sdb, logical block 41292801
Buffer I/O error on device sdb, logical block 41292802
Buffer I/O error on device sdb, logical block 41292803
Buffer I/O error on device sdb, logical block 41292804
Buffer I/O error on device sdb, logical block 41292805
Buffer I/O error on device sdb, logical block 41292806
Buffer I/O error on device sdb, logical block 41292807
Buffer I/O error on device sdb, logical block 41292808
Buffer I/O error on device sdb, logical block 41292809
sd 1:0:0:0: SCSI error: return code = 0x0001
end_request: I/O error, dev sdb, sector 330342400

lots more of these, until finally:

mptbase: ioc0: LogInfo(0x3000): Originator={PL}, Code={Reset}, 
SubCode(0x1000)
mptbase: ioc0: LogInfo(0x3000): Originator={PL}, Code={Reset}, 
SubCode(0x1000)
mptsas: ioc0: attaching sata device, channel 0, id 13, phy 0
scsi 1:0:4:0: Direct-Access ATA  ST3750640AS  EPQ: 0 ANSI: 5
SCSI device sdj: 1465149168 512-byte hdwr sectors (750156 MB)
sdj: Write Protect is off
SCSI device sdj: write cache

Re: scsi: Devices offlined

2007-03-20 Thread Philippe Troin
Wakko Warner <[EMAIL PROTECTED]> writes:

> [84797.683873] sr 1:0:13:0: scsi: Device offlined - not ready after error 
> recovery
> 
> Is there anyway to make the kernel "online" a device that has done this? 
> I've had this happen on various devices (mostly on usb where I can
> unplug/replug), but this time, it's on a scsi controller and the driver is
> not a module.
> 
> If it's possible to do this w/o rebooting, I'd like to know for when I have
> this happen in the future.

Have you tried:

  echo remove-single-device BUS ID LUN > /proc/scsi
  echo add-single-device BUS ID LUN > /proc/scsi

?

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: O_NONBLOCK setting "leak" outside of a process??

2007-02-02 Thread Philippe Troin
Roland Kuhn <[EMAIL PROTECTED]> writes:

> Hi Guillaume!
> 
> On 2 Feb 2007, at 14:48, Guillaume Chazarain wrote:
> 
> > 2007/2/2, Roland Kuhn <[EMAIL PROTECTED]>:
> >
> >> That's a bug, right?
> >
> > No, if you want something like: (echo toto; date; echo titi) > file
> > to work in your shell, you'll be happy to have the seek position
> > shared in the processes.

Absolutely right.  This has been part of Unix since the beginning.

> As a naive user I'd probably expect that each of the above adds to
> the output, which perfectly fits the O_APPEND flag (to be set by the
> shell, of course).

No, no, O_APPEND has slightly different semantics.

> The immediate point was about the flags, though, and having
> O_NONBLOCK on or off certainly is a _design_ choice when writing a
> program. If I remove O_NONBLOCK, I have a right to expect that I/O
> functions do not return EAGAIN!

Generally you don't want to mess with shared resouces like stdin,
stdout and stderr.

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: O_NONBLOCK setting "leak" outside of a process??

2007-02-01 Thread Philippe Troin
Denis Vlasenko <[EMAIL PROTECTED]> writes:

> What share the same file descriptor? MC and programs started from it?

All the processes started from your shell share at least fds 0, 1 and 2.
 
> I thought after exec() fds atre either closed (if CLOEXEC) or
> becoming independent from parent process
> (i.e. it you seek, close, etc your fd, parent would not notice that).
> 
> Am I wrong?

I'm afraid so.  Seek position and flags are still shared after an
exec.

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: software read-only flag for rw partition or disk ?

2007-01-29 Thread Philippe Troin
"Yakov Lerner" <[EMAIL PROTECTED]> writes:

> Does /proc have any entries to flip the "software read-only flag"
> for a partition or disk (which are physically read-write) ?

No, but you can use blockdev --setro /dev/hdXX

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: O_NONBLOCK setting "leak" outside of a process??

2007-01-29 Thread Philippe Troin
Denis Vlasenko <[EMAIL PROTECTED]> writes:

> Hi,
> 
> I am currently on Linux 2.6.18, x86_64.
> I came across strange behavior while working on one
> of busybox applets. I narrowed it down to these two
> trivial testcases:
> 
> #include 
> #include 
> int main() {
> fcntl(0, F_SETFL, fcntl(0, F_GETFL, 0) | O_NONBLOCK);
> return 0;
> }
> 
> #include 
> #include 
> int main() {
> fcntl(0, F_SETFL, fcntl(0, F_GETFL, 0) & ~O_NONBLOCK);
> return 0;
> }
> 
> If I run "nonblock" in Midnight Commander in KDE's Konsole,
> screen redraw starts to work ~5 times slower. For example,
> Ctrl-O ("show/hide panels" in MC) takes ~0.5 sec to redraw.
> This persists after the program exist (which it
> does immediately as you see).
> Running "block" reverts things to normal.
> 
> I mean: how can O_NONBLOCK _issued in a process which
> already exited_ have any effect whatsoever on MC or Konsole?
> They can't even know that it did it, right?
> 
> Either I do not know something subtle about Unix or some sort
> of bug is at work.

Because they all share the same stdin file descriptor, therefore they
share the same file descriptor flags?

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: xor as a lazy comparison

2005-07-25 Thread Philippe Troin
Lee Revell <[EMAIL PROTECTED]> writes:

> On Mon, 2005-07-25 at 13:55 -0400, Steven Rostedt wrote: 
> > Doesn't matter. The cycles saved for old compilers is not rational to
> > have obfuscated code.
> 
> Where do we draw the line with this?  Is x *= 2 preferable to x <<= 2 as
> well?

Depends if you want to multiply by 2 or 4 :-)

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Lack of Documentation about SA_RESTART...

2005-07-11 Thread Philippe Troin
"Theodore Ts'o" <[EMAIL PROTECTED]> writes:

> On Mon, Jul 11, 2005 at 12:32:37PM +0200, Paolo Ornati wrote:
> > But what I'm looking for is a list of syscalls that are automatically
> > restarted when SA_RESTART is set, and especially in what conditions.
> > 
> > For example: read(), write(), open() are obviously restarted, but even
> > on non-blocking fd?
> > And what about connect() and select() for example?
> > 
> > There are a lot of syscalls that can fail with "EINTR"! What's the
> > advantage of using SA_RESTART if one doesn't know what syscalls are
> > restarted?
> 
> According to the Single Unix Specification V3, all functions that
> return EINTR are supposed to restart if a process receives a signal
> where signal handler has been installed with the SA_RESTART flag.  

Except for select() and poll(), which should always return EINTR even
when interrupted with a SA_RESTART signal.

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Disturbing wide variation in execution time

2005-07-07 Thread Philippe Troin
Sheo Shanker Prasad <[EMAIL PROTECTED]> writes:

> I will appreciate your help in eliminating a disturbing wide
> variation (by a factors of 2 to 2.5) in the execution time of a test
> (execution benchmark) program under identical conditions even when
> the machine is freshly started (rebooted) and no other user program
> is running (not even e-mail or Internet browser).
> 
> I have a dual Opteron 250 (2.4 GHz) running SuSE 9.3 Pro & Linux
> version 2.6.11.4-21.7-smp ([EMAIL PROTECTED]) (gcc version 3.3.5
> 20050117 (prerelease) (SUSE Linux)) #1 SMP Thu Jun 2 14:23:14 UTC
> 2005. The motherboard is Tyan Thunder K8W (S2885 ANRF) with AMI BIOS
> 
> The machine has 4GB of PC3200 DDR RAM, two dimms on each CPU.
> 
> The original machine bought from a vendor about 6 months ago. At
> that time it was running SuSE 9.1 Pro and the execution time for the
> same test program was consistently the same (around 2m 37s +/- a few
> %). Then the mother board failed and the machine went totally
> dead. The vendor then replaced the failed motherboard with a new
> Tyan Thunder K8W and installed the SuSE 9.3. I am not sure whether
> or not the AMI BIOS was also replaced.
> 
> When the repaired machine was started, I began to notice the
> disturbing wide variation and the frequect significant slow down of
> the machine as exhibited by the factor of 2 to 2.5 increased
> execution time of the test program as described above.  Sometimes it
> would be quite fast (executing at the original 2m 40s) and sometime
> a factor of 2.5 slow, and sometimes with speed in between.

8< snip >8

 1. Are you running an i386 kernel or an x86_64 kernel?

 2. Which BIOS version?

 3. Is node interleaving enabled in the BIOS?

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.4.29 sk98lin patch for Asus K8W SE Deluxe

2005-03-03 Thread Philippe Troin
Willy Tarreau <[EMAIL PROTECTED]> writes:

> On Wed, Mar 02, 2005 at 02:00:30PM -0800, Philippe Troin wrote:
>   
> > +   /* Asus K8V Se Deluxe bugfix. Correct VPD content */
> > +   /* MBo April 2004 */
> > +   if( ((unsigned char)pAC->vpd.vpd_buf[0x3f] == 0x38) &&
> > +   ((unsigned char)pAC->vpd.vpd_buf[0x40] == 0x3c) &&
> > +   ((unsigned char)pAC->vpd.vpd_buf[0x41] == 0x45) ) {
> > +   printk("sk98lin : humm... Asus mainboard with buggy VPD ? 
> > correcting data.\n");
>   ^
> Please, could you put some KERN_XXX here to avoid a buggy message level ?

Yes, of course.

Phil.

Signed-Off-By: Philippe Troin <[EMAIL PROTECTED]>

diff -ruN linux-2.4.29.orig/drivers/net/sk98lin/skvpd.c 
linux-2.4.29/drivers/net/sk98lin/skvpd.c
--- linux-2.4.29.orig/drivers/net/sk98lin/skvpd.c   Wed Apr 14 06:05:30 2004
+++ linux-2.4.29/drivers/net/sk98lin/skvpd.cMon Feb 21 02:03:00 2005
@@ -466,6 +466,15 @@

pAC->vpd.vpd_size = vpd_size;
 
+   /* Asus K8V Se Deluxe bugfix. Correct VPD content */
+   /* MBo April 2004 */
+   if( ((unsigned char)pAC->vpd.vpd_buf[0x3f] == 0x38) &&
+   ((unsigned char)pAC->vpd.vpd_buf[0x40] == 0x3c) &&
+   ((unsigned char)pAC->vpd.vpd_buf[0x41] == 0x45) ) {
+   printk(KERN_INFO "sk98lin : humm... Asus mainboard with buggy 
VPD ? correcting data.\n");
+   (unsigned char)pAC->vpd.vpd_buf[0x40] = 0x38;
+   }
+
/* find the end tag of the RO area */
if (!(r = vpd_find_para(pAC, VPD_RV, &rp))) {
SK_DBG_MSG(pAC, SK_DBGMOD_VPD, SK_DBGCAT_ERR | SK_DBGCAT_FATAL,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.4.29 sk98lin patch for Asus K8W SE Deluxe

2005-03-02 Thread Philippe Troin
The EEPROM (or whatever that is) on Asus K8V SE Deluxe motherboards
contains buggy firmware.  This buggy firmware has one flipped bit, and
causes the sk98lin driver refuses to work correctly.  Please look at
this thread:

  http://www.ussg.iu.edu/hypermail/linux/kernel/0404.0/1439.html

It contains a patch for 2.6 that fixs the problem.  Enclosed is a copy
of this patch for 2.4.29.  Please consider applying.

Phil.

Signed-Off-By: Philippe Troin <[EMAIL PROTECTED]>

diff -ruN linux-2.4.29.orig/drivers/net/sk98lin/skvpd.c 
linux-2.4.29/drivers/net/sk98lin/skvpd.c
--- linux-2.4.29.orig/drivers/net/sk98lin/skvpd.c   Wed Apr 14 06:05:30 2004
+++ linux-2.4.29/drivers/net/sk98lin/skvpd.cMon Feb 21 02:03:00 2005
@@ -466,6 +466,15 @@

pAC->vpd.vpd_size = vpd_size;
 
+   /* Asus K8V Se Deluxe bugfix. Correct VPD content */
+   /* MBo April 2004 */
+   if( ((unsigned char)pAC->vpd.vpd_buf[0x3f] == 0x38) &&
+   ((unsigned char)pAC->vpd.vpd_buf[0x40] == 0x3c) &&
+   ((unsigned char)pAC->vpd.vpd_buf[0x41] == 0x45) ) {
+   printk("sk98lin : humm... Asus mainboard with buggy VPD ? 
correcting data.\n");
+   (unsigned char)pAC->vpd.vpd_buf[0x40] = 0x38;
+   }
+
/* find the end tag of the RO area */
if (!(r = vpd_find_para(pAC, VPD_RV, &rp))) {
SK_DBG_MSG(pAC, SK_DBGMOD_VPD, SK_DBGCAT_ERR | SK_DBGCAT_FATAL,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Static with esound (esd) and via82cxxx_audio on 2.4.29.

2005-03-02 Thread Philippe Troin
Running 

Linux 2.4.29 #1 SMP Mon Feb 21 02:11:56 PST 2005 i686 unknown

on an Asus K8W SE Deluxe, bios 1005 with an embedded via82cxxx audio
controller:

  00:11.5 Multimedia audio controller: VIA Technologies, Inc. AC97 Audio 
Controller (rev 60)
Subsystem: Asustek Computer, Inc.: Unknown device 80b0
Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- \
 ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium \
>TAbort- SERR- 
   #include 
   #include 
   #include 
   #include 
   #include 
   #include 

   #define CK(args)\
   if ((args) == -1)   \
 { \
   fprintf(stderr, "%s: %s\n", #args, strerror(errno));\
   exit(1);\
 }

   int main()
   {
 int fd;

 CK(fd = open("/dev/dsp", O_WRONLY));
 CK(ioctl(fd, SNDCTL_DSP_POST, 0));
 sleep(1);
 return 0;
   }

I've tried backporting 1.9.1-ac4 from 2.6.x, but it has the same
problems.

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Unexpected I/O APIC on i386 2.4.29 / Tyan Thunder K8W (S2885)

2005-01-25 Thread Philippe Troin
Philippe Troin <[EMAIL PROTECTED]> writes:

[Please CC me on the replies]

This is seen on a dual-242 set-up with 2 GB of RAM running a i386
kernel (not x86_64).
2.4.27 and 2.4.28 also showed the problem.

Enclosed is the dmesg log and the lspci -vvv output.

Configuration available upon request (if it makes a difference).

Phil.

Linux version 2.4.29 ([EMAIL PROTECTED]) (gcc version 2.95.4 20011002 (Debian 
prerelease)) #1 SMP Mon Jan 24 12:48:51 PST 2005
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009fc00 (usable)
 BIOS-e820: 0009fc00 - 000a (reserved)
 BIOS-e820: 000e - 0010 (reserved)
 BIOS-e820: 0010 - 7fff (usable)
 BIOS-e820: 7fff - 7000 (ACPI data)
 BIOS-e820: 7000 - 8000 (ACPI NVS)
 BIOS-e820: ff78 - 0001 (reserved)
1151MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000ff780
hm, page 000ff000 reserved twice.
hm, page 0010 reserved twice.
hm, page 000fa000 reserved twice.
hm, page 000fb000 reserved twice.
On node 0 totalpages: 524272
zone(0): 4096 pages.
zone(1): 225280 pages.
zone(2): 294896 pages.
ACPI: RSDP (v002 ACPIAM) @ 0x000f6a50
ACPI: XSDT (v001 A M I  OEMXSDT  0x09000423 MSFT 0x0097) @ 0x7fff0100
ACPI: FADT (v001 A M I  OEMFACP  0x09000423 MSFT 0x0097) @ 0x7fff0281
ACPI: MADT (v001 A M I  OEMAPIC  0x09000423 MSFT 0x0097) @ 0x7fff0380
ACPI: OEMB (v001 A M I  OEMBIOS  0x09000423 MSFT 0x0097) @ 0x7040
ACPI: SRAT (v001 A M I  OEMSRAT  0x09000423 MSFT 0x0097) @ 0x7fff3be0
ACPI: HPET (v001 A M I  OEMHPET  0x09000423 MSFT 0x0097) @ 0x7fff3cd0
ACPI: ASF! (v001 AMIASF AMDSTRET 0x0001 INTL 0x02002026) @ 0x7fff3d10
ACPI: DSDT (v001  0 0001 0x0001 INTL 0x02002026) @ 0x
ACPI: Local APIC address 0xfee0
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 Unknown CPU [15:5] APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 Unknown CPU [15:5] APIC version 16
ACPI: IOAPIC (id[0x02] address[0xfec0] global_irq_base[0x0])
IOAPIC[0]: Assigned apic_id 2
IOAPIC[0]: apic_id 2, version 17, address 0xfec0, IRQ 0-23
ACPI: IOAPIC (id[0x03] address[0xff4fe000] global_irq_base[0x18])
IOAPIC[1]: Assigned apic_id 3
IOAPIC[1]: apic_id 3, version 17, address 0xff4fe000, IRQ 24-27
ACPI: IOAPIC (id[0x04] address[0xff4ff000] global_irq_base[0x1c])
IOAPIC[2]: Assigned apic_id 4
IOAPIC[2]: apic_id 4, version 17, address 0xff4ff000, IRQ 28-31
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
Using ACPI (MADT) for SMP configuration information
Kernel command line: vga=0xfffe ramdisk=0 root=/dev/discs/disc0/part1 ro
Initializing CPU#0
Detected 1592.948 MHz processor.
Console: colour VGA+ 80x50
Calibrating delay loop... 3178.49 BogoMIPS
Memory: 2069576k/2097088k available (1313k kernel code, 27124k reserved, 554k 
data, 108k init, 1179584k highmem)
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode cache hash table entries: 131072 (order: 8, 1048576 bytes)
Mount cache hash table entries: 512 (order: 0, 4096 bytes)
Buffer cache hash table entries: 131072 (order: 7, 524288 bytes)
Page-cache hash table entries: 524288 (order: 9, 2097152 bytes)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After generic, caps: 078bfbff e1d3fbff  
CPU: Common caps: 078bfbff e1d3fbff  
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.40 (20010327) Richard Gooch ([EMAIL PROTECTED])
mtrr: detected mtrr type: Intel
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
Intel machine check reporting enabled on CPU#0.
CPU: After generic, caps: 078bfbff e1d3fbff  
CPU: Common caps: 078bfbff e1d3fbff  
CPU0: AMD Opteron(tm) Processor 242 stepping 08
per-CPU timeslice cutoff: 2926.30 usecs.
enabled ExtINT on CPU#0
ESR value before enabling vector: 0004
ESR value after enabling vector: 
Booting processor 1/1 eip 3000
Initializing CPU#1
masked ExtINT on CPU#1
ESR value before enabling vector: 
ESR value after enabling vector: 
Calibrating delay loop... 3185.04 BogoMIPS
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
Intel machine check reporting enabled on CPU#1.
CPU: After generic, caps: 078bfbff e1d3fbff  
CPU: Common caps: 078bfbff e1d3fbff  
CPU1: AMD Opteron(tm) Processor 242 steppi

Re: Lost O_NONBLOCK (Bug?)

2001-04-19 Thread Philippe Troin

Jason Gunthorpe <[EMAIL PROTECTED]> writes:

> On 12 Apr 2001, Philippe Troin wrote:
> 
> > Apt I guess ? It has a very strange behavior when backgrounded...
> 
> Not really, just want it tries to run dpkg it hangs.
> 
> > > The last read was after the process was forgrounded. The read waits
> > > forever, the non-block flag seems to have gone missing. It is also a
> > > little odd I think that it repeated to get SIGTTIN which was never
> > > actually delivered to the program.. Shouldn't SIGTTIN suspend the process?
>  
> > Strace can perturbate signal delivery, especially for terminal-related
> > signals, I wouldn't trust it...
> 
> I know, the problem still happens without strace.

Do you have a snippet that can reproduce the problem ? Does this
happens only with 2.4, or both 2.2 and 2.4 have the problem ?

> > O_NONBLOCK is not lost... Attempting to read from the controlling tty
> > even from a O_NONBLOCK descriptor will trigger SIGTTIN.
> 
> I don't really care about the SIGTTIN, what bugs me is that the read that
> happens after the process has been foregrounded blocks - and that should
> not be.

True.

8< snip >8

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Lost O_NONBLOCK (Bug?)

2001-04-12 Thread Philippe Troin

Jason Gunthorpe <[EMAIL PROTECTED]> writes:

> I've run into the following weird behavior on my system with 2.4.0. I have
> the following code:

Apt I guess ? It has a very strange behavior when backgrounded...

> if (fork() == 0)
> {
> int Flags,dummy;
> if ((Flags = fcntl(STDIN_FILENO,F_GETFL,dummy)) < 0)
> _exit(100);
> if (fcntl(STDIN_FILENO,F_SETFL,Flags | O_NONBLOCK) < 0)
>  _exit(100);
> while (read(STDIN_FILENO,&dummy,1) == 1);
> if (fcntl(STDIN_FILENO,F_SETFL,Flags & (~(long)O_NONBLOCK)) < 0)
>  _exit(100);
>  
> // exec something
> }
> 
> Which works fine, unless the parent process was backgrounded by the shell
> (^Z then bg).  If that is the case then the O_NONBLOCK seems to be lost. I
> straced this: 
> 
> fcntl(0, F_GETFL)   = 0x2 (flags O_RDWR)
> fcntl(0, F_SETFL, O_RDWR|O_NONBLOCK)= 0
> read(0, 0xbfffea38, 1)  = ? ERESTARTSYS (To be restarted)
> --- SIGTTIN (Stopped (tty input)) ---
> --- SIGTTIN (Stopped (tty input)) ---
> read(0, 0xbfffea38, 1)  = ? ERESTARTSYS (To be restarted)
> --- SIGTTIN (Stopped (tty input)) ---
> --- SIGTTIN (Stopped (tty input)) ---
> [.. etc, again and again in a tight loop ..]
> --- SIGTTIN (Stopped (tty input)) ---
> --- SIGTTIN (Stopped (tty input)) ---
> read(0,
> 
> The last read was after the process was forgrounded. The read waits
> forever, the non-block flag seems to have gone missing. It is also a
> little odd I think that it repeated to get SIGTTIN which was never
> actually delivered to the program.. Shouldn't SIGTTIN suspend the process?

Strace can perturbate signal delivery, especially for terminal-related
signals, I wouldn't trust it...

O_NONBLOCK is not lost... Attempting to read from the controlling tty
even from a O_NONBLOCK descriptor will trigger SIGTTIN.

>From the code, it looks like you're trying to flush stdin before
exec'ing.

Why not use tcflush(STDIN_FILENO, TCIFLUSH) rather than using
O_NONBLOCK ?

This will not prevent SIGGTTIN from getting sent... You could catch it
or just ignore it...

But why would you want to flush stdin if you're in the background ?
Why not using:

  if (fork()==0)
  {
if (tcgetpgrp(STDIN_FILENO) == getpgrp())
{
  /* We're the foreground process of the controlling tty */
  tcflush(STDIN_FILENO, TCIFLUSH);
}

exec(...);
  }

Here you just don't care flushing stdin if you're not the foreground
process (which is the *right* thing to do).

There's a race condition if the process is backgrounded between the
tcgetgrp() and the tcflush(), but you'll have to leave with it...

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.2.17: TCP keepalive oops

2000-11-15 Thread Philippe Troin


Got this oops (captured by kmsgdump) today.
The machine was completely stuck.

Phil.

  ksymoops 2.3.4 on i686 2.2.17.  Options used
   -V (default)
   -k /proc/ksyms (default)
   -l /proc/modules (default)
   -o /lib/modules/2.2.17/ (default)
   -m /boot/System.map-2.2.17 (default)
  
  Warning: You did not tell me where to find symbol information.  I will
  assume that the log matches the kernel and modules that are running
  right now and I'll use the default options above for symbol resolution.
  If the current kernel and/or modules do not match the log, you can get
  more accurate output by telling me the kernel version and where to find
  map, modules, ksyms etc.  ksymoops -h explains the options.
  
  <1>Unable to handle kernel NULL pointer dereference at virtual address 0022
  <1>current->tss.cr3 = 00101000, %cr3 = 00101000
  <1>*pde = 
  <6>Oops: 
  <6>CPU:0
  <6>EIP:0010:[]
  Using defaults from ksymoops -t elf32-i386 -a i386
  <6>EFLAGS: 00010202
  <6>eax: cfd0   ebx: 0002   ecx:    edx: 
  <6>esi: 0369af3b   edi: 70a2   ebp: 0010   esp: cfffbe8c
  <6>ds: 0018   es: 0018   ss: 0018
  <6>Process swapper (pid: 0, process nr: 1, stackpage=cfffb000)
  <6>Stack: 0001 0010 cf782ec0 0001   cfffbf54 cfff1000 
  <6>   c0203700 c0169ce9 c0203f70 cfffbf54 cfffbedc c0115d42 c0203f34 c0169cb0 
  <6>   0001 cfffbf1c   cfffbf1c c011614d  0001 
  <6>Call Trace: [] [] [] [] [] 
[] [] 
  <6>   [] [] [] [] [] 
[] [] [] 
  <6>   [] 
  <6>Code: 8b 43 20 89 44 24 18 8b 43 30 85 c0 0f 85 09 01 00 00 8a 43 
  
  >>EIP; c016988c<=
  Trace; c0169ce9 
  Trace; c0115d42 
  Trace; c0169cb0 
  Trace; c011614d 
  Trace; c0111b14 
  Trace; c0111ac6 
  Trace; c011d451 
  Trace; c010ce39 
  Trace; c010ce50 
  Trace; c010b8a8 
  Trace; c0108ac9 
  Trace; c011d451 
  Trace; c0106000 
  Trace; c010ce39 
  Trace; c010ce50 
  Trace; c01cf660 
  Code;  c016988c 
   <_EIP>:
  Code;  c016988c<=
 0:   8b 43 20  mov0x20(%ebx),%eax   <=
  Code;  c016988f 
 3:   89 44 24 18   mov%eax,0x18(%esp,1)
  Code;  c0169893 
 7:   8b 43 30  mov0x30(%ebx),%eax
  Code;  c0169896 
 a:   85 c0 test   %eax,%eax
  Code;  c0169898 
 c:   0f 85 09 01 00 00 jne11b <_EIP+0x11b> c01699a7 

  Code;  c016989e 
12:   8a 43 00  mov0x0(%ebx),%al
  
  <6>Aiee, killing interrupt handler
  <0>Kernel panic: Attempted to kill the idle task!
  
  1 warning issued.  Results may not be reliable.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.x BUG & PATCH: recvmsg() does not check msg_controllen correctly

2000-11-04 Thread Philippe Troin

"David S. Miller" <[EMAIL PROTECTED]> writes:

>From: Philippe Troin <[EMAIL PROTECTED]>
>Date: 03 Nov 2000 19:53:04 -0800
> 
>Yes I agree, mixing signed and unsigned arithmetic is evil... Doesn't
>gcc have a flag for unsafe signed/unsigned mixtures ?
> 
>Would you consider this patch (or a variant) for inclusion ?
> 
> I would accept a patch which made the code set fdmax <= 0 when
> (msg->msg_controllen < (sizeof(struct cmsghdr) + sizeof(int)))
> because it is the sole reason this bug exists at all.

How about this one ?

Phil.

 linux-2.2.17-scmrights.patch


2.2.x BUG & PATCH: recvmsg() does not check msg_controllen correctly

2000-11-03 Thread Philippe Troin

I found this in all 2.2.x kernels, and it might possibly be present in
2.4.x too...

When receiving file descriptors via recvmsg(), scm_detach_fds() in
net/core/scm.c can overflow user space data at msg_control if
msg_controllen is less than sizeof(struct cmsghdr).

This is a security problem.

Attached is a patch to fix the problem and a little program to
demonstrate the problem.

Phil.

 linux-2.2.17-8-scmrights.patch
 check-anc.c


Re: 255.255.255.255 won't broadcast to multiple NICs

2000-11-03 Thread Philippe Troin

Rob Landley <[EMAIL PROTECTED]> writes:
> --- Philippe Troin <[EMAIL PROTECTED]> wrote:
> > All the code I've encountered which actually needed
> > to perform
> > broadcast on all interfaces was sending
> > subnet-directed broadcasts by
> > hand on all interfaces.
> 
> Bind to a socket to a local port and query that
> address you say?  Nope, too easy.  The address
> returned when I query a socket (rather than a
> connection) is 0.0.0.0 on any machine with multiple
> interfaces (even loopback), since the socket is bound
> to that port on ALL the interfaces.  Each incoming or
> outgoing connection does have a valid "from" IP
> address, but I have to wait for a connection to come
> in to get that.  (Unless I explicitly specify which IP
> to bind to when I create the socket, but if I knew
> that I'd already be there.)
> 
> Nope, making my own connection to a port on the same
> machine just means 127.0.0.1 is talking to 127.0.0.1. 
> Tried it.  Didn't work.
> 
> Nope, feeding the loopback address to getAllByName()
> doesn't help either.  I tried that too, it just
> returns a length 1 array containing just the loopback
> address.

The source IP address (as returned by getsockname()) is only set when
the socket is connected... It follows the same logic: for a multihomed
machine, we know which interface will be used only when we know who
we'll be talking to...

> Now you know why I'm resorting to 255.255.255.255. 
> I'm sort of faking things: when the server broadcasts
> to clients they know who it is, and when they
> broadcast to it, it knows who THEY are (it says in the
> UDP datagram header info).  And the way I've written
> it, that's all they really need to know (although when
> we reply to each other we can each find out the info
> we don't know: who WE are.  But by that point, we no
> longer need it. :)
> 
> I may just document "if you run this on a machine with
> more than one network card, you have to specify the
> broadcast addresses on the command line".  It's
> configuration, but the only machine likely to HAVE
> multiple interfaces is the server (which could be
> serving multiple subnets in a really BIG render farm),
> so I suppose it's tolerable...

You could use SIOGIFCONF (from C) to get the address list. I'm not
sure is java has the equivalent... Or maybe a very small native
method...

> > Broadcast is ugly anyways, why don't you use
> > multicast ?
> 
> I'm only passingly familiar with it, does it map well
> to this problem?
> 
> The only data I'm trying to transmit is "where is
> everybody", or "wake up".  The broadcast packets are
> only needed for clients to find the server on bootup
> (and vice versa if the server is rebooted).  They're
> also used to wake up clients if they go to sleep
> because the server has nothing for them to do at the
> moment, but that second part's a convenience, really. 
> The server could loop through and address them
> individually instead since it knows where they are by
> that point.
> 
> The actual heavy lifting of data is done by TCP/IP
> streams.  UDP broadcast is just for figuring out where
> to open the TCP/IP connections to.

Sounds like a good job for multicast... It's fairly simple to use,
but:

  1) I'm not sure if java gives you access to the required ioctls
 (there's only five of them).

  2) You may need to run mrouted on all your gateways.

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 255.255.255.255 won't broadcast to multiple NICs

2000-11-02 Thread Philippe Troin

Rob Landley <[EMAIL PROTECTED]> writes:

> --- Jeff Garzik <[EMAIL PROTECTED]> wrote:
> > Rob Landley wrote:
> > > Under 2.2.16, broadcast packets addressed to
> > > 255.255.255.255 do not go out to all interfaces in
> > a
> > > machine with multiple network cards.  They're
> > getting
> > > routed out the default gateway's interface
> > instead.
> > 
> > Are the network cards on the same network?
> 
> Two subnets.  (both martians: 10.blah and
> 192.168.blah).  Gateway's off of 10.blah (beyond which
> lives the internet), the 192 thing is the small
> cluster I'm putting together in my office to test the
> software.
> 
> I take it this makes a difference?  If there's some
> kind of "don't do that" here, I might be happy just
> documenting it.  (In theory, I could iterate through
> the NICs and send out a broadcast packet to each
> interface's broadcast address (although for reasons
> that are a bit complicated to go into right now unless
> you really want to know, that's not easy to do in this
> case).)  But that's just a workaround to cover up the
> fact that the IP stack isn't doing the obvious with
> global broadcasts.
> 
> So the question is, is the stack's behavior right?  If
> not, what's involved in fixing it, and if so, is it
> documented anywhere?

I think historically, BSD stacks were routing 255.255.255.255 to the
"primary interface" (whatever that means).

All the code I've encountered which actually needed to perform
broadcast on all interfaces was sending subnet-directed broadcasts by
hand on all interfaces.

Broadcast is ugly anyways, why don't you use multicast ?

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [ADMIN] some list related topics ..

2000-10-19 Thread Philippe Troin

Matti Aarnio <[EMAIL PROTECTED]> writes:

> We ([EMAIL PROTECTED] -> me & DaveM) got just reports that
> somebody is diverting incoming email to some sort of auto-responding
> ticket system.
> 
> The thing does not carry original message "Received:" headers in replies,
> and is reporting invalid URL.
> 
> 
> Independent of that, people with supposedly working addresses
> are sometimes bouncing:
> 
>   1) because their backup MXes don't like the domain they have
>  (configuration problem somewhere, you can choose in between
>   the DNS data writer, and the backup MX admin)

8< snip >8

> For the subset  1  above, I am preparing to begin to run regularly
> (weekly very least, daily possibly) scanner which tries interactive
> testing of recipients address at all of user's domain's MX servers,
> and if any of the MX systems for user's domain gives a bad
> response for subscriber's address, that user gets the log report
> and some pointers on what to do.

It also sends a report for perfectly working systems... Just got two
of theses... Is this a normal feature of your checking script ?

(Report available upon request)

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.2.17 Stuck TCP ESTABLISHED sessions

2000-09-22 Thread Philippe Troin


I've seen that in the past, but never had time to investigate. For
some reasons, TCP sessions get stuck.

Here'an example with a ssh session:

  1) Netstat says on tantale (note the non-zero Send-Q):
  tcp 0 38364 tantale:ssh   neptune:1022 ESTABLISHED

 Netstat says on neptune:
  tcp 0 0 neptune:1022  tantale:22   ESTABLISHED

  2) At this point the session is stuck in the tantale->neptune
 direction although the other direction is still active.

  3) Here's what tcpdump says when I send data (type in one character
 at the ssh session):

  neptune.1022 > tantale.ssh: P 560:580(20) ack 1 win 32120 
(DF) [tos 0x10]
  tantale.ssh > neptune.1022: . ack 580 win 32120 
(DF) [tos 0x10]

Both tantale and neptune are configured as firewalls. Both tantale and
neptune are configured to always defragment.
Neptune also does masquerading, but in that particular case the
session is not masqueraded (since started on neptune).

I fail to understand why tantale does not send back what's in its
outgoing queue since neptune reports an open window of 32120 bytes.

Or am I missing something ?

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: BPF in the kernel

2000-09-18 Thread Philippe Troin

kjh63 <[EMAIL PROTECTED]> writes:

> > How Linux Kernel and BPF relate to each other:
> >
> > a) linux has BPF (I don't think so).

It has LSF, the Linux Socket Filter.

> > b) Linux has own equivalent of BPF (part of NAT?)

Yes, the LSF.

> > c) linux does not have anything like BPF

BPF opcodes works on LSF. LSF has some extensions to BSF, like
fetching which interface the packet came from.

> > d) something else (if so, then what?)
> 
> a) The Documentation/networking/filters.txt may say so but i dont think so
> either:
> 
> [root@localhost networking]# ls -al /dev/bpf0
> ls: /dev/bpf0: No such file or directory
> [root@localhost networking]# cd /dev/
> [root@localhost /dev]# sh MAKEDEV bpf0
> MAKEDEV: don't know how to make device "bpf0"

LSF does not work with devices nodes, it works with setsockopt:

struct bpf_program bpfp;
/* Fill bpfp */
setsockopt(SOL_SOCKET, SO_ATTACH_FILTER, &bpfp, sizeof(bpfp));

> How can I make ethereal (or libpcap) work with LSF?

Last time I checked libpcap was emulating BPF in user space. Which is
bad of course because all packets are copied...

Ideally, libpcap should be extended to support the LPFisms. A LSF
filtering some ethernet traffic does not have to be bound to an
interface. You can filter all the packets coming in.

I can send you an example if you need so.

Phil.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/