Re: Regression in 5.1.20: Reading long directory fails
> "JBF" == J Bruce Fields writes: JBF> Those readdir changes were client-side, right? Based on that I'd JBF> been assuming a client bug, but maybe it'd be worth getting a full JBF> packet capture of the readdir reply to make sure it's legit. I have been working with bcodding on IRC for the past couple of days on this. Fortunately I was able to come up with way to fill up a directory in such a way that it will fail with certainty and as a bonus doesn't include any user data so I can feel OK about sharing packet captures. I have a capture alongside a kernel trace of the problematic operation in https://www.math.uh.edu/~tibbs/nfs/. Not that I can particularly tell anything useful from that, but bcodding says that it seems to point to some issue in sunrpc. And because I can easily reproduce this and I was able to do a bisect: 2c94b8eca1a26cd46010d6e73a23da5f2e93a19d is the first bad commit commit 2c94b8eca1a26cd46010d6e73a23da5f2e93a19d Author: Chuck Lever Date: Mon Feb 11 11:25:41 2019 -0500 SUNRPC: Use au_rslack when computing reply buffer size au_rslack is significantly smaller than (au_cslack << 2). Using that value results in smaller receive buffers. In some cases this eliminates an extra segment in Reply chunks (RPC/RDMA). Signed-off-by: Chuck Lever Signed-off-by: Anna Schumaker :04 04 d4d1ce2fbe0035c5bd9df976b8c448df85dcb505 7011a792dfe72ff9cd70d66e45d353f3d7817e3e M net But of course, I can't say whether this is the actual bad commit or whether it just introduced a behavior change which alters the conditions under which the problem appears. And just to make sure that the blame doesn't lie with the old RHEL7 kernel, I rsynced over the problematic directory to a machine running something slightly more modern (5.1.11, which I know I need to update, but it's already set up to do kerberised NFS) and the same problem exists, though the directory listing does fail at a different place. - J<
Re: Regression in 5.1.20: Reading long directory fails
I asked the XFS folks who mentioned that the issues with 64 bit inodes are old, constrained to larger filesystems than what I'm using, not an issue with nfsv4, and not present on anything but 32bit clients with old userspace. In any case, I have been experimenting a bit and somehow the issue seems to be related to exporting with sec=krb5i:krb5p or sec=krb5i. If I export with just sec=krb5p, things magically begin to work. So basically: [root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts 7685 nas00:/export/misc-00/tester /home/tester nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77 0 0 (unmount, then re-export with krb5i on the server) [root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts ls: reading directory '/home/tester': Input/output error 5623 nas00:/export/misc-00/tester /home/tester nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5i,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77 0 0 (umount, then re-export with krb5i:krb5p on the server) [root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts ls: reading directory '/home/tester': Input/output error 5623 nas00:/export/misc-00/tester /home/tester nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5i,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77 0 0 (umount, switch back to plain krb5p) [root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts 7685 nas00:/export/misc-00/tester /home/tester nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77 0 0 Sometimes the number of files it lists before it fails changes (and in this case has been as small as a few hundred) but I don't know what causes it to change. Anyway, I hope this helps to pinpoint the problem. I now have a really easy way to reproduce this without having to kick people off of the server, and if the successes aren't just some kind of false positives then I guess I also have a workaround. I'm still at a loss as to why a revert of the readdir changes makes any difference at all here. - J<
Re: Regression in 5.1.20: Reading long directory fails
> "WW" == Wolfgang Walter writes: WW> What filesystem do you use on the server? xfs? Yeah, it's XFS. WW> If yes, does it use 64bit inodes (or started to use them)? These filesystems aren't super old, and were all created with the default RHEL7 options. I'm not sure how to check that 64 bit inodes are being used, though. xfs_info says: meta-data=/dev/mapper/nas-faculty--08 isize=256agcount=4, agsize=3276800 blks = sectsz=512 attr=2, projid32bit=1 = crc=0finobt=0 spinodes=0 data = bsize=4096 blocks=13107200, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal bsize=4096 blocks=6400, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 WW> Do you set a fsid when you export the filesystem? I have never done so on any server. And note that the servers are basically unchanged for quite some time, while the problem I'm having is new. I want to find some server-related cause for this but so far I haven't been able to do so. It seems my best option now seems to be to migrate all data off of this server and then wipe, reinstall and see if the problem reoccurs. - J<
Re: Regression in 5.1.20: Reading long directory fails
> "JLT" == Jason L Tibbitts writes: JLT> Certainly a server reboot, or maybe even just JLT> unmounting and remounting the filesystem or copying the data to JLT> another filesystem would tell me that. In any case, as soon as I JLT> am able to mess with that server, I'll know more. Rebooting the server did not make any difference, and now more users are seeing the problem. At this point I'm in a state where NFS simply isn't reliable at all, and I'm not sure what to do. If Centos 8 were out, I'd work on moving to that just so that the server was a little more modern. (Currently the server is Centos 7.) I guess I could try using Fedora, or installing one of the upstream kernels, just in case this has to do with some interaction between the client and the old RHEL7 kernel. I do have a packet capture of a directory listing that fails with EIO, but I'm not sure if it's safe to simply post it, and I'm not sure what tshark options would be useful in decoding it. I do know that I can rsync one of the problematic directories to a different server (running the same kernel) and it doesn't have the same problem. What I'll try next is rsyncing to a different filesystem on the same server, but again I'll have to wait until people log off to do proper testing. - J<
Re: Regression in 5.1.20: Reading long directory fails
> "BF" == J Bruce Fields writes: BF> Looks like that's db531db951f950b8 upstream. (Do you know if it's BF> reproduceable upstream as well?) Yes, it's reproducible up in the 5.3.0 RCs as well. However, while trying to do some further bisecting I ran into an odd problem. Now kernels which were previously working (i.e. 5.1.19 and older) are returning errors, but at a different file count. This only gives me more questions. And so, just to be absolutely sure that there isn't some weird server issue involved, I'm going to try to schedule a reboot of the relevant server. BF> Maybe it depends on having names of the right length to place some BF> bit of xdr on a boundary. I wonder if it'd be possible to reproduce BF> just by varying the name lengths randomly till you hit it. I know I can't reproduce with loads of short names, and with relatively long names as well (using sha256sum as filename generator). BF> No clever debugging ideas off the top of my head, I'm afraid. I BF> might start by patching the kernel or doing some tracing to figure BF> out exactly where that EIO is being generated? If I had any idea how to do that, I happily would. I'm certainly willing to learn. At least I can run strace to see where ls bombs: getdents64(5, 0x7fc13afaf040, 262144) = -1 EIO (Input/output error) bcodding on IRC mentioned that is a rather large count. Does make me wonder if the server is weirding out and sending the client bogus data. Certainly a server reboot, or maybe even just unmounting and remounting the filesystem or copying the data to another filesystem would tell me that. In any case, as soon as I am able to mess with that server, I'll know more. _ J<
Re: Regression in 5.1.20: Reading long directory fails
I now have another user reporting the same failure of readdir on a long directory which showed up in 5.1.20 and was traced to 3536b79ba75ba44b9ac1a9f1634f2e833bbb735c. I'm not sure what to do to get more traction besides reposting and adding some addresses to the CC list. If there is any information I can provide which might help to get to the bottom of this, please let me know. To recap: 5.1.20 introduced a regression reading some large directories. In this case, the directory should have 7800 files or so in it: [root@ld00 ~]# ls -l ~dblecher|wc -l ls: reading directory '/home/dblecher': Input/output error 1844 [root@ld00 ~]# cat /proc/version Linux version 5.1.20-300.fc30.x86_64 (mockbu...@bkernel04.phx2.fedoraproject.org) (gcc version 9.1.1 20190503 (Red Hat 9.1.1-1) (GCC)) #1 SMP Fri Jul 26 15:03:11 UTC 2019 (The server is a Centos 7 machine running kernel 3.10.0-957.12.2.el7.x86_64.) Building a kernel which reverts commit 3536b79ba75ba44b9ac1a9f1634f2e833bbb735c: Revert "NFS: readdirplus optimization by cache mechanism" (memleak) fixes the issue, but of course that revert was fixing a real issue so I'm not sure what to do. I can trivially reproduce this by simply trying to list the problematic directories but I'm not sure how to construct such a directory; simply creating 1 files doesn't cause the problem for me. I am willing to test patches and can build my own kernels, and I'm happy to provide any debugging information you might require. Unfortunately I don't know enough to dig in and figure out for myself what's going wrong. I did file https://bugzilla.redhat.com/show_bug.cgi?id=1740954 just to have this in a bug tracker somewhere. I'm happy to file one somewhere else if that would help. - J<
Re: [PATCH] scsi: sg: only check for dxfer_len greater than 256M
> "MKP" == Martin K Petersenwrites: MKP> Applied to 4.13/scsi-fixes. Thanks! My thanks as well to everyone who helped in getting this fixed. - J<
Re: [PATCH] scsi: sg: only check for dxfer_len greater than 256M
> "MKP" == Martin K Petersen writes: MKP> Applied to 4.13/scsi-fixes. Thanks! My thanks as well to everyone who helped in getting this fixed. - J<
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
> "JT" == Johannes Thumshirnwrites: JT> It's probably best to just check for dxfer_len <= 2^28 to be valid JT> as Doug suggested: I can verify that patch on top of git head (as of a few hours ago) does function properly. It didn't apply directly on top of 4.12 but even I can handle fixing that up. The result (just deleting the function and changing the call to a check for hp->dxfer_len >= SZ_256M) works fine and is at the end. So thanks. If this goes in, please CC to stable. - J< diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 82c33a6..aa6f1de 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -751,29 +751,6 @@ sg_new_write(Sg_fd *sfp, struct file *file, const char __user *buf, return count; } -static bool sg_is_valid_dxfer(sg_io_hdr_t *hp) -{ - switch (hp->dxfer_direction) { - case SG_DXFER_NONE: - if (hp->dxferp || hp->dxfer_len > 0) - return false; - return true; - case SG_DXFER_TO_DEV: - case SG_DXFER_FROM_DEV: - case SG_DXFER_TO_FROM_DEV: - if (!hp->dxferp || hp->dxfer_len == 0) - return false; - return true; - case SG_DXFER_UNKNOWN: - if ((!hp->dxferp && hp->dxfer_len) || - (hp->dxferp && hp->dxfer_len == 0)) - return false; - return true; - default: - return false; - } -} - static int sg_common_write(Sg_fd * sfp, Sg_request * srp, unsigned char *cmnd, int timeout, int blocking) @@ -794,7 +771,7 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp, "sg_common_write: scsi opcode=0x%02x, cmd_size=%d\n", (int) cmnd[0], (int) hp->cmd_len)); - if (!sg_is_valid_dxfer(hp)) + if (hp->dxfer_len >= SZ_256M) return -EINVAL; k = sg_start_req(srp, cmnd);
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
> "JT" == Johannes Thumshirn writes: JT> It's probably best to just check for dxfer_len <= 2^28 to be valid JT> as Doug suggested: I can verify that patch on top of git head (as of a few hours ago) does function properly. It didn't apply directly on top of 4.12 but even I can handle fixing that up. The result (just deleting the function and changing the call to a check for hp->dxfer_len >= SZ_256M) works fine and is at the end. So thanks. If this goes in, please CC to stable. - J< diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 82c33a6..aa6f1de 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -751,29 +751,6 @@ sg_new_write(Sg_fd *sfp, struct file *file, const char __user *buf, return count; } -static bool sg_is_valid_dxfer(sg_io_hdr_t *hp) -{ - switch (hp->dxfer_direction) { - case SG_DXFER_NONE: - if (hp->dxferp || hp->dxfer_len > 0) - return false; - return true; - case SG_DXFER_TO_DEV: - case SG_DXFER_FROM_DEV: - case SG_DXFER_TO_FROM_DEV: - if (!hp->dxferp || hp->dxfer_len == 0) - return false; - return true; - case SG_DXFER_UNKNOWN: - if ((!hp->dxferp && hp->dxfer_len) || - (hp->dxferp && hp->dxfer_len == 0)) - return false; - return true; - default: - return false; - } -} - static int sg_common_write(Sg_fd * sfp, Sg_request * srp, unsigned char *cmnd, int timeout, int blocking) @@ -794,7 +771,7 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp, "sg_common_write: scsi opcode=0x%02x, cmd_size=%d\n", (int) cmnd[0], (int) hp->cmd_len)); - if (!sg_is_valid_dxfer(hp)) + if (hp->dxfer_len >= SZ_256M) return -EINVAL; k = sg_start_req(srp, cmnd);
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
> "JT" == Johannes Thumshirnwrites: JT> Yes please (on top of the snippet I've sent you last). OK, I'm at 4.12 with 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 cherry picked, plus the fix patch and the debug patch you've sent previously. To make sure we're on the same page, I'll include the patch at the end. Running "mtx -f /dev/sg7 status" gives proper output with this logged to the console: [ 36.742905] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56 [ 36.750036] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136 [ 36.791673] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 37.339790] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 37.393597] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 And running "mtx -f /dev/sg7 next 0" gives the following output: [root@backup2 ~]# mtx -f /dev/sg7 next 0 Unloading drive 0 into Storage Element 1...mtx: Request Sense: Long Report=yes mtx: Request Sense: Valid Residual=no mtx: Request Sense: Error Code=0 (Unknown?!) mtx: Request Sense: Sense Key=No Sense mtx: Request Sense: FileMark=no mtx: Request Sense: EOM=no mtx: Request Sense: ILI=no mtx: Request Sense: Additional Sense Code = 00 mtx: Request Sense: Additional Sense Qualifier = 00 mtx: Request Sense: BPV=no mtx: Request Sense: Error in CDB=no mtx: Request Sense: SKSV=no MOVE MEDIUM from Element Address 1 to 1001 Failed And the following is logged to the console: [ 192.732294] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56 [ 192.739492] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136 [ 192.781507] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 193.392401] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 193.448970] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 193.495130] sg_is_valid_dxfer: dxfer_direction: -2, dxfer_len: 0 That's not any different than what I provided before, and honestly I wouldn't expect it to be. Is there something else I can log or some debugging switch I can twiddle to give you any more information? I can also try to be more available to try and avoid the timezone-induced day-long cycle time. I'm available on IRC (tibbs on freenode and oftc) and can try to stay up late or get up early or something to try and avoid this time zone mismatch. Here's what an strace of the last mtx call says: open("/dev/sg7", O_RDWR)= 3 ioctl(3, SG_GET_VERSION_NUM, [30536]) = 0 ioctl(3, SG_SET_TIMEOUT, [3]) = 0 brk(NULL) = 0x55d65f68a000 brk(0x55d65f6ab000) = 0x55d65f6ab000 brk(NULL) = 0x55d65f6ab000 ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=6, cmdp="\x12\x00\x00\x00\x38\x00", mx_sb_len=20, iovec_count=0, dxfer_len=56, timeout=3, flags=0, dxferp="\x08\x80\x05\x02\x45\x00\x00\x02\x42\x44\x54\x20\x20\x20\x20\x20\x46\x6c\x65\x78\x53\x74\x6f\x72\x20\x49\x49\x20\x20\x20\x20\x20"..., status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0 ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=6, cmdp="\x1a\x08\x1d\x00\x88\x00", mx_sb_len=20, iovec_count=0, dxfer_len=136, timeout=30, flags=0, dxferp="\x17\x00\x00\x00\x9d\x12\x00\x00\x00\x01\x03\xe9\x00\x30\x00\x65\x00\x00\x00\x01\x00\x01\x00\x00", status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, driver_status=0, resid=112, duration=61, info=0}) = 0 ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0 ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=12, cmdp="\xb8\x32\x03\xe9\x00\x30\x00\x00\x10\x90\x00\x00", mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0, dxferp="\x03\xe9\x00\x30\x00\x00\x09\xc8\x02\x80\x00\x34\x00\x00\x09\xc0\x03\xe9\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"..., status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, driver_status=0, resid=1728, duration=542, info=0}) = 0 ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0 ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=12, cmdp="\xb8\x34\x00\x01\x00\x01\x00\x00\x10\x90\x00\x00", mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0, dxferp="\x00\x01\x00\x01\x00\x00\x00\x3c\x04\x80\x00\x34\x00\x00\x00\x34\x00\x01\x09\x00\x00\x00\x00\x00\x00\x81\x03\xe9\x43\x30\x30\x30"..., status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, driver_status=0, resid=4172, duration=47, info=0}) = 0 ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0 ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=12, cmdp="\xb8\x31\x00\x00\x00\x01\x00\x00\x10\x90\x00\x00", mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0,
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
> "JT" == Johannes Thumshirn writes: JT> Yes please (on top of the snippet I've sent you last). OK, I'm at 4.12 with 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 cherry picked, plus the fix patch and the debug patch you've sent previously. To make sure we're on the same page, I'll include the patch at the end. Running "mtx -f /dev/sg7 status" gives proper output with this logged to the console: [ 36.742905] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56 [ 36.750036] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136 [ 36.791673] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 37.339790] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 37.393597] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 And running "mtx -f /dev/sg7 next 0" gives the following output: [root@backup2 ~]# mtx -f /dev/sg7 next 0 Unloading drive 0 into Storage Element 1...mtx: Request Sense: Long Report=yes mtx: Request Sense: Valid Residual=no mtx: Request Sense: Error Code=0 (Unknown?!) mtx: Request Sense: Sense Key=No Sense mtx: Request Sense: FileMark=no mtx: Request Sense: EOM=no mtx: Request Sense: ILI=no mtx: Request Sense: Additional Sense Code = 00 mtx: Request Sense: Additional Sense Qualifier = 00 mtx: Request Sense: BPV=no mtx: Request Sense: Error in CDB=no mtx: Request Sense: SKSV=no MOVE MEDIUM from Element Address 1 to 1001 Failed And the following is logged to the console: [ 192.732294] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56 [ 192.739492] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136 [ 192.781507] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 193.392401] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 193.448970] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 193.495130] sg_is_valid_dxfer: dxfer_direction: -2, dxfer_len: 0 That's not any different than what I provided before, and honestly I wouldn't expect it to be. Is there something else I can log or some debugging switch I can twiddle to give you any more information? I can also try to be more available to try and avoid the timezone-induced day-long cycle time. I'm available on IRC (tibbs on freenode and oftc) and can try to stay up late or get up early or something to try and avoid this time zone mismatch. Here's what an strace of the last mtx call says: open("/dev/sg7", O_RDWR)= 3 ioctl(3, SG_GET_VERSION_NUM, [30536]) = 0 ioctl(3, SG_SET_TIMEOUT, [3]) = 0 brk(NULL) = 0x55d65f68a000 brk(0x55d65f6ab000) = 0x55d65f6ab000 brk(NULL) = 0x55d65f6ab000 ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=6, cmdp="\x12\x00\x00\x00\x38\x00", mx_sb_len=20, iovec_count=0, dxfer_len=56, timeout=3, flags=0, dxferp="\x08\x80\x05\x02\x45\x00\x00\x02\x42\x44\x54\x20\x20\x20\x20\x20\x46\x6c\x65\x78\x53\x74\x6f\x72\x20\x49\x49\x20\x20\x20\x20\x20"..., status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0 ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=6, cmdp="\x1a\x08\x1d\x00\x88\x00", mx_sb_len=20, iovec_count=0, dxfer_len=136, timeout=30, flags=0, dxferp="\x17\x00\x00\x00\x9d\x12\x00\x00\x00\x01\x03\xe9\x00\x30\x00\x65\x00\x00\x00\x01\x00\x01\x00\x00", status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, driver_status=0, resid=112, duration=61, info=0}) = 0 ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0 ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=12, cmdp="\xb8\x32\x03\xe9\x00\x30\x00\x00\x10\x90\x00\x00", mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0, dxferp="\x03\xe9\x00\x30\x00\x00\x09\xc8\x02\x80\x00\x34\x00\x00\x09\xc0\x03\xe9\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"..., status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, driver_status=0, resid=1728, duration=542, info=0}) = 0 ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0 ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=12, cmdp="\xb8\x34\x00\x01\x00\x01\x00\x00\x10\x90\x00\x00", mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0, dxferp="\x00\x01\x00\x01\x00\x00\x00\x3c\x04\x80\x00\x34\x00\x00\x00\x34\x00\x01\x09\x00\x00\x00\x00\x00\x00\x81\x03\xe9\x43\x30\x30\x30"..., status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, driver_status=0, resid=4172, duration=47, info=0}) = 0 ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0 ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, cmd_len=12, cmdp="\xb8\x31\x00\x00\x00\x01\x00\x00\x10\x90\x00\x00", mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0,
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
> "JT" == Johannes Thumshirnwrites: JT> Jason, can you try the above? If it works and Doug doesn't respond, JT> I'm inclined yo submit this band aid. Unfortunately it doesn't appear to work for me. Maybe I'm building the wrong thing, though. I checked out 4.12, cherry picked 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 and then applied your one liner on top of that. There appears to be no change in behavior: [root@backup2 ~]# mtx -f /dev/sg7 next 0 Unloading drive 0 into Storage Element 47...mtx: Request Sense: Long Report=yes mtx: Request Sense: Valid Residual=no mtx: Request Sense: Error Code=0 (Unknown?!) mtx: Request Sense: Sense Key=No Sense mtx: Request Sense: FileMark=no mtx: Request Sense: EOM=no mtx: Request Sense: ILI=no mtx: Request Sense: Additional Sense Code = 00 mtx: Request Sense: Additional Sense Qualifier = 00 mtx: Request Sense: BPV=no mtx: Request Sense: Error in CDB=no mtx: Request Sense: SKSV=no MOVE MEDIUM from Element Address 1 to 1047 Failed I can also apply the debugging patch and try again if that would give you more useful information. - J<
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
> "JT" == Johannes Thumshirn writes: JT> Jason, can you try the above? If it works and Doug doesn't respond, JT> I'm inclined yo submit this band aid. Unfortunately it doesn't appear to work for me. Maybe I'm building the wrong thing, though. I checked out 4.12, cherry picked 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 and then applied your one liner on top of that. There appears to be no change in behavior: [root@backup2 ~]# mtx -f /dev/sg7 next 0 Unloading drive 0 into Storage Element 47...mtx: Request Sense: Long Report=yes mtx: Request Sense: Valid Residual=no mtx: Request Sense: Error Code=0 (Unknown?!) mtx: Request Sense: Sense Key=No Sense mtx: Request Sense: FileMark=no mtx: Request Sense: EOM=no mtx: Request Sense: ILI=no mtx: Request Sense: Additional Sense Code = 00 mtx: Request Sense: Additional Sense Qualifier = 00 mtx: Request Sense: BPV=no mtx: Request Sense: Error in CDB=no mtx: Request Sense: SKSV=no MOVE MEDIUM from Element Address 1 to 1047 Failed I can also apply the debugging patch and try again if that would give you more useful information. - J<
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
> "JT" == Johannes Thumshirnwrites: JT> Can you please apply this debugging patch, so I can see what's going JT> on. Sure, no problem. I generally run "mtx -f /dev/sg7 status" first just to make sure the library is there; this has always worked as expected. With the debug patch applied, this is sent to the console: [ 33.933422] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56 [ 33.940526] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136 [ 33.982429] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 34.569986] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 34.623898] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 Then running "mtx -f /dev/sg7 next 0" gives this as stdout/err: Unloading drive 0 into Storage Element 46...mtx: Request Sense: Long Report=yes mtx: Request Sense: Valid Residual=no mtx: Request Sense: Error Code=0 (Unknown?!) mtx: Request Sense: Sense Key=No Sense mtx: Request Sense: FileMark=no mtx: Request Sense: EOM=no mtx: Request Sense: ILI=no mtx: Request Sense: Additional Sense Code = 00 mtx: Request Sense: Additional Sense Qualifier = 00 mtx: Request Sense: BPV=no mtx: Request Sense: Error in CDB=no mtx: Request Sense: SKSV=no MOVE MEDIUM from Element Address 1 to 1046 Failed And this to the console: [ 45.552524] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56 [ 45.559626] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136 [ 45.603544] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 46.204614] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 46.258463] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 46.304530] sg_is_valid_dxfer: dxfer_direction: -2, dxfer_len: 0 Would you also want to see the output from that patch applied to a functioning kernel? - J<
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
> "JT" == Johannes Thumshirn writes: JT> Can you please apply this debugging patch, so I can see what's going JT> on. Sure, no problem. I generally run "mtx -f /dev/sg7 status" first just to make sure the library is there; this has always worked as expected. With the debug patch applied, this is sent to the console: [ 33.933422] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56 [ 33.940526] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136 [ 33.982429] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 34.569986] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 34.623898] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 Then running "mtx -f /dev/sg7 next 0" gives this as stdout/err: Unloading drive 0 into Storage Element 46...mtx: Request Sense: Long Report=yes mtx: Request Sense: Valid Residual=no mtx: Request Sense: Error Code=0 (Unknown?!) mtx: Request Sense: Sense Key=No Sense mtx: Request Sense: FileMark=no mtx: Request Sense: EOM=no mtx: Request Sense: ILI=no mtx: Request Sense: Additional Sense Code = 00 mtx: Request Sense: Additional Sense Qualifier = 00 mtx: Request Sense: BPV=no mtx: Request Sense: Error in CDB=no mtx: Request Sense: SKSV=no MOVE MEDIUM from Element Address 1 to 1046 Failed And this to the console: [ 45.552524] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56 [ 45.559626] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136 [ 45.603544] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 46.204614] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 46.258463] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240 [ 46.304530] sg_is_valid_dxfer: dxfer_direction: -2, dxfer_len: 0 Would you also want to see the output from that patch applied to a functioning kernel? - J<
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
I have verified that building a clean v4.12 with 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 cherry picked on top still shows the problem: [root@backup2 ~]# mtx -f /dev/sg7 next 0 Unloading drive 0 into Storage Element 45...mtx: Request Sense: Long Report=yes mtx: Request Sense: Valid Residual=no mtx: Request Sense: Error Code=0 (Unknown?!) mtx: Request Sense: Sense Key=No Sense mtx: Request Sense: FileMark=no mtx: Request Sense: EOM=no mtx: Request Sense: ILI=no mtx: Request Sense: Additional Sense Code = 00 mtx: Request Sense: Additional Sense Qualifier = 00 mtx: Request Sense: BPV=no mtx: Request Sense: Error in CDB=no mtx: Request Sense: SKSV=no MOVE MEDIUM from Element Address 1 to 1045 Failed Nothing appears to be logged; is there any kind of debugging information I can collect which might help to track this down? I'm not particularly good at this but I am pretty sure that I'm building everything properly and am actually booting the patched kernel. - J<
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
I have verified that building a clean v4.12 with 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 cherry picked on top still shows the problem: [root@backup2 ~]# mtx -f /dev/sg7 next 0 Unloading drive 0 into Storage Element 45...mtx: Request Sense: Long Report=yes mtx: Request Sense: Valid Residual=no mtx: Request Sense: Error Code=0 (Unknown?!) mtx: Request Sense: Sense Key=No Sense mtx: Request Sense: FileMark=no mtx: Request Sense: EOM=no mtx: Request Sense: ILI=no mtx: Request Sense: Additional Sense Code = 00 mtx: Request Sense: Additional Sense Qualifier = 00 mtx: Request Sense: BPV=no mtx: Request Sense: Error in CDB=no mtx: Request Sense: SKSV=no MOVE MEDIUM from Element Address 1 to 1045 Failed Nothing appears to be logged; is there any kind of debugging information I can collect which might help to track this down? I'm not particularly good at this but I am pretty sure that I'm building everything properly and am actually booting the patched kernel. - J<
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
> "JT" == Johannes Thumshirnwrites: JT> This is fixed with: commit 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 Hmm, well, I just pulled and built mainline, which does appear to contain that patch (though it wasn't there when I first started investigating this last week) and the problem is still there. I'll try building clean 4.12 and applying just that patch over the top. - J<
Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
> "JT" == Johannes Thumshirn writes: JT> This is fixed with: commit 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 Hmm, well, I just pulled and built mainline, which does appear to contain that patch (though it wasn't there when I first started investigating this last week) and the problem is still there. I'll try building clean 4.12 and applying just that patch over the top. - J<
[REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
After updating my tape backup server to 4.12 I found that mtx had issues controlling the tape library. Good behavior: [root@backup2 ~]# mtx -f /dev/sg7 next 0 Unloading drive 0 into Storage Element 4...done Loading media from Storage Element 5 into drive 0...done Bad behavior: [root@backup2 ~]# mtx -f /dev/sg7 next 0 Unloading drive 0 into Storage Element 46...mtx: Request Sense: Long Report=yes mtx: Request Sense: Valid Residual=no mtx: Request Sense: Error Code=0 (Unknown?!) mtx: Request Sense: Sense Key=No Sense mtx: Request Sense: FileMark=no mtx: Request Sense: EOM=no mtx: Request Sense: ILI=no mtx: Request Sense: Additional Sense Code = 00 mtx: Request Sense: Additional Sense Qualifier = 00 mtx: Request Sense: BPV=no mtx: Request Sense: Error in CDB=no mtx: Request Sense: SKSV=no MOVE MEDIUM from Element Address 1 to 1046 Failed This was seen on a machine running Fedora 25 as well as an Ubuntu machine. Relevant tickets: https://bugzilla.redhat.com/show_bug.cgi?id=1471302 http://bugzilla.kernel.org/show_bug.cgi?id=196375 https://bugs.launchpad.net/bugs/1704512 mtx in all cases is 1.3.12; in the Fedora case that's mtx-1.3.12-14.fc24.x86_64. I see this with an Overland Neo T48s library but the Ubuntu user had a Dell ML6000 and we both have completely different HBAs and cabling (LSI3008 SAS and qla2462 FC). I bisected this down to: commit 28676d869bbb5257b5f14c0c95ad3af3a7019dd5 Author: Johannes ThumshirnDate: Fri Apr 7 09:34:15 2017 +0200 scsi: sg: check for valid direction before starting the request Check for a valid direction before starting the request, otherwise we risk running into an assertion in the scsi midlayer checking for valid requests. [mkp: fixed typo] Signed-off-by: Johannes Thumshirn Link: http://www.spinics.net/lists/linux-scsi/msg104400.html Reported-by: Dmitry Vyukov Signed-off-by: Hannes Reinecke Tested-by: Johannes Thumshirn Reviewed-by: Christoph Hellwig Signed-off-by: Martin K. Petersen and confirmed that clean unpatched 4.12 shows the problem, while reverting just that patch fixes the issue. Unfortunately I don't know enough to actually fix this, but I can easily test patches. - J<
[REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control
After updating my tape backup server to 4.12 I found that mtx had issues controlling the tape library. Good behavior: [root@backup2 ~]# mtx -f /dev/sg7 next 0 Unloading drive 0 into Storage Element 4...done Loading media from Storage Element 5 into drive 0...done Bad behavior: [root@backup2 ~]# mtx -f /dev/sg7 next 0 Unloading drive 0 into Storage Element 46...mtx: Request Sense: Long Report=yes mtx: Request Sense: Valid Residual=no mtx: Request Sense: Error Code=0 (Unknown?!) mtx: Request Sense: Sense Key=No Sense mtx: Request Sense: FileMark=no mtx: Request Sense: EOM=no mtx: Request Sense: ILI=no mtx: Request Sense: Additional Sense Code = 00 mtx: Request Sense: Additional Sense Qualifier = 00 mtx: Request Sense: BPV=no mtx: Request Sense: Error in CDB=no mtx: Request Sense: SKSV=no MOVE MEDIUM from Element Address 1 to 1046 Failed This was seen on a machine running Fedora 25 as well as an Ubuntu machine. Relevant tickets: https://bugzilla.redhat.com/show_bug.cgi?id=1471302 http://bugzilla.kernel.org/show_bug.cgi?id=196375 https://bugs.launchpad.net/bugs/1704512 mtx in all cases is 1.3.12; in the Fedora case that's mtx-1.3.12-14.fc24.x86_64. I see this with an Overland Neo T48s library but the Ubuntu user had a Dell ML6000 and we both have completely different HBAs and cabling (LSI3008 SAS and qla2462 FC). I bisected this down to: commit 28676d869bbb5257b5f14c0c95ad3af3a7019dd5 Author: Johannes Thumshirn Date: Fri Apr 7 09:34:15 2017 +0200 scsi: sg: check for valid direction before starting the request Check for a valid direction before starting the request, otherwise we risk running into an assertion in the scsi midlayer checking for valid requests. [mkp: fixed typo] Signed-off-by: Johannes Thumshirn Link: http://www.spinics.net/lists/linux-scsi/msg104400.html Reported-by: Dmitry Vyukov Signed-off-by: Hannes Reinecke Tested-by: Johannes Thumshirn Reviewed-by: Christoph Hellwig Signed-off-by: Martin K. Petersen and confirmed that clean unpatched 4.12 shows the problem, while reverting just that patch fixes the issue. Unfortunately I don't know enough to actually fix this, but I can easily test patches. - J<
Commit 6016af "[media] v4l2: use __u32 rather than enums in ioctl() structs" breaks C++ users of V4L2
I ran into problems compiling the program ZoneMinder on Fedora rawhide (currently using something around 3.5rc6) which do not appear with 3.4 kernels. With help this was traced to commit 6016af82eafcb6e086a8f2a2197b46029a843d68, "[media] v4l2: use __u32 rather than enums in ioctl() structs" which changed videodev2.h in a way which appears to be incompatible with C++. This results in code such as the following: enum v4l2_buf_type type = v4l2_data.fmt.type; failing to compile with: zm_local_camera.cpp:1523:49: error: invalid conversion from '__u32 {aka unsigned int}' to 'v4l2_buf_type' [-fpermissive] but only when compiled with the headers from a 3.5 kernel. I'm very far from a C++ expert. I talked with some people who do grok it and the issue comes down to restrictions on assignments of ints to enums and additionally that enums in C++ don't have defined size. - J< -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Commit 6016af [media] v4l2: use __u32 rather than enums in ioctl() structs breaks C++ users of V4L2
I ran into problems compiling the program ZoneMinder on Fedora rawhide (currently using something around 3.5rc6) which do not appear with 3.4 kernels. With help this was traced to commit 6016af82eafcb6e086a8f2a2197b46029a843d68, [media] v4l2: use __u32 rather than enums in ioctl() structs which changed videodev2.h in a way which appears to be incompatible with C++. This results in code such as the following: enum v4l2_buf_type type = v4l2_data.fmt.type; failing to compile with: zm_local_camera.cpp:1523:49: error: invalid conversion from '__u32 {aka unsigned int}' to 'v4l2_buf_type' [-fpermissive] but only when compiled with the headers from a 3.5 kernel. I'm very far from a C++ expert. I talked with some people who do grok it and the issue comes down to restrictions on assignments of ints to enums and additionally that enums in C++ don't have defined size. - J -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major regression on hackbench with SLUB (more numbers)
> "IM" == Ingo Molnar <[EMAIL PROTECTED]> writes: IM> Distros will likely pick SLUB if there's no performance worries IM> and if it's the default. Fedora rawhide already uses SLUB. Actually, it seems to me that not only does Fedora rawhide use SLUB, but Fedora 8 and 7 use it as well. They don't have /proc/slabinfo and they all seem to have CONFIG_SLUB=y: > grep -r CONFIG_SLUB=y kernel kernel/devel/config-generic:CONFIG_SLUB=y kernel/F-7/configs/config-generic:CONFIG_SLUB=y kernel/F-8/config-generic:CONFIG_SLUB=y - J< -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Major regression on hackbench with SLUB (more numbers)
IM == Ingo Molnar [EMAIL PROTECTED] writes: IM Distros will likely pick SLUB if there's no performance worries IM and if it's the default. Fedora rawhide already uses SLUB. Actually, it seems to me that not only does Fedora rawhide use SLUB, but Fedora 8 and 7 use it as well. They don't have /proc/slabinfo and they all seem to have CONFIG_SLUB=y: grep -r CONFIG_SLUB=y kernel kernel/devel/config-generic:CONFIG_SLUB=y kernel/F-7/configs/config-generic:CONFIG_SLUB=y kernel/F-8/config-generic:CONFIG_SLUB=y - J -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ARC-1260: No space left on device, when there is (or should be) free space left
> "MN" == Magnus Naeslund <[EMAIL PROTECTED]> writes: MN> Inode count: 1430528 That's stunningly small; I created an 8TB partition here and made an ext3 filesystem in it with the default parameters; I got nearly 1000 times as many inodes as you have: Inode count: 1072414720 Block count: 2144799744 - J< - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ARC-1260: No space left on device, when there is (or should be) free space left
MN == Magnus Naeslund [EMAIL PROTECTED] writes: MN Inode count: 1430528 That's stunningly small; I created an 8TB partition here and made an ext3 filesystem in it with the default parameters; I got nearly 1000 times as many inodes as you have: Inode count: 1072414720 Block count: 2144799744 - J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] PM: Separate hibernation code from suspend code
> "AM" == Andrew Morton <[EMAIL PROTECTED]> writes: AM> This causes the long-suffering Vaio to fail to power off during AM> suspend to disk. It says "Please power me down manually". Interesting; I'm seeing exactly the same thing on a Vaio TXN17P, which popped up somewhere between Fedora's 2.6.21-1.2116 and 2.6.21-1.2125 kernels, but the changelog there shows little more than wireless fixes and the 2.6.21.1 patch. AM> However `halt -p' still works OK. The same for me. It's odd to think that identical symptoms are caused by something completely different, but I guess this is the kernel. - J< - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH -mm] PM: Separate hibernation code from suspend code
AM == Andrew Morton [EMAIL PROTECTED] writes: AM This causes the long-suffering Vaio to fail to power off during AM suspend to disk. It says Please power me down manually. Interesting; I'm seeing exactly the same thing on a Vaio TXN17P, which popped up somewhere between Fedora's 2.6.21-1.2116 and 2.6.21-1.2125 kernels, but the changelog there shows little more than wireless fixes and the 2.6.21.1 patch. AM However `halt -p' still works OK. The same for me. It's odd to think that identical symptoms are caused by something completely different, but I guess this is the kernel. - J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch] Support UTF-8 scripts
> "LR" == Lee Revell <[EMAIL PROTECTED]> writes: LR> Is Larry smoking crack? That's one of the worst ideas I've heard LR> in a long time. There's no easy way to enter those at the LR> keyboard! I know folks enjoy trashing Perl these days, but it's not justified in this case. From the Perl6-Bible - http://search.cpan.org/dist/Perl6-Bible/lib/Perl6/Bible/S03.pod: For those still living without the blessings of Unicode, that can also be written: << ... >>. - J< - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch] Support UTF-8 scripts
LR == Lee Revell [EMAIL PROTECTED] writes: LR Is Larry smoking crack? That's one of the worst ideas I've heard LR in a long time. There's no easy way to enter those at the LR keyboard! I know folks enjoy trashing Perl these days, but it's not justified in this case. From the Perl6-Bible - http://search.cpan.org/dist/Perl6-Bible/lib/Perl6/Bible/S03.pod: For those still living without the blessings of Unicode, that can also be written: ... . - J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: usb hotplug problems with 2.6.10
> "JH" == Jack Howarth <[EMAIL PROTECTED]> writes: JH> Alan, I had mentioned a couple weeks back that with kernel 2.6.10, JH> the ability to hotplug usb keys in Fedora Core 2 and 3 has been JH> broken. The true issue is a little more complicated than that. The kernel issue here is that under 2.6.9 (or Red Hat's versions thereof) usb-storage devices would be picked up by the SCSI layer immediately after the usb-storage module became aware of them. Under Red Hat's 2.6.10 (which currently includes -ac11) there is a delay of five or ten seconds between the device appearing in /sys/bus/usb/drivers/usb-storage and it showing up as a SCSI device. This delay has broken some stuff. Yes, Red Hat's userland hotplugging bits made incorrect assumptions and should be fixed. The question is whether this kernel delay is intended and, if not, how to fix it. Even after I've hacked around the userland problems I still have people asking why it takes so long for their USB keys to show up. - J< - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: usb hotplug problems with 2.6.10
JH == Jack Howarth [EMAIL PROTECTED] writes: JH Alan, I had mentioned a couple weeks back that with kernel 2.6.10, JH the ability to hotplug usb keys in Fedora Core 2 and 3 has been JH broken. The true issue is a little more complicated than that. The kernel issue here is that under 2.6.9 (or Red Hat's versions thereof) usb-storage devices would be picked up by the SCSI layer immediately after the usb-storage module became aware of them. Under Red Hat's 2.6.10 (which currently includes -ac11) there is a delay of five or ten seconds between the device appearing in /sys/bus/usb/drivers/usb-storage and it showing up as a SCSI device. This delay has broken some stuff. Yes, Red Hat's userland hotplugging bits made incorrect assumptions and should be fixed. The question is whether this kernel delay is intended and, if not, how to fix it. Even after I've hacked around the userland problems I still have people asking why it takes so long for their USB keys to show up. - J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3ware driver (3w-xxxx) in 2.6.10: procfs entry
> "PD" == Peter Daum <[EMAIL PROTECTED]> writes: PD> At least on my 8506-controllers there are also some minor problems PD> (e.g. "info" doesn't work during a verify) which I thought was due PD> to the fact that the program is intended exclusively for PD> 9000-controllers. You should report the problems you find to them. They do indicate (in the knowledge base on their web site) that you're going to need the in-engineering files to run on the latest kernels. It's only recently that the newer tools acquired the ability to control older controllers. According to a recent post from a 3ware employee on linux-ide-arrays, a proper release is expected in February. Obviously the best solution is that they just give us the source to these tools so that we can fix them ourselves. Knowing that isn't going to happen I'm happy they're at least giving us something while they catch up with the speed of kernel progress. I can verify the fact that info is busted when the controller is verifying the array; I'll gather some more info and pass this on to 3ware. - J< - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3ware driver (3w-xxxx) in 2.6.10: procfs entry
PD == Peter Daum [EMAIL PROTECTED] writes: PD At least on my 8506-controllers there are also some minor problems PD (e.g. info doesn't work during a verify) which I thought was due PD to the fact that the program is intended exclusively for PD 9000-controllers. You should report the problems you find to them. They do indicate (in the knowledge base on their web site) that you're going to need the in-engineering files to run on the latest kernels. It's only recently that the newer tools acquired the ability to control older controllers. According to a recent post from a 3ware employee on linux-ide-arrays, a proper release is expected in February. Obviously the best solution is that they just give us the source to these tools so that we can fix them ourselves. Knowing that isn't going to happen I'm happy they're at least giving us something while they catch up with the speed of kernel progress. I can verify the fact that info is busted when the controller is verifying the array; I'll gather some more info and pass this on to 3ware. - J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/