Re: Regression in 5.1.20: Reading long directory fails

2019-09-06 Thread Jason L Tibbitts III
> "JBF" == J Bruce Fields  writes:

JBF> Those readdir changes were client-side, right?  Based on that I'd
JBF> been assuming a client bug, but maybe it'd be worth getting a full
JBF> packet capture of the readdir reply to make sure it's legit.

I have been working with bcodding on IRC for the past couple of days on
this.  Fortunately I was able to come up with way to fill up a directory
in such a way that it will fail with certainty and as a bonus doesn't
include any user data so I can feel OK about sharing packet captures.  I
have a capture alongside a kernel trace of the problematic operation in
https://www.math.uh.edu/~tibbs/nfs/.  Not that I can particularly tell
anything useful from that, but bcodding says that it seems to point to
some issue in sunrpc.

And because I can easily reproduce this and I was able to do a bisect:

2c94b8eca1a26cd46010d6e73a23da5f2e93a19d is the first bad commit
commit 2c94b8eca1a26cd46010d6e73a23da5f2e93a19d
Author: Chuck Lever 
Date:   Mon Feb 11 11:25:41 2019 -0500

SUNRPC: Use au_rslack when computing reply buffer size

au_rslack is significantly smaller than (au_cslack << 2). Using
that value results in smaller receive buffers. In some cases this
eliminates an extra segment in Reply chunks (RPC/RDMA).

Signed-off-by: Chuck Lever 
Signed-off-by: Anna Schumaker 

:04 04 d4d1ce2fbe0035c5bd9df976b8c448df85dcb505 
7011a792dfe72ff9cd70d66e45d353f3d7817e3e M  net

But of course, I can't say whether this is the actual bad commit or
whether it just introduced a behavior change which alters the conditions
under which the problem appears.

And just to make sure that the blame doesn't lie with the old RHEL7
kernel, I rsynced over the problematic directory to a machine running
something slightly more modern (5.1.11, which I know I need to update,
but it's already set up to do kerberised NFS) and the same problem
exists, though the directory listing does fail at a different place.

 - J<


Re: Regression in 5.1.20: Reading long directory fails

2019-09-03 Thread Jason L Tibbitts III
I asked the XFS folks who mentioned that the issues with 64 bit inodes
are old, constrained to larger filesystems than what I'm using, not an
issue with nfsv4, and not present on anything but 32bit clients with old
userspace.

In any case, I have been experimenting a bit and somehow the issue seems
to be related to exporting with sec=krb5i:krb5p or sec=krb5i.  If I
export with just sec=krb5p, things magically begin to work.

So basically:

[root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts
7685
nas00:/export/misc-00/tester /home/tester nfs4 
rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77
 0 0

(unmount, then re-export with krb5i on the server)

[root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts
ls: reading directory '/home/tester': Input/output error
5623
nas00:/export/misc-00/tester /home/tester nfs4 
rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5i,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77
 0 0

(umount, then re-export with krb5i:krb5p on the server)

[root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts
ls: reading directory '/home/tester': Input/output error
5623
nas00:/export/misc-00/tester /home/tester nfs4 
rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5i,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77
 0 0

(umount, switch back to plain krb5p)

[root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts
7685
nas00:/export/misc-00/tester /home/tester nfs4 
rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77
 0 0

Sometimes the number of files it lists before it fails changes (and in
this case has been as small as a few hundred) but I don't know what
causes it to change.

Anyway, I hope this helps to pinpoint the problem.  I now have a really
easy way to reproduce this without having to kick people off of the
server, and if the successes aren't just some kind of false positives
then I guess I also have a workaround.  I'm still at a loss as to why a
revert of the readdir changes makes any difference at all here.

 - J<


Re: Regression in 5.1.20: Reading long directory fails

2019-09-03 Thread Jason L Tibbitts III
> "WW" == Wolfgang Walter  writes:

WW> What filesystem do you use on the server? xfs?

Yeah, it's XFS.

WW> If yes, does it use 64bit inodes (or started to use them)?

These filesystems aren't super old, and were all created with the
default RHEL7 options.  I'm not sure how to check that 64 bit inodes are
being used, though.  xfs_info says:

meta-data=/dev/mapper/nas-faculty--08 isize=256agcount=4, agsize=3276800 
blks
 =   sectsz=512   attr=2, projid32bit=1
 =   crc=0finobt=0 spinodes=0
data =   bsize=4096   blocks=13107200, imaxpct=25
 =   sunit=0  swidth=0 blks
naming   =version 2  bsize=4096   ascii-ci=0 ftype=0
log  =internal   bsize=4096   blocks=6400, version=2
 =   sectsz=512   sunit=0 blks, lazy-count=1
realtime =none   extsz=4096   blocks=0, rtextents=0

WW> Do you set a fsid when you export the filesystem?

I have never done so on any server.

And note that the servers are basically unchanged for quite some time,
while the problem I'm having is new.  I want to find some server-related
cause for this but so far I haven't been able to do so.  It seems my
best option now seems to be to migrate all data off of this server and
then wipe, reinstall and see if the problem reoccurs.

 - J<


Re: Regression in 5.1.20: Reading long directory fails

2019-09-03 Thread Jason L Tibbitts III
> "JLT" == Jason L Tibbitts  writes:

JLT> Certainly a server reboot, or maybe even just
JLT> unmounting and remounting the filesystem or copying the data to
JLT> another filesystem would tell me that.  In any case, as soon as I
JLT> am able to mess with that server, I'll know more.

Rebooting the server did not make any difference, and now more users are
seeing the problem.  At this point I'm in a state where NFS simply isn't
reliable at all, and I'm not sure what to do.  If Centos 8 were out,
I'd work on moving to that just so that the server was a little more
modern.  (Currently the server is Centos 7.)  I guess I could try using
Fedora, or installing one of the upstream kernels, just in case this has
to do with some interaction between the client and the old RHEL7 kernel.

I do have a packet capture of a directory listing that fails with EIO,
but I'm not sure if it's safe to simply post it, and I'm not sure what
tshark options would be useful in decoding it.

I do know that I can rsync one of the problematic directories to a
different server (running the same kernel) and it doesn't have the same
problem.  What I'll try next is rsyncing to a different filesystem on
the same server, but again I'll have to wait until people log off to do
proper testing.

 - J<


Re: Regression in 5.1.20: Reading long directory fails

2019-08-28 Thread Jason L Tibbitts III
> "BF" == J Bruce Fields  writes:

BF> Looks like that's db531db951f950b8 upstream.  (Do you know if it's
BF> reproduceable upstream as well?)

Yes, it's reproducible up in the 5.3.0 RCs as well.

However, while trying to do some further bisecting I ran into an odd
problem.  Now kernels which were previously working (i.e. 5.1.19 and
older) are returning errors, but at a different file count.  This only
gives me more questions.  And so, just to be absolutely sure that there
isn't some weird server issue involved, I'm going to try to schedule a
reboot of the relevant server.

BF> Maybe it depends on having names of the right length to place some
BF> bit of xdr on a boundary.  I wonder if it'd be possible to reproduce
BF> just by varying the name lengths randomly till you hit it.

I know I can't reproduce with loads of short names, and with relatively
long names as well (using sha256sum as filename generator).

BF> No clever debugging ideas off the top of my head, I'm afraid.  I
BF> might start by patching the kernel or doing some tracing to figure
BF> out exactly where that EIO is being generated?

If I had any idea how to do that, I happily would.  I'm certainly
willing to learn.  At least I can run strace to see where ls bombs:

getdents64(5, 0x7fc13afaf040, 262144)   = -1 EIO (Input/output error)

bcodding on IRC mentioned that is a rather large count.  Does make me
wonder if the server is weirding out and sending the client bogus data.
Certainly a server reboot, or maybe even just unmounting and remounting
the filesystem or copying the data to another filesystem would tell me
that.  In any case, as soon as I am able to mess with that server, I'll
know more.

 _ J<


Re: Regression in 5.1.20: Reading long directory fails

2019-08-22 Thread Jason L Tibbitts III
I now have another user reporting the same failure of readdir on a long
directory which showed up in 5.1.20 and was traced to
3536b79ba75ba44b9ac1a9f1634f2e833bbb735c.  I'm not sure what to do to
get more traction besides reposting and adding some addresses to the CC
list.  If there is any information I can provide which might help to get
to the bottom of this, please let me know.

To recap:

5.1.20 introduced a regression reading some large directories.  In this
case, the directory should have 7800 files or so in it:

[root@ld00 ~]# ls -l ~dblecher|wc -l
ls: reading directory '/home/dblecher': Input/output error
1844
[root@ld00 ~]# cat /proc/version Linux version 5.1.20-300.fc30.x86_64 
(mockbu...@bkernel04.phx2.fedoraproject.org) (gcc version 9.1.1 20190503 (Red 
Hat 9.1.1-1) (GCC)) #1 SMP Fri Jul 26 15:03:11 UTC 2019

(The server is a Centos 7 machine running kernel 3.10.0-957.12.2.el7.x86_64.)

Building a kernel which reverts commit 3536b79ba75ba44b9ac1a9f1634f2e833bbb735c:
  Revert "NFS: readdirplus optimization by cache mechanism" (memleak)
fixes the issue, but of course that revert was fixing a real issue so
I'm not sure what to do.

I can trivially reproduce this by simply trying to list the problematic
directories but I'm not sure how to construct such a directory; simply
creating 1 files doesn't cause the problem for me.  I am willing to
test patches and can build my own kernels, and I'm happy to provide any
debugging information you might require.  Unfortunately I don't know
enough to dig in and figure out for myself what's going wrong.

I did file https://bugzilla.redhat.com/show_bug.cgi?id=1740954 just to
have this in a bug tracker somewhere.  I'm happy to file one somewhere
else if that would help.

 - J<


Re: [PATCH] scsi: sg: only check for dxfer_len greater than 256M

2017-07-27 Thread Jason L Tibbitts III
> "MKP" == Martin K Petersen  writes:

MKP> Applied to 4.13/scsi-fixes. Thanks!

My thanks as well to everyone who helped in getting this fixed.

 - J<


Re: [PATCH] scsi: sg: only check for dxfer_len greater than 256M

2017-07-27 Thread Jason L Tibbitts III
> "MKP" == Martin K Petersen  writes:

MKP> Applied to 4.13/scsi-fixes. Thanks!

My thanks as well to everyone who helped in getting this fixed.

 - J<


Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-26 Thread Jason L Tibbitts III
> "JT" == Johannes Thumshirn  writes:

JT> It's probably best to just check for dxfer_len <= 2^28 to be valid
JT> as Doug suggested:

I can verify that patch on top of git head (as of a few hours ago) does
function properly.

It didn't apply directly on top of 4.12 but even I can handle fixing
that up.  The result (just deleting the function and changing the call
to a check for hp->dxfer_len >= SZ_256M) works fine and is at the end.

So thanks.  If this goes in, please CC to stable.

 - J<

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 82c33a6..aa6f1de 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -751,29 +751,6 @@ sg_new_write(Sg_fd *sfp, struct file *file, const char 
__user *buf,
return count;
 }
 
-static bool sg_is_valid_dxfer(sg_io_hdr_t *hp)
-{
-   switch (hp->dxfer_direction) {
-   case SG_DXFER_NONE:
-   if (hp->dxferp || hp->dxfer_len > 0)
-   return false;
-   return true;
-   case SG_DXFER_TO_DEV:
-   case SG_DXFER_FROM_DEV:
-   case SG_DXFER_TO_FROM_DEV:
-   if (!hp->dxferp || hp->dxfer_len == 0)
-   return false;
-   return true;
-   case SG_DXFER_UNKNOWN:
-   if ((!hp->dxferp && hp->dxfer_len) ||
-   (hp->dxferp && hp->dxfer_len == 0))
-   return false;
-   return true;
-   default:
-   return false;
-   }
-}
-
 static int
 sg_common_write(Sg_fd * sfp, Sg_request * srp,
unsigned char *cmnd, int timeout, int blocking)
@@ -794,7 +771,7 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp,
"sg_common_write:  scsi opcode=0x%02x, cmd_size=%d\n",
(int) cmnd[0], (int) hp->cmd_len));
 
-   if (!sg_is_valid_dxfer(hp))
+   if (hp->dxfer_len >= SZ_256M)
return -EINVAL;
 
k = sg_start_req(srp, cmnd);


Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-26 Thread Jason L Tibbitts III
> "JT" == Johannes Thumshirn  writes:

JT> It's probably best to just check for dxfer_len <= 2^28 to be valid
JT> as Doug suggested:

I can verify that patch on top of git head (as of a few hours ago) does
function properly.

It didn't apply directly on top of 4.12 but even I can handle fixing
that up.  The result (just deleting the function and changing the call
to a check for hp->dxfer_len >= SZ_256M) works fine and is at the end.

So thanks.  If this goes in, please CC to stable.

 - J<

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 82c33a6..aa6f1de 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -751,29 +751,6 @@ sg_new_write(Sg_fd *sfp, struct file *file, const char 
__user *buf,
return count;
 }
 
-static bool sg_is_valid_dxfer(sg_io_hdr_t *hp)
-{
-   switch (hp->dxfer_direction) {
-   case SG_DXFER_NONE:
-   if (hp->dxferp || hp->dxfer_len > 0)
-   return false;
-   return true;
-   case SG_DXFER_TO_DEV:
-   case SG_DXFER_FROM_DEV:
-   case SG_DXFER_TO_FROM_DEV:
-   if (!hp->dxferp || hp->dxfer_len == 0)
-   return false;
-   return true;
-   case SG_DXFER_UNKNOWN:
-   if ((!hp->dxferp && hp->dxfer_len) ||
-   (hp->dxferp && hp->dxfer_len == 0))
-   return false;
-   return true;
-   default:
-   return false;
-   }
-}
-
 static int
 sg_common_write(Sg_fd * sfp, Sg_request * srp,
unsigned char *cmnd, int timeout, int blocking)
@@ -794,7 +771,7 @@ sg_common_write(Sg_fd * sfp, Sg_request * srp,
"sg_common_write:  scsi opcode=0x%02x, cmd_size=%d\n",
(int) cmnd[0], (int) hp->cmd_len));
 
-   if (!sg_is_valid_dxfer(hp))
+   if (hp->dxfer_len >= SZ_256M)
return -EINVAL;
 
k = sg_start_req(srp, cmnd);


Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-25 Thread Jason L Tibbitts III
> "JT" == Johannes Thumshirn  writes:

JT> Yes please (on top of the snippet I've sent you last).

OK, I'm at 4.12 with 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 cherry
picked, plus the fix patch and the debug patch you've sent previously.
To make sure we're on the same page, I'll include the patch at the end.

Running "mtx -f /dev/sg7 status" gives proper output with this logged to
the console:

[   36.742905] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56
[   36.750036] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136
[   36.791673] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   37.339790] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   37.393597] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240

And running "mtx -f /dev/sg7 next 0" gives the following output:

[root@backup2 ~]# mtx -f /dev/sg7 next 0
Unloading drive 0 into Storage Element 1...mtx: Request Sense: Long
Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1 to 1001 Failed

And the following is logged to the console:

[  192.732294] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56
[  192.739492] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136
[  192.781507] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[  193.392401] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[  193.448970] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[  193.495130] sg_is_valid_dxfer: dxfer_direction: -2, dxfer_len: 0

That's not any different than what I provided before, and honestly I
wouldn't expect it to be.

Is there something else I can log or some debugging switch I can twiddle
to give you any more information?  I can also try to be more available
to try and avoid the timezone-induced day-long cycle time.  I'm
available on IRC (tibbs on freenode and oftc) and can try to stay up
late or get up early or something to try and avoid this time zone
mismatch.

Here's what an strace of the last mtx call says:

open("/dev/sg7", O_RDWR)= 3
ioctl(3, SG_GET_VERSION_NUM, [30536])   = 0
ioctl(3, SG_SET_TIMEOUT, [3])   = 0
brk(NULL)   = 0x55d65f68a000
brk(0x55d65f6ab000) = 0x55d65f6ab000
brk(NULL)   = 0x55d65f6ab000
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, 
cmd_len=6, cmdp="\x12\x00\x00\x00\x38\x00", mx_sb_len=20, iovec_count=0, 
dxfer_len=56, timeout=3, flags=0, 
dxferp="\x08\x80\x05\x02\x45\x00\x00\x02\x42\x44\x54\x20\x20\x20\x20\x20\x46\x6c\x65\x78\x53\x74\x6f\x72\x20\x49\x49\x20\x20\x20\x20\x20"...,
 status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, 
driver_status=0, resid=0, duration=1, info=0}) = 0
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, 
cmd_len=6, cmdp="\x1a\x08\x1d\x00\x88\x00", mx_sb_len=20, iovec_count=0, 
dxfer_len=136, timeout=30, flags=0, 
dxferp="\x17\x00\x00\x00\x9d\x12\x00\x00\x00\x01\x03\xe9\x00\x30\x00\x65\x00\x00\x00\x01\x00\x01\x00\x00",
 status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, 
driver_status=0, resid=112, duration=61, info=0}) = 0
ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, 
cmd_len=12, cmdp="\xb8\x32\x03\xe9\x00\x30\x00\x00\x10\x90\x00\x00", 
mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0, 
dxferp="\x03\xe9\x00\x30\x00\x00\x09\xc8\x02\x80\x00\x34\x00\x00\x09\xc0\x03\xe9\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"...,
 status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, 
driver_status=0, resid=1728, duration=542, info=0}) = 0
ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, 
cmd_len=12, cmdp="\xb8\x34\x00\x01\x00\x01\x00\x00\x10\x90\x00\x00", 
mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0, 
dxferp="\x00\x01\x00\x01\x00\x00\x00\x3c\x04\x80\x00\x34\x00\x00\x00\x34\x00\x01\x09\x00\x00\x00\x00\x00\x00\x81\x03\xe9\x43\x30\x30\x30"...,
 status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, 
driver_status=0, resid=4172, duration=47, info=0}) = 0
ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, 
cmd_len=12, cmdp="\xb8\x31\x00\x00\x00\x01\x00\x00\x10\x90\x00\x00", 
mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0, 

Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-25 Thread Jason L Tibbitts III
> "JT" == Johannes Thumshirn  writes:

JT> Yes please (on top of the snippet I've sent you last).

OK, I'm at 4.12 with 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 cherry
picked, plus the fix patch and the debug patch you've sent previously.
To make sure we're on the same page, I'll include the patch at the end.

Running "mtx -f /dev/sg7 status" gives proper output with this logged to
the console:

[   36.742905] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56
[   36.750036] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136
[   36.791673] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   37.339790] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   37.393597] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240

And running "mtx -f /dev/sg7 next 0" gives the following output:

[root@backup2 ~]# mtx -f /dev/sg7 next 0
Unloading drive 0 into Storage Element 1...mtx: Request Sense: Long
Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1 to 1001 Failed

And the following is logged to the console:

[  192.732294] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56
[  192.739492] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136
[  192.781507] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[  193.392401] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[  193.448970] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[  193.495130] sg_is_valid_dxfer: dxfer_direction: -2, dxfer_len: 0

That's not any different than what I provided before, and honestly I
wouldn't expect it to be.

Is there something else I can log or some debugging switch I can twiddle
to give you any more information?  I can also try to be more available
to try and avoid the timezone-induced day-long cycle time.  I'm
available on IRC (tibbs on freenode and oftc) and can try to stay up
late or get up early or something to try and avoid this time zone
mismatch.

Here's what an strace of the last mtx call says:

open("/dev/sg7", O_RDWR)= 3
ioctl(3, SG_GET_VERSION_NUM, [30536])   = 0
ioctl(3, SG_SET_TIMEOUT, [3])   = 0
brk(NULL)   = 0x55d65f68a000
brk(0x55d65f6ab000) = 0x55d65f6ab000
brk(NULL)   = 0x55d65f6ab000
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, 
cmd_len=6, cmdp="\x12\x00\x00\x00\x38\x00", mx_sb_len=20, iovec_count=0, 
dxfer_len=56, timeout=3, flags=0, 
dxferp="\x08\x80\x05\x02\x45\x00\x00\x02\x42\x44\x54\x20\x20\x20\x20\x20\x46\x6c\x65\x78\x53\x74\x6f\x72\x20\x49\x49\x20\x20\x20\x20\x20"...,
 status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, 
driver_status=0, resid=0, duration=1, info=0}) = 0
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, 
cmd_len=6, cmdp="\x1a\x08\x1d\x00\x88\x00", mx_sb_len=20, iovec_count=0, 
dxfer_len=136, timeout=30, flags=0, 
dxferp="\x17\x00\x00\x00\x9d\x12\x00\x00\x00\x01\x03\xe9\x00\x30\x00\x65\x00\x00\x00\x01\x00\x01\x00\x00",
 status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, 
driver_status=0, resid=112, duration=61, info=0}) = 0
ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, 
cmd_len=12, cmdp="\xb8\x32\x03\xe9\x00\x30\x00\x00\x10\x90\x00\x00", 
mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0, 
dxferp="\x03\xe9\x00\x30\x00\x00\x09\xc8\x02\x80\x00\x34\x00\x00\x09\xc0\x03\xe9\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"...,
 status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, 
driver_status=0, resid=1728, duration=542, info=0}) = 0
ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, 
cmd_len=12, cmdp="\xb8\x34\x00\x01\x00\x01\x00\x00\x10\x90\x00\x00", 
mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0, 
dxferp="\x00\x01\x00\x01\x00\x00\x00\x3c\x04\x80\x00\x34\x00\x00\x00\x34\x00\x01\x09\x00\x00\x00\x00\x00\x00\x81\x03\xe9\x43\x30\x30\x30"...,
 status=0, masked_status=0, msg_status=0, sb_len_wr=0, sbp="", host_status=0, 
driver_status=0, resid=4172, duration=47, info=0}) = 0
ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0x7ffdbfaee7a0) = 0
ioctl(3, SG_IO, {interface_id='S', dxfer_direction=SG_DXFER_FROM_DEV, 
cmd_len=12, cmdp="\xb8\x31\x00\x00\x00\x01\x00\x00\x10\x90\x00\x00", 
mx_sb_len=20, iovec_count=0, dxfer_len=4240, timeout=30, flags=0, 

Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-21 Thread Jason L Tibbitts III
> "JT" == Johannes Thumshirn  writes:

JT> Jason, can you try the above? If it works and Doug doesn't respond,
JT> I'm inclined yo submit this band aid.

Unfortunately it doesn't appear to work for me.  Maybe I'm building the
wrong thing, though.  I checked out 4.12, cherry picked
68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 and then applied your one liner
on top of that.  There appears to be no change in behavior:

[root@backup2 ~]# mtx -f /dev/sg7 next 0
Unloading drive 0 into Storage Element 47...mtx: Request Sense: Long
Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1 to 1047 Failed

I can also apply the debugging patch and try again if that would give
you more useful information.

 - J<


Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-21 Thread Jason L Tibbitts III
> "JT" == Johannes Thumshirn  writes:

JT> Jason, can you try the above? If it works and Doug doesn't respond,
JT> I'm inclined yo submit this band aid.

Unfortunately it doesn't appear to work for me.  Maybe I'm building the
wrong thing, though.  I checked out 4.12, cherry picked
68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 and then applied your one liner
on top of that.  There appears to be no change in behavior:

[root@backup2 ~]# mtx -f /dev/sg7 next 0
Unloading drive 0 into Storage Element 47...mtx: Request Sense: Long
Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1 to 1047 Failed

I can also apply the debugging patch and try again if that would give
you more useful information.

 - J<


Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-19 Thread Jason L Tibbitts III
> "JT" == Johannes Thumshirn  writes:

JT> Can you please apply this debugging patch, so I can see what's going
JT> on.

Sure, no problem.

I generally run "mtx -f /dev/sg7 status" first just to make sure the
library is there; this has always worked as expected.  With the debug
patch applied, this is sent to the console:
[   33.933422] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56
[   33.940526] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136
[   33.982429] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   34.569986] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   34.623898] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240

Then running "mtx -f /dev/sg7 next 0" gives this as stdout/err:

Unloading drive 0 into Storage Element 46...mtx: Request Sense: Long
Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1 to 1046 Failed

And this to the console:
[   45.552524] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56
[   45.559626] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136
[   45.603544] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   46.204614] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   46.258463] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   46.304530] sg_is_valid_dxfer: dxfer_direction: -2, dxfer_len: 0

Would you also want to see the output from that patch applied to a
functioning kernel?

 - J<


Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-19 Thread Jason L Tibbitts III
> "JT" == Johannes Thumshirn  writes:

JT> Can you please apply this debugging patch, so I can see what's going
JT> on.

Sure, no problem.

I generally run "mtx -f /dev/sg7 status" first just to make sure the
library is there; this has always worked as expected.  With the debug
patch applied, this is sent to the console:
[   33.933422] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56
[   33.940526] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136
[   33.982429] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   34.569986] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   34.623898] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240

Then running "mtx -f /dev/sg7 next 0" gives this as stdout/err:

Unloading drive 0 into Storage Element 46...mtx: Request Sense: Long
Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1 to 1046 Failed

And this to the console:
[   45.552524] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 56
[   45.559626] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 136
[   45.603544] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   46.204614] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   46.258463] sg_is_valid_dxfer: dxfer_direction: -3, dxfer_len: 4240
[   46.304530] sg_is_valid_dxfer: dxfer_direction: -2, dxfer_len: 0

Would you also want to see the output from that patch applied to a
functioning kernel?

 - J<


Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-18 Thread Jason L Tibbitts III
I have verified that building a clean v4.12 with
68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 cherry picked on top still
shows the problem:

[root@backup2 ~]# mtx -f /dev/sg7 next 0
Unloading drive 0 into Storage Element 45...mtx: Request Sense: Long
Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1 to 1045 Failed

Nothing appears to be logged; is there any kind of debugging information
I can collect which might help to track this down?  I'm not particularly
good at this but I am pretty sure that I'm building everything properly
and am actually booting the patched kernel.

 - J<


Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-18 Thread Jason L Tibbitts III
I have verified that building a clean v4.12 with
68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47 cherry picked on top still
shows the problem:

[root@backup2 ~]# mtx -f /dev/sg7 next 0
Unloading drive 0 into Storage Element 45...mtx: Request Sense: Long
Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1 to 1045 Failed

Nothing appears to be logged; is there any kind of debugging information
I can collect which might help to track this down?  I'm not particularly
good at this but I am pretty sure that I'm building everything properly
and am actually booting the patched kernel.

 - J<


Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-18 Thread Jason L Tibbitts III
> "JT" == Johannes Thumshirn  writes:

JT> This is fixed with: commit 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47

Hmm, well, I just pulled and built mainline, which does appear to
contain that patch (though it wasn't there when I first started
investigating this last week) and the problem is still there.  I'll try
building clean 4.12 and applying just that patch over the top.

 - J<


Re: [REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-18 Thread Jason L Tibbitts III
> "JT" == Johannes Thumshirn  writes:

JT> This is fixed with: commit 68c59fcea1f2c6a54c62aa896cc623c1b5bc9b47

Hmm, well, I just pulled and built mainline, which does appear to
contain that patch (though it wasn't there when I first started
investigating this last week) and the problem is still there.  I'll try
building clean 4.12 and applying just that patch over the top.

 - J<


[REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-17 Thread Jason L Tibbitts III
After updating my tape backup server to 4.12 I found that mtx had issues
controlling the tape library.  Good behavior:

[root@backup2 ~]# mtx -f /dev/sg7 next 0
Unloading drive 0 into Storage Element 4...done
Loading media from Storage Element 5 into drive 0...done

Bad behavior:

[root@backup2 ~]# mtx -f /dev/sg7 next 0
Unloading drive 0 into Storage Element 46...mtx: Request Sense: Long
Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1 to 1046 Failed

This was seen on a machine running Fedora 25 as well as an Ubuntu
machine.  Relevant tickets:
  https://bugzilla.redhat.com/show_bug.cgi?id=1471302
  http://bugzilla.kernel.org/show_bug.cgi?id=196375
  https://bugs.launchpad.net/bugs/1704512

mtx in all cases is 1.3.12; in the Fedora case that's
mtx-1.3.12-14.fc24.x86_64.  I see this with an Overland Neo T48s library
but the Ubuntu user had a Dell ML6000 and we both have completely
different HBAs and cabling (LSI3008 SAS and qla2462 FC).

I bisected this down to:

commit 28676d869bbb5257b5f14c0c95ad3af3a7019dd5
Author: Johannes Thumshirn 
Date:   Fri Apr 7 09:34:15 2017 +0200

scsi: sg: check for valid direction before starting the request

Check for a valid direction before starting the request, otherwise we
risk running into an assertion in the scsi midlayer checking for valid
requests.

[mkp: fixed typo]

Signed-off-by: Johannes Thumshirn 
Link: http://www.spinics.net/lists/linux-scsi/msg104400.html
Reported-by: Dmitry Vyukov 
Signed-off-by: Hannes Reinecke 
Tested-by: Johannes Thumshirn 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Martin K. Petersen 

and confirmed that clean unpatched 4.12 shows the problem, while
reverting just that patch fixes the issue.  Unfortunately I don't know
enough to actually fix this, but I can easily test patches.

 - J<


[REGRESSION] 28676d869bbb (scsi: sg: check for valid direction before starting the request) breaks mtx tape library control

2017-07-17 Thread Jason L Tibbitts III
After updating my tape backup server to 4.12 I found that mtx had issues
controlling the tape library.  Good behavior:

[root@backup2 ~]# mtx -f /dev/sg7 next 0
Unloading drive 0 into Storage Element 4...done
Loading media from Storage Element 5 into drive 0...done

Bad behavior:

[root@backup2 ~]# mtx -f /dev/sg7 next 0
Unloading drive 0 into Storage Element 46...mtx: Request Sense: Long
Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 1 to 1046 Failed

This was seen on a machine running Fedora 25 as well as an Ubuntu
machine.  Relevant tickets:
  https://bugzilla.redhat.com/show_bug.cgi?id=1471302
  http://bugzilla.kernel.org/show_bug.cgi?id=196375
  https://bugs.launchpad.net/bugs/1704512

mtx in all cases is 1.3.12; in the Fedora case that's
mtx-1.3.12-14.fc24.x86_64.  I see this with an Overland Neo T48s library
but the Ubuntu user had a Dell ML6000 and we both have completely
different HBAs and cabling (LSI3008 SAS and qla2462 FC).

I bisected this down to:

commit 28676d869bbb5257b5f14c0c95ad3af3a7019dd5
Author: Johannes Thumshirn 
Date:   Fri Apr 7 09:34:15 2017 +0200

scsi: sg: check for valid direction before starting the request

Check for a valid direction before starting the request, otherwise we
risk running into an assertion in the scsi midlayer checking for valid
requests.

[mkp: fixed typo]

Signed-off-by: Johannes Thumshirn 
Link: http://www.spinics.net/lists/linux-scsi/msg104400.html
Reported-by: Dmitry Vyukov 
Signed-off-by: Hannes Reinecke 
Tested-by: Johannes Thumshirn 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Martin K. Petersen 

and confirmed that clean unpatched 4.12 shows the problem, while
reverting just that patch fixes the issue.  Unfortunately I don't know
enough to actually fix this, but I can easily test patches.

 - J<


Commit 6016af "[media] v4l2: use __u32 rather than enums in ioctl() structs" breaks C++ users of V4L2

2012-07-16 Thread Jason L Tibbitts III
I ran into problems compiling the program ZoneMinder on Fedora rawhide
(currently using something around 3.5rc6) which do not appear with 3.4
kernels.  With help this was traced to commit
6016af82eafcb6e086a8f2a2197b46029a843d68, "[media] v4l2: use __u32
rather than enums in ioctl() structs" which changed videodev2.h in a way
which appears to be incompatible with C++.

This results in code such as the following:
enum v4l2_buf_type type = v4l2_data.fmt.type;
failing to compile with:
  zm_local_camera.cpp:1523:49: error: invalid conversion from '__u32
  {aka unsigned int}' to 'v4l2_buf_type' [-fpermissive]
but only when compiled with the headers from a 3.5 kernel.

I'm very far from a C++ expert.  I talked with some people who do grok
it and the issue comes down to restrictions on assignments of ints to
enums and additionally that enums in C++ don't have defined size.

 - J<
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Commit 6016af [media] v4l2: use __u32 rather than enums in ioctl() structs breaks C++ users of V4L2

2012-07-16 Thread Jason L Tibbitts III
I ran into problems compiling the program ZoneMinder on Fedora rawhide
(currently using something around 3.5rc6) which do not appear with 3.4
kernels.  With help this was traced to commit
6016af82eafcb6e086a8f2a2197b46029a843d68, [media] v4l2: use __u32
rather than enums in ioctl() structs which changed videodev2.h in a way
which appears to be incompatible with C++.

This results in code such as the following:
enum v4l2_buf_type type = v4l2_data.fmt.type;
failing to compile with:
  zm_local_camera.cpp:1523:49: error: invalid conversion from '__u32
  {aka unsigned int}' to 'v4l2_buf_type' [-fpermissive]
but only when compiled with the headers from a 3.5 kernel.

I'm very far from a C++ expert.  I talked with some people who do grok
it and the issue comes down to restrictions on assignments of ints to
enums and additionally that enums in C++ don't have defined size.

 - J
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Major regression on hackbench with SLUB (more numbers)

2007-12-22 Thread Jason L Tibbitts III
> "IM" == Ingo Molnar <[EMAIL PROTECTED]> writes:

IM> Distros will likely pick SLUB if there's no performance worries
IM> and if it's the default. Fedora rawhide already uses SLUB.

Actually, it seems to me that not only does Fedora rawhide use SLUB,
but Fedora 8 and 7 use it as well.  They don't have /proc/slabinfo and
they all seem to have CONFIG_SLUB=y:

> grep -r CONFIG_SLUB=y kernel
kernel/devel/config-generic:CONFIG_SLUB=y
kernel/F-7/configs/config-generic:CONFIG_SLUB=y
kernel/F-8/config-generic:CONFIG_SLUB=y

 - J<
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Major regression on hackbench with SLUB (more numbers)

2007-12-22 Thread Jason L Tibbitts III
 IM == Ingo Molnar [EMAIL PROTECTED] writes:

IM Distros will likely pick SLUB if there's no performance worries
IM and if it's the default. Fedora rawhide already uses SLUB.

Actually, it seems to me that not only does Fedora rawhide use SLUB,
but Fedora 8 and 7 use it as well.  They don't have /proc/slabinfo and
they all seem to have CONFIG_SLUB=y:

 grep -r CONFIG_SLUB=y kernel
kernel/devel/config-generic:CONFIG_SLUB=y
kernel/F-7/configs/config-generic:CONFIG_SLUB=y
kernel/F-8/config-generic:CONFIG_SLUB=y

 - J
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ARC-1260: No space left on device, when there is (or should be) free space left

2007-06-18 Thread Jason L Tibbitts III
> "MN" == Magnus Naeslund <[EMAIL PROTECTED]> writes:

MN> Inode count: 1430528

That's stunningly small; I created an 8TB partition here and made an
ext3 filesystem in it with the default parameters; I got nearly 1000
times as many inodes as you have:

Inode count:  1072414720
Block count:  2144799744

 - J<
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ARC-1260: No space left on device, when there is (or should be) free space left

2007-06-18 Thread Jason L Tibbitts III
 MN == Magnus Naeslund [EMAIL PROTECTED] writes:

MN Inode count: 1430528

That's stunningly small; I created an 8TB partition here and made an
ext3 filesystem in it with the default parameters; I got nearly 1000
times as many inodes as you have:

Inode count:  1072414720
Block count:  2144799744

 - J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] PM: Separate hibernation code from suspend code

2007-05-05 Thread Jason L Tibbitts III
> "AM" == Andrew Morton <[EMAIL PROTECTED]> writes:

AM> This causes the long-suffering Vaio to fail to power off during
AM> suspend to disk.  It says "Please power me down manually".

Interesting; I'm seeing exactly the same thing on a Vaio TXN17P, which
popped up somewhere between Fedora's 2.6.21-1.2116 and 2.6.21-1.2125
kernels, but the changelog there shows little more than wireless fixes
and the 2.6.21.1 patch.

AM> However `halt -p' still works OK.

The same for me.  It's odd to think that identical symptoms are caused
by something completely different, but I guess this is the kernel.

 - J<
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm] PM: Separate hibernation code from suspend code

2007-05-05 Thread Jason L Tibbitts III
 AM == Andrew Morton [EMAIL PROTECTED] writes:

AM This causes the long-suffering Vaio to fail to power off during
AM suspend to disk.  It says Please power me down manually.

Interesting; I'm seeing exactly the same thing on a Vaio TXN17P, which
popped up somewhere between Fedora's 2.6.21-1.2116 and 2.6.21-1.2125
kernels, but the changelog there shows little more than wireless fixes
and the 2.6.21.1 patch.

AM However `halt -p' still works OK.

The same for me.  It's odd to think that identical symptoms are caused
by something completely different, but I guess this is the kernel.

 - J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] Support UTF-8 scripts

2005-08-14 Thread Jason L Tibbitts III
> "LR" == Lee Revell <[EMAIL PROTECTED]> writes:

LR> Is Larry smoking crack?  That's one of the worst ideas I've heard
LR> in a long time.  There's no easy way to enter those at the
LR> keyboard!

I know folks enjoy trashing Perl these days, but it's not justified in
this case.  From the Perl6-Bible -
http://search.cpan.org/dist/Perl6-Bible/lib/Perl6/Bible/S03.pod:

 For those still living without the blessings of Unicode, that can
 also be written: << ... >>.

 - J<
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] Support UTF-8 scripts

2005-08-14 Thread Jason L Tibbitts III
 LR == Lee Revell [EMAIL PROTECTED] writes:

LR Is Larry smoking crack?  That's one of the worst ideas I've heard
LR in a long time.  There's no easy way to enter those at the
LR keyboard!

I know folks enjoy trashing Perl these days, but it's not justified in
this case.  From the Perl6-Bible -
http://search.cpan.org/dist/Perl6-Bible/lib/Perl6/Bible/S03.pod:

 For those still living without the blessings of Unicode, that can
 also be written:  ... .

 - J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: usb hotplug problems with 2.6.10

2005-02-03 Thread Jason L Tibbitts III
> "JH" == Jack Howarth <[EMAIL PROTECTED]> writes:

JH> Alan, I had mentioned a couple weeks back that with kernel 2.6.10,
JH> the ability to hotplug usb keys in Fedora Core 2 and 3 has been
JH> broken.

The true issue is a little more complicated than that.  The kernel
issue here is that under 2.6.9 (or Red Hat's versions thereof)
usb-storage devices would be picked up by the SCSI layer immediately
after the usb-storage module became aware of them.  Under Red Hat's
2.6.10 (which currently includes -ac11) there is a delay of
five or ten seconds between the device appearing in
/sys/bus/usb/drivers/usb-storage and it showing up as a SCSI device.

This delay has broken some stuff.  Yes, Red Hat's userland hotplugging
bits made incorrect assumptions and should be fixed.  The question is
whether this kernel delay is intended and, if not, how to fix it.
Even after I've hacked around the userland problems I still have
people asking why it takes so long for their USB keys to show up.

 - J<
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: usb hotplug problems with 2.6.10

2005-02-03 Thread Jason L Tibbitts III
 JH == Jack Howarth [EMAIL PROTECTED] writes:

JH Alan, I had mentioned a couple weeks back that with kernel 2.6.10,
JH the ability to hotplug usb keys in Fedora Core 2 and 3 has been
JH broken.

The true issue is a little more complicated than that.  The kernel
issue here is that under 2.6.9 (or Red Hat's versions thereof)
usb-storage devices would be picked up by the SCSI layer immediately
after the usb-storage module became aware of them.  Under Red Hat's
2.6.10 (which currently includes -ac11) there is a delay of
five or ten seconds between the device appearing in
/sys/bus/usb/drivers/usb-storage and it showing up as a SCSI device.

This delay has broken some stuff.  Yes, Red Hat's userland hotplugging
bits made incorrect assumptions and should be fixed.  The question is
whether this kernel delay is intended and, if not, how to fix it.
Even after I've hacked around the userland problems I still have
people asking why it takes so long for their USB keys to show up.

 - J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3ware driver (3w-xxxx) in 2.6.10: procfs entry

2005-01-19 Thread Jason L Tibbitts III
> "PD" == Peter Daum <[EMAIL PROTECTED]> writes:

PD> At least on my 8506-controllers there are also some minor problems
PD> (e.g. "info" doesn't work during a verify) which I thought was due
PD> to the fact that the program is intended exclusively for
PD> 9000-controllers.

You should report the problems you find to them.  They do indicate (in
the knowledge base on their web site) that you're going to need the
in-engineering files to run on the latest kernels.  It's only recently
that the newer tools acquired the ability to control older
controllers.

According to a recent post from a 3ware employee on linux-ide-arrays,
a proper release is expected in February.  Obviously the best solution
is that they just give us the source to these tools so that we can fix
them ourselves.  Knowing that isn't going to happen I'm happy they're
at least giving us something while they catch up with the speed of
kernel progress.

I can verify the fact that info is busted when the controller is
verifying the array; I'll gather some more info and pass this on to
3ware.

 - J<
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3ware driver (3w-xxxx) in 2.6.10: procfs entry

2005-01-19 Thread Jason L Tibbitts III
 PD == Peter Daum [EMAIL PROTECTED] writes:

PD At least on my 8506-controllers there are also some minor problems
PD (e.g. info doesn't work during a verify) which I thought was due
PD to the fact that the program is intended exclusively for
PD 9000-controllers.

You should report the problems you find to them.  They do indicate (in
the knowledge base on their web site) that you're going to need the
in-engineering files to run on the latest kernels.  It's only recently
that the newer tools acquired the ability to control older
controllers.

According to a recent post from a 3ware employee on linux-ide-arrays,
a proper release is expected in February.  Obviously the best solution
is that they just give us the source to these tools so that we can fix
them ourselves.  Knowing that isn't going to happen I'm happy they're
at least giving us something while they catch up with the speed of
kernel progress.

I can verify the fact that info is busted when the controller is
verifying the array; I'll gather some more info and pass this on to
3ware.

 - J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/