Re: BUG at net/sunrpc/svc_xprt.c:921

2013-02-25 Thread Mark Lord
On 13-01-17 08:53 AM, J. Bruce Fields wrote:
> On Thu, Jan 17, 2013 at 08:11:52AM -0500, Mark Lord wrote:
>> On 13-01-14 11:17 AM, Mark Lord wrote:
>>>
>>> Here's the code with the BUG() at net/sunrpc/svc_xprt.c line 921:
>>>
>>> /*
>>>  * Remove a dead transport
>>>  */
>>> static void svc_delete_xprt(struct svc_xprt *xprt)
>>> {
>>> struct svc_serv *serv = xprt->xpt_server;
>>> struct svc_deferred_req *dr;
>>>
>>> /* Only do this once */
>>> if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
>>> BUG();
>>

Saw this again today on 3.7.9 -- dunno if your changes are in that kernel yet 
though.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT] Networking

2013-02-21 Thread Mark Lord
On 13-02-20 10:05 PM, Linus Torvalds wrote:
> On Wed, Feb 20, 2013 at 2:09 PM, David Miller  wrote:
>>
>> 15) Orphan and delete a bunch of pre-historic networking drivers from
>> Paul Gortmaker.
> 
> Nooo You killed the 3c501 and 3c503 drivers! Snif.
> 
> I wonder if they still worked..

I hope they're not really dead, because we still use them in several machines 
here
as secondary interfaces for test rigs and whatnot.

It's a tad early to be nuking support for such widespread devices.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT] Networking

2013-02-21 Thread Mark Lord
On 13-02-21 09:26 PM, Paul Gortmaker wrote:
> On Thu, Feb 21, 2013 at 9:37 AM, Mark Lord  wrote:
>> On 13-02-20 10:05 PM, Linus Torvalds wrote:
>>> On Wed, Feb 20, 2013 at 2:09 PM, David Miller  wrote:
..
>>> Nooo You killed the 3c501 and 3c503 drivers! Snif.
>>>
>>> I wonder if they still worked..
>>
>> I hope they're not really dead, because we still use them in several 
>> machines here
>> as secondary interfaces for test rigs and whatnot.
..
> Did you actually look at the drivers deleted?
..

Finally got to one of the boxes here to check.
And you're right, I was confusing drivers.

I always seem to get the 3c509 (ISA) stuff confused with the 3c59x (PCI).
Our boxes here have the 3c59x (PCI) cards.

R.I.P. 3c50x.  :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drop support for x86-32

2012-08-29 Thread Mark Lord
On 12-08-26 10:15 AM, wbrana wrote:
> On 8/26/12, Mark Lord  wrote:
>> Here are a couple of real scenarios you don't seem to have thought about.
>> A 32-bit kernel on a legacy (or even new) system in 2017 will still need
>> regular kernel updates (not "long term" un0maintained kernels)
>> in order to work with new USB devices, new 4KB+ sector hard drives,
>> newer generations of SSDs, etc..
> 12-years-old machine is trash.

There you go making assumptions again.
Who said anything about a 12-year old machine?

Much more likely is a 5-year old software installation
that gets moved to a new box.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 20000+ wake-ups/second in 2.6.24. Bug?

2008-02-04 Thread Mark Lord

Arjan van de Ven wrote:

On Mon, 04 Feb 2008 12:29:03 -0500
Mark Lord <[EMAIL PROTECTED]> wrote:


re:  http://bugzilla.kernel.org/show_bug.cgi?id=9489

This just happened here again.  Or at least I finally noticed that
the fan on my notebook seemed to be running hard for much longer
than usual.  :)

Powertop showed 2.6.24-final running with 1-36000 wakeups/sec,
with *nothing* significant running:  top showed 97+% idle on both
cores.

-   Device: Errors: Correctable- Non-Fatal- Fatal+
Unsupported-
+   Device: Errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+ Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
Device: MaxPayload 128 bytes, MaxReadReq 128 bytes
Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1,
Port 1 @@ -101,12 +101,12 @@
Latency: 0, Cache Line Size: 64 bytes
Bus: primary=00, secondary=0c, subordinate=0c, sec-latency=0
Memory behind bridge: efc0-efcf
-   Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast

TAbort- 
+   Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast

TAbort- Reset- FastB2B-


this shows you're having various types of really bad things going on, like PCI
master aborts and the like. Those would certainly be a factor in waking the cpu 
up;
they're basically hardware exceptions, and I can totally believe (would need to 
find out
from hw guys how this works in practice) that this sort of serious error would 
keep the
cpu out of deep C states until resolved.

..

Or perhaps some initialization on the main-boot patch
just doesn't happen on the resume-from-hibernate paths ?
(either in the BIOS or kernel or drivers ..)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] 2.6.24 refuses to boot - ATA problem?

2008-02-04 Thread Mark Lord

Gene Heskett wrote:

On Sunday 03 February 2008, Ingo Molnar wrote:

* Gene Heskett <[EMAIL PROTECTED]> wrote:

I believe its the same, but lemme paste it for sure, yes:
[   26.339926] ENABLING IO-APIC IRQs
[   26.340119] ..TIMER: vector=0x31 apic1=0 pin1=0 apic2=-1 pin2=-1
[   26.350129] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
[   26.350182] ...trying to set up timer (IRQ0) through the 8259A ... 
failed. [   26.350185] ...trying to set up timer as Virtual Wire IRQ...

failed. [   26.360186] ...trying to set up timer as ExtINT IRQ... works.

The third line is the only line that makes it to the screen during the
boot trace.

Now, what does this tell us?

the question would be:

- if you remove the acpi_use_timer_override boot flag
- and if you boot a kernel with this hack applied

=> do those weird PATA failures come back?

If the failues do _not_ come back then the problem is somehow
affected/worked-around by the IO-APIC code that generates the above 4
lines. If the failures are still the same then the above 4 lines are
really just an uninteresting side-effect of the acpi_use_timer_override
flag - and the real side-effects (that fixes PATA on your box) are to be
found elsewhere.

Sadly, the latter variant is the expected answer.

Ingo


And at this point, I can't tell.  This reboot was from a cold start, without 
the argument, and cold by long enough to make the rounds about the house and 
pick up a beer, but not take my evening pillbox.  A minute cold, maybe 2 max.  
The log is clean since except for a kudzu nag of some sort:

..

Just to muddy your observations:  it is quite possible that a cold (power-off)
reboot may be required to properly observe what happens here.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 20000+ wake-ups/second in 2.6.24. Bug?

2008-02-04 Thread Mark Lord

re:  http://bugzilla.kernel.org/show_bug.cgi?id=9489

This just happened here again.  Or at least I finally noticed that
the fan on my notebook seemed to be running hard for much longer
than usual.  :)

Powertop showed 2.6.24-final running with 1-36000 wakeups/sec,
with *nothing* significant running:  top showed 97+% idle on both cores.

/proc/interrupts didn't have anything interesting either.

I've put a snap of the powertop output into Bug 9489 (link above),
along with the kernel .config again.

This was after an uptime of many days, with lots of suspend/resume (RAM)
cycles and even a few hibernate/resume cycles.

lspci -vv  doesn't show much different from a fresh reboot
versus what was seem "during" the problem:

--- lspci.rebooted  2008-02-04 12:18:53.0 -0500
+++ lspci.during2008-02-04 12:16:04.0 -0500
@@ -44,7 +44,7 @@
   Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
   Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- TAbort- Reset- FastB2B-
   Capabilities: [88] Subsystem: Dell Unknown device 01cd
   Capabilities: [80] Power Management version 2
@@ -24,7 +24,7 @@
   Capabilities: [a0] Express Root Port (Slot+) IRQ 0
   Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
   Device: Latency L0s <64ns, L1 <1us
-   Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
+   Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
   Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
   Device: MaxPayload 128 bytes, MaxReadReq 128 bytes
   Link: Supported Speed 2.5Gb/s, Width x16, ASPM L0s L1, Port 2
@@ -69,12 +69,12 @@
   Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- TAbort- Reset- FastB2B-
   Capabilities: [40] Express Root Port (Slot+) IRQ 0
   Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
   Device: Latency L0s unlimited, L1 unlimited
-   Device: Errors: Correctable- Non-Fatal- Fatal+ Unsupported-
+   Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
   Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
   Device: MaxPayload 128 bytes, MaxReadReq 128 bytes
   Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1, Port 1
@@ -101,12 +101,12 @@
   Latency: 0, Cache Line Size: 64 bytes
   Bus: primary=00, secondary=0c, subordinate=0c, sec-latency=0
   Memory behind bridge: efc0-efcf
-   Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- TAbort- Reset- FastB2B-
   Capabilities: [40] Express Root Port (Slot+) IRQ 0
   Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
   Device: Latency L0s unlimited, L1 unlimited
-   Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
+   Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
   Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
   Device: MaxPayload 128 bytes, MaxReadReq 128 bytes
   Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1, Port 2
@@ -140,7 +140,7 @@
   Capabilities: [40] Express Root Port (Slot+) IRQ 0
   Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
   Device: Latency L0s unlimited, L1 unlimited
-   Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
+   Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
   Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
   Device: MaxPayload 128 bytes, MaxReadReq 128 bytes
   Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1, Port 4


Rafael ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kgdb in git-x86#mm review

2008-02-11 Thread Mark Lord

Andi Kleen wrote:

..
You should probably use simple_strtoul() instead of inventing an
own hex parser in kgdb.c. And sprintf instead of an own hex writer.
In general more use sprintf would probably shorten a lot of the parser
code.

..

Speaking of which.. the kernel implementation of snprintf() seems
to have a bug somewhere, in that it returns an incorrect count in
some situations -- mostly around where the buffer is too small to
hold the data being written.  There's an off-by-one bug there somewhere,
but I have not had time yet to track it down more precisely.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] rfcomm

2008-02-13 Thread Mark Lord

Andrew Morton wrote:

On Mon, 11 Feb 2008 17:57:54 +0200 Alon Bar-Lev <[EMAIL PROTECTED]> wrote:


On Tuesday 06 November 2007, Alon Bar-Lev wrote:

On 11/6/07, Dave Young <[EMAIL PROTECTED]> wrote:

Hi,
sorry for reply again, this seems a diffrent issue ...

All that I do is running pppd over the rfcomm, suspending the system and resume.
I don't load any binary module.

..

Tried 2.6.24.1...

..

Feb 11 17:46:05 alon1 usb 3-1: new full speed USB device using uhci_hcd and 
address 8
Feb 11 17:46:05 alon1 usb 3-1: configuration #1 chosen from 1 choice
Feb 11 17:46:05 alon1 BUG: unable to handle kernel NULL pointer dereference at 
virtual address 0008
Feb 11 17:46:05 alon1 printing eip: c01b2da6 *pde =  
Feb 11 17:46:05 alon1 Oops:  [#1] PREEMPT 
Feb 11 17:46:05 alon1 Modules linked in: aes_generic crypto_algapi ieee80211_crypt_ccmp ppp_deflate zlib_deflate zlib_inflate bsd_comp ppp_async thinkpad_acpi hwmon nvram vmnet(P) vmmon(P) tun radeon drm autofs4 ipv6 nf_nat_irc nf_nat_ftp nf_conntrack_irc nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat ipt_REJECT xt_tcpudp ipt_LOG xt_limit xt_state nf_conntrack_ipv4 nf_conntrack iptable_filter ip_tables x_tables rfcomm l2cap snd_pcm_oss snd_mixer_oss snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device ppp_generic slhc ioatdma dca cfq_iosched cpufreq_powersave cpufreq_ondemand cpufreq_conservative acpi_cpufreq freq_table uinput fan af_packet nls_cp1255 nls_iso8859_1 nls_utf8 nls_base hci_usb bluetooth pcmcia snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm ipw2200 nsc_ircc snd_timer irda ieee80211 snd psmouse yenta_socket ehci_hcd pcspkr ieee80211_crypt e1000 rsrc_nonstatic uhci_hcd soundcore i2c_i801 intel_agp crc_ccitt thermal sr_mod pcmcia_core snd_page_al

!

 lo!

 c battery rtc firmware_class agpgart ac processor cdrom sg button unix usbcore 
evdev ext3 jbd ext2 mbcache loop ata_piix libata sd_mod scsi_mod
Feb 11 17:46:05 alon1 
Feb 11 17:46:05 alon1 Pid: 4, comm: events/0 Tainted: P(2.6.24-gentoo-r1 #1)

Feb 11 17:46:05 alon1 EIP: 0060:[] EFLAGS: 00010286 CPU: 0
Feb 11 17:46:05 alon1 EIP is at sysfs_get_dentry+0x26/0x80
Feb 11 17:46:05 alon1 EAX:  EBX:  ECX:  EDX: ebf21000
Feb 11 17:46:05 alon1 ESI: eab4e880 EDI: f713bb40 EBP: f713bb40 ESP: f7c49f00
Feb 11 17:46:05 alon1 DS: 007b ES: 007b FS:  GS:  SS: 0068
Feb 11 17:46:05 alon1 Process events/0 (pid: 4, ti=f7c48000 task=f7c3efc0 
task.ti=f7c48000)
Feb 11 17:46:05 alon1 Stack: f7c97120 f7135a68 f7e71e10 c01b303d   fffe c030ba9c 
Feb 11 17:46:05 alon1 f7c97120 f7135a68 f2fefb40 f7c97120 f7135a68 f2fefb40 c030ba8e c01ce1fb 
Feb 11 17:46:05 alon1 f75f1b00 c030ba8e f2fefb40 f75f1b00 f75f1b00  f7135a00  
Feb 11 17:46:05 alon1 Call Trace:

Feb 11 17:46:05 alon1 [] sysfs_move_dir+0x3d/0x1f0
Feb 11 17:46:05 alon1 [] kobject_move+0x9b/0x120
Feb 11 17:46:05 alon1 [] device_move+0x51/0x110
Feb 11 17:46:05 alon1 [] del_conn+0x0/0x40 [bluetooth]
Feb 11 17:46:05 alon1 [] del_conn+0x10/0x40 [bluetooth]
Feb 11 17:46:05 alon1 [] run_workqueue+0x81/0x140
Feb 11 17:46:05 alon1 [] schedule+0x168/0x2e0
Feb 11 17:46:05 alon1 [] autoremove_wake_function+0x0/0x50
Feb 11 17:46:05 alon1 [] worker_thread+0x9b/0xf0
Feb 11 17:46:05 alon1 [] autoremove_wake_function+0x0/0x50
Feb 11 17:46:05 alon1 [] worker_thread+0x0/0xf0
Feb 11 17:46:05 alon1 [] kthread+0x42/0x70
Feb 11 17:46:05 alon1 [] kthread+0x0/0x70
Feb 11 17:46:05 alon1 [] kernel_thread_helper+0x7/0x18
Feb 11 17:46:05 alon1 ===
Feb 11 17:46:05 alon1 Code: 26 00 00 00 00 57 89 c7 a1 50 1b 3a c0 56 53 8b 70 38 85 f6 74 08 8b 0e 85 c9 74 58 ff 06 8b 56 50 39 fa 74 47 89 fb eb 02 89 c3 <8b> 43 08 39 c2 75 f7 8b 46 08 83 c0 68 e8 98 e7 10 00 8b 43 10 
Feb 11 17:46:05 alon1 EIP: [] sysfs_get_dentry+0x26/0x80 SS:ESP 0068:f7c49f00


A number of bluetooth fixes went into 2.6.25-rc1.  It would be interestig
to see if we fixed this.

..

I had a strange thing happen with 2.6.24[.0] the other day.
My bluetooth serial dongles stopped working.
Unloading/reloading modules and daemons had no effect.
A system reboot cured it (for now).

That's the first time I've had unfixable bluetooth trouble, well, ever I 
suppose.
Just another useless data tidbit on 2.6.24.

-ml
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET] printk: implement printk_header() and merging printk, take #3

2008-02-14 Thread Mark Lord

Tejun Heo wrote:

..
* For timeouts, result TF isn't available and thus res printout is
misleading.  res shouldn't be printed after timeouts.  This would
require allocating yest another temp buf and separating out res printing
into separate snprintf.

..

And snprintf() is buggy, by the way.  It does not always seem to return
the correct character fill counts.  I've given up trying to use it here
for anything in kernelspace that *must* work.

Someday I'll go back and try to figure out where it's screwing up.
I think it is doing so in the cases where it runs out of space in the buffer.

-ml
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Spurious completions during NCQ

2008-02-16 Thread Mark Lord

Hugo Mills wrote:

   I'm getting these on my Dell Latitude D830:

Feb 15 13:06:00 willow kernel: ata1.00: exception Emask 0x2 SAct 0x4 SErr 0x0 
action 0x2 frozen
Feb 15 13:06:00 willow kernel: ata1.00: spurious completions during NCQ 
issue=0x0 SAct=0x4 FIS=004040a1:0002
Feb 15 13:06:00 willow kernel: ata1.00: cmd 61/10:10:26:fb:c4/00:00:02:00:00/40 
tag 2 cdb 0x0 data 8192 out
Feb 15 13:06:00 willow kernel:  res 40/00:10:26:fb:c4/00:00:02:00:00/40 
Emask 0x2 (HSM violation)
Feb 15 13:06:00 willow kernel: ata1: soft resetting port
Feb 15 13:06:00 willow kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 
SControl 300)
Feb 15 13:06:00 willow kernel: ata1.00: configured for UDMA/133
Feb 15 13:06:00 willow kernel: ata1: EH complete
Feb 15 13:06:00 willow kernel: sd 0:0:0:0: [sda] 312581808 512-byte hardware 
sectors (160042 MB)
Feb 15 13:06:00 willow kernel: sd 0:0:0:0: [sda] Write Protect is off
Feb 15 13:06:00 willow kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Feb 15 13:06:00 willow kernel: sd 0:0:0:0: [sda] Write cache: enabled, read 
cache: enabled, doesn't support DPO or FUA

   In some cases, there are several cmd/res lines listed. It's
happening about once an hour or so (not correlated with any other
event that I can see). It doesn't seem to be affecting operation of
the machine, but it's making me nervous.

   Can anyone set my mind at rest? (Or suggest a fix?)

..

Tejun, have the spurious completion fixes been backported
to 2.6.23 / 2.6.22 yet ?  Those kernels will be in common use
for some time to come, and this fix is more or less essential.

???
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET] printk: implement printk_header() and merging printk, take #3

2008-02-16 Thread Mark Lord

Tejun Heo wrote:

Andrew Morton wrote:

So, I guess it's NACK w/o suggested alternatives, right?

I wouldn't nack without good reasons, and I have none here.  I don't have
very strong opinions either way.


I was just wondering whether I should just go with snprintf dancing in
eh_link_report, which does make sense if not many need merging printk.

..

Any chance you could poke through snprintf() and look for the off-by-one bug
on the return result?  (I think it happens when "n" is exceeded).

:)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


hdparm-8.1 now available

2008-02-16 Thread Mark Lord

hdparm has now been updated to version 8.1,
and includes some significant new features of
potential interest to Linux kernel storage hackers:

The new --make-bad-sector flag can be used to deliberately
corrupt a sector on the media (creating a media error situation).
This can be very handy for testing error recovery strategies
and timeouts for devices and RAIDs.

It uses the new ATA WRITE_UNC_EXT command (designed for the purpose)
when the drive supports it, otherwise it will try and fall back on
the older WRITE_LONG command (which is limited to LBA28).

The manpage has more information on this option.

There is also a new --write-sector (aka. --repair-sector) flag
to *fix* a bad sector.  This can be used later to undo the bad
sectors created by the --make-bad-sector flag.

The new --read-sector flag can be used to test a sector
for media errors.  I generally use the following sequence here:

  hdparm --make-bad-sector  /dev/sdb  ## corrup a sector
  hdparm --read-sector  /dev/sdb## verify that it is now bad

  test my device driver etc..

  hdparm --repair-sector nnn /dev/sdb  ## fix the bad sector
  hdparm --read-sector nnn /dev/sdb  ## verify that it is now fixed


hdparm also now has a new -N flag for dealing with Host-Protected-Areas (HPA),
and other, more minor, fixes and enhancements.

hdparm-8.1 is available at http://sourceforge.net/projects/hdparm/

Thanks to Bruce Allen for supplying me with test drives
which implement the new WRITE_UNC_EXT command.

Cheers

Mark Lord
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Regression from 3.4.9 to 3.4.16 "stable" kernel

2012-10-28 Thread Mark Lord
My server here runs the 3.4.xx series of "stable" kernels.
Until today, it was running 3.4.9.
Today I tried to upgrade it to 3.4.16.
It hangs in setup.c.

I've isolated the fault down to this specific change
that was made between 3.4.9 and 3.4.16.
Reverting this change allows the system to boot/run normally again.


--- linux-3.4.9/arch/x86/kernel/setup.c 2012-08-15 11:17:17.0 -0400
+++ linux-3.4.16/arch/x86/kernel/setup.c2012-10-28 13:36:33.0 
-0400
@@ -927,8 +927,21 @@

 #ifdef CONFIG_X86_64
if (max_pfn > max_low_pfn) {
-   max_pfn_mapped = init_memory_mapping(1UL<<32,
-max_pfnsize <= 1UL << 32)
+   continue;
+
+   if (ei->type == E820_RESERVED)
+   continue;
+
+   max_pfn_mapped = init_memory_mapping(
+   ei->addr < 1UL << 32 ? 1UL << 32 : ei->addr,
+   ei->addr + ei->size);
+   }
+
/* can we preseve max_low_pfn ?*/
max_low_pfn = max_pfn;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression from 3.4.9 to 3.4.16 "stable" kernel

2012-10-29 Thread Mark Lord
On 12-10-29 02:46 AM, Willy Tarreau wrote:
> On Mon, Oct 29, 2012 at 12:03:55AM -0400, Mark Lord wrote:
>> My server here runs the 3.4.xx series of "stable" kernels.
>> Until today, it was running 3.4.9.
>> Today I tried to upgrade it to 3.4.16.
>> It hangs in setup.c.
>>
>> I've isolated the fault down to this specific change
>> that was made between 3.4.9 and 3.4.16.
>> Reverting this change allows the system to boot/run normally again.
>>
>>
>> --- linux-3.4.9/arch/x86/kernel/setup.c  2012-08-15 11:17:17.0 
>> -0400
>> +++ linux-3.4.16/arch/x86/kernel/setup.c 2012-10-28 13:36:33.0 
>> -0400
>> @@ -927,8 +927,21 @@
>>
>>  #ifdef CONFIG_X86_64
>>  if (max_pfn > max_low_pfn) {
>> -max_pfn_mapped = init_memory_mapping(1UL<<32,
>> - max_pfn<> +int i;
>> +for (i = 0; i < e820.nr_map; i++) {
>> +struct e820entry *ei = &e820.map[i];
>> +
>> +if (ei->addr + ei->size <= 1UL << 32)
>> +continue;
>> +
>> +if (ei->type == E820_RESERVED)
>> +continue;
>> +
>> +max_pfn_mapped = init_memory_mapping(
>> +ei->addr < 1UL << 32 ? 1UL << 32 : ei->addr,
>> +ei->addr + ei->size);
>> +}
>> +
>>  /* can we preseve max_low_pfn ?*/
>>  max_low_pfn = max_pfn;
>>  }
> 
> For the record, it is this commit introduced in 3.4.16 :
> 
> commit efd5fa0c1a1d1b46846ea6e8d1a783d0d8a6a721
> Author: Jacob Shin 
> Date:   Thu Oct 20 16:15:26 2011 -0500
> 
> x86: Exclude E820_RESERVED regions and memory holes above 4 GB from 
> direct mapping.
> 
> commit 1e779aabe1f0768c2bf8f8c0a5583679b54a upstream.
> 
> On systems with very large memory (1 TB in our case), BIOS may report a
> reserved region or a hole in the E820 map, even above the 4 GB range. 
> Exclude
> these from the direct mapping.
> 
> [ hpa: this should be done not just for > 4 GB but for everything above 
> the legacy
>   region (1 MB), at the very least.  That, however, turns out to require 
> significant
>   restructuring.  That work is well underway, but is not suitable for 
> rc/stable. ]
> 
> Signed-off-by: Jacob Shin 
> Link: 
> http://lkml.kernel.org/r/1319145326-13902-1-git-send-email-jacob.s...@amd.com
> Signed-off-by: H. Peter Anvin 
> Signed-off-by: Greg Kroah-Hartman 
> 
> Willy


Thanks, Willy.

I've also now downloaded linux-3.7.0-rc3, and it boots/runs without need for 
patching.
So there's a fix somewhere in between that perhaps could also get backported to 
-stable.

-ml

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression from 3.4.9 to 3.4.16 "stable" kernel

2012-10-29 Thread Mark Lord
On 12-10-29 10:22 AM, Mark Lord wrote:
> On 12-10-29 02:46 AM, Willy Tarreau wrote:
>> On Mon, Oct 29, 2012 at 12:03:55AM -0400, Mark Lord wrote:
>>> My server here runs the 3.4.xx series of "stable" kernels.
>>> Until today, it was running 3.4.9.
>>> Today I tried to upgrade it to 3.4.16.
>>> It hangs in setup.c.
>>>
>>> I've isolated the fault down to this specific change
>>> that was made between 3.4.9 and 3.4.16.
>>> Reverting this change allows the system to boot/run normally again.
..
>> For the record, it is this commit introduced in 3.4.16 :
>>
>> commit efd5fa0c1a1d1b46846ea6e8d1a783d0d8a6a721
>> Author: Jacob Shin 
>> Date:   Thu Oct 20 16:15:26 2011 -0500
>>
>> x86: Exclude E820_RESERVED regions and memory holes above 4 GB from 
>> direct mapping.
>> 
>> commit 1e779aabe1f0768c2bf8f8c0a5583679b54a upstream.
>> 
>> On systems with very large memory (1 TB in our case), BIOS may report a
>> reserved region or a hole in the E820 map, even above the 4 GB range. 
>> Exclude
>> these from the direct mapping.
>> 
>> [ hpa: this should be done not just for > 4 GB but for everything above 
>> the legacy
>>   region (1 MB), at the very least.  That, however, turns out to require 
>> significant
>>   restructuring.  That work is well underway, but is not suitable for 
>> rc/stable. ]
>> 
>> Signed-off-by: Jacob Shin 
>> Link: 
>> http://lkml.kernel.org/r/1319145326-13902-1-git-send-email-jacob.s...@amd.com
>> Signed-off-by: H. Peter Anvin 
>> Signed-off-by: Greg Kroah-Hartman 
..
> I've also now downloaded linux-3.7.0-rc3, and it boots/runs without need for 
> patching.
> So there's a fix somewhere in between that perhaps could also get backported 
> to -stable.
..

Heh.. except that kernel has its own issues -- hangs in some kind of screen loop
in the Radeon code (?) when trying to shutdown.  ctrl-alt-sysrq s+u+s+b gets 
out of that,
but it hangs in a similar fashion during the subsequent reboot.

A full power-off was required to get the Radeon video to behave so I could 
reboot
the system with 3.4.16 again.  I'm not going to pursue that issue for now, 
though.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression from 3.4.9 to 3.4.16 "stable" kernel

2012-10-29 Thread Mark Lord
There's something else very wrong when going from 3.4.9 to 3.4.16.
I've done it on two machines here, one the AMD-450 server (64-bit),
and the other my main notebook (Core2duo 32-bit-PAE).

Both systems feel much more sluggish than usual with 3.4.16 running.
Reverted them both back to earlier kernels (3.4.9, 3.4.4-PAE),
and the usual responsive feel has returned.

Vague, I know, but something bad happened in there somewhere.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression from 3.4.9 to 3.4.16 "stable" kernel

2012-10-29 Thread Mark Lord
On 12-10-29 07:03 PM, Greg Kroah-Hartman wrote:
> On Mon, Oct 29, 2012 at 07:00:54PM -0400, Mark Lord wrote:
>> There's something else very wrong when going from 3.4.9 to 3.4.16.
>> I've done it on two machines here, one the AMD-450 server (64-bit),
>> and the other my main notebook (Core2duo 32-bit-PAE).
>>
>> Both systems feel much more sluggish than usual with 3.4.16 running.
>> Reverted them both back to earlier kernels (3.4.9, 3.4.4-PAE),
>> and the usual responsive feel has returned.
>>
>> Vague, I know, but something bad happened in there somewhere.
> 
> That's too vague for me to do anything with, sorry.  Bisection would be
> good if you can figure out how to measure this.

Well, I'd bet Donkeys to Daises that reverting the kernel/sched.c changes
will probably fix the responsiveness, but I haven't done that yet.
I've lost enough time already debugging the other issues.

This is more just an indication that perhaps -stable patches need better review
than they're getting.  Take the setup.c breakage: as soon as I pointed it out,
a few people jumped in with knowledge that it was broken, and that patches
existed to fix it.

That kind of thing should be happening before a -stable release,
though I don't know how you would get the Right People to look
at this stuff then rather than after the fact.  Maybe a topic
for a future kernel summit or something.

Best wishes.
-ml

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/6] PCI/MSI: Factor out pci_get_msi_cap() interface

2013-10-01 Thread Mark Lord
On 13-09-26 09:03 AM, Alexander Gordeev wrote:
> On Thu, Sep 26, 2013 at 08:32:53AM -0400, Mark Lord wrote:
>> On 13-09-18 05:48 AM, Alexander Gordeev wrote:
>>> The last pattern makes most of sense to me and could be updated with a more
>>> clear sequence - a call to (bit modified) pci_msix_table_size() followed
>>> by a call to pci_enable_msix(). I think this pattern can effectively
>>> supersede the currently recommended "loop" practice.
>>
>> The loop is still necessary, because there's a race between those two calls,
>> so that pci_enable_msix() can still fail due to lack of MSIX slots.
> 
> Moreover, the existing loop pattern is racy and could fail just as easily ;)

Yes, but it then loops again to correct things.

> But (1) that is something drivers should expect and (2) there is basically
> nothing to race against - that is probably the reason it has not been a
> problem for pSeries. So I think we should not care about this.

I always care about race conditions.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC 00/77] Re-design MSI/MSI-X interrupts enablement pattern

2013-10-08 Thread Mark Lord
On 13-10-02 06:29 AM, Alexander Gordeev wrote:
..
> This update converts pci_enable_msix() and pci_enable_msi_block()
> interfaces to canonical kernel functions and makes them return a
> error code in case of failure or 0 in case of success.

Rather than silently break dozens of drivers in mysterious ways,
please invent new function names for the replacements to the
existing pci_enable_msix() and pci_enable_msi_block() functions.

That way, both in-tree and out-of-tree drivers will notice the API change,
rather than having it go unseen and just failing for unknown reasons.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC 00/77] Re-design MSI/MSI-X interrupts enablement pattern

2013-10-10 Thread Mark Lord
Just to help us all understand "the loop" issue..

Here's an example of driver code which uses the existing MSI-X interfaces,
for a device which can work with either 16, 8, 4, 2, or 1 MSI-X interrupt.
This is from a new driver I'm working on right now:


static int xx_alloc_msix_irqs (struct xx_dev *dev, int nvec)
{
xx_disable_all_irqs(dev);
do {
if (nvec < 2)
xx_prep_for_1_msix_vector(dev);
else if (nvec < 4)
xx_prep_for_2_msix_vectors(dev);
else if (nvec < 8)
xx_prep_for_4_msix_vectors(dev);
else if (nvec < 16)
xx_prep_for_8_msix_vectors(dev);
else
xx_prep_for_16_msix_vectors(dev);
nvec = pci_enable_msix(dev->pdev, dev->irqs, dev->num_vectors);
} while (nvec > 0);

if (nvec) {
kerr(dev->name, "pci_enable_msix() failed, err=%d", nvec);
dev->num_vectors = 0;
return nvec;
}
return 0;   /* success */
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/6] PCI/MSI: Factor out pci_get_msi_cap() interface

2013-09-26 Thread Mark Lord
On 13-09-18 05:48 AM, Alexander Gordeev wrote:
>
> The last pattern makes most of sense to me and could be updated with a more
> clear sequence - a call to (bit modified) pci_msix_table_size() followed
> by a call to pci_enable_msix(). I think this pattern can effectively
> supersede the currently recommended "loop" practice.

The loop is still necessary, because there's a race between those two calls,
so that pci_enable_msix() can still fail due to lack of MSIX slots.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA hdd refuses to reallocate a sector?

2013-06-23 Thread Mark Lord
On 13-06-23 03:00 PM, Pavel Machek wrote:
>
> Thanks for the hint. (Insert rant about hdparm documentation
> explaining that it is bad idea, but not telling me _why_ is it bad
> idea. Can I expect cache consistency issues after that, or is it just
> simple "you are writing to the disk without any checks"? Plus, I guess
> documentation should mention what sector number is. I guess sectors
> are 512bytes for the old drives, but is it 512 or 4096 for new
> drives?)

For ATA, use the "logical sector size".
For all existing drives out there, that's a 512 byte unit.

> ...but it does not do the trick :-(. It behaves strangely as if it was
> still cached somewhere. Do I need to turn off the write back cache?

No, it works just fine.  You probably have more than one bad sector.
After you see a read failure, run "smartctl -a" and look at the error
logs to see what sector the drive is choking on.

Or just low-level format it all with "hdparm --security-erase".

Cheers
-- 
Mark Lord
Real-Time Remedies Inc.
ml...@pobox.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA hdd refuses to reallocate a sector?

2013-06-23 Thread Mark Lord
On 13-06-23 05:51 PM, Pavel Machek wrote:
> On Sun 2013-06-23 17:27:52, Mark Lord wrote:
>
>> For all existing drives out there, that's a 512 byte unit.
> 
> I guessed so. (It would be good to actually document it, as well as
> documenting exactly why it is dangerous. Is it okay to send patches?)

Absolutely.  Please, even!

> Well, I definitely have more than one bad sector, but I did try to
> read exactly the same sector and it failed. See below.
..
read failed.
write works.
read failed.
write works.
read works.
dd failed.
read works.
read works.
read failed.

Odd.  The drive must be furiously reshuffling sectors or something,
or more likely pushing a piece of dirt around scuffing up more bits.

hdparm generally talks directly to a drive, not through the block
or filesystem layers.  So the block, filesystem, and page-cache stuff
don't know anything about --read-sector and --write-sector.

Cheers
-- 
Mark Lord
Real-Time Remedies Inc.
ml...@pobox.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA hdd refuses to reallocate a sector?

2013-06-24 Thread Mark Lord
On 13-06-24 03:14 AM, Ondrej Zary wrote:
..
> Being tired of using hdparm manually, I created a simple hdd_realloc utility
> that reads the disk in big blocks (1 MB). When there's a read error, it reads
> the failed block sector-by-sector and tries to rewrite the sectors that fail
> to read. It work fine for disks with just a couple of pending sectors.

Something like that would work very well if it used the hdparm approach
(directly to the drive) for the sector-by-sector part.

Going through the block layer isn't always going to work,
because the kernel likes to do I/O in PAGE_SIZE multiples.

And the SCSI stack in Linux has rather atrocious error handling.
It lumps multiple requests together, and can fail the entire lot even
if only a single sector is bad.

Cheers
-- 
Mark Lord
Real-Time Remedies Inc.
ml...@pobox.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA hdd refuses to reallocate a sector?

2013-06-29 Thread Mark Lord
On 13-06-29 02:47 PM, Henrique de Moraes Holschuh wrote:
> You know, either the "long" or the "offline" SMART test routines do exactly
> that on any spinning rust device with a firmware that is not utterly broken.
> 
> The HDD's firmware will rewrite, and even reallocate any "weak" sectors
> found by the surface scan.
> 

The drives I have tried this on (smartctl -t long),
abort at the first bad sector.  Not useful.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Drop support for x86-32

2012-08-26 Thread Mark Lord
On 12-08-24 12:45 PM, wbrana wrote:
> On 8/24/12, Alan Cox  wrote:
>> That doesn't work for a variety of reasons x86 hardware is still
>> changing, devices are still changing. So please exit cloud cuckoo land
>> and go do something useful.
> Hardware will be discontinued if no software will support it.

Here are a couple of real scenarios you don't seem to have thought about.
A 32-bit kernel on a legacy (or even new) system in 2017 will still need
regular kernel updates (not "long term" un0maintained kernels)
in order to work with new USB devices, new 4KB+ sector hard drives,
newer generations of SSDs, etc..

It's (mostly) all about drivers.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc1/2 CD/DVD burning broken

2008-02-16 Thread Mark Lord

Andreas Schwab wrote:

Jeff Garzik <[EMAIL PROTECTED]> writes:


Andreas Schwab wrote:

Since commit aaa04c28cb9a1efd42541fdb7ab648231c2a2263 [blk_end_request:
changing ide-cd (take 4)] I cannot burn any CD/DVD any more, getting the
following error from wodim:

Errno: 0 (Success), write_g1 scsi sendcmd: no error
CDB:  2A 00 00 00 00 00 00 00 1F 00
status: 0x2 (CHECK CONDITION)
Sense Bytes: 70 00 05 00 00 00 00 0E 00 00 00 00 21 02 00 00
Sense Key: 0x5 Illegal Request, Segment 0
Sense Code: 0x21 Qual 0x02 (invalid address for write) Fru 0x0
Sense flags: Blk 0 (not valid) resid: 63488

Does libata on the same hardware work?


There is no libata driver for ide-pmac.

..

What chipset is that..  CMD646 ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What's needed for a PCIe card to be recognized?

2008-02-17 Thread Mark Lord

Hans J. Koch wrote:

Am Sun, 17 Feb 2008 07:29:27 -0800
schrieb Arjan van de Ven <[EMAIL PROTECTED]>:


On Sun, 17 Feb 2008 13:37:53 +0100
"Hans J. Koch" <[EMAIL PROTECTED]> wrote:

Of course there's no driver for the wlan, but that's a different
story ;)

I replaced that unsupported Atheros 5007 card with an ipw3945, so I
haven't got that problem.

oh but then you have a MUCH bigger problem ;(
The bios of that animal is hardcoded to the 5007 (or at least
Atheros). If you stick your own card in, for FCC reasons, the bios
refuses the card.

..

s/FCC/brand protection by Atheros/


Really? Unbelievable what these guys do to make my live harder...
So, they might use some undocumented GPIO to turn the power on, and
refuse that if they don't find the original card? Looks like I can't
have WLAN on an EeePC (I won't run a tainted kernel). Stupid thing to
sell a PC with Linux preinstalled but with hardware not supported in
mainline.

..

Try it again with 2.6.25-rc2 and this module option:

   options pciehp pciehp_force=1

Just a thin hope, really, but it might work.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: What's needed for a PCIe card to be recognized?

2008-02-17 Thread Mark Lord

Mark Lord wrote:

Hans J. Koch wrote:

..

Really? Unbelievable what these guys do to make my live harder...
So, they might use some undocumented GPIO to turn the power on, and

...

GPIO lines are not usually very difficult to trace,
and programming them is pretty easy, too ...

If I had an EeePC here, I'd do that for you (and everyone else),
but I'm waiting for a lower-power (fanless) unit to be introduced first.


refuse that if they don't find the original card? Looks like I can't
have WLAN on an EeePC (I won't run a tainted kernel). Stupid thing to
sell a PC with Linux preinstalled but with hardware not supported in
mainline.

..

Try it again with 2.6.25-rc2 and this module option:

   options pciehp pciehp_force=1

Just a thin hope, really, but it might work.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


hdparm-8.2 now available

2008-02-18 Thread Mark Lord

hdparm for SATA/PATA has now been updated to version 8.2.

Upgrading is strongly recommended for all users.
This version fixes several bugs in earlier 7.x/8.x releases
that could cause issues when advanced features are attempted
on devices being driven by the old drivers/ide subsystem.

libata devices were unaffected, but drivers/ide devices could
misbehave or even be corrupted by some operations.

hdparm-8.2 is available at http://sourceforge.net/projects/hdparm/ 


Upgrading from any earlier 7.x or 8.x version
is strongly recommended for all users.

Cheers
--
Mark Lord
Real-Time Remedies Inc.
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: very poor ext3 write performance on big filesystems?

2008-02-19 Thread Mark Lord

Mark Lord wrote:

Theodore Tso wrote:
..

The following ld_preload can help in some cases.  Mutt has this hack
encoded in for maildir directories, which helps.

..

Oddly enough, that same spd_readdir() preload craps out here too
when used with "rm -r" on largish directories.

I added a bit more debugging to it, and it always craps out like this:
opendir dir=0x805ad10((nil))
Readdir64 dir=0x805ad10 pos=0/289/290
Readdir64 dir=0x805ad10 pos=1/289/290
Readdir64 dir=0x805ad10 pos=2/289/290
Readdir64 dir=0x805ad10 pos=3/289/290
Readdir64 dir=0x805ad10 pos=4/289/290
...
Readdir64 dir=0x805ad10 pos=287/289/290
Readdir64 dir=0x805ad10 pos=288/289/290
Readdir64 dir=0x805ad10 pos=289/289/290
Readdir64 dir=0x805ad10 pos=0/289/290
Readdir64: dirstruct->dp=(nil)
Readdir64: ds=(nil)
Segmentation fault (core dumped)
   
Always.  The "rm -r" loops over the directory, as show above,

and then tries to re-access entry 0 somehow, at which point
it discovers that it's been NULLed out.

Which is weird, because the local seekdir() was never called,
and the code never zeroed/freed that memory itself
(I've got printfs in there..).

Nulling out the qsort has no effect, and smaller/larger
ALLOC_STEPSIZE values don't seem to matter.

But.. when the entire tree is in RAM (freshly unpacked .tar),
it seems to have no problems with it.  As opposed to an uncached tree.

..

I take back that last point -- it also fails even when the tree *is* cached.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: very poor ext3 write performance on big filesystems?

2008-02-19 Thread Mark Lord

Theodore Tso wrote:
..

The following ld_preload can help in some cases.  Mutt has this hack
encoded in for maildir directories, which helps.

..

Oddly enough, that same spd_readdir() preload craps out here too
when used with "rm -r" on largish directories.

I added a bit more debugging to it, and it always craps out like this:

opendir dir=0x805ad10((nil))

Readdir64 dir=0x805ad10 pos=0/289/290
Readdir64 dir=0x805ad10 pos=1/289/290
Readdir64 dir=0x805ad10 pos=2/289/290
Readdir64 dir=0x805ad10 pos=3/289/290
Readdir64 dir=0x805ad10 pos=4/289/290
...
Readdir64 dir=0x805ad10 pos=287/289/290
Readdir64 dir=0x805ad10 pos=288/289/290
Readdir64 dir=0x805ad10 pos=289/289/290
Readdir64 dir=0x805ad10 pos=0/289/290
Readdir64: dirstruct->dp=(nil)
Readdir64: ds=(nil)
Segmentation fault (core dumped)



Always.  The "rm -r" loops over the directory, as show above,
and then tries to re-access entry 0 somehow, at which point
it discovers that it's been NULLed out.

Which is weird, because the local seekdir() was never called,
and the code never zeroed/freed that memory itself
(I've got printfs in there..).

Nulling out the qsort has no effect, and smaller/larger
ALLOC_STEPSIZE values don't seem to matter.

But.. when the entire tree is in RAM (freshly unpacked .tar),
it seems to have no problems with it.  As opposed to an uncached tree.

Peculiar.. I wonder where the bug is ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: very poor ext3 write performance on big filesystems?

2008-02-19 Thread Mark Lord

Paulo Marques wrote:

Mark Lord wrote:

Theodore Tso wrote:
..

The following ld_preload can help in some cases.  Mutt has this hack
encoded in for maildir directories, which helps.

..

Oddly enough, that same spd_readdir() preload craps out here too
when used with "rm -r" on largish directories.


 From looking at the code, I think I've found at least one bug in opendir:
...
dnew = realloc(dirstruct->dp,
dirstruct->max * sizeof(struct dir_s));

...

Shouldn't this be: "...*sizeof(struct dirent_s));"?

..

Yeah, that's one bug.
Another is that ->fd is frequently left uninitialized, yet later used.

Fixing those didn't change the null pointer deaths, though.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: hdparm-8.2 now available

2008-02-20 Thread Mark Lord

Greg Freemyer wrote:

Mark,

What kernel level is needed to support the new -N arg?

..

I believe it should work with 2.4.0 or newer.
But some kernels have a buggy implementation of it.


Tried it on a Suse 2.6.22 kernel (possibly not patched with all the
current security updates).

Failed with:

The Running Kernel Lack CONFIG_IDE_TASK_IOCTL Support.

..

That failure indicates that your drive is not using libata drivers.
USB/SD drives/drivers do not support this flag at all,
but drives using the ancient IDE drivers should work.

For drives using the ancient IDE drivers, you will need
to use a kernel configured with the CONFIG_IDE_TASK_IOCTL flag.
If you see the message above, then that means your kernel
was probably built without that configuration option selected.

Look in /proc/config.gz and see if the option shows up there,
or in "cat /boot/config-`uname -r`".  If you don't see it
in there, then the kernel would have to get rebuilt with
that config option to enable the feature.

Or just switch to the modern libata drivers for IDE/SATA drives.

Note also that hdparm has now been updated to version 8.4,
with some unrelated bug fixes.

Cheers
--
Mark Lord
Real-Time Remedies Inc.
[EMAIL PROTECTED]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.

2008-02-20 Thread Mark Lord

Jeff Chua wrote:



On Feb 20, 2008 2:19 PM, Jeff Chua

I'll try the "idle=poll" to see if that works and will try some printk


I don't know what exactly the i915_suspend() and i915_resume() are 
supposed to do because it works better without them.


After inserting "return 0;" right at the top of those two functions, 
suspend (and power-off properly), and resume (without green screen) 
works just fine.

..

Does this machine have more than one CPU core?  If so..
Does your kernel have CONFIG_HOTPLUG_CPU=y (if not, enable it).

??
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sata_mv: fix loop with last port

2008-02-07 Thread Mark Lord

Yinghai Lu wrote:

[PATCH] sata_mv: fix loop with last port

commit f351b2d638c3cb0b95adde3549b7bfaf3f991dfa
sata_mv: Support SoC controllers

cause panic:

scsi 4:0:0:0: Direct-Access ATA  HITACHI HDS7225S V44O PQ: 0 ANSI: 5
sd 4:0:0:0: [sde] 488390625 512-byte hardware sectors (250056 MB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
sd 4:0:0:0: [sde] 488390625 512-byte hardware sectors (250056 MB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support 
DPO or FUA
 sde:<1>BUG: unable to handle kernel NULL pointer dereference at 
001a
IP: [] mv_interrupt+0x21c/0x4cc
PGD 0
Oops:  [1] SMP
CPU 3
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.24-smp-08636-g0afc2ed-dirty #26
RIP: 0010:[]  [] mv_interrupt+0x21c/0x4cc
RSP: :8102050bbec8  EFLAGS: 00010297
RAX: 0008 RBX:  RCX: 0003
RDX: 8000 RSI: 0286 RDI: 8102035180e0
RBP: 0001 R08: 0003 R09: 8102036613e0
R10: 0002 R11: 8061474c R12: 8102035bf828
R13: 0008 R14: 81020348ece8 R15: c20002cb2000
FS:  () GS:810405025700() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 001a CR3: 00201000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process swapper (pid: 0, threadinfo 810405094000, task 8102050b28c0)
Stack:  0001000c 000204220400 00110002 81020348eda8
 0001 8102035f2cc0  
 0018   80269ee8
Call Trace:
   [] ? handle_IRQ_event+0x25/0x53
 [] ? handle_fasteoi_irq+0x90/0xc8
 [] ? do_IRQ+0xf1/0x15f
 [] ? default_idle+0x0/0x55
 [] ? ret_from_intr+0x0/0xa
   [] ? lapic_next_event+0x0/0xa
 [] ? default_idle+0x31/0x55
 [] ? default_idle+0x2c/0x55
 [] ? default_idle+0x0/0x55
 [] ? cpu_idle+0x92/0xb8


Code: 41 14 85 c0 89 44 24 14 0f 84 9d 02 00 00 f7 d0 01 d6 41 89 d5 89 41 14 8b 41 
14 89 34 24 e9 7e 02 00 00 49 63 c5 49 8b 5c c6 48  43 1a 80 4c 8b a3 20 37 
00 00 0f 85 62 02 00 00 31 c9 41 83
RIP  [] mv_interrupt+0x21c/0x4cc
 RSP 
CR2: 001a
---[ end trace 2583b5f7a5350584 ]---
Kernel panic - not syncing: Aiee, killing interrupt handler!

last_port already include port0 base.
this patch change use last_port directly, and move pp assignment later.

Signed-off-by: Yinghai Lu <[EMAIL PROTECTED]>

..

Yup, obvious bug fixes, thanks.
Signed-off-by: Mark Lord <[EMAIL PROTECTED]>


Index: linux-2.6/drivers/ata/sata_mv.c
===
--- linux-2.6.orig/drivers/ata/sata_mv.c
+++ linux-2.6/drivers/ata/sata_mv.c
@@ -1716,14 +1716,16 @@ static void mv_host_intr(struct ata_host
VPRINTK("ENTER, hc%u relevant=0x%08x HC IRQ cause=0x%08x\n",
hc, relevant, hc_irq_cause);
 
-	for (port = port0; port < port0 + last_port; port++) {

+   for (port = port0; port < last_port; port++) {
struct ata_port *ap = host->ports[port];
-   struct mv_port_priv *pp = ap->private_data;
+   struct mv_port_priv *pp;
int have_err_bits, hard_port, shift;
 
 		if ((!ap) || (ap->flags & ATA_FLAG_DISABLED))

continue;
 
+		pp = ap->private_data;

+
shift = port << 1;/* (port * 2) */
if (port >= MV_PORTS_PER_HC) {
shift++;/* skip bit 8 in the HC Main IRQ reg */

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG at net/sunrpc/svc_xprt.c:921 (another one)

2013-01-20 Thread Mark Lord
Got it again, this time on a different system
running mostly the same software.

This time, I noticed when it happened:  on mounting another system's storage 
over NFS.
I was doing a "mount" command when suddenly this happened.  Linux-3.7.3.

[ 3342.841487] [ cut here ]
[ 3342.841527] kernel BUG at net/sunrpc/svc_xprt.c:921!
[ 3342.841547] invalid opcode:  [#1] PREEMPT SMP
[ 3342.841579] Modules linked in: nfsv3 nfsv4 sha1_generic ppp_mppe ppp_async 
crc_ccitt ppp_generic
slhc btusb hid_generic arc4 usbhid hid b43 coretemp kvm_intel kvm mac80211 
cfg80211
snd_hda_codec_idt dell_wmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss 
snd_mixer_oss snd_pcm
dell_laptop dcdbas snd_seq_dummy snd_seq_oss microcode snd_seq_midi psmouse 
snd_rawmidi
snd_seq_midi_event snd_seq nouveau ttm drm_kms_helper ssb drm bnep i2c_algo_bit 
snd_timer i2c_core
mxm_wmi snd_seq_device rfcomm snd bluetooth parport_pc soundcore snd_page_alloc 
ppdev video nfsd
auth_rpcgss nfs lockd binfmt_misc sunrpc lp parport firewire_ohci tg3 
firewire_core crc_itu_t libphy
hwmon sdhci_pci sdhci
[ 3342.842104] CPU 1
[ 3342.842120] Pid: 4108, comm: nfsv4.0-svc Not tainted 3.7.3 #3 Dell Inc. 
Precision M6300
   /0JM680
[ 3342.842151] RIP: 0010:[]  [] 
svc_delete_xprt+0x23/0xf3 [sunrpc]
[ 3342.842206] RSP: 0018:8801a680be38  EFLAGS: 00010286
[ 3342.842226] RAX:  RBX: 880215103000 RCX: dead00200200
[ 3342.842247] RDX: dead00100100 RSI: 880215103038 RDI: 0006
[ 3342.842270] RBP: 880215726900 R08: 0606 R09: 88021fd11280
[ 3342.842293] R10: 004f9885 R11: 88021fd11280 R12: 880215726900
[ 3342.842315] R13: 880215103038 R14:  R15: 
[ 3342.842337] FS:  () GS:88021fd0() 
knlGS:
[ 3342.842361] CS:  0010 DS:  ES:  CR0: 8005003b
[ 3342.842380] CR2: 00b0d000 CR3: 0001cf402000 CR4: 07e0
[ 3342.842402] DR0:  DR1:  DR2: 
[ 3342.842425] DR3:  DR6: 0ff0 DR7: 0400
[ 3342.842448] Process nfsv4.0-svc (pid: 4108, threadinfo 8801a680a000, 
task 8802027cf080)
[ 3342.842476] Stack:
[ 3342.842488]  dead00200200 88020970a000 880215103000 
880215726900
[ 3342.842527]  880215726900 a007c960 8801a680bfd8 
00011280
[ 3342.842568]  8801a680bfd8 88020970a000 88020970a000 
8801a680bf08
[ 3342.842608] Call Trace:
[ 3342.842644]  [] ? svc_recv+0xcc/0x338 [sunrpc]
[ 3342.842678]  [] ? nfs_callback_authenticate+0x20/0x20 
[nfsv4]
[ 3342.842712]  [] ? nfs4_callback_svc+0x1d/0x3c [nfsv4]
[ 3342.842739]  [] ? kthread+0x81/0x89
[ 3342.842763]  [] ? posix_cpu_nsleep_restart+0x11/0x89
[ 3342.842787]  [] ? kthread_freezable_should_stop+0x31/0x31
[ 3342.842813]  [] ? ret_from_fork+0x7c/0xb0
[ 3342.842835]  [] ? kthread_freezable_should_stop+0x31/0x31
[ 3342.842856] Code: c2 84 d2 74 02 eb a0 c3 41 55 4c 8d 6f 38 41 54 4c 89 ee 
55 53 48 89 fb 51 48
8b 6f 40 bf 06 00 00 00 e8 62 fa ff ff 85 c0 74 02 <0f> 0b 48 8b 43 08 4c 8d 65 
10 48 89 df ff 50 38
4c 89 e7 e8 2a
[ 3342.843192] RIP  [] svc_delete_xprt+0x23/0xf3 [sunrpc]
[ 3342.843235]  RSP 
[ 3342.858471] ---[ end trace f0dbe9f9cd4029c3 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG at net/sunrpc/svc_xprt.c:921 (another one)

2013-02-13 Thread Mark Lord
On 13-02-12 03:52 PM, J. Bruce Fields wrote:
> On Sun, Jan 20, 2013 at 05:51:12PM -0500, Mark Lord wrote:
>> Got it again, this time on a different system
>> running mostly the same software.
> 
> Mark, Paweł, Tom, could any of you confirm whether this helps?
..

No, I cannot confirm one way or the other,
because I haven't noticed it again since the most recent
couple of occurrences I posted earlier here.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


BUG at net/sunrpc/svc_xprt.c:921

2013-01-14 Thread Mark Lord
Since upgrading to 3.7, and now 3.7.2, my AMD-450E based server
is getting these BUG complaints.  The .config file is gzip'd/attached.

[ cut here ]
kernel BUG at net/sunrpc/svc_xprt.c:921!
invalid opcode:  [#1] SMP
Modules linked in: nfsv4 xt_state xt_tcpudp xt_recent xt_LOG xt_limit 
iptable_mangle iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter 
ip_tables x_tables
sc520_wdt btusb snd_usb_audio snd_usbmidi_lib hid_generic ftdi_sio usbserial 
usbhid hid
snd_hda_codec_realtek psmouse snd_hda_codec_hdmi r8169 xhci_hcd mii 
snd_hda_intel snd_hda_codec
snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi 
snd_seq_midi_event snd_seq bnep
snd_timer rfcomm snd_seq_device bluetooth snd nfsd auth_rpcgss binfmt_misc 
radeon nfs lockd sunrpc
soundcore ttm snd_page_alloc drm_kms_helper drm i2c_algo_bit it87 hwmon_vid 
k10temp hwmon microcode
CPU 0
Pid: 29613, comm: nfsv4.0-svc Not tainted 3.7.2 #1 System manufacturer System 
Product Name/E45M1-I
DELUXE
RIP: 0010:[]  [] svc_delete_xprt+0x23/0xeb 
[sunrpc]
RSP: 0018:880234f05e38  EFLAGS: 00010286
RAX:  RBX: 8801b931b000 RCX: dead00200200
RDX: dead00100100 RSI: 8801b931b038 RDI: 0006
RBP: 880049125e40 R08: 0606 R09: 88023ec10fc0
R10: 88023ec10fc0 R11: 88023ec10fc0 R12: 880049125e40
R13: 8801b931b038 R14:  R15: 
FS:  7f5bef2fd700() GS:88023ec0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 7f2f1800bfa0 CR3: 00015ba2e000 CR4: 07f0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process nfsv4.0-svc (pid: 29613, threadinfo 880234f04000, task 
88021b51a280)
Stack:
 1cc7 8801a5f2e000 8801b931b000 880049125e40
 880049125e40 a016a56a 00010fc0 880234f05fd8
 880234f05fd8 8801a5f2e000 8801a5f2e000 880234f05f08
Call Trace:
 [] ? svc_recv+0xcc/0x338 [sunrpc]
 [] ? nfs_callback_authenticate+0x20/0x20 [nfsv4]
 [] ? nfs4_callback_svc+0x1d/0x3c [nfsv4]
 [] ? kthread+0x81/0x89
 [] ? kthread_freezable_should_stop+0x36/0x36
 [] ? ret_from_fork+0x7c/0xb0
 [] ? kthread_freezable_should_stop+0x36/0x36
Code: c2 84 d2 74 02 eb a0 c3 41 55 4c 8d 6f 38 41 54 4c 89 ee 55 53 48 89 fb 
50 48 8b 6f 40 bf 06
00 00 00 e8 77 fa ff ff 85 c0 74 02 <0f> 0b 48 8b 43 08 4c 8d 65 10 48 89 df ff 
50 38 4c 89 e7 e8 6d
RIP  [] svc_delete_xprt+0x23/0xeb [sunrpc]
 RSP 
---[ end trace 916f6471c0b47e1d ]---


Here's the code with the BUG() at net/sunrpc/svc_xprt.c line 921:

/*
 * Remove a dead transport
 */
static void svc_delete_xprt(struct svc_xprt *xprt)
{
struct svc_serv *serv = xprt->xpt_server;
struct svc_deferred_req *dr;

/* Only do this once */
if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
BUG();
...






config.txt.gz
Description: GNU Zip compressed data


Re: BUG at net/sunrpc/svc_xprt.c:921

2013-01-14 Thread Mark Lord
On 13-01-14 03:37 PM, J. Bruce Fields wrote:
> Thanks for the report.
> 
> On Mon, Jan 14, 2013 at 11:17:09AM -0500, Mark Lord wrote:
>> Since upgrading to 3.7, and now 3.7.2, my AMD-450E based server
> 
> It's acting as an NFS client, right?

Client and server, with other Linux boxes all running 3.something kernels.

> What did you upgrade from?

3.4.something, I believe.

>> is getting these BUG complaints.  The .config file is gzip'd/attached.
> 
> Is this easy to reproduce?

So far, it seems to pop up within a day or so of any reboot.
I normally only reboot that system for a kernel upgrade,
but can do so a bit more often if there's useful info to collect.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG at net/sunrpc/svc_xprt.c:921

2013-01-16 Thread Mark Lord
On 13-01-16 12:20 AM, Stanislav Kinsbursky wrote:
> 
> Mark, could you provide any call traces?

Call traces from where/what?
There's this one, posted earlier in the BUG report:

kernel BUG at net/sunrpc/svc_xprt.c:921!
Call Trace:
 [] ? svc_recv+0xcc/0x338 [sunrpc]
 [] ? nfs_callback_authenticate+0x20/0x20 [nfsv4]
 [] ? nfs4_callback_svc+0x1d/0x3c [nfsv4]
 [] ? kthread+0x81/0x89
 [] ? kthread_freezable_should_stop+0x36/0x36
 [] ? ret_from_fork+0x7c/0xb0
 [] ? kthread_freezable_should_stop+0x36/0x36

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG at net/sunrpc/svc_xprt.c:921

2013-01-16 Thread Mark Lord
On 13-01-16 05:51 PM, Mark Lord wrote:
> On 13-01-16 12:20 AM, Stanislav Kinsbursky wrote:
>>
>> Mark, could you provide any call traces?
> 
> Call traces from where/what?
> There's this one, posted earlier in the BUG report:
> 
> kernel BUG at net/sunrpc/svc_xprt.c:921!
> Call Trace:
>  [] ? svc_recv+0xcc/0x338 [sunrpc]
>  [] ? nfs_callback_authenticate+0x20/0x20 [nfsv4]
>  [] ? nfs4_callback_svc+0x1d/0x3c [nfsv4]
>  [] ? kthread+0x81/0x89
>  [] ? kthread_freezable_should_stop+0x36/0x36
>  [] ? ret_from_fork+0x7c/0xb0
>  [] ? kthread_freezable_should_stop+0x36/0x36
..

This might be of some interest.
Here are the first few lines of the same BUG occurance,
with timestamps and the dmesg lines that immediately preceeded it.
Perhaps they might help indicate who's triggering the action
that results in the BUG(?).

Jan 14 10:58:05 zippy kernel: [66045.627952] NFS: Registering the id_resolver 
key type
Jan 14 10:58:05 zippy kernel: [66045.628014] Key type id_resolver registered
Jan 14 10:58:05 zippy kernel: [66045.628020] Key type id_legacy registered
Jan 14 10:58:05 zippy kernel: [66045.636302] [ cut here 
]
Jan 14 10:58:05 zippy kernel: [66045.648342] kernel BUG at 
net/sunrpc/svc_xprt.c:921!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG at net/sunrpc/svc_xprt.c:921

2013-01-17 Thread Mark Lord
On 13-01-14 11:17 AM, Mark Lord wrote:
>
> Here's the code with the BUG() at net/sunrpc/svc_xprt.c line 921:
> 
> /*
>  * Remove a dead transport
>  */
> static void svc_delete_xprt(struct svc_xprt *xprt)
> {
> struct svc_serv *serv = xprt->xpt_server;
> struct svc_deferred_req *dr;
> 
> /* Only do this once */
> if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
> BUG();


Shouldn't there also be a return statement after the BUG() line,
inside the if-stmt ?

I mean, the comment says "only do this once", but it actually
appears to end up doing it twice, despite the test.

??
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG at net/sunrpc/svc_xprt.c:921

2013-01-17 Thread Mark Lord
On 13-01-17 08:53 AM, J. Bruce Fields wrote:
> On Thu, Jan 17, 2013 at 08:11:52AM -0500, Mark Lord wrote:
>> On 13-01-14 11:17 AM, Mark Lord wrote:
>>>
>>> Here's the code with the BUG() at net/sunrpc/svc_xprt.c line 921:
>>>
>>> /*
>>>  * Remove a dead transport
>>>  */
>>> static void svc_delete_xprt(struct svc_xprt *xprt)
>>> {
>>> struct svc_serv *serv = xprt->xpt_server;
>>> struct svc_deferred_req *dr;
>>>
>>> /* Only do this once */
>>> if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
>>> BUG();
>>
>>
>> Shouldn't there also be a return statement after the BUG() line,
>> inside the if-stmt ?
> 
> BUG() kills the thread that calls it

Oh, does it?  Well, taken care of then, I guess.
With a sledgehammer.

:)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG at net/sunrpc/svc_xprt.c:921

2013-01-17 Thread Mark Lord
On 13-01-17 08:24 AM, Stanislav Kinsbursky wrote:
..
> This looks like the old issue I was trying to fix with "SUNRPC: protect 
> service sockets lists during
> per-net shutdown".
> So, here is the problem as I see it: there is a transport, which is processed 
> by service thread and
> it's processing is racing with per-net service shutdown:
> 
> CPU#0:CPU#1:
> 
> svc_recvsvc_close_net
> svc_get_next_xprt (list_del_init(xpt_ready))
> svc_close_list (set XPT_BUSY and XPT_CLOSE)
> svc_clear_pools(xprt was gained on CPU#0 already)
> svc_delete_xprt (set XPT_DEAD)
> svc_handle_xprt (is XPT_CLOSE => svc_delete_xprt()
> BUG()
> 
> So, from my POW, we need some way to:
> 1) Skip such in-progress transports on svc_close_net() call (there is not way 
> to detect them, or at 
> least I don't see one)
> 2) Delete the transport after somewhere after svc_xprt_received()
> 
> But there is a problem with svc_xprt_received(): there is a call for 
> svc_xprt_put() in it
> (svc_recv->svc_handle_xprt->svc_xprt_received->svc_xprt_put) . And if we are 
> the only user - then
> the transport will be destroyed. But transport is dereferenced later in 
> svc_recv() after the
> svc_handle_xprt call.

Sounds like a reference count type of problem/solution (kref) (?)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG at net/sunrpc/svc_xprt.c:921

2013-01-18 Thread Mark Lord
On 13-01-18 12:37 AM, Stanislav Kinsbursky wrote:
>
> You have more than one NFS mount in different network namespaces, haven't you?
>

No, I don't (knowingly) use (multiple) namespaces at all.
Usually I disable them in the kernel .config,
though it appears the currently running kernel has this:

CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_IPC_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_NET_NS is not set

The full .config was attached to the first post in this thread.

Cheers

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.

2008-02-22 Thread Mark Lord

Rafael J. Wysocki wrote:


No.  Again, if there are devices that wake us up from S4, but not from S5,
they need to be handled differently in the *enter S4* case (hibernation) and
in the *enter S5* case (powering off the system).

..

Something I've never understood, is why we would ever want to bother with *S4* 
at all?

I actually like hibernation (great for travelling), but I treat it as if
it were a complete power-off (S5?).  I pull batteries, unplug drives,
boot other operating systems, etc..

And when I put it all back together again with the Linux disk inserted,
I fully expect it to "resume" from the hibernation of 3 months ago.
And it does.

Why would I ever want anything less than a full poweroff for hibernation 

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.

2008-02-22 Thread Mark Lord

[EMAIL PROTECTED] wrote:
..
I've been watching for kexec hibernate for a little while now, and the 
last I saw was that acpi was incompatible with the kexec hibernate (but 
the suspend folks were still claiming that devices needed to be put in 
the 'right mode' not just powered off. I've been waiting to see this 
resolved.

..

Yeah, exactly.  What's so special about poweroff on hibernation?
Why even bother with the special "S4" state there?
I want a real full poweroff, or at least I think I do.  Why wouldn't I?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc2 System no longer powers off after suspend-to-disk. Screen becomes green.

2008-02-22 Thread Mark Lord

Rafael J. Wysocki wrote:

On Friday, 22 of February 2008, Mark Lord wrote:

[EMAIL PROTECTED] wrote:
..
I've been watching for kexec hibernate for a little while now, and the 
last I saw was that acpi was incompatible with the kexec hibernate (but 
the suspend folks were still claiming that devices needed to be put in 
the 'right mode' not just powered off. I've been waiting to see this 
resolved.

..

Yeah, exactly.  What's so special about poweroff on hibernation?
Why even bother with the special "S4" state there?


(1) To be able to wake up with the help of devices that can't wake
the system up from S5 (power off)
(2) To handle some platform devices appropriately over the cycle

..

That's the theory.  I've read about it, but have yet to imagine
any real-life situation where it applies.

But this isn't my speciality, so.. do you have experience with any real 
examples?

Thanks!


I want a real full poweroff, or at least I think I do.  Why wouldn't I?




You may want that, some people may not want it.

We are supposed to handle S4, the BIOS/platform may expect us to do that, so
IMO this is a good enough reason to do it.  Especially that we can.

Thanks,
Rafael


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.25-rc2 + smartd = hang

2008-02-22 Thread Mark Lord

Anders Eriksson wrote:


[EMAIL PROTECTED] said:

The sysrq-e output is probably just standard ext3 journalling unrelated  to
the problem...  what does dmesg say?  lspci?  What's your hardware setup? 



dmesg ; smartd ; dmesg yields no new entries in dmesg. It seems on disk 
accesses are dead. it still routes packets fine.


This is an old PII-300 with 2 IDS disks and a DVD R/W. 

...

Feb 22 17:38:49 tippex Uniform Multi-Platform E-IDE driver
Feb 22 17:38:49 tippex ide: Assuming 33MHz system bus speed for PIO modes; 
override with idebus=xx
Feb 22 17:38:49 tippex PIIX4: IDE controller (0x8086:0x7111 rev 0x01) at  PCI 
slot :00:07.1
Feb 22 17:38:49 tippex PIIX4: not 100% native mode: will probe irqs later
Feb 22 17:38:49 tippex ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, 
hdb:PIO
Feb 22 17:38:49 tippex ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:PIO, 
hdd:PIO
Feb 22 17:38:49 tippex Probing IDE interface ide0...
Feb 22 17:38:49 tippex hdb: IC35L120AVV207-0, ATA DISK drive
Feb 22 17:38:49 tippex hda: IBM-DTTA-371010, ATA DISK drive
Feb 22 17:38:49 tippex hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4
Feb 22 17:38:49 tippex hda: UDMA/33 mode selected
Feb 22 17:38:49 tippex hdb: host max PIO4 wanted PIO255(auto-tune) selected PIO4
Feb 22 17:38:49 tippex hdb: UDMA/33 mode selected
Feb 22 17:38:49 tippex Probing IDE interface ide1...
Feb 22 17:38:49 tippex hdd: Maxtor 6L250R0, ATA DISK drive
Feb 22 17:38:49 tippex hdc: AOPEN DUW1608/ARR, ATAPI CD/DVD-ROM drive
Feb 22 17:38:49 tippex hdc: host max PIO4 wanted PIO255(auto-tune) selected PIO4
Feb 22 17:38:49 tippex hdc: UDMA/33 mode selected
Feb 22 17:38:49 tippex hdd: host max PIO4 wanted PIO255(auto-tune) selected PIO4
Feb 22 17:38:49 tippex hdd: UDMA/33 mode selected
Feb 22 17:38:49 tippex ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Feb 22 17:38:49 tippex ide1 at 0x170-0x177,0x376 on irq 15

..

So that's using the old IDE drivers.
And the network and USB are sharing IRQ#11 with each other.

If you are going to be using newer kernels like this (2.6.23+),
then you might consider shifting those drives over to libata drivers.

This involves a little bit of work -- building a kernel with libata
and "ata_piix" built-in instead of the old IDE drivers,
and then rearranging /etc/fstab to match the new device names
(eg. /dev/sda instead of /dev/hda).

But at this point libata is working much better than the old IDE stuff,
and it really is worth moving things over if you can.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ata_ram driver

2008-02-22 Thread Mark Lord

Matthew Wilcox wrote:

I've ported the scsi_ram driver [1] to libata.  It could use a lot more
work -- there's a lot of stuff in the identify page that I haven't
filled in, and there's a lot of commands it doesn't even try to execute.

For example, when you unload the driver, you get the mildly disturbing
messages:

sd 12:0:0:0: [sdb] Stopping disk
sd 12:0:0:0: [sdb] START_STOP FAILED
sd 12:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET 
driverbyte=DRIVER_OK,SUGGEST_OK


..

I see messages like those with *established* libata drivers from time to time.
It could just be a bug in the shutdown sequence, somewhere between libata,
SCSI, block layer, and the device model in general.  Or not.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ide: remove stale comments from ide-dma.c

2008-02-22 Thread Mark Lord

Bartlomiej Zolnierkiewicz wrote:

- ide-dma.c is not a separate module

- ide-dma.c is not PCI specific anymore

- DMA is enabled by default nowadays

- link for Intel Zappa BIOS is dead

etc.

Signed-off-by: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>
---
 drivers/ide/ide-dma.c |   48 
 1 file changed, 48 deletions(-)

Index: b/drivers/ide/ide-dma.c
===
--- a/drivers/ide/ide-dma.c
+++ b/drivers/ide/ide-dma.c
@@ -11,49 +11,6 @@
  */
 
 /*

- * This module provides support for the bus-master IDE DMA functions
- * of various PCI chipsets, including the Intel PIIX (i82371FB for
- * the 430 FX chipset), the PIIX3 (i82371SB for the 430 HX/VX and 
- * 440 chipsets), and the PIIX4 (i82371AB for the 430 TX chipset)

- * ("PIIX" stands for "PCI ISA IDE Xcellerator").
- *
- * Pretty much the same code works for other IDE PCI bus-mastering chipsets.
- *
- * DMA is supported for all IDE devices (disk drives, cdroms, tapes, floppies).

..

Those top comments still look relevant, or at least as relevant
as the rest of the file (and subsystem) itself.  :)

Sigh.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ugly patch] Save .15W-.5W by AHCI powersaving

2008-02-25 Thread Mark Lord

Pavel Machek wrote:

Hi!

This is a patch (very ugly, assumes you have just one disk) to bring
powersaving to AHCI. You need Alan's SCSI autosuspend (attached) patch
as a base.

It saves .5W compared to config with disk spinning, and even .15W
compared to hdparm -y... on my thinkpad x60 anyway.

..

There was a discussion of this here today.
It makes good use of AHCI-specific features.

Has it been tested with a Port-Multiplier yet?

This is cool enough that we really ought to do a hardware-independent
version, so that all SATA interfaces could benefit.  Especially ata_piix,
but others too.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Sata-MV, Intergated Sata Device Support

2008-02-25 Thread Mark Lord

Jeff Garzik wrote:

Jon Li wrote:

Hello,

I am curious as to whether there are plans to add support for integrated
sata devices.  I personally want to add support for a 60x1C0 based
device (pci:id = 0x5182).  I think adding support should be relatively
simple, except for a few issues outlined below.

In the original mvSata.c (ver3.4) that has 0x5182 support, the config
space is as such:

case MV_SATA_DEVICE_ID_5182:
pAdapter->numberOfChannels = MV_SATA_5182_PORT_NUM;
pAdapter->numberOfUnits = 1;
pAdapter->portsPerUnit = 2;
pAdapter->sataAdapterGeneration = MV_SATA_GEN_IIE;
/*The integrated sata core chip based on 60x1 C0*/
pAdapter->chipIs60X1C0 = MV_TRUE;
pAdapter->hostInterface = MV_HOST_IF_INTEGRATED;
pAdapter->mainMaskOffset = 0x20024; /*the iobaseaddress is
0x6*/
pAdapter->mainCauseOffset = 0x20020;
break;

I have not yet figured out how all these values are defined in sata-mv.c
(ver 0.8).  Specifically, where do I define "numberOfChannels" which
should equal 2, and "numberOfUnits" which obviously equals 1?

I have a current config space (not completed) for sata-mv.c which is:

{  /* chip_5182 */
.sht= &mv_sht,
.flags= (MV_COMMON_FLAGS | MV_6XXX_FLAGS |
   MV_FLAG_DUAL_HC),
.pio_mask= 0x1f,/* pio0-4 */
.udma_mask= 0x7f,/* udma0-6 */
.port_ops= &mv6_ops,
},

...

Saeed:  isn't this what your SOC patches already implemented for us?
As near as I can tell, sata_mv now already has support for the 60x1C0.

-ml
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ugly patch] Save .15W-.5W by AHCI powersaving

2008-02-26 Thread Mark Lord

Pavel Machek wrote:

Hi!


This is a patch (very ugly, assumes you have just one disk) to bring
powersaving to AHCI. You need Alan's SCSI autosuspend (attached) patch
as a base.

It saves .5W compared to config with disk spinning, and even .15W
compared to hdparm -y... on my thinkpad x60 anyway.

..

There was a discussion of this here today.


Real-life discussion, or something I could read? :-).


It makes good use of AHCI-specific features.

Has it been tested with a Port-Multiplier yet?


I do not know what port-multiplier is, sorry. But it was not really
tested. It is not expected to work on any other config than notebook
very similar to mine.


This is cool enough that we really ought to do a hardware-independent
version, so that all SATA interfaces could benefit.  Especially ata_piix,
but others too.


Well, it seems like it is 10 lines per driver once Alan's SCSI
autosuspend patches are in...

..

Cool (literally)!

I think I might have gotten your patch confused in my mind
with another AHCI patch, which uses features of the chip itself
to automatically negotiate/change link power status on the fly
(no s/w needed, other than to turn it on).

That one is very ACPI specific, though.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Sata-MV, Intergated Sata Device Support

2008-02-26 Thread Mark Lord

saeed wrote:


On Mon, 25 Feb 2008, Jeff Garzik wrote:


...

Saeed:  isn't this what your SOC patches already implemented for us?
As near as I can tell, sata_mv now already has support for the 60x1C0.

Saeed's stuff didn't support PCI though, and Jon Li is definitely talking
about PCI...
yes, my patch added support for the SoC sata like in the 5182, and this 
is what Jon Li was concerned about. he mentioneded the 60x1C0 pci device 
just to suggest to use it's code for the SoC sata as it is very similar.

..

I don't think I understand your english there.

Does the current sata_mv driver work as-is with the chipset this person wants?
If not, then exactly what has to change to make it work?

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problems with Promise IDE controller under 2.4.1

2001-01-31 Thread Mark Lord

Simple solution is to have kernel fall-back to LBA style
translations instead of kernel "basic" translations.
This would make it match the first two "BIOS" drives
on most systems, and not really hurt anything in most cases.

Even better would be to add a stage in front of the fall-back,
which queries the BIOS (from kernel startup code) for translation
info on ALL drives.
-- 
Mark Lord
Real-Time Remedies Inc.
[EMAIL PROTECTED]


Rupa Schomaker wrote:
> 
> Andre Hedrick <[EMAIL PROTECTED]> writes:
> 
> > > But there is no indication of what the problems could be,
> > > or what he thinks the geometry should be (and why).
> > > I see nothing very wrong in the posted data.
> >
> > We agree Andries, but the enduser wants to see stuff the same.
> 
> In my case, I have two identical Maxtor drives, but they reported
> different geometry.  How could that be?  Move the "virgin" drive to
> the motherboard IDE controller and suddenly the geometry is the same.
> Use fdisk and partition the disk, write it, and then move to the
> promise controller and the "correct" geometry was used (that is, it is
> now the same as when hooked up to the motherboard ide controller).
> 
> Why was it important to me?  I'm doing RAID1 and it is really nice to
> have the same geometry so that the partition info is the same between
> the two drives.   Makes life easier.
> 
> --
> -rupa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: security issue: hard disk lock

2005-04-13 Thread Mark Lord
hdparm-6.0 is currently winding through release channels,
and includes support for freezing/managing the security status.
Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ALPS psmouse_reset on reconnect confusing Tecra M2

2005-07-05 Thread Mark Lord

Ahh.. that might explain some weirdness observed here, as well.

Thanks!

Dmitry Torokhov wrote:


Please try the following patch:

http://www.ucw.cz/~vojtech/input/alps-suspend-typo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git patches] IDE update

2005-07-07 Thread Mark Lord

Note:

hdparm can also use O_DIRECT for the -t timing test.

Eg.  hdparm --direct -t /dev/hda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git patches] ide update

2005-08-18 Thread Mark Lord

Linus Torvalds wrote:


Btw, things like this:

+#define IDEFLOPPY_TICKS_DELAY  HZ/20   /* default delay for ZIP 100 
(50ms) */

are just bugs waiting to happen.


Needs parenthesis: ((HZ)/20)

Or one could just use the msecs_to_jiffies() macro.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] libata: add ATAPI module option

2005-08-30 Thread Mark Lord

Jeff Garzik wrote:


-#ifndef ATA_ENABLE_ATAPI
-   if (unlikely(dev->class == ATA_DEV_ATAPI))
-   return NULL;
-#endif
+   if (atapi_enabled) {
+   if (unlikely(dev->class == ATA_DEV_ATAPI))
+   return NULL;
+   }

..

Is that if-stmt the right way around?
At first glance, I'd expect it to read:

 if (!atapi_enabled) {
 ...

Cheers!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: APs from the Kernel Summit run Linux

2005-08-31 Thread Mark Lord

Mmm.. curious sequence in the first 512 bytes of
the DWL-G730AP firmware binary.  It has this
sequence of bytes repeated several times:

  81 40 20 10 08 04 02 81 40 20 10 08 04 02 ...

That should be recognizable to somebody, I think.

I'll try loading the works into another ARM
system I have here, and see (1) if it runs as-is,
and (2) what the disassembly shows.

I'd certainly like to get source for my 730AP here,
as it seems to be a bit buggy on the WEP implementation.

Cheers
--
Mark Lord
Real-Time Remedies Inc.
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: APs from the Kernel Summit run Linux

2005-08-31 Thread Mark Lord

>Each of the first three large parts starts with this sequence of bytes

Actually, the byte structure of the first 0x100 bytes
of each section seems to be very similar.

Some kind of header.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SOLVED] USB Storage speed regression since 2.6.12

2005-09-01 Thread Mark Lord

DervishD wrote:
..

the new implementation seems to rewrite the fat on every single write
(that's the reason of the slowdown, probably), and since I'm not sure
about the quality of the flash memory present in the device, it is
very probable that it would wear the first sectors :( So I have to
mount it 'async' under 2.6.13; I didn't have to do that on older


Nearly all flashcard devices (CompactFlash, SD, MMC, ..)
have built-in wear-leveling in the on-card controller logic.
So continuously rewriting the FAT will NOT rewrite the same
on-card physical pages over and over, but rather it will
try to spread those writes out over the entire (available)
span of physical sectors on the device.

So no worries about "wearing out the FAT sectors",
but I'd still use "async" just to reduce the overall
wear and tear regardless.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: DVD+-R[W] regression in 2.6.12/13

2005-09-05 Thread Mark Lord

Oh, you *should* be able to get the results you
are looking for (hdparm -I) by trying it this way:

   hdparm --Istdin http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: DVD+-R[W] regression in 2.6.12/13

2005-09-05 Thread Mark Lord

Oliver Tennert wrote:


"hdparm -I /dev/dvdrecorder" leads to the output:

/dev/dvdrecorder:
 HDIO_DRIVE_CMD(identify) failed: Input/output error

The kernel tells me:

[4296893.262000] hdd: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
[4296893.262000] hdd: drive_cmd: error=0x04 { AbortedCommand }
[4296893.262000] ide: failed opcode was: 0xec


Those messages are "normal" for an ATAPI drive.

hdparm first tries the IDENTIFY opcode (0xec), and if that fails (above)
it then tries the PACKET_IDENTIFY opcode (0xa1), which should work for ATAPI.

I'm not sure why the "failed: Input/output error" (-EIO) result is
being returned from the ATA layer in this case.  Driver bug, most likely.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: DVD+-R[W] regression in 2.6.12/13

2005-09-05 Thread Mark Lord

Alan Cox wrote:

On Llu, 2005-09-05 at 12:24 -0400, Mark Lord wrote:


I'm not sure why the "failed: Input/output error" (-EIO) result is
being returned from the ATA layer in this case.  Driver bug, most likely.


Because the command failed an error was reported back instead of success
status/info.


Well, yes, that's -EIO, as expected from the IDENTIFY command.
But the PACKET_IDENTIFY should not be failing on the ATAPI drive.

-ml
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFC: i386: kill !4KSTACKS

2005-09-06 Thread Mark Lord

Daniel Phillips wrote:
There are only two stacks involved, the normal kernel stack and your new ndis 
stack.  You save ESP of the kernel stack at the base of the ndis stack.  When 
the Windows code calls your api, you get the ndis ESP, load the kernel ESP 
from the base of the ndis stack, push the ndis ESP so you can get back to the 
ndis code later, and continue on your merry way.


With CONFIG_PREEMPT, this will still cause trouble due to lack
of "current" task info on the NDIS stack.

One option is to copy (duplicate) the bottom-of-stack info when
switching to the NDIS stack.

Another option is to stick a Mutex around any use of the NDIS stack
when calling into the foreign driver (might be done like this already??),
which will prevent PREEMPTion during the call.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PATCH] USB patches for 2.6.13

2005-09-08 Thread Mark Lord

Is someone actively working on USB Suspend/Resume support yet?

I ask because this is becoming more and more important as people
shift more to portable notebook computers with Linux.

Enabling CONFIG_USB_SUSPEND is currently a surefire way to
guarantee crashing my own notebook on suspend/resume,
whereas it *usually* (but not always) survives when that
config option is left unset.

Nothing complicated in the configuration -- just a USB mouse,
but that's enough to nuke it.

Anyone looking at that stuff right now?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sata_nv + ADMA + Samsung disk problem

2008-01-03 Thread Mark Lord

Robert Hancock wrote:

Mark Lord wrote:

Robert Hancock wrote:
..
 From some of the traces I took previously (posted on LKML as 
"sata_nv ADMA controller lockup investigation" way back in Feb 07), 
what seems to occur is that when the second command is issued very 
rapidly (within less than 20 microseconds, or potentially longer) 
after the previous command's completion, the ADMA status changes from 
0x500 (STOPPED and IDLE) to 0x400 (just IDLE) as it typically does, 
but then it sticks there, no interrupt is ever raised, and CPB 
response flags remain at 0.

..

Assuming that NVidia got their ADMA core logic from Pacific Digital
(the inventors), then it may have some of the same bugs as the original.

One of those bugs is that the aGO trigger is sampled in a "racey" way,
such that it sometimes may miss a recent addition to the ring.

The *only* way to guarantee things with the original Pacific Digital core
was to (1) always retrigger aGO for a full ring scan with each new 
addition,

and (2) poll periodically (every half second or so) rather than relying
exclusively on the IRQ actually working..

Dunno about the NVidia version.


Theirs works rather differently - the GO bit is there, but there's 
another append register which is used to tell the controller that a new 
tag has been added to the CPB list.

..

The PacDigi core uses a "search count" register for that purpose,
but the buggy nature of the core required that it always be set
to "2 * ring_size" to ensure nothing got missed.

Here's some comments from the original ADMA driver.
Maybe something from here might help with the NV stuff, too.

  // There is a chance that the chip will skip over a CPB if a SERVICE 
interrupt
   // occurs while it's reading the CPB header.  This won't cause us to get
   // stuck anywhere, but it might slow down execution of the new CPB if
   // it has to wait for the next time we hit aGO.  So.. Dxxx/Dxxx suggest
   // that all we need to do is tell the chip to do two passes around the 
ring
   // from an aGO instead of one pass, so that it will find the "missed" CPB
   // on the second pass.  This isn't as bad as it first looks.
   //
   writew(channel->num_cpbs * 2, &adma_regs->cpb_search_count);

Or again, the NV stuff may be completely different (?).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sata_nv + ADMA + Samsung disk problem

2008-01-03 Thread Mark Lord

Mark Lord wrote:

Robert Hancock wrote:

Mark Lord wrote:

Robert Hancock wrote:
..

 From some of the traces I took previously (posted on LKML as "sata_nv ADMA 
controller lockup investigation" way back in Feb 07), what seems to occur is that 
when the second command is issued very rapidly (within less than 20 microseconds, or 
potentially longer) after the previous command's completion, the ADMA status changes from 
0x500 (STOPPED and IDLE) to 0x400 (just IDLE) as it typically does, but then it sticks 
there, no interrupt is ever raised, and CPB response flags remain at 0.

..

Assuming that NVidia got their ADMA core logic from Pacific Digital
(the inventors), then it may have some of the same bugs as the original.

One of those bugs is that the aGO trigger is sampled in a "racey" way,
such that it sometimes may miss a recent addition to the ring.

The *only* way to guarantee things with the original Pacific Digital core
was to (1) always retrigger aGO for a full ring scan with each new addition,
and (2) poll periodically (every half second or so) rather than relying
exclusively on the IRQ actually working..

Dunno about the NVidia version.


Theirs works rather differently - the GO bit is there, but there's another 
append register which is used to tell the controller that a new tag has been 
added to the CPB list.

..

The PacDigi core uses a "search count" register for that purpose,
but the buggy nature of the core required that it always be set
to "2 * ring_size" to ensure nothing got missed.

Here's some comments from the original ADMA driver.
Maybe something from here might help with the NV stuff, too.

  // There is a chance that the chip will skip over a CPB if a SERVICE 
interrupt
   // occurs while it's reading the CPB header.  This won't cause us to get
   // stuck anywhere, but it might slow down execution of the new CPB if
   // it has to wait for the next time we hit aGO.  So.. Dxxx/Dxxx suggest
   // that all we need to do is tell the chip to do two passes around the 
ring
   // from an aGO instead of one pass, so that it will find the "missed" CPB
   // on the second pass.  This isn't as bad as it first looks.
   //
   writew(channel->num_cpbs * 2, &adma_regs->cpb_search_count);

Or again, the NV stuff may be completely different (?).

..

Another thing about the PacDigi core:  one has to be very careful
to avoid sequential accesses to sequential PCI locations when
programming the chip -- it cannot handle merged register writes.

So for any group of sequentially laid out registers, the code has
to ensure it never writes two adjacent registers in sequence..

-ml
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree

2008-01-03 Thread Mark Lord

Venki Pallipadi wrote:

Reintroduce run time configurable max_cstate for !CPU_IDLE case.

Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]>

Index: linux-2.6.24-rc/drivers/acpi/processor_idle.c
===
--- linux-2.6.24-rc.orig/drivers/acpi/processor_idle.c
+++ linux-2.6.24-rc/drivers/acpi/processor_idle.c
@@ -76,7 +76,11 @@ static void (*pm_idle_save) (void) __rea
 #define PM_TIMER_TICKS_TO_US(p)(((p) * 
1000)/(PM_TIMER_FREQUENCY/1000))
 
 static unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER;

+#ifdef CONFIG_CPU_IDLE
 module_param(max_cstate, uint, );
+#else
+module_param(max_cstate, uint, 0644);
+#endif
 static unsigned int nocst __read_mostly;
 module_param(nocst, uint, );
 

..

I'll try and re-test with this on Friday.

Meanwhile, can you give a short summary of how behaviour differs
between CONFIG_CPU_IDLE and !CONFIG_CPU_IDLE  ??

I'm not at all clear on how this really affects things.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree

2008-01-04 Thread Mark Lord

Mark Lord wrote:

Venki Pallipadi wrote:

Reintroduce run time configurable max_cstate for !CPU_IDLE case.

Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]>

Index: linux-2.6.24-rc/drivers/acpi/processor_idle.c
===
--- linux-2.6.24-rc.orig/drivers/acpi/processor_idle.c
+++ linux-2.6.24-rc/drivers/acpi/processor_idle.c
@@ -76,7 +76,11 @@ static void (*pm_idle_save) (void) __rea
 #define PM_TIMER_TICKS_TO_US(p)(((p) * 
1000)/(PM_TIMER_FREQUENCY/1000))
 
 static unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER;

+#ifdef CONFIG_CPU_IDLE
 module_param(max_cstate, uint, );
+#else
+module_param(max_cstate, uint, 0644);
+#endif
 static unsigned int nocst __read_mostly;
 module_param(nocst, uint, );
 

..

I'll try and re-test with this on Friday.

..

Okay, with !CONFIG_CPU_IDLE, this works fine -- same as 2.6.23 and earlier.


Meanwhile, can you give a short summary of how behaviour differs
between CONFIG_CPU_IDLE and !CONFIG_CPU_IDLE  ??

I'm not at all clear on how this really affects things.


???
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree

2008-01-05 Thread Mark Lord

Pallipadi, Venkatesh wrote:

-Original Message-
From: Mark Lord [mailto:[EMAIL PROTECTED] 

..
Okay, with !CONFIG_CPU_IDLE, this works fine -- same as 2.6.23 
and earlier.




Good to know. Atleast we do not have a regression for 2.6.24 now.

..

Agreed.  We're happy here, for now.


Meanwhile, can you give a short summary of how behaviour differs
between CONFIG_CPU_IDLE and !CONFIG_CPU_IDLE  ??

I'm not at all clear on how this really affects things.


With CPU_IDLE, the C-state policy is removed from acpi driver. Ideally
policy should have nothing to do with ACPI, as ACPI only provides the
C-state mechanisms. So, with CPU_IDLE, it is not easy to control this
variable through a acpi driver module at run time. Also, the latency
interface that was mentioned before is to serve the same purpose in a
more clear manner (based on the wakeup latency) instead of a C-state
number which may not mean much from the end user point of view.

I will look at why latency does not work on a single core system
soon(Was that with UP kernel or SMP kernel?). That way we will have a
proper cover for this with CPU_IDLE in future.

..

That was with a UP kernel on a UP box.

The latency thingie really seemed to have little or no effect,
whereas setting max_cstate=1 has a quite noticeable positive impact.

Things seemed okay (with the latency thingie) on the SMP machine,
but with two cores it is probably simply more forgiving.  


cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc6-git12: Reported regressions from 2.6.23

2008-01-05 Thread Mark Lord

Rafael J. Wysocki wrote:

This message contains a list of some regressions from 2.6.23 reported since
2.6.24-rc1 was released, for which there are no fixes in the mainline I know
of.  If any of them have been fixed already, please let me know.

..

Subject : 2+ wake-ups/second in 2.6.24
Submitter   : Mark Lord <[EMAIL PROTECTED]>
Date: 2007-12-02 04:23
References  : http://lkml.org/lkml/2007/12/1/141
  http://bugzilla.kernel.org/show_bug.cgi?id=9489
Handled-By  : Arjan van de Ven <[EMAIL PROTECTED]>


..

I have only seen that one once, and I think it was Arjan who said
that it has been observed rarely by other people as well.
The bugzilla entry is mostly just to track the darned thing,
but it seems unlikely that anyone will find/fix it for 2.6.24.
No big deal, but it would be good to have somebody knowledgeable
in clocks/interrupts try and track it down.

I wonder if it's just a babbling IRQ on resume, before the driver
has run it's resume code or something ?

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree

2008-01-06 Thread Mark Lord

Venki Pallipadi wrote:

Reintroduce run time configurable max_cstate for !CPU_IDLE case.

Signed-off-by: Venkatesh Pallipadi <[EMAIL PROTECTED]>

Index: linux-2.6.24-rc/drivers/acpi/processor_idle.c
===
--- linux-2.6.24-rc.orig/drivers/acpi/processor_idle.c
+++ linux-2.6.24-rc/drivers/acpi/processor_idle.c
@@ -76,7 +76,11 @@ static void (*pm_idle_save) (void) __rea
 #define PM_TIMER_TICKS_TO_US(p)(((p) * 
1000)/(PM_TIMER_FREQUENCY/1000))
 
 static unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER;

+#ifdef CONFIG_CPU_IDLE
 module_param(max_cstate, uint, );
+#else
+module_param(max_cstate, uint, 0644);
+#endif
 static unsigned int nocst __read_mostly;
 module_param(nocst, uint, );
 

..

Can we get this patch upstream so that a stock 2.6.24 will work here?

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.24-rc7

2008-01-06 Thread Mark Lord

Linus Torvalds wrote:
..
Both git trees and tar-balls/patches pushed out, should be mirroring out 
within minutes. So there are no excuses to not try it out, and see if your 
favorite regression has been fixed.

..

We're still missing the sata_qstor regression fix from Tejun,
and the patch from Venkatesh Pallipadi that reinstates max_cstate
in sysfs for !CPU_IDLE.

Cheers

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree

2008-01-07 Thread Mark Lord

Andrew Morton wrote:

..
umm, OK, I queued it for 2.6.24.  I'll give people a day or so to comment
on this.

I had to invent some silly changlelog for it.  Please review it for
accuracy and completeness?

..

From: Venki Pallipadi <[EMAIL PROTECTED]>

This was writeable in 2.6.23 but the cpuidle merge made it read-only.  But
some people's scripts (ie: Mark's) were writing to it.

..

Actually, the cpuidle changes made it not appear at all in sysfs.

Thanks, Andrew & Venki.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree

2008-01-07 Thread Mark Lord

Arjan van de Ven wrote:
..

if we take a step back; Mark afaics only wants to put 1 in there...
And that makes sense; either you want the "no latency" C1, or you want the lot
(esp given that C2 and deeper are at the whim of the bios, what they mean varies
over time. Actually even C1 does that on some AMD systems);

Longer term I'd suggest we make an option that basically is "C1 only",
(or technically, "use hlt only")
that solves Marks VMWARE thing, and is a lot closer to what people really want.

..

Yeah, that makes sense.


Well, that and if VMWARE really can't deal with latency in their kernel module
they should use the proper code for that. It's also a ton easier to implement, 
since
it basically is "don't use the CPUIDLE idle loop, but use the traditional hlt 
one"

..

I don't think it's so much VMware itself, as it is the guest OS inside it.

Cheers

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + restore-missing-sysfs-max_cstate-attr.patch added to -mm tree

2008-01-07 Thread Mark Lord

Len Brown wrote:

1. Why does VMware need max_cstate=1 to load quickly?

..

Eh?  Nothing to do with "loading" anything,
but rather it's simple responsiveness to guest keyboard
input that we're experiencing trouble with.
The guest OS is probably "broken" in that regard,
but setting max_cstate=1 makes it usable here.


2. Why does the "max_csate=1" workaround help only
   on the dual-core boxes, while the single-core
   boxes still fail to load quickly?

..

Eh?  Setting max_cstate=1 helps on both single/dual core
boxes/kernels here.  The alternative (newer) latency thing
(that requires a custom kernel module to change on the fly)
is the thing that had no effect at all on our single-core box,
but did seem to help the dual-core more (not verified completely
on dual-core though).

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Do SATA tape drives work?

2008-01-08 Thread Mark Lord

Tejun Heo wrote:

[cc'ing linux-ide]

Jonathan Woithe wrote:

Hi guys

I was wondering whether anyone can shed any light on the status of SATA tape
drives.  There's very little info on the net about this at least in the
places I've checked; the only thing of any significance I've found thus far
is a note in a Bacula document dated April 2007 which states that drives
other than real SCSI units don't generally work with Bacula.

To put this into context, I'm looking at purchasing a Sony SDX470VRB SATA
AIT-1 tape drive for use with the SATA controller on an Intel DG31PR
mainboard.  The drive will be used primarily with tar/cpio.  Obvsiouly
however I only want to make the purchase if there's a reasonable chance of
it working.

I would appreciate any information you can shed on this issue.


It's supposed to with recent updates.  Mark, right?

..

I wouldn't buy anything with "Sony" on it,
but Albert thinks ATAPI tapes should be working now
(he has my old drive now).


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Reproducible data corruption with sendfile+vsftp - splice regression?

2007-12-08 Thread Mark Lord

Francois Romieu wrote:

Holger Hoffstaette <[EMAIL PROTECTED]> :
[...]

Maybe turning off sendfile or NAPI just lead to random success - so far it
really looks like tso on the r8169 is the common cause.


TSO on the r8169 is the magic switch but the regression makes imvho more
sense from a VM pov:

- the corrupted file has the same size as the expected file
- the corrupted file exhibits holes which come as a multiple of 4096 bytes
  (8*4k, 2 places, there may be more)

...

That's interesting.  I had the those exact same symptoms here
with copying data to/from a USB stick recently.
But that stick died completely shortly thereafter,
so this was written-off as "bad hardware".

Strange that you see the same symptoms from a different scenario.
Probably no relationship there, but ..

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] iwlwifi3945/4965 - fix rate control algo reference leak

2007-12-08 Thread Mark Lord

Zhu Yi wrote:

On Thu, 2007-12-06 at 12:39 +0300, Cyrill Gorcunov wrote:

From: Cyrill Gorcunov <[EMAIL PROTECTED]>
Subject: [PATCH] iwlwifi3945/4965 - fix rate control algo reference leak

..

Any chance of getting LEDs support re-added to this driver,
perhaps in the 2.6.25 timeframe?

With that in there, I could finally switch the machines around here
away from the earlier ipw3945 stuff.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-08 Thread Mark Lord

Ingo Molnar wrote:

...
thanks. I do get the impression that most of this can/should wait until 
2.6.25. The patches look quite dangerous.

..

I confess to not really trying hard to understand everything in this thread,
but the implication seems to be that this bug might affect udelay()
and possibly jiffies ?

If so, then fixing it has to be a *must* for 2.6.24, as otherwise we'll get
all sorts of one-in-while odd driver bugs.. like maybe these two for starters:

[Bug 9492] 2.6.24:  false double-clicks from USB mouse
[Bug 9489] 2+ wake-ups/second in 2.6.24

Neither of which happens often enough to explain or debug,
but either of which *could* be caused by some weird jiffies thing maybe.

???
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Iomega ZIP-100 drive unsupported with jmicron JMB361 chip?

2007-12-11 Thread Mark Lord

trash can wrote:
..

Robert Hancock wrote:

That is rather curious. There's no sign of any libata error handling
going on.. Maybe the drive is actually returning that error code in the
ATAPI CDB, or at least we think it is?

You are sure that this drive still works with older kernels using
drivers/ide, and that the hardware didn't break at some point, I assume?


Thanks for your time. I get a kernel panic with the controller mode in BIOS
set to AHCI with Fedora Core 6. Once returned to IDE (setting used since
computer was built) I booted up. Added information: In Fedora 7 I can not
burn DVDs or CDROM using K3b with the Zip drive connected. Some
preformatting (I assume) is done rendering the CD/DVD useless then got an
I/O error. In Fedora 8 I am able to burn a CDROM using K3b but DVDs behave
as with Fedora 7. All is well when the Zip is totaly removed. This Zip
drive also worked under Microsoft 2000 { which was removed over a year
ago ;-) }. Fedora Core 6, Fedora 7, and Fedora 8 are all installed on
this machine. This drive did not work with a Ubuntu 7.10 Live CD on this
machine. I intend to try this Zip drive on another motherboard with the
above live CD.

...

I missed the early part of this thread,
but here is a data point that may or may not be useful.

I have an ASUS mobo here with an onboard JM363 SATA/PATA controller
(verified by looking at the actual chip).

It works fine when in AHCI mode with a PATA ATAPI ZIP100 drive
all by itself.  No other configurations tested.
This is with kernel 2.6.24-rc4-git?.

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Iomega ZIP-100 drive unsupported with jmicron JMB361 chip?

2007-12-11 Thread Mark Lord

Mark Lord wrote:


I missed the early part of this thread,
but here is a data point that may or may not be useful.

I have an ASUS mobo here with an onboard JM363 SATA/PATA controller
(verified by looking at the actual chip).

It works fine when in AHCI mode with a PATA ATAPI ZIP100 drive
all by itself.  No other configurations tested.
This is with kernel 2.6.24-rc4-git?.

..

Oh yeah.. that's with libata controlling all drives in the system.

This exact same mobo was used for a while with a PATA DVD-RW (no ZIP drive)
under older kernels using drivers/IDE, but was unreliable in that configuration
(Ubuntu Edgy).   Ditto when the PATA DVD-RW was replaced with a SATA DVD-RW.

-ml
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sata_mv: improve warnings about Highpoint RocketRAID 23xx cards

2007-12-11 Thread Mark Lord

Improve the existing boot/load time warnings from sata_mv
for Highpoint RocketRAID 23xx cards, based on new knowledge
about where the BIOS likes to overwrite sectors with metadata.

Harmless to us, but very useful for end users.

Signed-off-by: Mark Lord <[EMAIL PROTECTED]>
---
This should ideally go upstream for 2.6.24.

--- old/drivers/ata/sata_mv.c   2007-12-10 18:14:09.0 -0500
+++ linux/drivers/ata/sata_mv.c 2007-12-11 12:51:51.0 -0500
@@ -2506,11 +2506,31 @@
if (pdev->vendor == PCI_VENDOR_ID_TTI &&
(pdev->device == 0x2300 || pdev->device == 0x2310))
{
-   printk(KERN_WARNING "sata_mv: Highpoint RocketRAID BIOS"
-   " will CORRUPT DATA on attached drives when"
-   " configured as \"Legacy\".  BEWARE!\n");
-   printk(KERN_WARNING "sata_mv: Use BIOS \"JBOD\" volumes"
-   " instead for safety.\n");
+   /*
+* Highpoint RocketRAID PCIe 23xx series cards:
+*
+* Unconfigured drives are treated as "Legacy"
+* by the BIOS, and it overwrites sector 8 with
+* a "Lgcy" metadata block prior to Linux boot.
+*
+* Configured drives (RAID or JBOD) leave sector 8
+* alone, but instead overwrite a high numbered
+* sector for the RAID metadata.  This sector can
+* be determined exactly, by truncating the physical
+* drive capacity to a nice even GB value.
+*
+* RAID metadata is at: (dev->n_sectors & ~0xf)
+*
+* Warn the user, lest they think we're just buggy.
+*/
+   printk(KERN_WARNING DRV_NAME ": Highpoint RocketRAID"
+   " BIOS CORRUPTS DATA on all attached drives,"
+   " regardless of if/how they are configured."
+   " BEWARE!\n");
+   printk(KERN_WARNING DRV_NAME ": For data safety, do not"
+   " use sectors 8-9 on \"Legacy\" drives,"
+   " and avoid the final two gigabytes on"
+   " all RocketRAID BIOS initialized drives.\n");
}
case chip_6042:
hpriv->ops = &mv6xxx_ops;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Iomega ZIP-100 drive unsupported with jmicron JMB361 chip?

2007-12-11 Thread Mark Lord

trash can wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Thanks for the note. Zip drive as only device on the bus did not work
for me. kernel is correctly identifying the Jmicron chip.

..

So have you tried 2.6.24-rc* on that system yet, using only libata ?


Mark Lord wrote:

Mark Lord wrote:

I missed the early part of this thread,
but here is a data point that may or may not be useful.

I have an ASUS mobo here with an onboard JM363 SATA/PATA controller
(verified by looking at the actual chip).

It works fine when in AHCI mode with a PATA ATAPI ZIP100 drive
all by itself.  No other configurations tested.
This is with kernel 2.6.24-rc4-git?.

..

Oh yeah.. that's with libata controlling all drives in the system.

..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sata_mv hotplug flaky?

2007-12-12 Thread Mark Lord

Jeff Garzik wrote:

Orion Poplawski wrote:
Not sure what the latest status of sata_mv hotplug should be, but it 
seems close.  I'm currently running 2.6.24-0.81.rc4.git7.fc9 with a 
MV88SX5081.  Pulled a couple drives and re-added.  One device got 
re-added, but the other did not.  It seems like I got the system to 
probe and re-add the first drive by doing a "mount -a", but haven't 
been able (no idea how) to re-add the second.




hotplug made great strides with the introduction of the new-EH code, but 
it still needs a bit of work.  Mark Lord was looking into that, so you 
can sure your report has been noted...

..

I am not yet looking into sata_mv hotplug, but it is on the list
for early 2008.  There are some peculiarities and errata that need
to be addressed on some of the Marvell chips for this to work reliably.

Target kernel for it is now 2.6.26, with development/testing happening
during the 2.6.25 timeframe (but too late for inclusion there).

If the required tweaks are small and safe enough, then we might be
able to push a few of those out for 2.6.25.  We'll see.

The 2.6.26 kernel should also get full NCQ and PMP support for sata_mv,
and a smattering of other enhancements as well (eg. ATAPI support).

Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Intel Management Engine Interface

2007-12-12 Thread Mark Lord

Anas Nashif wrote:


Actually no TCP/IP is needed here. Basically the MEI driver writes and reads the
messages to/from the firmware. When communicating in-band using LMS, TCP/IP
terminates at LMS and the messages are copied using MEI driver.


To have a feel for all of this, with many examples, samples and documentation
you can download the AMT 3.0 SDK (google: intel amt sdk).

I would be more interested right now how the kernel can use this without
additional user space support. Any ideas on this? 


I will dig for some documents on that.

..

Sample code for storing a trace_back() would be much better.

-ml
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Jens,

I'm experimenting here with trying to generate large I/O through libata,
and not having much luck.

The limit seems to be the number of hardware PRD (SG) entries permitted
by the driver (libata:ata_piix), which is 128 by default.

The problem is, the block layer *never* sends an SG entry larger than 8192 
bytes,
and even that size is exceptionally rare.  Nearly all I/O segments are 4096 
bytes,
so I never see a single I/O larger than 512KB (128 * 4096).

If I patch various parts of block and SCSI, this limit doesn't budge,
but when I change the hardware PRD limit in libata, it scales by exactly
whatever I set the new value to.  This tells me that adjacent I/O segments
are not being combined.

I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should
result in adjacent single pages being combined into larger physical segments?

This is x86-32 with latest 2.6.24-rc*.
I'll re-test on older kernels next.

???
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

(resending with corrected email address for Jens)

Jens,

I'm experimenting here with trying to generate large I/O through libata,
and not having much luck.

The limit seems to be the number of hardware PRD (SG) entries permitted
by the driver (libata:ata_piix), which is 128 by default.

The problem is, the block layer *never* sends an SG entry larger than 8192 
bytes,
and even that size is exceptionally rare.  Nearly all I/O segments are 4096 
bytes,
so I never see a single I/O larger than 512KB (128 * 4096).

If I patch various parts of block and SCSI, this limit doesn't budge,
but when I change the hardware PRD limit in libata, it scales by exactly
whatever I set the new value to.  This tells me that adjacent I/O segments
are not being combined.

I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should
result in adjacent single pages being combined into larger physical segments?

This is x86-32 with latest 2.6.24-rc*.
I'll re-test on older kernels next.

???
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Mark Lord wrote:

(resending with corrected email address for Jens)

Jens,

I'm experimenting here with trying to generate large I/O through libata,
and not having much luck.

The limit seems to be the number of hardware PRD (SG) entries permitted
by the driver (libata:ata_piix), which is 128 by default.

The problem is, the block layer *never* sends an SG entry larger than 
8192 bytes,
and even that size is exceptionally rare.  Nearly all I/O segments are 
4096 bytes,

so I never see a single I/O larger than 512KB (128 * 4096).

If I patch various parts of block and SCSI, this limit doesn't budge,
but when I change the hardware PRD limit in libata, it scales by exactly
whatever I set the new value to.  This tells me that adjacent I/O segments
are not being combined.

I thought that QUEUE_FLAG_CLUSTER (aka. SCSI host .use_clustering=1) should
result in adjacent single pages being combined into larger physical 
segments?


This is x86-32 with latest 2.6.24-rc*.
I'll re-test on older kernels next.

...

Problem confirmed.  2.6.23.8 regularly generates segments up to 64KB for libata,
but 2.6.24 uses only 4KB segments and a *few* 8KB segments.

???
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Matthew Wilcox wrote:

On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
Problem confirmed.  2.6.23.8 regularly generates segments up to 64KB for 
libata,

but 2.6.24 uses only 4KB segments and a *few* 8KB segments.


Just a suspicion ... could this be slab vs slub?  ie check your configs
are the same / similar between the two kernels.

..

Mmmm.. a good thought, that one.
But I just rechecked, and both have CONFIG_SLAB=y

My guess is that something got changed around when Jens
reworked the block layer for 2.6.24.
I'm going to dig around in there now.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Matthew Wilcox wrote:

On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
Problem confirmed.  2.6.23.8 regularly generates segments up to 64KB for 
libata,

but 2.6.24 uses only 4KB segments and a *few* 8KB segments.

Just a suspicion ... could this be slab vs slub?  ie check your configs
are the same / similar between the two kernels.

..

Mmmm.. a good thought, that one.
But I just rechecked, and both have CONFIG_SLAB=y

My guess is that something got changed around when Jens
reworked the block layer for 2.6.24.
I'm going to dig around in there now.


I didn't rework the block layer for 2.6.24 :-). The core block layer
changes since 2.6.23 are:

- Support for empty barriers. Not a likely candidate.
- Shared tag queue fixes. Totally unlikely.
- sg chaining support. Not likely.
- The bio changes from Neil. Of the bunch, the most likely suspects in
  this area, since it changes some of the code involved with merges and
  blk_rq_map_sg().
- Lots of simple stuff, again very unlikely.

Anyway, it sounds odd for this to be a block layer problem if you do see
occasional segments being merged. So it sounds more like the input data
having changed.

Why not just bisect it?

..

Because the early 2.6.24 series failed to boot on this machine
due to bugs in the block layer -- so the code that caused this regression
is probably in the stuff from before the kernels became usable here.

Cheers

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Mark Lord wrote:

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Matthew Wilcox wrote:

On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
Problem confirmed.  2.6.23.8 regularly generates segments up to 
64KB for libata,

but 2.6.24 uses only 4KB segments and a *few* 8KB segments.

Just a suspicion ... could this be slab vs slub?  ie check your configs
are the same / similar between the two kernels.

..

Mmmm.. a good thought, that one.
But I just rechecked, and both have CONFIG_SLAB=y

My guess is that something got changed around when Jens
reworked the block layer for 2.6.24.
I'm going to dig around in there now.


I didn't rework the block layer for 2.6.24 :-). The core block layer
changes since 2.6.23 are:

- Support for empty barriers. Not a likely candidate.
- Shared tag queue fixes. Totally unlikely.
- sg chaining support. Not likely.
- The bio changes from Neil. Of the bunch, the most likely suspects in
  this area, since it changes some of the code involved with merges and
  blk_rq_map_sg().
- Lots of simple stuff, again very unlikely.

Anyway, it sounds odd for this to be a block layer problem if you do see
occasional segments being merged. So it sounds more like the input data
having changed.

Why not just bisect it?

..

Because the early 2.6.24 series failed to boot on this machine
due to bugs in the block layer -- so the code that caused this regression
is probably in the stuff from before the kernels became usable here.

..

That sounds more harsh than intended --> the earlier 2.6.24 kernels (up to
the first couple of -rc* ones failed here because of incompatibilities
between the block/bio changes and libata.

That's better, I think! 


Cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Mark Lord wrote:

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Matthew Wilcox wrote:

On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
Problem confirmed.  2.6.23.8 regularly generates segments up to 
64KB for libata,

but 2.6.24 uses only 4KB segments and a *few* 8KB segments.

Just a suspicion ... could this be slab vs slub?  ie check your configs
are the same / similar between the two kernels.

..

Mmmm.. a good thought, that one.
But I just rechecked, and both have CONFIG_SLAB=y

My guess is that something got changed around when Jens
reworked the block layer for 2.6.24.
I'm going to dig around in there now.

I didn't rework the block layer for 2.6.24 :-). The core block layer
changes since 2.6.23 are:

- Support for empty barriers. Not a likely candidate.
- Shared tag queue fixes. Totally unlikely.
- sg chaining support. Not likely.
- The bio changes from Neil. Of the bunch, the most likely suspects in
 this area, since it changes some of the code involved with merges and
 blk_rq_map_sg().
- Lots of simple stuff, again very unlikely.

Anyway, it sounds odd for this to be a block layer problem if you do see
occasional segments being merged. So it sounds more like the input data
having changed.

Why not just bisect it?

..

Because the early 2.6.24 series failed to boot on this machine
due to bugs in the block layer -- so the code that caused this regression
is probably in the stuff from before the kernels became usable here.

..

That sounds more harsh than intended --> the earlier 2.6.24 kernels (up to
the first couple of -rc* ones failed here because of incompatibilities
between the block/bio changes and libata.

That's better, I think! 


No worries, I didn't pick it up as harsh just as an odd conclusion :-)

If I were you, I'd just start from the first -rc that booted for you. If
THAT has the bug, then we'll think of something else. If you don't get
anywhere, I can run some tests tomorrow and see if I can reproduce it
here.

..

I believe that *anyone* can reproduce it, since it's broken long before
the requests ever get to SCSI or libata.  Which also means that *anyone*
who wants to can bisect it, as well.

I don't do "bisects".

But I will dig a bit more and see if I can find the culprit.

Cheers

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: QUEUE_FLAG_CLUSTER: not working in 2.6.24 ?

2007-12-13 Thread Mark Lord

Jens Axboe wrote:

On Thu, Dec 13 2007, Mark Lord wrote:

Matthew Wilcox wrote:

On Thu, Dec 13, 2007 at 01:48:18PM -0500, Mark Lord wrote:
Problem confirmed.  2.6.23.8 regularly generates segments up to 64KB for 
libata,

but 2.6.24 uses only 4KB segments and a *few* 8KB segments.

Just a suspicion ... could this be slab vs slub?  ie check your configs
are the same / similar between the two kernels.

..

Mmmm.. a good thought, that one.
But I just rechecked, and both have CONFIG_SLAB=y

My guess is that something got changed around when Jens
reworked the block layer for 2.6.24.
I'm going to dig around in there now.


I didn't rework the block layer for 2.6.24 :-). The core block layer
changes since 2.6.23 are:

- Support for empty barriers. Not a likely candidate.
- Shared tag queue fixes. Totally unlikely.
- sg chaining support. Not likely.
- The bio changes from Neil. Of the bunch, the most likely suspects in
  this area, since it changes some of the code involved with merges and
  blk_rq_map_sg().
- Lots of simple stuff, again very unlikely.

Anyway, it sounds odd for this to be a block layer problem if you do see
occasional segments being merged. So it sounds more like the input data
having changed.

Why not just bisect it?

..

CC'ing Neil Brown.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   >