Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-29 Thread Finn Thain

On Sun, 29 Nov 2015, Geert Uytterhoeven wrote:

> If an ISA access takes 8 us, while the CPU runs at 1 GHz, i.e. 500M 
> loops/s, the difference will be huge.

Ondrej showed that an ISA access can take about 1.6 us. I don't know what 
to make of the "8 uS" comment in the mainline driver. Maybe it was an even 
slower ISA card.

Anyway, I made a measurement on my hardware and confirmed that lpj is a 
very bad proxy for device register access throughput. The "loops per 
access" gap is several orders of magnitude:

  lpj   HZaccess time (us)  lpa

Ondrej's 5380 ISA card (PC):  4797252   250   1.6   1.9k
My DMX3191D PCI card (PowerMac):   167079   100   0.42  7.0

> 
> Perhaps you can calibrate an NCR5380_read() loop at driver init time, 
> and use the calibration value later?

I had the same idea but I didn't think that the complexity was justified 
by the low precision requirement. But now that I have some timings I have 
to agree.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-29 Thread Geert Uytterhoeven
Hi Finn,

On Sun, Nov 29, 2015 at 11:25 AM, Finn Thain  wrote:
> On Sun, 29 Nov 2015, Geert Uytterhoeven wrote:
>> This still heavily depends on the processing time spent in
>> NCR5380_read(). You should never use a value derived from
>> loops_per_jiffy for a non-empty loop,
>
> Sure but the time-out condition isn't supposed to be precise.
> Plus/minus a jiffy is no problem. Plus/minus a second is no good.

If an ISA access takes 8 µs, while the CPU runs at 1 GHz, i.e. 500M loops/s,
the difference will be huge.

>> as it may take much longer. Always compare with an maximum end time
>> instead.
>
> That can't work with interrupts disabled, which was the problem I was
> trying to solve by use of loops_per_jiffy.

Then you indeed can't use jiffies.
Perhaps you can calibrate an NCR5380_read() loop at driver init time,
and use the calibration value later?

> NCR5380_poll_politely() in mainline doesn't work with interrupts disabled
> either, hence patch 21.
>
>> E.g.:
>>
>> end = jiffies + 2;/* 1 jiffie + 1 safeguard */
>> do {
>>  if ((NCR5380_read(reg1) & bit1) == val1)
>>  return 0;
>>  cpu_relax();
>> } while (time_before(jiffies, end);
>>
>> And a similar loop for "Busy-wait for up to 20 ms".
>
> Interrupts may be disabled during that loop also. Please refer to (and/or
> respond to) patch 21, "ncr5380: Sleep when polling, if possible", in which
> these changes were made.

So the above loop may never terminate. Oops...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-29 Thread Finn Thain

On Sun, 29 Nov 2015, Geert Uytterhoeven wrote:

> This still heavily depends on the processing time spent in 
> NCR5380_read(). You should never use a value derived from 
> loops_per_jiffy for a non-empty loop,

Sure but the time-out condition isn't supposed to be precise.
Plus/minus a jiffy is no problem. Plus/minus a second is no good.

> as it may take much longer. Always compare with an maximum end time 
> instead.

That can't work with interrupts disabled, which was the problem I was 
trying to solve by use of loops_per_jiffy.

NCR5380_poll_politely() in mainline doesn't work with interrupts disabled 
either, hence patch 21.

> 
> E.g.:
> 
> end = jiffies + 2;/* 1 jiffie + 1 safeguard */
> do {
>  if ((NCR5380_read(reg1) & bit1) == val1)
>  return 0;
>  cpu_relax();
> } while (time_before(jiffies, end);
> 
> And a similar loop for "Busy-wait for up to 20 ms".

Interrupts may be disabled during that loop also. Please refer to (and/or 
respond to) patch 21, "ncr5380: Sleep when polling, if possible", in which 
these changes were made.

-- 

> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-29 Thread Geert Uytterhoeven
Hi Finn,

On Thu, Nov 26, 2015 at 11:32 PM, Finn Thain  wrote:
>> The timekeeping warning does not appear when all 71 patches are applied.
>> Reverse-bisected it - the warning disappears after:
>> [PATCH 50/71] ncr5380: Change instance->host_lock to hostdata->lock
>>
>
> Makes sense. I think that this should solve the problem:
>
> Index: linux/drivers/scsi/NCR5380.c
> ===
> --- linux.orig/drivers/scsi/NCR5380.c   2015-11-27 09:21:40.0 +1100
> +++ linux/drivers/scsi/NCR5380.c2015-11-27 09:25:36.0 +1100
> @@ -230,7 +230,7 @@ static int NCR5380_poll_politely2(struct
> unsigned long n;
>
> /* Busy-wait for up to 1 jiffy */
> -   n = loops_per_jiffy;
> +   n = 1 + loops_per_jiffy / 10;
> do {
> if ((NCR5380_read(reg1) & bit1) == val1)
> return 0;

This still heavily depends on the processing time spent in NCR5380_read().
You should never use a value derived from loops_per_jiffy for a non-empty loop,
as it may take much longer. Always compare with an maximum end time instead.

E.g.:

end = jiffies + 2;/* 1 jiffie + 1 safeguard */
do {
 if ((NCR5380_read(reg1) & bit1) == val1)
 return 0;
 cpu_relax();
} while (time_before(jiffies, end);

And a similar loop for "Busy-wait for up to 20 ms".

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-29 Thread Ondrej Zary
On Sunday 29 November 2015 01:46:03 Finn Thain wrote:
> 
> On Sun, 29 Nov 2015, Ondrej Zary wrote:
> 
> > > [...] I think that this should solve the problem:
> > 
> > Yes, it does!
> > 
> > [   48.119367] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> > 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > [   49.715388] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > 4.6  PQ: 0 ANSI: 2 CCS
> 
> That still takes about 1.6 seconds to scan a vacant bus ID. It should be 
> more like 0.25 seconds. Did you have the entire patch series applied?
> 
> The code presently in mainline spins for 500 iterations in the relevant 
> busy-wait loop, with the comment, "8uS a cycle for the cpu access". That 
> doesn't help me much. Does that refer to an ISA bus cycle? Or one 
> iteration of the loop? Electrical conductance?
> 
> If we knew how many loops_per_jiffy your machine gets, and your CONFIG_HZ 
> value, we could figure out how long a chip register access takes (which 
> might be 8 us).

loops_per_jiffy=4797252
CONFIG_HZ=250

# cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 11
model name  : Intel(R) Celeron(TM) CPU1200MHz
stepping: 1
microcode   : 0x1c
cpu MHz : 1199.313
cache size  : 256 KB
physical id : 0
siblings: 1
core id : 0
cpu cores   : 1
apicid  : 0
initial apicid  : 0
fdiv_bug: no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 2
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov 
pse36 mmx fxsr sse
bugs:
bogomips: 2398.62
clflush size: 32
cache_alignment : 32
address sizes   : 36 bits physical, 32 bits virtual
power management:


# cat iotest.c
#include 
#include 

int main(void) {
if (iopl(3)) {
perror("iopl");
return 1;
}
for (int i = 0; i < 1000; i++)
inb(0x280);
return 0;
}

# cc --std=c99 iotest.c -o iotest
# time ./iotest

real0m16.938s
user0m16.932s
sys 0m0.000s

outb() instead of inb():
# time ./iotest

real0m17.210s
user0m17.204s
sys 0m0.000s


> I'll try to figure out a similar timing for my Domex PCI card to see if 
> there is some sort of compromise to be found.


-- 
Ondrej Zary
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-29 Thread Ondrej Zary
On Sunday 29 November 2015 01:46:03 Finn Thain wrote:
> 
> On Sun, 29 Nov 2015, Ondrej Zary wrote:
> 
> > > [...] I think that this should solve the problem:
> > 
> > Yes, it does!
> > 
> > [   48.119367] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> > 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > [   49.715388] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > 4.6  PQ: 0 ANSI: 2 CCS
> 
> That still takes about 1.6 seconds to scan a vacant bus ID. It should be 
> more like 0.25 seconds. Did you have the entire patch series applied?
> 
> The code presently in mainline spins for 500 iterations in the relevant 
> busy-wait loop, with the comment, "8uS a cycle for the cpu access". That 
> doesn't help me much. Does that refer to an ISA bus cycle? Or one 
> iteration of the loop? Electrical conductance?
> 
> If we knew how many loops_per_jiffy your machine gets, and your CONFIG_HZ 
> value, we could figure out how long a chip register access takes (which 
> might be 8 us).

loops_per_jiffy=4797252
CONFIG_HZ=250

# cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 11
model name  : Intel(R) Celeron(TM) CPU1200MHz
stepping: 1
microcode   : 0x1c
cpu MHz : 1199.313
cache size  : 256 KB
physical id : 0
siblings: 1
core id : 0
cpu cores   : 1
apicid  : 0
initial apicid  : 0
fdiv_bug: no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 2
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov 
pse36 mmx fxsr sse
bugs:
bogomips: 2398.62
clflush size: 32
cache_alignment : 32
address sizes   : 36 bits physical, 32 bits virtual
power management:


# cat iotest.c
#include 
#include 

int main(void) {
if (iopl(3)) {
perror("iopl");
return 1;
}
for (int i = 0; i < 1000; i++)
inb(0x280);
return 0;
}

# cc --std=c99 iotest.c -o iotest
# time ./iotest

real0m16.938s
user0m16.932s
sys 0m0.000s

outb() instead of inb():
# time ./iotest

real0m17.210s
user0m17.204s
sys 0m0.000s


> I'll try to figure out a similar timing for my Domex PCI card to see if 
> there is some sort of compromise to be found.


-- 
Ondrej Zary
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-29 Thread Geert Uytterhoeven
Hi Finn,

On Thu, Nov 26, 2015 at 11:32 PM, Finn Thain  wrote:
>> The timekeeping warning does not appear when all 71 patches are applied.
>> Reverse-bisected it - the warning disappears after:
>> [PATCH 50/71] ncr5380: Change instance->host_lock to hostdata->lock
>>
>
> Makes sense. I think that this should solve the problem:
>
> Index: linux/drivers/scsi/NCR5380.c
> ===
> --- linux.orig/drivers/scsi/NCR5380.c   2015-11-27 09:21:40.0 +1100
> +++ linux/drivers/scsi/NCR5380.c2015-11-27 09:25:36.0 +1100
> @@ -230,7 +230,7 @@ static int NCR5380_poll_politely2(struct
> unsigned long n;
>
> /* Busy-wait for up to 1 jiffy */
> -   n = loops_per_jiffy;
> +   n = 1 + loops_per_jiffy / 10;
> do {
> if ((NCR5380_read(reg1) & bit1) == val1)
> return 0;

This still heavily depends on the processing time spent in NCR5380_read().
You should never use a value derived from loops_per_jiffy for a non-empty loop,
as it may take much longer. Always compare with an maximum end time instead.

E.g.:

end = jiffies + 2;/* 1 jiffie + 1 safeguard */
do {
 if ((NCR5380_read(reg1) & bit1) == val1)
 return 0;
 cpu_relax();
} while (time_before(jiffies, end);

And a similar loop for "Busy-wait for up to 20 ms".

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-29 Thread Finn Thain

On Sun, 29 Nov 2015, Geert Uytterhoeven wrote:

> This still heavily depends on the processing time spent in 
> NCR5380_read(). You should never use a value derived from 
> loops_per_jiffy for a non-empty loop,

Sure but the time-out condition isn't supposed to be precise.
Plus/minus a jiffy is no problem. Plus/minus a second is no good.

> as it may take much longer. Always compare with an maximum end time 
> instead.

That can't work with interrupts disabled, which was the problem I was 
trying to solve by use of loops_per_jiffy.

NCR5380_poll_politely() in mainline doesn't work with interrupts disabled 
either, hence patch 21.

> 
> E.g.:
> 
> end = jiffies + 2;/* 1 jiffie + 1 safeguard */
> do {
>  if ((NCR5380_read(reg1) & bit1) == val1)
>  return 0;
>  cpu_relax();
> } while (time_before(jiffies, end);
> 
> And a similar loop for "Busy-wait for up to 20 ms".

Interrupts may be disabled during that loop also. Please refer to (and/or 
respond to) patch 21, "ncr5380: Sleep when polling, if possible", in which 
these changes were made.

-- 

> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-29 Thread Geert Uytterhoeven
Hi Finn,

On Sun, Nov 29, 2015 at 11:25 AM, Finn Thain  wrote:
> On Sun, 29 Nov 2015, Geert Uytterhoeven wrote:
>> This still heavily depends on the processing time spent in
>> NCR5380_read(). You should never use a value derived from
>> loops_per_jiffy for a non-empty loop,
>
> Sure but the time-out condition isn't supposed to be precise.
> Plus/minus a jiffy is no problem. Plus/minus a second is no good.

If an ISA access takes 8 µs, while the CPU runs at 1 GHz, i.e. 500M loops/s,
the difference will be huge.

>> as it may take much longer. Always compare with an maximum end time
>> instead.
>
> That can't work with interrupts disabled, which was the problem I was
> trying to solve by use of loops_per_jiffy.

Then you indeed can't use jiffies.
Perhaps you can calibrate an NCR5380_read() loop at driver init time,
and use the calibration value later?

> NCR5380_poll_politely() in mainline doesn't work with interrupts disabled
> either, hence patch 21.
>
>> E.g.:
>>
>> end = jiffies + 2;/* 1 jiffie + 1 safeguard */
>> do {
>>  if ((NCR5380_read(reg1) & bit1) == val1)
>>  return 0;
>>  cpu_relax();
>> } while (time_before(jiffies, end);
>>
>> And a similar loop for "Busy-wait for up to 20 ms".
>
> Interrupts may be disabled during that loop also. Please refer to (and/or
> respond to) patch 21, "ncr5380: Sleep when polling, if possible", in which
> these changes were made.

So the above loop may never terminate. Oops...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-29 Thread Finn Thain

On Sun, 29 Nov 2015, Geert Uytterhoeven wrote:

> If an ISA access takes 8 us, while the CPU runs at 1 GHz, i.e. 500M 
> loops/s, the difference will be huge.

Ondrej showed that an ISA access can take about 1.6 us. I don't know what 
to make of the "8 uS" comment in the mainline driver. Maybe it was an even 
slower ISA card.

Anyway, I made a measurement on my hardware and confirmed that lpj is a 
very bad proxy for device register access throughput. The "loops per 
access" gap is several orders of magnitude:

  lpj   HZaccess time (us)  lpa

Ondrej's 5380 ISA card (PC):  4797252   250   1.6   1.9k
My DMX3191D PCI card (PowerMac):   167079   100   0.42  7.0

> 
> Perhaps you can calibrate an NCR5380_read() loop at driver init time, 
> and use the calibration value later?

I had the same idea but I didn't think that the complexity was justified 
by the low precision requirement. But now that I have some timings I have 
to agree.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-28 Thread Finn Thain

On Sun, 29 Nov 2015, Ondrej Zary wrote:

> > [...] I think that this should solve the problem:
> 
> Yes, it does!
> 
> [   48.119367] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> [   49.715388] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS

That still takes about 1.6 seconds to scan a vacant bus ID. It should be 
more like 0.25 seconds. Did you have the entire patch series applied?

The code presently in mainline spins for 500 iterations in the relevant 
busy-wait loop, with the comment, "8uS a cycle for the cpu access". That 
doesn't help me much. Does that refer to an ISA bus cycle? Or one 
iteration of the loop? Electrical conductance?

If we knew how many loops_per_jiffy your machine gets, and your CONFIG_HZ 
value, we could figure out how long a chip register access takes (which 
might be 8 us).

I'll try to figure out a similar timing for my Domex PCI card to see if 
there is some sort of compromise to be found.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-28 Thread Ondrej Zary
On Thursday 26 November 2015 23:32:46 Finn Thain wrote:
> 
> On Thu, 26 Nov 2015, Ondrej Zary wrote:
> 
> > 
> > [  156.014234] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> > 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > [  156.020369] scsi host2: scsi_scan_host_selected: 
> > <4294967295:4294967295:18446744073709551615>
> > [  156.020777] scsi 2:0:0:0: scsi scan: INQUIRY pass 1 length 36
> > [  156.022046] scsi host2: main: dequeued de045600
> > [  156.022109] STATUS_REG: 00
> > BASR: 00
> > ICR: 00
> > MODE: 00
> > [  156.022451] scsi host2: starting arbitration, id = 7
> > [  156.022740] scsi host2: won arbitration
> > [  156.022806] scsi host2: selecting target 0
> > [  171.097450] scsi host2: target did not respond within 250ms
> > [  171.097508] scsi host2: main: select complete
> > [  171.099554] scsi 2:0:0:0: scsi scan: INQUIRY failed with code 0x4
> 
> This is very helpful, thanks.
> 
> The problem has to be this:
> 
> err = NCR5380_poll_politely(instance, STATUS_REG, SR_BSY, SR_BSY,
> msecs_to_jiffies(250));
> 
> which would suggest a problem with the busy wait loop. It seems that 5380 
> register accesses are much more expensive on PC than the platforms I've 
> tested.
> 
> > [  171.103200] scsi 2:0:1:0: scsi scan: INQUIRY pass 1 length 36
> > [  171.105246] scsi host2: main: dequeued df6b9f00
> > [  171.105350] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  171.105664] scsi host2: starting arbitration, id = 7
> > [  171.105752] scsi host2: won arbitration
> > [  171.105822] scsi host2: selecting target 1
> > [  171.105977] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> > [  171.106042] scsi host2: nexus established.
> > [  171.106093] scsi host2: main: select complete
> > [  171.106145] scsi host2: main: performing information transfer
> > [  171.108531] scsi 2:0:1:0: scsi scan: INQUIRY successful with code 0x0
> > [  171.108639] scsi 2:0:1:0: scsi scan: INQUIRY pass 2 length 130
> > [  171.109401] scsi host2: main: dequeued df6b9f00
> > [  171.109485] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  171.109779] scsi host2: starting arbitration, id = 7
> > [  171.109878] scsi host2: won arbitration
> > [  171.109954] scsi host2: selecting target 1
> > [  171.110112] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> > [  171.110181] scsi host2: nexus established.
> > [  171.110236] scsi host2: main: select complete
> > [  171.110286] scsi host2: main: performing information transfer
> > [  171.114811] scsi 2:0:1:0: scsi scan: INQUIRY successful with code 0x0
> > [  171.114928] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > 4.6  PQ: 0 ANSI: 2 CCS
> > [  171.115011] scsi target2:0:1: scsi scan: Sequential scan
> > [  171.115856] scsi 2:0:1:1: scsi scan: INQUIRY pass 1 length 36
> > [  171.116008] scsi host2: main: dequeued df6b9800
> > [  171.116039] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  171.116736] scsi host2: starting arbitration, id = 7
> > [  171.116843] scsi host2: won arbitration
> > [  171.116898] scsi host2: selecting target 1
> > [  171.117117] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> > [  171.117182] scsi host2: nexus established.
> > [  171.117234] scsi host2: main: select complete
> > [  171.117287] scsi host2: main: performing information transfer
> > [  171.119674] scsi 2:0:1:1: scsi scan: INQUIRY successful with code 0x0
> > [  171.119781] scsi 2:0:1:1: scsi scan: INQUIRY pass 2 length 130
> > [  171.119898] scsi host2: main: dequeued df6b9800
> > [  171.119962] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  171.121426] scsi host2: starting arbitration, id = 7
> > [  171.121537] scsi host2: won arbitration
> > [  171.121592] scsi host2: selecting target 1
> > [  171.121766] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> > [  171.121834] scsi host2: nexus established.
> > [  171.121889] scsi host2: main: select complete
> > [  171.121940] scsi host2: main: performing information transfer
> > [  171.126361] scsi 2:0:1:1: scsi scan: INQUIRY successful with code 0x0
> > [  171.126470] scsi 2:0:1:1: scsi scan: peripheral qualifier of 3, device 
> > not added
> > [  171.127735] scsi 2:0:2:0: scsi scan: INQUIRY pass 1 length 36
> > [  171.127920] scsi host2: main: dequeued df6b9800
> > [  171.127982] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  171.128648] scsi host2: starting arbitration, id = 7
> > [  171.128754] scsi host2: won arbitration
> > [  171.128811] scsi host2: selecting target 2
> > [  186.203734] scsi host2: target did not respond within 250ms
> > [  186.203794] scsi host2: main: select complete
> > [  186.205355] scsi 2:0:2:0: scsi scan: INQUIRY failed with code 0x4
> > [  186.208989] scsi 2:0:3:0: scsi scan: INQUIRY pass 1 length 36
> > [  

Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-28 Thread Ondrej Zary
On Thursday 26 November 2015 23:32:46 Finn Thain wrote:
> 
> On Thu, 26 Nov 2015, Ondrej Zary wrote:
> 
> > 
> > [  156.014234] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> > 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > [  156.020369] scsi host2: scsi_scan_host_selected: 
> > <4294967295:4294967295:18446744073709551615>
> > [  156.020777] scsi 2:0:0:0: scsi scan: INQUIRY pass 1 length 36
> > [  156.022046] scsi host2: main: dequeued de045600
> > [  156.022109] STATUS_REG: 00
> > BASR: 00
> > ICR: 00
> > MODE: 00
> > [  156.022451] scsi host2: starting arbitration, id = 7
> > [  156.022740] scsi host2: won arbitration
> > [  156.022806] scsi host2: selecting target 0
> > [  171.097450] scsi host2: target did not respond within 250ms
> > [  171.097508] scsi host2: main: select complete
> > [  171.099554] scsi 2:0:0:0: scsi scan: INQUIRY failed with code 0x4
> 
> This is very helpful, thanks.
> 
> The problem has to be this:
> 
> err = NCR5380_poll_politely(instance, STATUS_REG, SR_BSY, SR_BSY,
> msecs_to_jiffies(250));
> 
> which would suggest a problem with the busy wait loop. It seems that 5380 
> register accesses are much more expensive on PC than the platforms I've 
> tested.
> 
> > [  171.103200] scsi 2:0:1:0: scsi scan: INQUIRY pass 1 length 36
> > [  171.105246] scsi host2: main: dequeued df6b9f00
> > [  171.105350] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  171.105664] scsi host2: starting arbitration, id = 7
> > [  171.105752] scsi host2: won arbitration
> > [  171.105822] scsi host2: selecting target 1
> > [  171.105977] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> > [  171.106042] scsi host2: nexus established.
> > [  171.106093] scsi host2: main: select complete
> > [  171.106145] scsi host2: main: performing information transfer
> > [  171.108531] scsi 2:0:1:0: scsi scan: INQUIRY successful with code 0x0
> > [  171.108639] scsi 2:0:1:0: scsi scan: INQUIRY pass 2 length 130
> > [  171.109401] scsi host2: main: dequeued df6b9f00
> > [  171.109485] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  171.109779] scsi host2: starting arbitration, id = 7
> > [  171.109878] scsi host2: won arbitration
> > [  171.109954] scsi host2: selecting target 1
> > [  171.110112] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> > [  171.110181] scsi host2: nexus established.
> > [  171.110236] scsi host2: main: select complete
> > [  171.110286] scsi host2: main: performing information transfer
> > [  171.114811] scsi 2:0:1:0: scsi scan: INQUIRY successful with code 0x0
> > [  171.114928] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > 4.6  PQ: 0 ANSI: 2 CCS
> > [  171.115011] scsi target2:0:1: scsi scan: Sequential scan
> > [  171.115856] scsi 2:0:1:1: scsi scan: INQUIRY pass 1 length 36
> > [  171.116008] scsi host2: main: dequeued df6b9800
> > [  171.116039] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  171.116736] scsi host2: starting arbitration, id = 7
> > [  171.116843] scsi host2: won arbitration
> > [  171.116898] scsi host2: selecting target 1
> > [  171.117117] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> > [  171.117182] scsi host2: nexus established.
> > [  171.117234] scsi host2: main: select complete
> > [  171.117287] scsi host2: main: performing information transfer
> > [  171.119674] scsi 2:0:1:1: scsi scan: INQUIRY successful with code 0x0
> > [  171.119781] scsi 2:0:1:1: scsi scan: INQUIRY pass 2 length 130
> > [  171.119898] scsi host2: main: dequeued df6b9800
> > [  171.119962] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  171.121426] scsi host2: starting arbitration, id = 7
> > [  171.121537] scsi host2: won arbitration
> > [  171.121592] scsi host2: selecting target 1
> > [  171.121766] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> > [  171.121834] scsi host2: nexus established.
> > [  171.121889] scsi host2: main: select complete
> > [  171.121940] scsi host2: main: performing information transfer
> > [  171.126361] scsi 2:0:1:1: scsi scan: INQUIRY successful with code 0x0
> > [  171.126470] scsi 2:0:1:1: scsi scan: peripheral qualifier of 3, device 
> > not added
> > [  171.127735] scsi 2:0:2:0: scsi scan: INQUIRY pass 1 length 36
> > [  171.127920] scsi host2: main: dequeued df6b9800
> > [  171.127982] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  171.128648] scsi host2: starting arbitration, id = 7
> > [  171.128754] scsi host2: won arbitration
> > [  171.128811] scsi host2: selecting target 2
> > [  186.203734] scsi host2: target did not respond within 250ms
> > [  186.203794] scsi host2: main: select complete
> > [  186.205355] scsi 2:0:2:0: scsi scan: INQUIRY failed with code 0x4
> > [  186.208989] scsi 2:0:3:0: scsi scan: INQUIRY pass 1 length 36
> > [  

Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-28 Thread Finn Thain

On Sun, 29 Nov 2015, Ondrej Zary wrote:

> > [...] I think that this should solve the problem:
> 
> Yes, it does!
> 
> [   48.119367] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> [   49.715388] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS

That still takes about 1.6 seconds to scan a vacant bus ID. It should be 
more like 0.25 seconds. Did you have the entire patch series applied?

The code presently in mainline spins for 500 iterations in the relevant 
busy-wait loop, with the comment, "8uS a cycle for the cpu access". That 
doesn't help me much. Does that refer to an ISA bus cycle? Or one 
iteration of the loop? Electrical conductance?

If we knew how many loops_per_jiffy your machine gets, and your CONFIG_HZ 
value, we could figure out how long a chip register access takes (which 
might be 8 us).

I'll try to figure out a similar timing for my Domex PCI card to see if 
there is some sort of compromise to be found.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-26 Thread Finn Thain

On Thu, 26 Nov 2015, Ondrej Zary wrote:

> 
> [  156.014234] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> [  156.020369] scsi host2: scsi_scan_host_selected: 
> <4294967295:4294967295:18446744073709551615>
> [  156.020777] scsi 2:0:0:0: scsi scan: INQUIRY pass 1 length 36
> [  156.022046] scsi host2: main: dequeued de045600
> [  156.022109] STATUS_REG: 00
> BASR: 00
> ICR: 00
> MODE: 00
> [  156.022451] scsi host2: starting arbitration, id = 7
> [  156.022740] scsi host2: won arbitration
> [  156.022806] scsi host2: selecting target 0
> [  171.097450] scsi host2: target did not respond within 250ms
> [  171.097508] scsi host2: main: select complete
> [  171.099554] scsi 2:0:0:0: scsi scan: INQUIRY failed with code 0x4

This is very helpful, thanks.

The problem has to be this:

err = NCR5380_poll_politely(instance, STATUS_REG, SR_BSY, SR_BSY,
msecs_to_jiffies(250));

which would suggest a problem with the busy wait loop. It seems that 5380 
register accesses are much more expensive on PC than the platforms I've 
tested.

> [  171.103200] scsi 2:0:1:0: scsi scan: INQUIRY pass 1 length 36
> [  171.105246] scsi host2: main: dequeued df6b9f00
> [  171.105350] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  171.105664] scsi host2: starting arbitration, id = 7
> [  171.105752] scsi host2: won arbitration
> [  171.105822] scsi host2: selecting target 1
> [  171.105977] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> [  171.106042] scsi host2: nexus established.
> [  171.106093] scsi host2: main: select complete
> [  171.106145] scsi host2: main: performing information transfer
> [  171.108531] scsi 2:0:1:0: scsi scan: INQUIRY successful with code 0x0
> [  171.108639] scsi 2:0:1:0: scsi scan: INQUIRY pass 2 length 130
> [  171.109401] scsi host2: main: dequeued df6b9f00
> [  171.109485] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  171.109779] scsi host2: starting arbitration, id = 7
> [  171.109878] scsi host2: won arbitration
> [  171.109954] scsi host2: selecting target 1
> [  171.110112] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> [  171.110181] scsi host2: nexus established.
> [  171.110236] scsi host2: main: select complete
> [  171.110286] scsi host2: main: performing information transfer
> [  171.114811] scsi 2:0:1:0: scsi scan: INQUIRY successful with code 0x0
> [  171.114928] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS
> [  171.115011] scsi target2:0:1: scsi scan: Sequential scan
> [  171.115856] scsi 2:0:1:1: scsi scan: INQUIRY pass 1 length 36
> [  171.116008] scsi host2: main: dequeued df6b9800
> [  171.116039] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  171.116736] scsi host2: starting arbitration, id = 7
> [  171.116843] scsi host2: won arbitration
> [  171.116898] scsi host2: selecting target 1
> [  171.117117] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> [  171.117182] scsi host2: nexus established.
> [  171.117234] scsi host2: main: select complete
> [  171.117287] scsi host2: main: performing information transfer
> [  171.119674] scsi 2:0:1:1: scsi scan: INQUIRY successful with code 0x0
> [  171.119781] scsi 2:0:1:1: scsi scan: INQUIRY pass 2 length 130
> [  171.119898] scsi host2: main: dequeued df6b9800
> [  171.119962] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  171.121426] scsi host2: starting arbitration, id = 7
> [  171.121537] scsi host2: won arbitration
> [  171.121592] scsi host2: selecting target 1
> [  171.121766] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> [  171.121834] scsi host2: nexus established.
> [  171.121889] scsi host2: main: select complete
> [  171.121940] scsi host2: main: performing information transfer
> [  171.126361] scsi 2:0:1:1: scsi scan: INQUIRY successful with code 0x0
> [  171.126470] scsi 2:0:1:1: scsi scan: peripheral qualifier of 3, device not 
> added
> [  171.127735] scsi 2:0:2:0: scsi scan: INQUIRY pass 1 length 36
> [  171.127920] scsi host2: main: dequeued df6b9800
> [  171.127982] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  171.128648] scsi host2: starting arbitration, id = 7
> [  171.128754] scsi host2: won arbitration
> [  171.128811] scsi host2: selecting target 2
> [  186.203734] scsi host2: target did not respond within 250ms
> [  186.203794] scsi host2: main: select complete
> [  186.205355] scsi 2:0:2:0: scsi scan: INQUIRY failed with code 0x4
> [  186.208989] scsi 2:0:3:0: scsi scan: INQUIRY pass 1 length 36
> [  186.209178] scsi host2: main: dequeued df6b9f00
> [  186.209242] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  186.209570] scsi host2: starting arbitration, id = 7
> [  186.209660] scsi host2: won arbitration
> [  186.209731] scsi host2: selecting target 3
> [  

Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-26 Thread Ondrej Zary
On Thursday 26 November 2015 11:23:01 Finn Thain wrote:
> 
> On Wed, 25 Nov 2015, Ondrej Zary wrote:
> 
> > On Wednesday 25 November 2015 04:17:09 Finn Thain wrote:
> > > 
> > > On Tue, 24 Nov 2015, Ondrej Zary wrote:
> > > 
> > > > 
> > > > Bisecting slow module initialization pointed to this commit.
> > > > 
> > > > Before this commit (2 seconds):
> > > > [   60.317374] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > > > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, 
> > > > sg_tablesize 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, 
> > > > USLEEP_SLEEP 50, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > > > [   60.780715] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S 
> > > > GM240S01X 4.6  PQ: 0 ANSI: 2 CCS
> > > > [   62.606260] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > > > 
> > > > 
> > > > After this commit (22 seconds):
> > > > [  137.511711] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > > > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, 
> > > > sg_tablesize 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, 
> > > > USLEEP_SLEEP 50, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > > > [  145.028532] clocksource: timekeeping watchdog: Marking clocksource 
> > > > 'tsc' as unstable because the skew is too large:
> > > > [  145.029767] clocksource:   'acpi_pm' wd_now: 
> > > > a49738 wd_last: f4da04 mask: ff
> > > > [  145.029828] clocksource:   'tsc' cs_now: 
> > > > 2ea624698e cs_last: 2c710aa17f mask: 
> > > > [  145.032733] clocksource: Switched to clocksource acpi_pm
> > > > [  145.236951] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S 
> > > > GM240S01X 4.6  PQ: 0 ANSI: 2 CCS
> > > > [  159.959308] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > > > 
> > > > 
> > > 
> > > This problem doesn't show up on my hardware, and I'd like to know where 
> > > those 22 seconds are being spent. Would you please apply the entire 
> > > series 
> > > and add,
> > > #define NDEBUG (NDEBUG_ARBITRATION | NDEBUG_SELECTION | NDEBUG_MAIN)
> > > to the top of g_NCR5380.c and send me the messages logged during modprobe?
> > 
> > [  397.014581] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> > 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > [  412.099695] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  412.103625] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  412.110503] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > 4.6  PQ: 0 ANSI: 2 CCS
> > [  412.110892] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  412.114154] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  412.119733] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  427.198108] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  442.276586] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  457.354592] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  472.432999] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  487.513027] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > 
> > 
> 
> I see that the delay increased from 22 seconds to 90 seconds. Perhaps that 
> is because the remaining patches were applied.

The remaining patches were not applied. The delay varies, probably depending
on the state of the external drive.

> Anyway, it seems to confirm that this delay comes from scanning the six 
> unused SCSI bus IDs. Each one takes 15 seconds. Could be a five second 
> timeout and three retries. It's hard to tell because the dsprintk() 
> logging didn't show up. Can you try modprobe after,
> # echo 1 > /sys/module/printk/parameters/ignore_loglevel
> And maybe also,
> # scsi_logging_level -sS7

[  156.014234] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
[  156.020369] scsi host2: scsi_scan_host_selected: 
<4294967295:4294967295:18446744073709551615>
[  156.020777] scsi 2:0:0:0: scsi scan: INQUIRY pass 1 length 36
[  156.022046] scsi host2: main: dequeued de045600
[  156.022109] STATUS_REG: 00
BASR: 00
ICR: 00
MODE: 00
[  156.022451] scsi host2: starting arbitration, id = 7
[  156.022740] scsi host2: won arbitration
[  156.022806] scsi host2: selecting target 0
[  171.097450] scsi host2: target did not respond within 250ms
[  171.097508] scsi host2: main: select complete
[  171.099554] scsi 2:0:0:0: scsi scan: INQUIRY failed with code 0x4
[  171.103200] scsi 2:0:1:0: scsi scan: INQUIRY pass 1 length 36
[  171.105246] scsi host2: main: dequeued df6b9f00
[  171.105350] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  171.105664] scsi host2: starting arbitration, id = 7
[  171.105752] scsi host2: won arbitration
[  171.105822] scsi host2: 

Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-26 Thread Finn Thain

On Wed, 25 Nov 2015, Ondrej Zary wrote:

> On Wednesday 25 November 2015 04:17:09 Finn Thain wrote:
> > 
> > On Tue, 24 Nov 2015, Ondrej Zary wrote:
> > 
> > > 
> > > Bisecting slow module initialization pointed to this commit.
> > > 
> > > Before this commit (2 seconds):
> > > [   60.317374] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, 
> > > sg_tablesize 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, 
> > > USLEEP_SLEEP 50, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > > [   60.780715] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > > 4.6  PQ: 0 ANSI: 2 CCS
> > > [   62.606260] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > > 
> > > 
> > > After this commit (22 seconds):
> > > [  137.511711] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, 
> > > sg_tablesize 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, 
> > > USLEEP_SLEEP 50, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > > [  145.028532] clocksource: timekeeping watchdog: Marking clocksource 
> > > 'tsc' as unstable because the skew is too large:
> > > [  145.029767] clocksource:   'acpi_pm' wd_now: 
> > > a49738 wd_last: f4da04 mask: ff
> > > [  145.029828] clocksource:   'tsc' cs_now: 
> > > 2ea624698e cs_last: 2c710aa17f mask: 
> > > [  145.032733] clocksource: Switched to clocksource acpi_pm
> > > [  145.236951] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > > 4.6  PQ: 0 ANSI: 2 CCS
> > > [  159.959308] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > > 
> > > 
> > 
> > This problem doesn't show up on my hardware, and I'd like to know where 
> > those 22 seconds are being spent. Would you please apply the entire series 
> > and add,
> > #define NDEBUG (NDEBUG_ARBITRATION | NDEBUG_SELECTION | NDEBUG_MAIN)
> > to the top of g_NCR5380.c and send me the messages logged during modprobe?
> 
> [  397.014581] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> [  412.099695] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  412.103625] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  412.110503] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS
> [  412.110892] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  412.114154] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  412.119733] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  427.198108] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  442.276586] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  457.354592] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  472.432999] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  487.513027] sd 2:0:1:0: Attached scsi generic sg1 type 0
> 
> 

I see that the delay increased from 22 seconds to 90 seconds. Perhaps that 
is because the remaining patches were applied.

Anyway, it seems to confirm that this delay comes from scanning the six 
unused SCSI bus IDs. Each one takes 15 seconds. Could be a five second 
timeout and three retries. It's hard to tell because the dsprintk() 
logging didn't show up. Can you try modprobe after,
# echo 1 > /sys/module/printk/parameters/ignore_loglevel
And maybe also,
# scsi_logging_level -sS7

It seems like the 53C400 chip doesn't arbitrate like a 53C80, but that 
doesn't make sense given it contains a 53C80 logic block. Perhaps some 
extra initialization is needed. Which card(s) are affected?

Aside from trying to figure out why the chip behaves oddly, we could also 
try to figure out whether the driver and its main loop is causing problems 
for other tasks i.e. timekeeping watchdog. Does this patch make any 
difference?

Index: linux/drivers/scsi/atari_NCR5380.c
===
--- linux.orig/drivers/scsi/atari_NCR5380.c 2015-11-26 15:42:01.0 
+1100
+++ linux/drivers/scsi/atari_NCR5380.c  2015-11-26 15:42:04.0 +1100
@@ -925,6 +925,7 @@ static void NCR5380_main(struct work_str
container_of(work, struct NCR5380_hostdata, main_task);
struct Scsi_Host *instance = hostdata->host;
struct scsi_cmnd *cmd;
+   unsigned long deadline = jiffies + msecs_to_jiffies(10);
int done;
 
/*
@@ -984,9 +985,12 @@ static void NCR5380_main(struct work_str
done = 0;
}
spin_unlock_irq(>lock);
-   if (!done)
-   cond_resched();
-   } while (!done);
+   if (done || time_is_before_jiffies(deadline))
+   break;
+   cond_resched();
+   } while (1);
+   if (!done)
+   queue_work(hostdata->work_q, >main_task);
 }
 
 

Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-26 Thread Finn Thain

On Thu, 26 Nov 2015, Ondrej Zary wrote:

> 
> [  156.014234] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> [  156.020369] scsi host2: scsi_scan_host_selected: 
> <4294967295:4294967295:18446744073709551615>
> [  156.020777] scsi 2:0:0:0: scsi scan: INQUIRY pass 1 length 36
> [  156.022046] scsi host2: main: dequeued de045600
> [  156.022109] STATUS_REG: 00
> BASR: 00
> ICR: 00
> MODE: 00
> [  156.022451] scsi host2: starting arbitration, id = 7
> [  156.022740] scsi host2: won arbitration
> [  156.022806] scsi host2: selecting target 0
> [  171.097450] scsi host2: target did not respond within 250ms
> [  171.097508] scsi host2: main: select complete
> [  171.099554] scsi 2:0:0:0: scsi scan: INQUIRY failed with code 0x4

This is very helpful, thanks.

The problem has to be this:

err = NCR5380_poll_politely(instance, STATUS_REG, SR_BSY, SR_BSY,
msecs_to_jiffies(250));

which would suggest a problem with the busy wait loop. It seems that 5380 
register accesses are much more expensive on PC than the platforms I've 
tested.

> [  171.103200] scsi 2:0:1:0: scsi scan: INQUIRY pass 1 length 36
> [  171.105246] scsi host2: main: dequeued df6b9f00
> [  171.105350] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  171.105664] scsi host2: starting arbitration, id = 7
> [  171.105752] scsi host2: won arbitration
> [  171.105822] scsi host2: selecting target 1
> [  171.105977] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> [  171.106042] scsi host2: nexus established.
> [  171.106093] scsi host2: main: select complete
> [  171.106145] scsi host2: main: performing information transfer
> [  171.108531] scsi 2:0:1:0: scsi scan: INQUIRY successful with code 0x0
> [  171.108639] scsi 2:0:1:0: scsi scan: INQUIRY pass 2 length 130
> [  171.109401] scsi host2: main: dequeued df6b9f00
> [  171.109485] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  171.109779] scsi host2: starting arbitration, id = 7
> [  171.109878] scsi host2: won arbitration
> [  171.109954] scsi host2: selecting target 1
> [  171.110112] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> [  171.110181] scsi host2: nexus established.
> [  171.110236] scsi host2: main: select complete
> [  171.110286] scsi host2: main: performing information transfer
> [  171.114811] scsi 2:0:1:0: scsi scan: INQUIRY successful with code 0x0
> [  171.114928] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS
> [  171.115011] scsi target2:0:1: scsi scan: Sequential scan
> [  171.115856] scsi 2:0:1:1: scsi scan: INQUIRY pass 1 length 36
> [  171.116008] scsi host2: main: dequeued df6b9800
> [  171.116039] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  171.116736] scsi host2: starting arbitration, id = 7
> [  171.116843] scsi host2: won arbitration
> [  171.116898] scsi host2: selecting target 1
> [  171.117117] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> [  171.117182] scsi host2: nexus established.
> [  171.117234] scsi host2: main: select complete
> [  171.117287] scsi host2: main: performing information transfer
> [  171.119674] scsi 2:0:1:1: scsi scan: INQUIRY successful with code 0x0
> [  171.119781] scsi 2:0:1:1: scsi scan: INQUIRY pass 2 length 130
> [  171.119898] scsi host2: main: dequeued df6b9800
> [  171.119962] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  171.121426] scsi host2: starting arbitration, id = 7
> [  171.121537] scsi host2: won arbitration
> [  171.121592] scsi host2: selecting target 1
> [  171.121766] scsi host2: target 1 selected, going into MESSAGE OUT phase.
> [  171.121834] scsi host2: nexus established.
> [  171.121889] scsi host2: main: select complete
> [  171.121940] scsi host2: main: performing information transfer
> [  171.126361] scsi 2:0:1:1: scsi scan: INQUIRY successful with code 0x0
> [  171.126470] scsi 2:0:1:1: scsi scan: peripheral qualifier of 3, device not 
> added
> [  171.127735] scsi 2:0:2:0: scsi scan: INQUIRY pass 1 length 36
> [  171.127920] scsi host2: main: dequeued df6b9800
> [  171.127982] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  171.128648] scsi host2: starting arbitration, id = 7
> [  171.128754] scsi host2: won arbitration
> [  171.128811] scsi host2: selecting target 2
> [  186.203734] scsi host2: target did not respond within 250ms
> [  186.203794] scsi host2: main: select complete
> [  186.205355] scsi 2:0:2:0: scsi scan: INQUIRY failed with code 0x4
> [  186.208989] scsi 2:0:3:0: scsi scan: INQUIRY pass 1 length 36
> [  186.209178] scsi host2: main: dequeued df6b9f00
> [  186.209242] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  186.209570] scsi host2: starting arbitration, id = 7
> [  186.209660] scsi host2: won arbitration
> [  186.209731] scsi host2: selecting target 3
> [  

Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-26 Thread Finn Thain

On Wed, 25 Nov 2015, Ondrej Zary wrote:

> On Wednesday 25 November 2015 04:17:09 Finn Thain wrote:
> > 
> > On Tue, 24 Nov 2015, Ondrej Zary wrote:
> > 
> > > 
> > > Bisecting slow module initialization pointed to this commit.
> > > 
> > > Before this commit (2 seconds):
> > > [   60.317374] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, 
> > > sg_tablesize 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, 
> > > USLEEP_SLEEP 50, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > > [   60.780715] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > > 4.6  PQ: 0 ANSI: 2 CCS
> > > [   62.606260] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > > 
> > > 
> > > After this commit (22 seconds):
> > > [  137.511711] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, 
> > > sg_tablesize 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, 
> > > USLEEP_SLEEP 50, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > > [  145.028532] clocksource: timekeeping watchdog: Marking clocksource 
> > > 'tsc' as unstable because the skew is too large:
> > > [  145.029767] clocksource:   'acpi_pm' wd_now: 
> > > a49738 wd_last: f4da04 mask: ff
> > > [  145.029828] clocksource:   'tsc' cs_now: 
> > > 2ea624698e cs_last: 2c710aa17f mask: 
> > > [  145.032733] clocksource: Switched to clocksource acpi_pm
> > > [  145.236951] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > > 4.6  PQ: 0 ANSI: 2 CCS
> > > [  159.959308] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > > 
> > > 
> > 
> > This problem doesn't show up on my hardware, and I'd like to know where 
> > those 22 seconds are being spent. Would you please apply the entire series 
> > and add,
> > #define NDEBUG (NDEBUG_ARBITRATION | NDEBUG_SELECTION | NDEBUG_MAIN)
> > to the top of g_NCR5380.c and send me the messages logged during modprobe?
> 
> [  397.014581] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> [  412.099695] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  412.103625] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  412.110503] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS
> [  412.110892] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  412.114154] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  412.119733] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  427.198108] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  442.276586] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  457.354592] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  472.432999] STATUS_REG: 00
> BASR: 08
> ICR: 00
> MODE: 00
> [  487.513027] sd 2:0:1:0: Attached scsi generic sg1 type 0
> 
> 

I see that the delay increased from 22 seconds to 90 seconds. Perhaps that 
is because the remaining patches were applied.

Anyway, it seems to confirm that this delay comes from scanning the six 
unused SCSI bus IDs. Each one takes 15 seconds. Could be a five second 
timeout and three retries. It's hard to tell because the dsprintk() 
logging didn't show up. Can you try modprobe after,
# echo 1 > /sys/module/printk/parameters/ignore_loglevel
And maybe also,
# scsi_logging_level -sS7

It seems like the 53C400 chip doesn't arbitrate like a 53C80, but that 
doesn't make sense given it contains a 53C80 logic block. Perhaps some 
extra initialization is needed. Which card(s) are affected?

Aside from trying to figure out why the chip behaves oddly, we could also 
try to figure out whether the driver and its main loop is causing problems 
for other tasks i.e. timekeeping watchdog. Does this patch make any 
difference?

Index: linux/drivers/scsi/atari_NCR5380.c
===
--- linux.orig/drivers/scsi/atari_NCR5380.c 2015-11-26 15:42:01.0 
+1100
+++ linux/drivers/scsi/atari_NCR5380.c  2015-11-26 15:42:04.0 +1100
@@ -925,6 +925,7 @@ static void NCR5380_main(struct work_str
container_of(work, struct NCR5380_hostdata, main_task);
struct Scsi_Host *instance = hostdata->host;
struct scsi_cmnd *cmd;
+   unsigned long deadline = jiffies + msecs_to_jiffies(10);
int done;
 
/*
@@ -984,9 +985,12 @@ static void NCR5380_main(struct work_str
done = 0;
}
spin_unlock_irq(>lock);
-   if (!done)
-   cond_resched();
-   } while (!done);
+   if (done || time_is_before_jiffies(deadline))
+   break;
+   cond_resched();
+   } while (1);
+   if (!done)
+   queue_work(hostdata->work_q, >main_task);
 }
 
 

Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-26 Thread Ondrej Zary
On Thursday 26 November 2015 11:23:01 Finn Thain wrote:
> 
> On Wed, 25 Nov 2015, Ondrej Zary wrote:
> 
> > On Wednesday 25 November 2015 04:17:09 Finn Thain wrote:
> > > 
> > > On Tue, 24 Nov 2015, Ondrej Zary wrote:
> > > 
> > > > 
> > > > Bisecting slow module initialization pointed to this commit.
> > > > 
> > > > Before this commit (2 seconds):
> > > > [   60.317374] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > > > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, 
> > > > sg_tablesize 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, 
> > > > USLEEP_SLEEP 50, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > > > [   60.780715] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S 
> > > > GM240S01X 4.6  PQ: 0 ANSI: 2 CCS
> > > > [   62.606260] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > > > 
> > > > 
> > > > After this commit (22 seconds):
> > > > [  137.511711] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > > > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, 
> > > > sg_tablesize 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, 
> > > > USLEEP_SLEEP 50, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > > > [  145.028532] clocksource: timekeeping watchdog: Marking clocksource 
> > > > 'tsc' as unstable because the skew is too large:
> > > > [  145.029767] clocksource:   'acpi_pm' wd_now: 
> > > > a49738 wd_last: f4da04 mask: ff
> > > > [  145.029828] clocksource:   'tsc' cs_now: 
> > > > 2ea624698e cs_last: 2c710aa17f mask: 
> > > > [  145.032733] clocksource: Switched to clocksource acpi_pm
> > > > [  145.236951] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S 
> > > > GM240S01X 4.6  PQ: 0 ANSI: 2 CCS
> > > > [  159.959308] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > > > 
> > > > 
> > > 
> > > This problem doesn't show up on my hardware, and I'd like to know where 
> > > those 22 seconds are being spent. Would you please apply the entire 
> > > series 
> > > and add,
> > > #define NDEBUG (NDEBUG_ARBITRATION | NDEBUG_SELECTION | NDEBUG_MAIN)
> > > to the top of g_NCR5380.c and send me the messages logged during modprobe?
> > 
> > [  397.014581] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> > 128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
> > [  412.099695] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  412.103625] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  412.110503] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > 4.6  PQ: 0 ANSI: 2 CCS
> > [  412.110892] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  412.114154] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  412.119733] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  427.198108] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  442.276586] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  457.354592] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  472.432999] STATUS_REG: 00
> > BASR: 08
> > ICR: 00
> > MODE: 00
> > [  487.513027] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > 
> > 
> 
> I see that the delay increased from 22 seconds to 90 seconds. Perhaps that 
> is because the remaining patches were applied.

The remaining patches were not applied. The delay varies, probably depending
on the state of the external drive.

> Anyway, it seems to confirm that this delay comes from scanning the six 
> unused SCSI bus IDs. Each one takes 15 seconds. Could be a five second 
> timeout and three retries. It's hard to tell because the dsprintk() 
> logging didn't show up. Can you try modprobe after,
> # echo 1 > /sys/module/printk/parameters/ignore_loglevel
> And maybe also,
> # scsi_logging_level -sS7

[  156.014234] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
[  156.020369] scsi host2: scsi_scan_host_selected: 
<4294967295:4294967295:18446744073709551615>
[  156.020777] scsi 2:0:0:0: scsi scan: INQUIRY pass 1 length 36
[  156.022046] scsi host2: main: dequeued de045600
[  156.022109] STATUS_REG: 00
BASR: 00
ICR: 00
MODE: 00
[  156.022451] scsi host2: starting arbitration, id = 7
[  156.022740] scsi host2: won arbitration
[  156.022806] scsi host2: selecting target 0
[  171.097450] scsi host2: target did not respond within 250ms
[  171.097508] scsi host2: main: select complete
[  171.099554] scsi 2:0:0:0: scsi scan: INQUIRY failed with code 0x4
[  171.103200] scsi 2:0:1:0: scsi scan: INQUIRY pass 1 length 36
[  171.105246] scsi host2: main: dequeued df6b9f00
[  171.105350] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  171.105664] scsi host2: starting arbitration, id = 7
[  171.105752] scsi host2: won arbitration
[  171.105822] scsi host2: 

Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-25 Thread Ondrej Zary
On Wednesday 25 November 2015 04:17:09 Finn Thain wrote:
> 
> On Tue, 24 Nov 2015, Ondrej Zary wrote:
> 
> > On Wednesday 18 November 2015 09:35:17 Finn Thain wrote:
> > > Linux v2.1.105 changed the algorithm for polling for the BSY signal
> > > in NCR5380_select() and NCR5380_main().
> > > 
> > > Presently, this code has a bug. Back then, NCR5380_set_timer(hostdata, 1)
> > > meant reschedule main() after sleeping for 10 ms. Repeated 25 times this
> > > provided the recommended 250 ms selection time-out delay. This got broken
> > > when HZ became configurable.
> > > 
> > > We could fix this but there's no need to reschedule the main loop. This
> > > BSY polling presently happens when the NCR5380_main() work queue item
> > > calls NCR5380_select(), which in turn schedules NCR5380_main(), which
> > > calls NCR5380_select() again, and so on.
> > > 
> > > This algorithm is a deviation from the simpler one in atari_NCR5380.c.
> > > The extra complexity and state is pointless. There's no reason to
> > > stop selection half-way and return to to the main loop when the main
> > > loop can do nothing useful until selection completes.
> > > 
> > > So just poll for BSY. We can sleep while polling now that we have a
> > > suitable workqueue.
> > 
> > Bisecting slow module initialization pointed to this commit.
> 
> That's disappointing. This patch removed some nasty code. Anyway, thanks 
> for taking the trouble to bisect.
> 
> > 
> > Before this commit (2 seconds):
> > [   60.317374] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> > 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, 
> > options { AUTOPROBE_IRQ PSEUDO_DMA }
> > [   60.780715] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > 4.6  PQ: 0 ANSI: 2 CCS
> > [   62.606260] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > 
> > 
> > After this commit (22 seconds):
> > [  137.511711] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> > 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, 
> > options { AUTOPROBE_IRQ PSEUDO_DMA }
> > [  145.028532] clocksource: timekeeping watchdog: Marking clocksource 'tsc' 
> > as unstable because the skew is too large:
> > [  145.029767] clocksource:   'acpi_pm' wd_now: a49738 
> > wd_last: f4da04 mask: ff
> > [  145.029828] clocksource:   'tsc' cs_now: 2ea624698e 
> > cs_last: 2c710aa17f mask: 
> > [  145.032733] clocksource: Switched to clocksource acpi_pm
> 
> I figured that it was okay to sleep from an unbound CPU-intensive 
> workqueue but doing so seems to cause problems. (See also patch 66/71 
> "Fix soft lockups".)
> 
> Perhaps a kthread is needed instead of a workqueue? (This workqueue 
> already has it's own kthread, but top shows that it doesn't accrue CPU 
> time.)
> 
> > [  145.236951] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > 4.6  PQ: 0 ANSI: 2 CCS
> > [  159.959308] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > 
> > 
> 
> This problem doesn't show up on my hardware, and I'd like to know where 
> those 22 seconds are being spent. Would you please apply the entire series 
> and add,
> #define NDEBUG (NDEBUG_ARBITRATION | NDEBUG_SELECTION | NDEBUG_MAIN)
> to the top of g_NCR5380.c and send me the messages logged during modprobe?

 [  397.014581] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
[  412.099695] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  412.103625] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  412.110503] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
PQ: 0 ANSI: 2 CCS
[  412.110892] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  412.114154] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  412.119733] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  427.198108] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  442.276586] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  457.354592] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  472.432999] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  487.513027] sd 2:0:1:0: Attached scsi generic sg1 type 0



-- 
Ondrej Zary
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-25 Thread Ondrej Zary
On Wednesday 25 November 2015 04:17:09 Finn Thain wrote:
> 
> On Tue, 24 Nov 2015, Ondrej Zary wrote:
> 
> > On Wednesday 18 November 2015 09:35:17 Finn Thain wrote:
> > > Linux v2.1.105 changed the algorithm for polling for the BSY signal
> > > in NCR5380_select() and NCR5380_main().
> > > 
> > > Presently, this code has a bug. Back then, NCR5380_set_timer(hostdata, 1)
> > > meant reschedule main() after sleeping for 10 ms. Repeated 25 times this
> > > provided the recommended 250 ms selection time-out delay. This got broken
> > > when HZ became configurable.
> > > 
> > > We could fix this but there's no need to reschedule the main loop. This
> > > BSY polling presently happens when the NCR5380_main() work queue item
> > > calls NCR5380_select(), which in turn schedules NCR5380_main(), which
> > > calls NCR5380_select() again, and so on.
> > > 
> > > This algorithm is a deviation from the simpler one in atari_NCR5380.c.
> > > The extra complexity and state is pointless. There's no reason to
> > > stop selection half-way and return to to the main loop when the main
> > > loop can do nothing useful until selection completes.
> > > 
> > > So just poll for BSY. We can sleep while polling now that we have a
> > > suitable workqueue.
> > 
> > Bisecting slow module initialization pointed to this commit.
> 
> That's disappointing. This patch removed some nasty code. Anyway, thanks 
> for taking the trouble to bisect.
> 
> > 
> > Before this commit (2 seconds):
> > [   60.317374] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> > 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, 
> > options { AUTOPROBE_IRQ PSEUDO_DMA }
> > [   60.780715] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > 4.6  PQ: 0 ANSI: 2 CCS
> > [   62.606260] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > 
> > 
> > After this commit (22 seconds):
> > [  137.511711] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> > n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> > 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, 
> > options { AUTOPROBE_IRQ PSEUDO_DMA }
> > [  145.028532] clocksource: timekeeping watchdog: Marking clocksource 'tsc' 
> > as unstable because the skew is too large:
> > [  145.029767] clocksource:   'acpi_pm' wd_now: a49738 
> > wd_last: f4da04 mask: ff
> > [  145.029828] clocksource:   'tsc' cs_now: 2ea624698e 
> > cs_last: 2c710aa17f mask: 
> > [  145.032733] clocksource: Switched to clocksource acpi_pm
> 
> I figured that it was okay to sleep from an unbound CPU-intensive 
> workqueue but doing so seems to cause problems. (See also patch 66/71 
> "Fix soft lockups".)
> 
> Perhaps a kthread is needed instead of a workqueue? (This workqueue 
> already has it's own kthread, but top shows that it doesn't accrue CPU 
> time.)
> 
> > [  145.236951] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 
> > 4.6  PQ: 0 ANSI: 2 CCS
> > [  159.959308] sd 2:0:1:0: Attached scsi generic sg1 type 0
> > 
> > 
> 
> This problem doesn't show up on my hardware, and I'd like to know where 
> those 22 seconds are being spent. Would you please apply the entire series 
> and add,
> #define NDEBUG (NDEBUG_ARBITRATION | NDEBUG_SELECTION | NDEBUG_MAIN)
> to the top of g_NCR5380.c and send me the messages logged during modprobe?

 [  397.014581] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
128, this_id 7, flags { NO_DMA_FIXUP }, options { AUTOPROBE_IRQ PSEUDO_DMA }
[  412.099695] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  412.103625] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  412.110503] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
PQ: 0 ANSI: 2 CCS
[  412.110892] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  412.114154] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  412.119733] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  427.198108] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  442.276586] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  457.354592] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  472.432999] STATUS_REG: 00
BASR: 08
ICR: 00
MODE: 00
[  487.513027] sd 2:0:1:0: Attached scsi generic sg1 type 0



-- 
Ondrej Zary
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-24 Thread Finn Thain

On Tue, 24 Nov 2015, Ondrej Zary wrote:

> On Wednesday 18 November 2015 09:35:17 Finn Thain wrote:
> > Linux v2.1.105 changed the algorithm for polling for the BSY signal
> > in NCR5380_select() and NCR5380_main().
> > 
> > Presently, this code has a bug. Back then, NCR5380_set_timer(hostdata, 1)
> > meant reschedule main() after sleeping for 10 ms. Repeated 25 times this
> > provided the recommended 250 ms selection time-out delay. This got broken
> > when HZ became configurable.
> > 
> > We could fix this but there's no need to reschedule the main loop. This
> > BSY polling presently happens when the NCR5380_main() work queue item
> > calls NCR5380_select(), which in turn schedules NCR5380_main(), which
> > calls NCR5380_select() again, and so on.
> > 
> > This algorithm is a deviation from the simpler one in atari_NCR5380.c.
> > The extra complexity and state is pointless. There's no reason to
> > stop selection half-way and return to to the main loop when the main
> > loop can do nothing useful until selection completes.
> > 
> > So just poll for BSY. We can sleep while polling now that we have a
> > suitable workqueue.
> 
> Bisecting slow module initialization pointed to this commit.

That's disappointing. This patch removed some nasty code. Anyway, thanks 
for taking the trouble to bisect.

> 
> Before this commit (2 seconds):
> [   60.317374] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, options 
> { AUTOPROBE_IRQ PSEUDO_DMA }
> [   60.780715] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS
> [   62.606260] sd 2:0:1:0: Attached scsi generic sg1 type 0
> 
> 
> After this commit (22 seconds):
> [  137.511711] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, options 
> { AUTOPROBE_IRQ PSEUDO_DMA }
> [  145.028532] clocksource: timekeeping watchdog: Marking clocksource 'tsc' 
> as unstable because the skew is too large:
> [  145.029767] clocksource:   'acpi_pm' wd_now: a49738 
> wd_last: f4da04 mask: ff
> [  145.029828] clocksource:   'tsc' cs_now: 2ea624698e 
> cs_last: 2c710aa17f mask: 
> [  145.032733] clocksource: Switched to clocksource acpi_pm

I figured that it was okay to sleep from an unbound CPU-intensive 
workqueue but doing so seems to cause problems. (See also patch 66/71 
"Fix soft lockups".)

Perhaps a kthread is needed instead of a workqueue? (This workqueue 
already has it's own kthread, but top shows that it doesn't accrue CPU 
time.)

> [  145.236951] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS
> [  159.959308] sd 2:0:1:0: Attached scsi generic sg1 type 0
> 
> 

This problem doesn't show up on my hardware, and I'd like to know where 
those 22 seconds are being spent. Would you please apply the entire series 
and add,
#define NDEBUG (NDEBUG_ARBITRATION | NDEBUG_SELECTION | NDEBUG_MAIN)
to the top of g_NCR5380.c and send me the messages logged during modprobe?

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-24 Thread Ondrej Zary
On Wednesday 18 November 2015 09:35:17 Finn Thain wrote:
> Linux v2.1.105 changed the algorithm for polling for the BSY signal
> in NCR5380_select() and NCR5380_main().
> 
> Presently, this code has a bug. Back then, NCR5380_set_timer(hostdata, 1)
> meant reschedule main() after sleeping for 10 ms. Repeated 25 times this
> provided the recommended 250 ms selection time-out delay. This got broken
> when HZ became configurable.
> 
> We could fix this but there's no need to reschedule the main loop. This
> BSY polling presently happens when the NCR5380_main() work queue item
> calls NCR5380_select(), which in turn schedules NCR5380_main(), which
> calls NCR5380_select() again, and so on.
> 
> This algorithm is a deviation from the simpler one in atari_NCR5380.c.
> The extra complexity and state is pointless. There's no reason to
> stop selection half-way and return to to the main loop when the main
> loop can do nothing useful until selection completes.
> 
> So just poll for BSY. We can sleep while polling now that we have a
> suitable workqueue.

Bisecting slow module initialization pointed to this commit.

Before this commit (2 seconds):
[   60.317374] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, options { 
AUTOPROBE_IRQ PSEUDO_DMA }
[   60.780715] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
PQ: 0 ANSI: 2 CCS
[   62.606260] sd 2:0:1:0: Attached scsi generic sg1 type 0


After this commit (22 seconds):
[  137.511711] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, options { 
AUTOPROBE_IRQ PSEUDO_DMA }
[  145.028532] clocksource: timekeeping watchdog: Marking clocksource 'tsc' as 
unstable because the skew is too large:
[  145.029767] clocksource:   'acpi_pm' wd_now: a49738 
wd_last: f4da04 mask: ff
[  145.029828] clocksource:   'tsc' cs_now: 2ea624698e 
cs_last: 2c710aa17f mask: 
[  145.032733] clocksource: Switched to clocksource acpi_pm
[  145.236951] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
PQ: 0 ANSI: 2 CCS
[  159.959308] sd 2:0:1:0: Attached scsi generic sg1 type 0


-- 
Ondrej Zary
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-24 Thread Ondrej Zary
On Wednesday 18 November 2015 09:35:17 Finn Thain wrote:
> Linux v2.1.105 changed the algorithm for polling for the BSY signal
> in NCR5380_select() and NCR5380_main().
> 
> Presently, this code has a bug. Back then, NCR5380_set_timer(hostdata, 1)
> meant reschedule main() after sleeping for 10 ms. Repeated 25 times this
> provided the recommended 250 ms selection time-out delay. This got broken
> when HZ became configurable.
> 
> We could fix this but there's no need to reschedule the main loop. This
> BSY polling presently happens when the NCR5380_main() work queue item
> calls NCR5380_select(), which in turn schedules NCR5380_main(), which
> calls NCR5380_select() again, and so on.
> 
> This algorithm is a deviation from the simpler one in atari_NCR5380.c.
> The extra complexity and state is pointless. There's no reason to
> stop selection half-way and return to to the main loop when the main
> loop can do nothing useful until selection completes.
> 
> So just poll for BSY. We can sleep while polling now that we have a
> suitable workqueue.

Bisecting slow module initialization pointed to this commit.

Before this commit (2 seconds):
[   60.317374] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, options { 
AUTOPROBE_IRQ PSEUDO_DMA }
[   60.780715] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
PQ: 0 ANSI: 2 CCS
[   62.606260] sd 2:0:1:0: Attached scsi generic sg1 type 0


After this commit (22 seconds):
[  137.511711] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, options { 
AUTOPROBE_IRQ PSEUDO_DMA }
[  145.028532] clocksource: timekeeping watchdog: Marking clocksource 'tsc' as 
unstable because the skew is too large:
[  145.029767] clocksource:   'acpi_pm' wd_now: a49738 
wd_last: f4da04 mask: ff
[  145.029828] clocksource:   'tsc' cs_now: 2ea624698e 
cs_last: 2c710aa17f mask: 
[  145.032733] clocksource: Switched to clocksource acpi_pm
[  145.236951] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
PQ: 0 ANSI: 2 CCS
[  159.959308] sd 2:0:1:0: Attached scsi generic sg1 type 0


-- 
Ondrej Zary
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-24 Thread Finn Thain

On Tue, 24 Nov 2015, Ondrej Zary wrote:

> On Wednesday 18 November 2015 09:35:17 Finn Thain wrote:
> > Linux v2.1.105 changed the algorithm for polling for the BSY signal
> > in NCR5380_select() and NCR5380_main().
> > 
> > Presently, this code has a bug. Back then, NCR5380_set_timer(hostdata, 1)
> > meant reschedule main() after sleeping for 10 ms. Repeated 25 times this
> > provided the recommended 250 ms selection time-out delay. This got broken
> > when HZ became configurable.
> > 
> > We could fix this but there's no need to reschedule the main loop. This
> > BSY polling presently happens when the NCR5380_main() work queue item
> > calls NCR5380_select(), which in turn schedules NCR5380_main(), which
> > calls NCR5380_select() again, and so on.
> > 
> > This algorithm is a deviation from the simpler one in atari_NCR5380.c.
> > The extra complexity and state is pointless. There's no reason to
> > stop selection half-way and return to to the main loop when the main
> > loop can do nothing useful until selection completes.
> > 
> > So just poll for BSY. We can sleep while polling now that we have a
> > suitable workqueue.
> 
> Bisecting slow module initialization pointed to this commit.

That's disappointing. This patch removed some nasty code. Anyway, thanks 
for taking the trouble to bisect.

> 
> Before this commit (2 seconds):
> [   60.317374] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, options 
> { AUTOPROBE_IRQ PSEUDO_DMA }
> [   60.780715] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS
> [   62.606260] sd 2:0:1:0: Attached scsi generic sg1 type 0
> 
> 
> After this commit (22 seconds):
> [  137.511711] scsi host2: Generic NCR5380/NCR53C400 SCSI, io_port 0x0, 
> n_io_port 0, base 0xd8000, irq 0, can_queue 16, cmd_per_lun 2, sg_tablesize 
> 128, this_id 7, flags { NCR53C400 }, USLEEP_POLL 3, USLEEP_SLEEP 50, options 
> { AUTOPROBE_IRQ PSEUDO_DMA }
> [  145.028532] clocksource: timekeeping watchdog: Marking clocksource 'tsc' 
> as unstable because the skew is too large:
> [  145.029767] clocksource:   'acpi_pm' wd_now: a49738 
> wd_last: f4da04 mask: ff
> [  145.029828] clocksource:   'tsc' cs_now: 2ea624698e 
> cs_last: 2c710aa17f mask: 
> [  145.032733] clocksource: Switched to clocksource acpi_pm

I figured that it was okay to sleep from an unbound CPU-intensive 
workqueue but doing so seems to cause problems. (See also patch 66/71 
"Fix soft lockups".)

Perhaps a kthread is needed instead of a workqueue? (This workqueue 
already has it's own kthread, but top shows that it doesn't accrue CPU 
time.)

> [  145.236951] scsi 2:0:1:0: Direct-Access QUANTUM  LP240S GM240S01X 4.6  
> PQ: 0 ANSI: 2 CCS
> [  159.959308] sd 2:0:1:0: Attached scsi generic sg1 type 0
> 
> 

This problem doesn't show up on my hardware, and I'd like to know where 
those 22 seconds are being spent. Would you please apply the entire series 
and add,
#define NDEBUG (NDEBUG_ARBITRATION | NDEBUG_SELECTION | NDEBUG_MAIN)
to the top of g_NCR5380.c and send me the messages logged during modprobe?

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-18 Thread Finn Thain
Linux v2.1.105 changed the algorithm for polling for the BSY signal
in NCR5380_select() and NCR5380_main().

Presently, this code has a bug. Back then, NCR5380_set_timer(hostdata, 1)
meant reschedule main() after sleeping for 10 ms. Repeated 25 times this
provided the recommended 250 ms selection time-out delay. This got broken
when HZ became configurable.

We could fix this but there's no need to reschedule the main loop. This
BSY polling presently happens when the NCR5380_main() work queue item
calls NCR5380_select(), which in turn schedules NCR5380_main(), which
calls NCR5380_select() again, and so on.

This algorithm is a deviation from the simpler one in atari_NCR5380.c.
The extra complexity and state is pointless. There's no reason to
stop selection half-way and return to to the main loop when the main
loop can do nothing useful until selection completes.

So just poll for BSY. We can sleep while polling now that we have a
suitable workqueue.

Signed-off-by: Finn Thain 

---
 drivers/scsi/NCR5380.c   |   74 ---
 drivers/scsi/NCR5380.h   |2 -
 drivers/scsi/atari_NCR5380.c |   49 
 3 files changed, 29 insertions(+), 96 deletions(-)

Index: linux/drivers/scsi/NCR5380.c
===
--- linux.orig/drivers/scsi/NCR5380.c   2015-11-18 19:33:31.0 +1100
+++ linux/drivers/scsi/NCR5380.c2015-11-18 19:33:33.0 +1100
@@ -981,7 +981,7 @@ static void NCR5380_main(struct work_str
do {
/* Lock held here */
done = 1;
-   if (!hostdata->connected && !hostdata->selecting) {
+   if (!hostdata->connected) {
dprintk(NDEBUG_MAIN, "scsi%d : not connected\n", 
instance->host_no);
/*
 * Search through the issue_queue for a command destined
@@ -1011,9 +1011,6 @@ static void NCR5380_main(struct work_str
 */
dprintk(NDEBUG_MAIN|NDEBUG_QUEUES, 
"scsi%d : main() : command for target %d lun %llu removed from issue_queue\n", 
instance->host_no, tmp->device->id, tmp->device->lun);

-   hostdata->selecting = NULL;
-   /* RvC: have to preset this to indicate 
a new command is being performed */
-
/*
 * REQUEST SENSE commands are issued 
without tagged
 * queueing, even on SCSI-II devices 
because the
@@ -1031,26 +1028,13 @@ static void NCR5380_main(struct work_str
done = 0;

dprintk(NDEBUG_MAIN|NDEBUG_QUEUES, "scsi%d : main(): select() failed, returned 
to issue_queue\n", instance->host_no);
}
-   if (hostdata->connected ||
-   hostdata->selecting)
+   if (hostdata->connected)
break;
/* lock held here still */
}   /* if target/lun is not busy */
}   /* for */
/* exited locked */
}   /* if (!hostdata->connected) */
-   if (hostdata->selecting) {
-   tmp = (struct scsi_cmnd *) hostdata->selecting;
-   /* Selection will drop and retake the lock */
-   if (!NCR5380_select(instance, tmp)) {
-   /* OK or bad target */
-   } else {
-   LIST(tmp, hostdata->issue_queue);
-   tmp->host_scribble = (unsigned char *) 
hostdata->issue_queue;
-   hostdata->issue_queue = tmp;
-   done = 0;
-   }
-   }   /* if hostdata->selecting */
if (hostdata->connected
 #ifdef REAL_DMA
&& !hostdata->dmalen
@@ -1169,7 +1153,6 @@ static irqreturn_t NCR5380_intr(int dumm
  * Returns : -1 if selection failed but should be retried.
  *  0 if selection failed and should not be retried.
  *  0 if selection succeeded completely (hostdata->connected == cmd).
- *  0 if selection in progress (hostdata->selecting == cmd).
  *
  * Side effects : 
  *  If bus busy, arbitration failed, etc, NCR5380_select() will exit 
@@ -1193,13 +1176,8 @@ static int NCR5380_select(struct Scsi_Ho
unsigned char tmp[3], phase;
unsigned char *data;
int len;
-   unsigned long timeout;
-   unsigned char value;
int err;
 
-   if 

[PATCH 22/71] ncr5380: Eliminate selecting state

2015-11-18 Thread Finn Thain
Linux v2.1.105 changed the algorithm for polling for the BSY signal
in NCR5380_select() and NCR5380_main().

Presently, this code has a bug. Back then, NCR5380_set_timer(hostdata, 1)
meant reschedule main() after sleeping for 10 ms. Repeated 25 times this
provided the recommended 250 ms selection time-out delay. This got broken
when HZ became configurable.

We could fix this but there's no need to reschedule the main loop. This
BSY polling presently happens when the NCR5380_main() work queue item
calls NCR5380_select(), which in turn schedules NCR5380_main(), which
calls NCR5380_select() again, and so on.

This algorithm is a deviation from the simpler one in atari_NCR5380.c.
The extra complexity and state is pointless. There's no reason to
stop selection half-way and return to to the main loop when the main
loop can do nothing useful until selection completes.

So just poll for BSY. We can sleep while polling now that we have a
suitable workqueue.

Signed-off-by: Finn Thain 

---
 drivers/scsi/NCR5380.c   |   74 ---
 drivers/scsi/NCR5380.h   |2 -
 drivers/scsi/atari_NCR5380.c |   49 
 3 files changed, 29 insertions(+), 96 deletions(-)

Index: linux/drivers/scsi/NCR5380.c
===
--- linux.orig/drivers/scsi/NCR5380.c   2015-11-18 19:33:31.0 +1100
+++ linux/drivers/scsi/NCR5380.c2015-11-18 19:33:33.0 +1100
@@ -981,7 +981,7 @@ static void NCR5380_main(struct work_str
do {
/* Lock held here */
done = 1;
-   if (!hostdata->connected && !hostdata->selecting) {
+   if (!hostdata->connected) {
dprintk(NDEBUG_MAIN, "scsi%d : not connected\n", 
instance->host_no);
/*
 * Search through the issue_queue for a command destined
@@ -1011,9 +1011,6 @@ static void NCR5380_main(struct work_str
 */
dprintk(NDEBUG_MAIN|NDEBUG_QUEUES, 
"scsi%d : main() : command for target %d lun %llu removed from issue_queue\n", 
instance->host_no, tmp->device->id, tmp->device->lun);

-   hostdata->selecting = NULL;
-   /* RvC: have to preset this to indicate 
a new command is being performed */
-
/*
 * REQUEST SENSE commands are issued 
without tagged
 * queueing, even on SCSI-II devices 
because the
@@ -1031,26 +1028,13 @@ static void NCR5380_main(struct work_str
done = 0;

dprintk(NDEBUG_MAIN|NDEBUG_QUEUES, "scsi%d : main(): select() failed, returned 
to issue_queue\n", instance->host_no);
}
-   if (hostdata->connected ||
-   hostdata->selecting)
+   if (hostdata->connected)
break;
/* lock held here still */
}   /* if target/lun is not busy */
}   /* for */
/* exited locked */
}   /* if (!hostdata->connected) */
-   if (hostdata->selecting) {
-   tmp = (struct scsi_cmnd *) hostdata->selecting;
-   /* Selection will drop and retake the lock */
-   if (!NCR5380_select(instance, tmp)) {
-   /* OK or bad target */
-   } else {
-   LIST(tmp, hostdata->issue_queue);
-   tmp->host_scribble = (unsigned char *) 
hostdata->issue_queue;
-   hostdata->issue_queue = tmp;
-   done = 0;
-   }
-   }   /* if hostdata->selecting */
if (hostdata->connected
 #ifdef REAL_DMA
&& !hostdata->dmalen
@@ -1169,7 +1153,6 @@ static irqreturn_t NCR5380_intr(int dumm
  * Returns : -1 if selection failed but should be retried.
  *  0 if selection failed and should not be retried.
  *  0 if selection succeeded completely (hostdata->connected == cmd).
- *  0 if selection in progress (hostdata->selecting == cmd).
  *
  * Side effects : 
  *  If bus busy, arbitration failed, etc, NCR5380_select() will exit 
@@ -1193,13 +1176,8 @@ static int NCR5380_select(struct Scsi_Ho
unsigned char tmp[3], phase;
unsigned char *data;
int len;
-   unsigned long timeout;
-   unsigned char value;
int err;