Re: ohci1394 broke 2.6.19 -> 2.6.20-rc1

2007-02-05 Thread Robert Crocombe

On 2/5/07, Stefan Richter <[EMAIL PROTECTED]> wrote:

It's my oversight, see patch.


Yes, this fixes things.  Thanks!

--
Robert Crocombe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


ohci1394 broke 2.6.19 -> 2.6.20-rc1

2007-02-05 Thread Robert Crocombe

Prior to testing a patch for bugzilla bug 7569 (hosts lost on bus
reset), I wanted to reproduce the behavior.  I can under the noted
2.6.16-blah kernels, but moving to anything more recent than 2.6.19
means ohci1394 is non-functional (no 1394 hosts are detected) and the
module cannot be removed.

I have narrowed it down to 2.6.19 works, 2.6.20-rc1 doesn't.  Lots of detail at:

http://bugzilla.kernel.org/show_bug.cgi?id=7942

--
Robert Crocombe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


ohci1394 broke 2.6.19 - 2.6.20-rc1

2007-02-05 Thread Robert Crocombe

Prior to testing a patch for bugzilla bug 7569 (hosts lost on bus
reset), I wanted to reproduce the behavior.  I can under the noted
2.6.16-blah kernels, but moving to anything more recent than 2.6.19
means ohci1394 is non-functional (no 1394 hosts are detected) and the
module cannot be removed.

I have narrowed it down to 2.6.19 works, 2.6.20-rc1 doesn't.  Lots of detail at:

http://bugzilla.kernel.org/show_bug.cgi?id=7942

--
Robert Crocombe
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ohci1394 broke 2.6.19 - 2.6.20-rc1

2007-02-05 Thread Robert Crocombe

On 2/5/07, Stefan Richter [EMAIL PROTECTED] wrote:

It's my oversight, see patch.


Yes, this fixes things.  Thanks!

--
Robert Crocombe
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: In-tree version of new FireWire drivers available

2007-01-26 Thread Robert Crocombe

On 1/25/07, Pieter Palmers <[EMAIL PROTECTED]> wrote:

I'd like to make one note here:
We should have a way to use smaller DMA buffers than one page size. If I
remember correctly, the page size on my system is 4096 bytes, being 1024
quadlets. If we assume a 4 channel audio stream, this corresponds to 256
audio samples. This means that the controller generates an interrupt
every 256 samples, making that we can achieve a latency of 512 samples
at best. This is unacceptable in a pro-audio environment.

The current stack exhibits this problem, and I solve it by recalculating
the max packet size, based upon the stream composition (i.e. expected
packet size) and the requested audio buffer size, such that the
interrupts are generated at a high enough frequency.

I'm not a kernel hacker, but when looking through the code I had the
impression that smaller DMA buffers were possible (aren't smaller
buffers used in packet-per-buffer mode?).


I am using isochronous receive in RAW1394_DMA_PACKET_PER_BUFFER mode
because I am closing a simulation loop around the data that is
received/transmitted.  Just for giggles I cranked up a test
isochronous stream from a bus analyzer at 1kB per packet at 8kHz at
the S400 rate (i.e., one packet on each cycle start: 8MBps ), set the
machine up to listen, and was able to maintain 8kHz interrupts at ~12%
CPU utilization on a 2.8GHz Opteron.

  1744719 interrupts int 218.112 seconds is 7999.193 ints/sec

I wasn't doing anything with the data for this test, but I have had
the aforementioned sim running steady at a somewhat lower rate.  This
test ran under 2.6.20-rc5-rt10, but the more "productiony" system is
on 2.6.16-rt29.

So hopefully you can get markedly lower latencies.  Myself, I'm
tickled pink by the performance that can be achieved.

--
Robert Crocombe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: In-tree version of new FireWire drivers available

2007-01-26 Thread Robert Crocombe

On 1/25/07, Pieter Palmers [EMAIL PROTECTED] wrote:

I'd like to make one note here:
We should have a way to use smaller DMA buffers than one page size. If I
remember correctly, the page size on my system is 4096 bytes, being 1024
quadlets. If we assume a 4 channel audio stream, this corresponds to 256
audio samples. This means that the controller generates an interrupt
every 256 samples, making that we can achieve a latency of 512 samples
at best. This is unacceptable in a pro-audio environment.

The current stack exhibits this problem, and I solve it by recalculating
the max packet size, based upon the stream composition (i.e. expected
packet size) and the requested audio buffer size, such that the
interrupts are generated at a high enough frequency.

I'm not a kernel hacker, but when looking through the code I had the
impression that smaller DMA buffers were possible (aren't smaller
buffers used in packet-per-buffer mode?).


I am using isochronous receive in RAW1394_DMA_PACKET_PER_BUFFER mode
because I am closing a simulation loop around the data that is
received/transmitted.  Just for giggles I cranked up a test
isochronous stream from a bus analyzer at 1kB per packet at 8kHz at
the S400 rate (i.e., one packet on each cycle start: 8MBps ), set the
machine up to listen, and was able to maintain 8kHz interrupts at ~12%
CPU utilization on a 2.8GHz Opteron.

  1744719 interrupts int 218.112 seconds is 7999.193 ints/sec

I wasn't doing anything with the data for this test, but I have had
the aforementioned sim running steady at a somewhat lower rate.  This
test ran under 2.6.20-rc5-rt10, but the more productiony system is
on 2.6.16-rt29.

So hopefully you can get markedly lower latencies.  Myself, I'm
tickled pink by the performance that can be achieved.

--
Robert Crocombe
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19.1-rt15: BUG in __tasklet_action at kernel/softirq.c:568

2006-12-20 Thread Robert Crocombe

On 12/19/06, Ingo Molnar <[EMAIL PROTECTED]> wrote:

yeah. This is something that triggers very rarely on certain boxes. Not
fixed yet, and it's been around for some time.


Is there anything you would like me to do to help diagnose this?

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19.1-rt15: BUG in __tasklet_action at kernel/softirq.c:568

2006-12-20 Thread Robert Crocombe

On 12/19/06, Ingo Molnar [EMAIL PROTECTED] wrote:

yeah. This is something that triggers very rarely on certain boxes. Not
fixed yet, and it's been around for some time.


Is there anything you would like me to do to help diagnose this?

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.19.1-rt15: BUG in __tasklet_action at kernel/softirq.c:568

2006-12-18 Thread Robert Crocombe

Almost exactly 24 hours after booting 2.6.19.1-rt15, I encountered the
following:

softirq-tasklet/49[CPU#3]: BUG in __tasklet_action at kernel/softirq.c:568

Call Trace:
[] __WARN_ON+0x5c/0x74
[] __tasklet_action+0xae/0xf2
[] ksoftirqd+0xfc/0x198
[] ksoftirqd+0x0/0x198
[] kthread+0xd1/0x101
[] child_rip+0xa/0x12
[] kthread+0x0/0x101
[] child_rip+0x0/0x12

softirq-tasklet/36[CPU#2]: BUG in __tasklet_action at kernel/softirq.c:568

Call Trace:
[] __WARN_ON+0x5c/0x74
[] __tasklet_action+0xae/0xf2
[] ksoftirqd+0xfc/0x198
[] ksoftirqd+0x0/0x198
[] kthread+0xd1/0x101
[] child_rip+0xa/0x12
[] kthread+0x0/0x101
[] child_rip+0x0/0x12

softirq-tasklet/49[CPU#3]: BUG in __tasklet_action at kernel/softirq.c:568

Call Trace:
[] __WARN_ON+0x5c/0x74
[] __tasklet_action+0xae/0xf2
[] ksoftirqd+0xfc/0x198
[] ksoftirqd+0x0/0x198
[] kthread+0xd1/0x101
[] child_rip+0xa/0x12
[] kthread+0x0/0x101
[] child_rip+0x0/0x12

softirq-tasklet/49[CPU#3]: BUG in __tasklet_action at kernel/softirq.c:568

Call Trace:
[] __WARN_ON+0x5c/0x74
[] __tasklet_action+0xae/0xf2
[] ksoftirqd+0xfc/0x198
[] ksoftirqd+0x0/0x198
[] kthread+0xd1/0x101
[] child_rip+0xa/0x12
[] kthread+0x0/0x101
[] child_rip+0x0/0x12

I had set the machine to do 1,000 kernel compiles the day before, but
it might have been finished by then (the BUG triggered on a Saturday).
I did this because it was kernel compiles that previously triggered a
hard lockup on -rt kernels.  The machine seems to still be usable, and
the compiles all completed.

The referenced line is:
   /*
* After this point on the tasklet might be rescheduled
* on another CPU, but it can only be added to another
* CPU's tasklet list if we unlock the tasklet (which we
* dont do yet).
*/
   if (!test_and_clear_bit(TASKLET_STATE_SCHED, >state))
   WARN_ON(1);

This is a quad Opteron.  Config attached.

--
Robert Crocombe


config_2.6.19.1-rt15
Description: Binary data


2.6.19.1-rt15: BUG in __tasklet_action at kernel/softirq.c:568

2006-12-18 Thread Robert Crocombe

Almost exactly 24 hours after booting 2.6.19.1-rt15, I encountered the
following:

softirq-tasklet/49[CPU#3]: BUG in __tasklet_action at kernel/softirq.c:568

Call Trace:
[8027aca3] __WARN_ON+0x5c/0x74
[8027c6e6] __tasklet_action+0xae/0xf2
[8027cccd] ksoftirqd+0xfc/0x198
[8027cbd1] ksoftirqd+0x0/0x198
[8022ed5b] kthread+0xd1/0x101
[80257bb8] child_rip+0xa/0x12
[8022ec8a] kthread+0x0/0x101
[80257bae] child_rip+0x0/0x12

softirq-tasklet/36[CPU#2]: BUG in __tasklet_action at kernel/softirq.c:568

Call Trace:
[8027aca3] __WARN_ON+0x5c/0x74
[8027c6e6] __tasklet_action+0xae/0xf2
[8027cccd] ksoftirqd+0xfc/0x198
[8027cbd1] ksoftirqd+0x0/0x198
[8022ed5b] kthread+0xd1/0x101
[80257bb8] child_rip+0xa/0x12
[8022ec8a] kthread+0x0/0x101
[80257bae] child_rip+0x0/0x12

softirq-tasklet/49[CPU#3]: BUG in __tasklet_action at kernel/softirq.c:568

Call Trace:
[8027aca3] __WARN_ON+0x5c/0x74
[8027c6e6] __tasklet_action+0xae/0xf2
[8027cccd] ksoftirqd+0xfc/0x198
[8027cbd1] ksoftirqd+0x0/0x198
[8022ed5b] kthread+0xd1/0x101
[80257bb8] child_rip+0xa/0x12
[8022ec8a] kthread+0x0/0x101
[80257bae] child_rip+0x0/0x12

softirq-tasklet/49[CPU#3]: BUG in __tasklet_action at kernel/softirq.c:568

Call Trace:
[8027aca3] __WARN_ON+0x5c/0x74
[8027c6e6] __tasklet_action+0xae/0xf2
[8027cccd] ksoftirqd+0xfc/0x198
[8027cbd1] ksoftirqd+0x0/0x198
[8022ed5b] kthread+0xd1/0x101
[80257bb8] child_rip+0xa/0x12
[8022ec8a] kthread+0x0/0x101
[80257bae] child_rip+0x0/0x12

I had set the machine to do 1,000 kernel compiles the day before, but
it might have been finished by then (the BUG triggered on a Saturday).
I did this because it was kernel compiles that previously triggered a
hard lockup on -rt kernels.  The machine seems to still be usable, and
the compiles all completed.

The referenced line is:
   /*
* After this point on the tasklet might be rescheduled
* on another CPU, but it can only be added to another
* CPU's tasklet list if we unlock the tasklet (which we
* dont do yet).
*/
   if (!test_and_clear_bit(TASKLET_STATE_SCHED, t-state))
   WARN_ON(1);

This is a quad Opteron.  Config attached.

--
Robert Crocombe


config_2.6.19.1-rt15
Description: Binary data


Re: realtime-preempt and arm

2006-12-15 Thread Robert Crocombe

[EMAIL PROTECTED]:~$ uname -r
2.6.19.1-rt15_00

And I'm totally thrilled since this is the first -rt kernel that I've
tried and been able to boot since .16-rt29.  Yay!

[EMAIL PROTECTED]:~$ zcat /proc/config.gz | egrep "HZ.*=y"
CONFIG_HZ_1000=y

100 revs; min: 5008 max: 5034 avg: 5015
100 revs; min: 5008 max: 5023 avg: 5010
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5018 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5014 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5023 avg: 5010
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5018 avg: 5009
100 revs; min: 5008 max: 5019 avg: 5009
100 revs; min: 5008 max: 5013 avg: 5009

quad Opteron running x86_64 Fedora Core 5.

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: realtime-preempt and arm

2006-12-15 Thread Robert Crocombe

[EMAIL PROTECTED]:~$ uname -r
2.6.19.1-rt15_00

And I'm totally thrilled since this is the first -rt kernel that I've
tried and been able to boot since .16-rt29.  Yay!

[EMAIL PROTECTED]:~$ zcat /proc/config.gz | egrep HZ.*=y
CONFIG_HZ_1000=y

100 revs; min: 5008 max: 5034 avg: 5015
100 revs; min: 5008 max: 5023 avg: 5010
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5018 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5014 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5023 avg: 5010
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5015 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5016 avg: 5009
100 revs; min: 5008 max: 5017 avg: 5009
100 revs; min: 5008 max: 5018 avg: 5009
100 revs; min: 5008 max: 5019 avg: 5009
100 revs; min: 5008 max: 5013 avg: 5009

quad Opteron running x86_64 Fedora Core 5.

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: isochronous receives?

2006-12-14 Thread Robert Crocombe

On 12/13/06, Stefan Richter <[EMAIL PROTECTED]> wrote:

How about leaving ohci1394 as it is but document tag_mask better in
libraw1394's inline doxygen(?) comments, and maybe add an enum or macros
to be used as values of raw1394_iso_recv_start's tag_mask argument?

/* can be ORed together */
#define RAW1394_IR_MATCH_TAG_0   1
#define RAW1394_IR_MATCH_TAG_1   2
#define RAW1394_IR_MATCH_TAG_2   4
#define RAW1394_IR_MATCH_TAG_3   8
#define RAW1394_IR_MATCH_ALL_TAGS   -1


Yeah, that's definitely much better.  I guess this would go in
libraw1394's raw1394.h?  Similar to:

--- raw1394.h   2006-11-29 11:54:56.0 -0700
+++ raw1394_modified.h  2006-12-14 11:20:57.0 -0700
@@ -40,6 +40,14 @@
#define RAW1394_RCODE_TYPE_ERROR 0x6
#define RAW1394_RCODE_ADDRESS_ERROR  0x7

+/* can be ORed together */
+#define RAW1394_IR_MATCH_TAG_0  0x1
+#define RAW1394_IR_MATCH_TAG_1  0x2
+#define RAW1394_IR_MATCH_TAG_2  0x4
+#define RAW1394_IR_MATCH_TAG_3  0x8
+#define RAW1394_IR_MATCH_ALL_TAGS   -1
+#define RAW1394_IR_MATCH_TAG(tag)   (1 << (tag))
+
typedef u_int8_t  byte_t;
typedef u_int32_t quadlet_t;
typedef u_int64_t octlet_t;
@@ -273,7 +281,9 @@
 * @handle: libraw1394 handle
 * @start_on_cycle: isochronous cycle number on which to start
 * (-1 if you don't care)
- * @tag_mask: mask of tag fields to match (-1 to receive all packets)
+ * @tag_mask: mask of tag fields to match.  Use the RAW1394_IR_MATCH_*
+ * values for this rather than the literal tag bits: the values are not
+ * equivalent.
 * @sync: not used, reserved for future implementation
 *
 * Returns: 0 on success or -1 on failure (sets errno)

??

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: isochronous receives?

2006-12-14 Thread Robert Crocombe

On 12/13/06, Stefan Richter [EMAIL PROTECTED] wrote:

How about leaving ohci1394 as it is but document tag_mask better in
libraw1394's inline doxygen(?) comments, and maybe add an enum or macros
to be used as values of raw1394_iso_recv_start's tag_mask argument?

/* can be ORed together */
#define RAW1394_IR_MATCH_TAG_0   1
#define RAW1394_IR_MATCH_TAG_1   2
#define RAW1394_IR_MATCH_TAG_2   4
#define RAW1394_IR_MATCH_TAG_3   8
#define RAW1394_IR_MATCH_ALL_TAGS   -1


Yeah, that's definitely much better.  I guess this would go in
libraw1394's raw1394.h?  Similar to:

--- raw1394.h   2006-11-29 11:54:56.0 -0700
+++ raw1394_modified.h  2006-12-14 11:20:57.0 -0700
@@ -40,6 +40,14 @@
#define RAW1394_RCODE_TYPE_ERROR 0x6
#define RAW1394_RCODE_ADDRESS_ERROR  0x7

+/* can be ORed together */
+#define RAW1394_IR_MATCH_TAG_0  0x1
+#define RAW1394_IR_MATCH_TAG_1  0x2
+#define RAW1394_IR_MATCH_TAG_2  0x4
+#define RAW1394_IR_MATCH_TAG_3  0x8
+#define RAW1394_IR_MATCH_ALL_TAGS   -1
+#define RAW1394_IR_MATCH_TAG(tag)   (1  (tag))
+
typedef u_int8_t  byte_t;
typedef u_int32_t quadlet_t;
typedef u_int64_t octlet_t;
@@ -273,7 +281,9 @@
 * @handle: libraw1394 handle
 * @start_on_cycle: isochronous cycle number on which to start
 * (-1 if you don't care)
- * @tag_mask: mask of tag fields to match (-1 to receive all packets)
+ * @tag_mask: mask of tag fields to match.  Use the RAW1394_IR_MATCH_*
+ * values for this rather than the literal tag bits: the values are not
+ * equivalent.
 * @sync: not used, reserved for future implementation
 *
 * Returns: 0 on success or -1 on failure (sets errno)

??

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: isochronous receives?

2006-12-13 Thread Robert Crocombe

On 11/29/06, Keith Curtis <[EMAIL PROTECTED]> wrote:

I never resolved the problem. I turned on the excessive debugging output, but it
didn't print out info about receiving packets or interrupts. My test
app claimed there were no packets received although the bus analyzer
showed lots of packets going by.


Well, I figured it out, finally.  Thankfully (in a way...), it was my
code: I was setting the tag to -1 in a certain spot (which indicates
that you want to see all packets, regardless of their tag), but
unhelpfully changing it to 0 before calling raw1394_iso_recv_start...

...dangit, though.  Looking at the data stream, the tag *is* zero.

Stefan, isn't the line:

   /* match on specified tags */
   contextMatch = tag_mask << 28;

in ohci_iso_recv_start() wrong?  The register looks to work like this.
The tag field is two bits.

if you want to match on 11b, then set bit tag3 (bit 31)
if you want to match on 10b, then set bit tag2 (bit 30)
if you want to match on 01b, then set bit tag1 (bit 29)
if you want to match on 00b, then set bit tag0 (bit 28)

Which makes the shift obviously wrong.  Passing in '3' to match on tag
11b will have you instead set bits 29 and 28, and you will match on
01b and 00b.  Passing in '0' will completely bone you: no bits will be
turned on.  Passing in '-1' to match all bits does work, though.
You'd have to know to pass in 0x8 to match for tag 11b, which is a
skosh counterintuitive and probably not what was intended.

Here's my crap patch.  It appears to Work For Me(tm).

--- ohci1394.c  2006-12-04 16:52:10.916044780 -0700
+++ modified_ohci1394.c 2006-12-13 07:22:07.613917511 -0700
@@ -1491,7 +1491,18 @@
   reg_write(recv->ohci, recv->ContextControlSet, command);

   /* match on specified tags */
-   contextMatch = tag_mask << 28;
+switch (tag_mask)
+{
+   case -1: contextMatch = tag_mask << 28; break;
+   case 0: contextMatch = (1 << 28); break;
+   case 1: contextMatch = (1 << 29); break;
+   case 2: contextMatch = (1 << 30); break;
+   case 3: contextMatch = (1 << 31); break;
+   default:
+   DBGMSG("Invalid tag_mask %0x, matching all tags",tag_mask);
+   contextMatch = tag_mask << 28;
+   break;
+   }

   if (iso->channel == -1) {
   /* enable multichannel reception */


So nevermind.  I'm totally vindicated and my code is, as always,
flawless.  Cough.

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: isochronous receives?

2006-12-13 Thread Robert Crocombe

On 11/29/06, Keith Curtis [EMAIL PROTECTED] wrote:

I never resolved the problem. I turned on the excessive debugging output, but it
didn't print out info about receiving packets or interrupts. My test
app claimed there were no packets received although the bus analyzer
showed lots of packets going by.


Well, I figured it out, finally.  Thankfully (in a way...), it was my
code: I was setting the tag to -1 in a certain spot (which indicates
that you want to see all packets, regardless of their tag), but
unhelpfully changing it to 0 before calling raw1394_iso_recv_start...

...dangit, though.  Looking at the data stream, the tag *is* zero.

Stefan, isn't the line:

   /* match on specified tags */
   contextMatch = tag_mask  28;

in ohci_iso_recv_start() wrong?  The register looks to work like this.
The tag field is two bits.

if you want to match on 11b, then set bit tag3 (bit 31)
if you want to match on 10b, then set bit tag2 (bit 30)
if you want to match on 01b, then set bit tag1 (bit 29)
if you want to match on 00b, then set bit tag0 (bit 28)

Which makes the shift obviously wrong.  Passing in '3' to match on tag
11b will have you instead set bits 29 and 28, and you will match on
01b and 00b.  Passing in '0' will completely bone you: no bits will be
turned on.  Passing in '-1' to match all bits does work, though.
You'd have to know to pass in 0x8 to match for tag 11b, which is a
skosh counterintuitive and probably not what was intended.

Here's my crap patch.  It appears to Work For Me(tm).

--- ohci1394.c  2006-12-04 16:52:10.916044780 -0700
+++ modified_ohci1394.c 2006-12-13 07:22:07.613917511 -0700
@@ -1491,7 +1491,18 @@
   reg_write(recv-ohci, recv-ContextControlSet, command);

   /* match on specified tags */
-   contextMatch = tag_mask  28;
+switch (tag_mask)
+{
+   case -1: contextMatch = tag_mask  28; break;
+   case 0: contextMatch = (1  28); break;
+   case 1: contextMatch = (1  29); break;
+   case 2: contextMatch = (1  30); break;
+   case 3: contextMatch = (1  31); break;
+   default:
+   DBGMSG(Invalid tag_mask %0x, matching all tags,tag_mask);
+   contextMatch = tag_mask  28;
+   break;
+   }

   if (iso-channel == -1) {
   /* enable multichannel reception */


So nevermind.  I'm totally vindicated and my code is, as always,
flawless.  Cough.

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


isochronous receives?

2006-11-28 Thread Robert Crocombe

Keith, et. al,

I am having problems with isochronous receives, and remembered just as
I was getting ready to dig into the source that there was a message
about this stuff.  Lo and behold your message to linux1394-user from
September 7:


I'm trying to receive isochronous streams (using libraw1394 1.2.0), and
I've noticed that if data is transmitted on channel 63, then my app tends
to work fine. If the stream is on a different channel, then I don't see
any isochronous packets at all.  I'm using 2.4.29, I've also tried 2.6.15
with similar results, can't seem to receive channels < 63.


Did you ultimately have any success getting this going?  Funnily
enough, when I tested isochronous stuff in July, I just did iso
transmit since I figured receives *must* be working since everyone has
camcorders and whatnot.  My currently my iso xmit stuff does appear to
be working, but iso receives are not.

I have a Firespy and no reason not to trust it, so I can see the junk
I'm spewing out.  I've tried transmitting on channels 4 and 63 (per
your advice), but neither works for me.  I suppose it could my
stuff... nah.

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ieee1394: host adapter disappears on 1394 bus reset

2006-11-28 Thread Robert Crocombe

On 11/27/06, Stefan Richter <[EMAIL PROTECTED]> wrote:

Posted writes are still enabled. phys_dma=0 disables only the physical
response unit. You have to change the source if you want to disable
posted writes. See the top of ohci_initialize. Should this be a module
load parameter too?


Er.  I misspoke.  What I need is for write requests directed to
address 0 to be directed to the asynchronous unit so that I can treat
them as regular asynchronous write requests.  As the OHCI 1.1 spec
says:

"Physical requests that are rejected by the PhysicalRequestFilter
shall be sent to the AR Request DMA context if the AR Request DMA
context is enabled". (5.14.2, page 58)

That does appear to be happening: I have an ARM mapping set to begin
at 0 and extend some ways along, and I do receive write requests.  At
first I was simply changing the lines:

reg_write(ohci,OHCI1394_PhyReqFilterHiSet, 0x);
reg_write(ohci,OHCI1394_PhyReqFilterLoSet, 0x);

to be 0x  instead, but then I paid more attention to the
source and saw the phys_dma parameter, which does the same.  Well,
*did*, in 2.6.16.  I see that 2.6.18 doesn't write 0 if !phys_dma, it
just leaves the values alone, but I guess that's okay since they are
set to 0 on reset.  Same difference.

So that's okay.  Uhm, mostly.  You should really see the horrors I
have created in order to be able to have 5 hosts map the same address
range (the custom protocol we're using doesn't use the destination
address at all, so it's 0 for everybody).

So long ways round, I think the phys_dma parameter is the proper thing for me.

And I will try and do some actual thinking about what is happening.  I
was hoping to offload that work to you and simply perform mechanical
changes to the source!  Rats!

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ieee1394: host adapter disappears on 1394 bus reset

2006-11-28 Thread Robert Crocombe

On 11/27/06, Stefan Richter [EMAIL PROTECTED] wrote:

Posted writes are still enabled. phys_dma=0 disables only the physical
response unit. You have to change the source if you want to disable
posted writes. See the top of ohci_initialize. Should this be a module
load parameter too?


Er.  I misspoke.  What I need is for write requests directed to
address 0 to be directed to the asynchronous unit so that I can treat
them as regular asynchronous write requests.  As the OHCI 1.1 spec
says:

Physical requests that are rejected by the PhysicalRequestFilter
shall be sent to the AR Request DMA context if the AR Request DMA
context is enabled. (5.14.2, page 58)

That does appear to be happening: I have an ARM mapping set to begin
at 0 and extend some ways along, and I do receive write requests.  At
first I was simply changing the lines:

reg_write(ohci,OHCI1394_PhyReqFilterHiSet, 0x);
reg_write(ohci,OHCI1394_PhyReqFilterLoSet, 0x);

to be 0x  instead, but then I paid more attention to the
source and saw the phys_dma parameter, which does the same.  Well,
*did*, in 2.6.16.  I see that 2.6.18 doesn't write 0 if !phys_dma, it
just leaves the values alone, but I guess that's okay since they are
set to 0 on reset.  Same difference.

So that's okay.  Uhm, mostly.  You should really see the horrors I
have created in order to be able to have 5 hosts map the same address
range (the custom protocol we're using doesn't use the destination
address at all, so it's 0 for everybody).

So long ways round, I think the phys_dma parameter is the proper thing for me.

And I will try and do some actual thinking about what is happening.  I
was hoping to offload that work to you and simply perform mechanical
changes to the source!  Rats!

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


isochronous receives?

2006-11-28 Thread Robert Crocombe

Keith, et. al,

I am having problems with isochronous receives, and remembered just as
I was getting ready to dig into the source that there was a message
about this stuff.  Lo and behold your message to linux1394-user from
September 7:


I'm trying to receive isochronous streams (using libraw1394 1.2.0), and
I've noticed that if data is transmitted on channel 63, then my app tends
to work fine. If the stream is on a different channel, then I don't see
any isochronous packets at all.  I'm using 2.4.29, I've also tried 2.6.15
with similar results, can't seem to receive channels  63.


Did you ultimately have any success getting this going?  Funnily
enough, when I tested isochronous stuff in July, I just did iso
transmit since I figured receives *must* be working since everyone has
camcorders and whatnot.  My currently my iso xmit stuff does appear to
be working, but iso receives are not.

I have a Firespy and no reason not to trust it, so I can see the junk
I'm spewing out.  I've tried transmitting on channels 4 and 63 (per
your advice), but neither works for me.  I suppose it could my
stuff... nah.

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ieee1394: host adapter disappears on 1394 bus reset

2006-11-27 Thread Robert Crocombe

Robert Crocombe wrote:

this is in 2.6.16-rt29 which has proved to be the easiest to provoke.
I actually couldn't get 2.6.18 to break earlier this morning (few
hundred resets).


Okay, I got the problem to occur again with 2.6.18.  I will attach my
config in case you wish to scrutinize for any boneheadedness on my
part.

I provoked the problem both with and without the additional read of
IntMaskSet.  Amazingly, I lost host1 on the bus reset that occured
after this sequence:

rmmod ohci1394
rmmod ieee1394
make
make modules_install
modprobe ohci1394

which followed my adding the extra register read line.  Here's the
entirety of the host1 stuff (I did a s/.*host[^1].*//g in vim).  I
snipped some of the self ID chatter.

Nov 27 13:06:35 spanky kernel: ieee1394: nodemgr and IRM functionality disabled
Nov 27 13:06:35 spanky kernel: ohci1394: fw-host1: Remapped memory
spaces reg 0xc2058000
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Soft reset finished
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Iso contexts reg:
00a8 implemented: 000f
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Iso contexts reg:
0098 implemented: 00ff
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Receive DMA ctx=0 initialized
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Receive DMA ctx=0 initialized
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Transmit DMA ctx=0
initialized
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Transmit DMA ctx=1
initialized
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: physUpperBoundOffset=
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: OHCI-1394 1.1
(PCI): IRQ=[98]  MMIO=[f9ffe000-f9ffe7ff]  Max Packet=[4096]  IR/IT
contexts=[4/8]
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: IntEvent: 00020010
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: irq_handler: Bus
reset requested
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: Cancel request received
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: Got RQPkt interrupt
status=0x8409
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: Single packet rcv'd
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: IntEvent: 0001
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: SelfID interrupt
received (phyid 1, not root)
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: SelfID packet
0x807fc494 received
Nov 27 13:06:38 spanky kernel: ohci1394: fw-host1: SelfID packet
0x817fc494 received
Nov 27 13:06:38 spanky kernel: ohci1394: fw-host1: SelfID for this
node is 0x817fc494
Nov 27 13:06:39 spanky kernel: ohci1394: fw-host1: SelfID packet BLAH
...15 more SelfID...
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: SelfID complete
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: PhyReqFilter=
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: IntEventClear
 IntEventSet   04508000 IntMaskSet838301f3
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: IntEvent: 00020010
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: irq_handler: Bus
reset requested
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: Cancel request received
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: Got RQPkt interrupt
status=0x8409
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: Single packet rcv'd
Nov 27 13:06:41 spanky kernel: ohci1394: fw-host1: IntEvent: 0001
Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID interrupt
received (phyid 1, not root)
Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID packet
0x807fc494 received
Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID packet
0x817fc496 received
Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID for this
node is 0x817fc496
Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID packet BLAH
...15 more SelfID...
Nov 27 13:06:43 spanky kernel: ohci1394: fw-host1: SelfID complete
Nov 27 13:06:43 spanky kernel: ohci1394: fw-host1: PhyReqFilter=
Nov 27 13:06:44 spanky kernel: ohci1394: fw-host1: IntEventClear
 IntEventSet   6ffdc33f IntMaskSet

with the bad IntMaskSet again.

I don't know if the host loss when I didn't have the additional read
is meaningful, but there it is simply:

Nov 27 13:04:39 spanky kernel: ohci1394: fw-host2: SelfID packet
0x823fc4f8 rf8c43f8c
.
.
.
Nov 27 13:06:30 spanky kernel: ohci1394: fw-host2: Soft reset finished

with 2 minutes and ~30 bus resets in between.

Oh, poop.  I didn't mention that I have:

options ieee1394 disable_nodemgr=1
options ohci1394 phys_dma=0

in my /etc/modprobe.conf.  The Linux adapters are functioning as
simulated peripherals to a piece of control hardware that always has a
dest address of 0x   on all packets so I needed to get rid
of posted writes and any bickering over bus master.

--
Robert Crocombe
[EMAIL PROTECTED]


2.6.18_00_config.bz2
Description: BZip2 compressed data


Re: ieee1394: host adapter disappears on 1394 bus reset

2006-11-27 Thread Robert Crocombe

On 11/27/06, Stefan Richter <[EMAIL PROTECTED]> wrote:
But perhaps more importantly, how are the IRQs distributed?

# cat /proc/interrupts


This is almost right after boot.  I generated about 40 bus resets just
to stir things up a little:

  CPU0   CPU1   CPU2   CPU3
 0:  33660  36393  30037  69980IO-APIC-edge  timer
 1:  0  0  1 10IO-APIC-edge  i8042
 8:  0  0  0  0IO-APIC-edge  rtc
 9:  0  0  0  0   IO-APIC-level  acpi
12:  0  0  0113IO-APIC-edge  i8042
15:  0270686215IO-APIC-edge  ide1
50:  1  0  11567  7   IO-APIC-level  aic79xx
58:  0  0  0  0   IO-APIC-level  ehci_hcd:usb1
66:  0  0  0  0   IO-APIC-level  ohci_hcd:usb2
74:  0  1  7 80   IO-APIC-level
ohci1394, ohci1394
82:  7 23 30 28   IO-APIC-level  ohci1394
90:  2 28 17 71   IO-APIC-level  eth0
98:  9 27 21   9182   IO-APIC-level  eth1
106: 19 17 20 26   IO-APIC-level  ohci1394
114: 16 26 34 12   IO-APIC-level  ohci1394
233:  0  0 15  0   IO-APIC-level  aic79xx
NMI:410 78 75 77
LOC: 166733 166657 166542 166432
ERR:  0
MIS:  0

Also:
I couldn't cause the problem when using 4 Fireboard 800s through
several hundred bus resets (usually took <= 40 for the Indigita card)


Please add
reg_read(ohci, OHCI1394_IntMaskSet);
right before hpsb_selfid_complete(host, phyid, isroot);. This will flush
the previous reg_write before hpsb_selfid_complete starts doing
unspeakable things.


Okay, so the code looks like this now:

   DBGMSG("PhyReqFilter=%08x%08x",
  reg_read(ohci,OHCI1394_PhyReqFilterHiSet),
  reg_read(ohci,OHCI1394_PhyReqFilterLoSet));

   reg_read(ohci, OHCI1394_IntMaskSet);

   hpsb_selfid_complete(host, phyid, isroot);

   DBGMSG( "IntEventClear %08x "
   "IntEventSet %08x "
   "IntMaskSet %08x",
   reg_read(ohci, OHCI1394_IntEventClear),
   reg_read(ohci, OHCI1394_IntEventSet),
   reg_read(ohci, OHCI1394_IntMaskSet));

this is in 2.6.16-rt29 which has proved to be the easiest to provoke.
I actually couldn't get 2.6.18 to break earlier this morning (few
hundred resets).

Okay, I've lost host1 (on the Indigita), but this time the last print
statement is:

Nov 27 10:38:27 spanky kernel: ohci1394: fw-host1: IntEventClear
 IntEventSet 04588000 IntMaskSet 818300f3

just like all the other hosts.  I can confirm that no bus reset
handlers are called, and there are another 4,000 lines of statements
from the other hosts after the last from host1.

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-27 Thread Robert Crocombe

The difference that Wink reports is tiny compared to that measured on
my Opteron machines:

dual (2.6.17):

[EMAIL PROTECTED]:cyclecounter_test$ ./rdtsc-pref 100
rdtsc:   average ticks=  10
gtod:average ticks=4296
gtod_us: average ticks=4328

quad (2.6.16-rt29):

[EMAIL PROTECTED]:wink_saville_test$ ./rdtsc-pref 100
rdtsc:   average ticks=  10
gtod:average ticks=5688
gtod_us: average ticks=5711

I have my own little test that I'll attach, but it gives a similar
result.  Here are the results from the 2x box:

[EMAIL PROTECTED]:cyclecounter_test$ ./timing
Using the cycle counter
Calibrated timer as 2593081969.758825 Hz
4194304 iterations in 0.016 seconds is 0.004 useconds per iteration.

[EMAIL PROTECTED]:cyclecounter_test$ ./timing_gettimeofday
Using gettimeofday
4194304 iterations in 6.793 seconds is 1.620 useconds per iteration.

I have used the pthread affinity and/or cpuset, etc. mechanisms to try
and inject some reliability into the measurement.

Using gtod() can amount to a substantial disturbance of the thing to
be measured.  Using rdtsc, things seem reliable so far, and we have an
FPGA (accessed through the PCI bus) that has been programmed to give
access to an 8MHz clock and we do some checks against that.

--
Robert Crocombe
[EMAIL PROTECTED]
#include   // printf()
#include  // uint64_t
#include  // drand48()
#include  // select()
#include// gettimeofday
#include 
#include  // rdtscll()



// Globals


enum
{
ITERATIONS  = 1 << 22
};

static double seconds_per_tick;


// Prototypes


double gimme_timeofday(void);
double get_time(void);

void selectsleep(unsigned us);
void init(void);


// Definitions


double
gimme_timeofday(void)
{
struct timeval tv;
gettimeofday(, 0);
return tv.tv_sec + 1e-6 * tv.tv_usec;
}


double
get_time(void)
{
uint64_t t;
rdtscll(t);
return t * seconds_per_tick;
}


/**
A good way to simply hang around doing nothing for awhile.
*/

void
selectsleep(unsigned us)
{
	struct timeval tv;
	tv.tv_sec = 0;
	tv.tv_usec = us;
	select(0,0,0,0,);
}

/**
Figure out how fast rdtscll() ticks.  This should be equal to the
frequency of the clock on the processor.  Here's the bad news: I don't
know if rdtscll() always uses the same processor so it may very well be
necessary to set a processor affinity to get really good results over
time.

This piece of code by Mark Hahn from brain.mcmaster.ca/~hahn/.
*/

void
init(void)
{
	double sumx = 0;
	double sumy = 0;
	double sumxx = 0;
	double sumxy = 0;
	double slope;

	// least squares linear regression of ticks onto real time
	// as returned by gettimeofday.

	const unsigned n = 30;
	unsigned i;

	for ( unsigned int i = 0; i < n; ++i)
{
		double breal,real,ticks;
		uint64_t aticks, bticks;
	
		breal = gimme_timeofday();
		rdtscll(bticks);

		selectsleep((unsigned)(1 + drand48() * 20));

rdtscll(aticks);
		ticks = aticks - bticks;
		real = gimme_timeofday() - breal;

		sumx += real;
		sumxx += real * real;
		sumxy += real * ticks;
		sumy += ticks;
	}

	slope = ((sumxy - (sumx*sumy) / n) / (sumxx - (sumx*sumx) / n));
	seconds_per_tick = 1.0 / slope;

printf("Calibrated timer as %.6f Hz\n", slope);
}

int
main(int argc, char *argv[])
{
printf("Doing stuff\n");

#if 0   // Using rdtscll()
printf("Using the cycle counter\n");
init();
double time_start = gimme_timeofday();
for (unsigned int i = 0; i < ITERATIONS; ++i)
{
double the_time;
the_time = get_time();
}
double time_end = gimme_timeofday();
#else   // using gettimeofday()
printf("Using gettimeofday\n");
double time_start = gimme_timeofday();
for (unsigned int i = 0; i < ITERATIONS; ++i)
{
double the_time;
the_time = gimme_timeofday();
}
double time_end = gimme_timeofday();
#endif

double diff = time_end - time_start;

double useconds = (diff / ITERATIONS) * 1e6;

printf("%u iterations in %.3f seconds is %.3f useconds per iteration.\n",
  ITERATIONS, diff, useconds);

printf("Done\n");
return 0;
}


Re: ieee1394: host adapter disappears on 1394 bus reset

2006-11-27 Thread Robert Crocombe

On 11/22/06, Stefan Richter <[EMAIL PROTECTED]> wrote:

One thing you could try next is to add a debug logging macro which
prints the contents of OHCI1394_IntEventClear, OHCI1394_IntEventSet, and
OHCI1394_IntMaskSet, right after ohci1394's call to
hpsb_selfid_complete. (I'm merely poking in the dark here.)


I think you've got something!  I managed to provoke failure from 3 of
the 5 interfaces in a single burst of reset clicking!  And yes, all 3
failed interfaces are on the Indigita card, and no, the Fireboard has
never failed.

The last thing I see from the failed interfaces is this:

Nov 27 08:25:51 spanky kernel: ohci1394: fw-host3: PhyReqFilter=
Nov 27 08:25:51 spanky kernel: ohci1394: fw-host3: IntEventClear
 IntEventSet 6ffdc33f IntMaskSet 

which looks very different from the entries by the interfaces that
survive (these are the lines immediately before the one above)

Nov 27 08:25:51 spanky kernel: ohci1394: fw-host4: IntEventClear
 IntEventSet 04508000 IntMaskSet 818300f3
Nov 27 08:25:51 spanky kernel:
Nov 27 08:25:51 spanky kernel: ohci1394: fw-host2: IntEventClear
 IntEventSet 04508000 IntMaskSet 818300f3
Nov 27 08:25:51 spanky kernel:

I'm not sure if this says anything to you except "hey, don't use those
Indigita cards".  The problem is, I can't get the number of ports I
need using only Fireboards (I think I need 6, and I have 5 PCI slots
but need to use some of the other slots).

Is there further diagnostic poking about that I can do to narrow down
the problem?   Is something for Indigita?  The card is pretty basic: 4
of the TI TSB82AA2 (Ice Lynx) links behind a IBM/Tundra PCI-X bridge.
I have an Intel quad ethernet card that uses the exact same part
(well, one rev older, actually).  Here's a chunk of my lspci for
completeness sake:

01:04.0 PCI bridge: IBM PCI-X to PCI-X Bridge (rev 03)
01:06.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b
Link Layer Controller (rev 01)
02:04.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b
Link Layer Controller (rev 01)
02:05.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b
Link Layer Controller (rev 01)
02:06.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b
Link Layer Controller (rev 01)
02:07.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b
Link Layer Controller (rev 01)

I will also try cramming a machine full of Fireboards and seeing if I
can't get one of them to fail.

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ieee1394: host adapter disappears on 1394 bus reset

2006-11-27 Thread Robert Crocombe

On 11/22/06, Stefan Richter [EMAIL PROTECTED] wrote:

One thing you could try next is to add a debug logging macro which
prints the contents of OHCI1394_IntEventClear, OHCI1394_IntEventSet, and
OHCI1394_IntMaskSet, right after ohci1394's call to
hpsb_selfid_complete. (I'm merely poking in the dark here.)


I think you've got something!  I managed to provoke failure from 3 of
the 5 interfaces in a single burst of reset clicking!  And yes, all 3
failed interfaces are on the Indigita card, and no, the Fireboard has
never failed.

The last thing I see from the failed interfaces is this:

Nov 27 08:25:51 spanky kernel: ohci1394: fw-host3: PhyReqFilter=
Nov 27 08:25:51 spanky kernel: ohci1394: fw-host3: IntEventClear
 IntEventSet 6ffdc33f IntMaskSet 

which looks very different from the entries by the interfaces that
survive (these are the lines immediately before the one above)

Nov 27 08:25:51 spanky kernel: ohci1394: fw-host4: IntEventClear
 IntEventSet 04508000 IntMaskSet 818300f3
Nov 27 08:25:51 spanky kernel:
Nov 27 08:25:51 spanky kernel: ohci1394: fw-host2: IntEventClear
 IntEventSet 04508000 IntMaskSet 818300f3
Nov 27 08:25:51 spanky kernel:

I'm not sure if this says anything to you except hey, don't use those
Indigita cards.  The problem is, I can't get the number of ports I
need using only Fireboards (I think I need 6, and I have 5 PCI slots
but need to use some of the other slots).

Is there further diagnostic poking about that I can do to narrow down
the problem?   Is something for Indigita?  The card is pretty basic: 4
of the TI TSB82AA2 (Ice Lynx) links behind a IBM/Tundra PCI-X bridge.
I have an Intel quad ethernet card that uses the exact same part
(well, one rev older, actually).  Here's a chunk of my lspci for
completeness sake:

01:04.0 PCI bridge: IBM PCI-X to PCI-X Bridge (rev 03)
01:06.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b
Link Layer Controller (rev 01)
02:04.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b
Link Layer Controller (rev 01)
02:05.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b
Link Layer Controller (rev 01)
02:06.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b
Link Layer Controller (rev 01)
02:07.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b
Link Layer Controller (rev 01)

I will also try cramming a machine full of Fireboards and seeing if I
can't get one of them to fail.

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] x86: unify/rewrite SMP TSC sync code

2006-11-27 Thread Robert Crocombe

The difference that Wink reports is tiny compared to that measured on
my Opteron machines:

dual (2.6.17):

[EMAIL PROTECTED]:cyclecounter_test$ ./rdtsc-pref 100
rdtsc:   average ticks=  10
gtod:average ticks=4296
gtod_us: average ticks=4328

quad (2.6.16-rt29):

[EMAIL PROTECTED]:wink_saville_test$ ./rdtsc-pref 100
rdtsc:   average ticks=  10
gtod:average ticks=5688
gtod_us: average ticks=5711

I have my own little test that I'll attach, but it gives a similar
result.  Here are the results from the 2x box:

[EMAIL PROTECTED]:cyclecounter_test$ ./timing
Using the cycle counter
Calibrated timer as 2593081969.758825 Hz
4194304 iterations in 0.016 seconds is 0.004 useconds per iteration.

[EMAIL PROTECTED]:cyclecounter_test$ ./timing_gettimeofday
Using gettimeofday
4194304 iterations in 6.793 seconds is 1.620 useconds per iteration.

I have used the pthread affinity and/or cpuset, etc. mechanisms to try
and inject some reliability into the measurement.

Using gtod() can amount to a substantial disturbance of the thing to
be measured.  Using rdtsc, things seem reliable so far, and we have an
FPGA (accessed through the PCI bus) that has been programmed to give
access to an 8MHz clock and we do some checks against that.

--
Robert Crocombe
[EMAIL PROTECTED]
#include stdio.h  // printf()
#include stdint.h // uint64_t
#include stdlib.h // drand48()
#include sys/select.h // select()
#include sys/time.h   // gettimeofday
#include time.h
#include asm-x86_64/msr.h // rdtscll()



// Globals


enum
{
ITERATIONS  = 1  22
};

static double seconds_per_tick;


// Prototypes


double gimme_timeofday(void);
double get_time(void);

void selectsleep(unsigned us);
void init(void);


// Definitions


double
gimme_timeofday(void)
{
struct timeval tv;
gettimeofday(tv, 0);
return tv.tv_sec + 1e-6 * tv.tv_usec;
}


double
get_time(void)
{
uint64_t t;
rdtscll(t);
return t * seconds_per_tick;
}


/**
A good way to simply hang around doing nothing for awhile.
*/

void
selectsleep(unsigned us)
{
	struct timeval tv;
	tv.tv_sec = 0;
	tv.tv_usec = us;
	select(0,0,0,0,tv);
}

/**
Figure out how fast rdtscll() ticks.  This should be equal to the
frequency of the clock on the processor.  Here's the bad news: I don't
know if rdtscll() always uses the same processor so it may very well be
necessary to set a processor affinity to get really good results over
time.

This piece of code by Mark Hahn from brain.mcmaster.ca/~hahn/.
*/

void
init(void)
{
	double sumx = 0;
	double sumy = 0;
	double sumxx = 0;
	double sumxy = 0;
	double slope;

	// least squares linear regression of ticks onto real time
	// as returned by gettimeofday.

	const unsigned n = 30;
	unsigned i;

	for ( unsigned int i = 0; i  n; ++i)
{
		double breal,real,ticks;
		uint64_t aticks, bticks;
	
		breal = gimme_timeofday();
		rdtscll(bticks);

		selectsleep((unsigned)(1 + drand48() * 20));

rdtscll(aticks);
		ticks = aticks - bticks;
		real = gimme_timeofday() - breal;

		sumx += real;
		sumxx += real * real;
		sumxy += real * ticks;
		sumy += ticks;
	}

	slope = ((sumxy - (sumx*sumy) / n) / (sumxx - (sumx*sumx) / n));
	seconds_per_tick = 1.0 / slope;

printf(Calibrated timer as %.6f Hz\n, slope);
}

int
main(int argc, char *argv[])
{
printf(Doing stuff\n);

#if 0   // Using rdtscll()
printf(Using the cycle counter\n);
init();
double time_start = gimme_timeofday();
for (unsigned int i = 0; i  ITERATIONS; ++i)
{
double the_time;
the_time = get_time();
}
double time_end = gimme_timeofday();
#else   // using gettimeofday()
printf(Using gettimeofday\n);
double time_start = gimme_timeofday();
for (unsigned int i = 0; i  ITERATIONS; ++i)
{
double the_time;
the_time = gimme_timeofday();
}
double time_end = gimme_timeofday();
#endif

double diff = time_end - time_start;

double useconds = (diff / ITERATIONS) * 1e6;

printf(%u iterations in %.3f seconds is %.3f useconds per iteration.\n,
  ITERATIONS, diff, useconds);

printf(Done\n);
return 0;
}


Re: ieee1394: host adapter disappears on 1394 bus reset

2006-11-27 Thread Robert Crocombe

On 11/27/06, Stefan Richter [EMAIL PROTECTED] wrote:
But perhaps more importantly, how are the IRQs distributed?

# cat /proc/interrupts


This is almost right after boot.  I generated about 40 bus resets just
to stir things up a little:

  CPU0   CPU1   CPU2   CPU3
 0:  33660  36393  30037  69980IO-APIC-edge  timer
 1:  0  0  1 10IO-APIC-edge  i8042
 8:  0  0  0  0IO-APIC-edge  rtc
 9:  0  0  0  0   IO-APIC-level  acpi
12:  0  0  0113IO-APIC-edge  i8042
15:  0270686215IO-APIC-edge  ide1
50:  1  0  11567  7   IO-APIC-level  aic79xx
58:  0  0  0  0   IO-APIC-level  ehci_hcd:usb1
66:  0  0  0  0   IO-APIC-level  ohci_hcd:usb2
74:  0  1  7 80   IO-APIC-level
ohci1394, ohci1394
82:  7 23 30 28   IO-APIC-level  ohci1394
90:  2 28 17 71   IO-APIC-level  eth0
98:  9 27 21   9182   IO-APIC-level  eth1
106: 19 17 20 26   IO-APIC-level  ohci1394
114: 16 26 34 12   IO-APIC-level  ohci1394
233:  0  0 15  0   IO-APIC-level  aic79xx
NMI:410 78 75 77
LOC: 166733 166657 166542 166432
ERR:  0
MIS:  0

Also:
I couldn't cause the problem when using 4 Fireboard 800s through
several hundred bus resets (usually took = 40 for the Indigita card)


Please add
reg_read(ohci, OHCI1394_IntMaskSet);
right before hpsb_selfid_complete(host, phyid, isroot);. This will flush
the previous reg_write before hpsb_selfid_complete starts doing
unspeakable things.


Okay, so the code looks like this now:

   DBGMSG(PhyReqFilter=%08x%08x,
  reg_read(ohci,OHCI1394_PhyReqFilterHiSet),
  reg_read(ohci,OHCI1394_PhyReqFilterLoSet));

   reg_read(ohci, OHCI1394_IntMaskSet);

   hpsb_selfid_complete(host, phyid, isroot);

   DBGMSG( IntEventClear %08x 
   IntEventSet %08x 
   IntMaskSet %08x,
   reg_read(ohci, OHCI1394_IntEventClear),
   reg_read(ohci, OHCI1394_IntEventSet),
   reg_read(ohci, OHCI1394_IntMaskSet));

this is in 2.6.16-rt29 which has proved to be the easiest to provoke.
I actually couldn't get 2.6.18 to break earlier this morning (few
hundred resets).

Okay, I've lost host1 (on the Indigita), but this time the last print
statement is:

Nov 27 10:38:27 spanky kernel: ohci1394: fw-host1: IntEventClear
 IntEventSet 04588000 IntMaskSet 818300f3

just like all the other hosts.  I can confirm that no bus reset
handlers are called, and there are another 4,000 lines of statements
from the other hosts after the last from host1.

--
Robert Crocombe
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ieee1394: host adapter disappears on 1394 bus reset

2006-11-27 Thread Robert Crocombe

Robert Crocombe wrote:

this is in 2.6.16-rt29 which has proved to be the easiest to provoke.
I actually couldn't get 2.6.18 to break earlier this morning (few
hundred resets).


Okay, I got the problem to occur again with 2.6.18.  I will attach my
config in case you wish to scrutinize for any boneheadedness on my
part.

I provoked the problem both with and without the additional read of
IntMaskSet.  Amazingly, I lost host1 on the bus reset that occured
after this sequence:

rmmod ohci1394
rmmod ieee1394
make
make modules_install
modprobe ohci1394

which followed my adding the extra register read line.  Here's the
entirety of the host1 stuff (I did a s/.*host[^1].*//g in vim).  I
snipped some of the self ID chatter.

Nov 27 13:06:35 spanky kernel: ieee1394: nodemgr and IRM functionality disabled
Nov 27 13:06:35 spanky kernel: ohci1394: fw-host1: Remapped memory
spaces reg 0xc2058000
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Soft reset finished
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Iso contexts reg:
00a8 implemented: 000f
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Iso contexts reg:
0098 implemented: 00ff
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Receive DMA ctx=0 initialized
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Receive DMA ctx=0 initialized
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Transmit DMA ctx=0
initialized
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: Transmit DMA ctx=1
initialized
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: physUpperBoundOffset=
Nov 27 13:06:36 spanky kernel: ohci1394: fw-host1: OHCI-1394 1.1
(PCI): IRQ=[98]  MMIO=[f9ffe000-f9ffe7ff]  Max Packet=[4096]  IR/IT
contexts=[4/8]
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: IntEvent: 00020010
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: irq_handler: Bus
reset requested
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: Cancel request received
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: Got RQPkt interrupt
status=0x8409
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: Single packet rcv'd
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: IntEvent: 0001
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: SelfID interrupt
received (phyid 1, not root)
Nov 27 13:06:37 spanky kernel: ohci1394: fw-host1: SelfID packet
0x807fc494 received
Nov 27 13:06:38 spanky kernel: ohci1394: fw-host1: SelfID packet
0x817fc494 received
Nov 27 13:06:38 spanky kernel: ohci1394: fw-host1: SelfID for this
node is 0x817fc494
Nov 27 13:06:39 spanky kernel: ohci1394: fw-host1: SelfID packet BLAH
...15 more SelfID...
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: SelfID complete
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: PhyReqFilter=
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: IntEventClear
 IntEventSet   04508000 IntMaskSet838301f3
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: IntEvent: 00020010
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: irq_handler: Bus
reset requested
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: Cancel request received
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: Got RQPkt interrupt
status=0x8409
Nov 27 13:06:40 spanky kernel: ohci1394: fw-host1: Single packet rcv'd
Nov 27 13:06:41 spanky kernel: ohci1394: fw-host1: IntEvent: 0001
Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID interrupt
received (phyid 1, not root)
Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID packet
0x807fc494 received
Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID packet
0x817fc496 received
Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID for this
node is 0x817fc496
Nov 27 13:06:42 spanky kernel: ohci1394: fw-host1: SelfID packet BLAH
...15 more SelfID...
Nov 27 13:06:43 spanky kernel: ohci1394: fw-host1: SelfID complete
Nov 27 13:06:43 spanky kernel: ohci1394: fw-host1: PhyReqFilter=
Nov 27 13:06:44 spanky kernel: ohci1394: fw-host1: IntEventClear
 IntEventSet   6ffdc33f IntMaskSet

with the bad IntMaskSet again.

I don't know if the host loss when I didn't have the additional read
is meaningful, but there it is simply:

Nov 27 13:04:39 spanky kernel: ohci1394: fw-host2: SelfID packet
0x823fc4f8 rf8c43f8c
.
.
.
Nov 27 13:06:30 spanky kernel: ohci1394: fw-host2: Soft reset finished

with 2 minutes and ~30 bus resets in between.

Oh, poop.  I didn't mention that I have:

options ieee1394 disable_nodemgr=1
options ohci1394 phys_dma=0

in my /etc/modprobe.conf.  The Linux adapters are functioning as
simulated peripherals to a piece of control hardware that always has a
dest address of 0x   on all packets so I needed to get rid
of posted writes and any bickering over bus master.

--
Robert Crocombe
[EMAIL PROTECTED]


2.6.18_00_config.bz2
Description: BZip2 compressed data