This is verrrrry strange indeed. The mfence works but the lfence does not.
On top of these I have tried *many* other different operations which I
thought may have an effect. You can find the entire code snippit for these
tests at the end of this email. When the code was being tested each
'section' was individually uncommented and then the result of that test is
placed in a comment above the test.

After doing these tests and scratching my head, I decided to do a test with
my old 82574L driver. I removed the X540 from the test machine and in the
same PCIe slot placed in the 82574L card which I have previously used (also
using the same ethernet connection). Since the descriptor format for legacy
descriptors is identical in the X540 and 82574L, I used the exact same code
(posted below) as I used on the X540. On this card *no fences* were needed.
Indicating that either A. I'm initialing the X540 in a manner that somehow
makes this behaviour possible. B. the X540 (or maybe my specific one) has a
bug that is causing a need force these fences. C. Maybe it's not a bug but
something that needs to be documented. I still find it very strange that a
write fence changes how things operate when no writes are actually being
done where I'm fencing.

Some other tests I have done:

- Have other processors spin and do mfences while the main core does not do
an mfence. This did not make it work, and this is expected as an mfence
only should locally change behaviour.
- mfences all over the X540 initialization and prior to doing the DD
polling. Did not fix the problem.

Some things I could think of that would cause this problem:

- It's just a bug in the X540, or mine specifically. (If I'm not too lazy
maybe I'll swap my X540s around between machines and try on another one).
- It's a bug in my initialization, but I would find this strange as my
82574L driver initializes in almost an identical fashion.
- It's a bug in my motherboard/CPU, but only on >=8x channel PCIe cards,
which would explain why it didn't affect the 82574L.

--------- Example code ---------------------

; Get the rx entry
mov rdx, qword [gs:thread_local.x540_rx_ring_base]
mov rax, qword [gs:thread_local.x540_rx_head]
shl rax, 4

; XXX: Temporary, used for the loop counter for testing.
xor ebp, ebp

; Putting an mfence/lfence/sfence here has no effect on the result.

mov rsi, qword [rdx + rax + 0] ; pointer to packet contents

; Putting an mfence/lfence/sfence here has no effect on the result.

.lewp:
; Putting an mfence/lfence/sfence here has no effect on the result.

; Wait until a packet is present here by polling the DD bit
test dword [rdx + rax + 8 + 4], 1
jz   short .lewp

; Without anything here, we fail with nothing getting printed.
 ; Successful
;mfence

; Failure, never prints
;lfence

; Successful
;sfence

; Successful (the pushes and pops do not change the result, fails without
; rdtsc)
;push rax
;push rdx
;rdtsc
;pop rdx
;pop rax

; Failure, prints out 3 to the screen. Meaning we read the value 3 times
; before it became accurate. On a second attempt it printed 3 as well.
;clflush [rsi + (udp_template_10g.ulen - udp_template_10g)]

; Successful
;wbinvd

; Failure, never prints
; mov rcx, 1024
;.simple_pause:
; dec rcx
; jnz short .simple_pause

; Failure, never prints
; mov rcx, 1024
;.do_some_reads:
; pop r15
; dec rcx
; jnz short .do_some_reads

; Failure, never prints
; mov rcx, 1024
;.do_some_writes:
; push r15
; dec rcx
; jnz short .do_some_writes

; Successful
;invlpg [rsi + (udp_template_10g.ulen - udp_template_10g)]
 ; Failure, never prints
;prefetch [rsi + (udp_template_10g.ulen - udp_template_10g)]

; Failure, never prints
;lock inc qword [rsp]

; Spinloop and keep track of how many spins we have done in ebp. We spin
; until the packet indicates a UDP length of 0x14 bytes.
.spin:
inc ebp
cmp word [rsi + (udp_template_10g.ulen - udp_template_10g)], 0x1400
jne short .spin

; Print out the value in ebp to the screen (count from the loop above).
mov  edx, ebp
call outhexq

cli
hlt
------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to