Re: No bus_space_read_8 on x86 ?

2012-10-23 Thread Carl Delsey

On 10/13/12 03:26, Robert Watson wrote:


On Fri, 12 Oct 2012, Carl Delsey wrote:

Indeed -- and on non-x86, where there are uncached direct map 
segments, and TLB entries that disable caching, reading 2x 32-bit vs 
1x 64-bit have quite different effects in terms of atomicity. Where 
uncached I/Os are being used, those differences may affect semantics 
significantly -- e.g., if your device has a 64-bit memory-mapped 
FIFO or registers, 2x 32-bit gives you two halves of two different 
64-bit values, rather than two halves of the same value.  As device 
drivers depend on those atomicity semantics, we should (at the 
busspace level) offer only the exactly expected semantics, rather 
than trying to patch things up.  If a device driver accessing 64-bit 
fields wants to support doing it using two 32-bit reads, it can 
figure out how to splice it together following 
bus_space_read_region_4().
I wouldn't make any default behaviour for bus_space_read_8 on i386, 
just amd64. My assumption (which may be unjustified) is that by far 
the most common implementations to read a 64-bit register on i386 
would be to read the lower 4 bytes first, followed by the upper 4 
bytes (or vice versa) and then stitch them together.  I think we 
should provide helper functions for these two cases, otherwise I fear 
our code base will be littered with multiple independent 
implementations of this.


Some driver writer who wants to take advantage of these helper 
functions would do something like

#ifdef i386
#definebus_space_read_8bus_space_read_8_lower_first
#endif
otherwise, using bus_space_read_8 won't compile for i386 builds.
If these implementations won't work for their case, they are free to 
write their own implementation or take whatever action is necessary.


I guess my question is, are these cases common enough that it is 
worth helping developers by providing functions that do the double 
read and shifts for them, or do we leave them to deal with it on 
their own at the risk of possibly some duplicated code.


I was thinking we might suggest to developers that they use a KPI that 
specifically captures the underlying semantics, so it's clear they 
understand them.  Untested example:


uint64_t v;

/*
 * On 32-bit systems, read the 64-bit statistic using two 32-bit
 * reads.
 *
 * XXX: This will sometimes lead to a race.
 *
 * XXX: Gosh, I wonder if some word-swapping is needed in the merge?
 */
#ifdef 32-bit
bus_space_read_region_4(space, handle, offset, (uint32_t *)v, 2;
#else
bus_space_read_8(space, handle, offset, v);
#endif

The potential need to word swap, however, suggests that you may be 
right about the error-prone nature of manual merging.


Again, I have to apologize for the delay in replying. I'm still dealing 
with my family emergency some.


I really like Robert's idea of using bus_space_read_region_4 and 
treating the 64 bit variable as an array of two 32 bit variables. That 
eliminates the need to shift. I'll try to incorporate that idea if I can.


Anyhow, after talking to jimharris@ I figured the best thing to do was 
to break this up into separate patches. This first patch is to provide 
bus_space__8 for amd64 only. This way we can get the easy case out 
of the way. I'll follow up with a couple more patches: one to modify the 
cxgbe driver and maybe a few other drivers to use this new function, and 
then a second that makes a stab at how to deal with i386 if we think 
that is worthwhile.


This first patch is at:
http://people.freebsd.org/~jimharris/patches/bus_space_xxx_8.patch

I basically copied the ia64 implementation to use as a base for this 
change. I wasn't quite sure how best to deal with the copyrights. The 
ia64 implementation has about 3 copyright headers on it, which seemed a 
bit ridiculous to copy, but probably is necessary to be strictly legal. 
I compromised and took the copyright from marcel@ since he was the last 
to change the file. Do we have any best practices on how to deal with this?


Thanks for the feedback and thanks in advance for taking the time to 
look at this patch.


Carl


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: No bus_space_read_8 on x86 ?

2012-10-13 Thread Robert Watson


On Fri, 12 Oct 2012, Carl Delsey wrote:

Indeed -- and on non-x86, where there are uncached direct map segments, and 
TLB entries that disable caching, reading 2x 32-bit vs 1x 64-bit have quite 
different effects in terms of atomicity. Where uncached I/Os are being 
used, those differences may affect semantics significantly -- e.g., if your 
device has a 64-bit memory-mapped FIFO or registers, 2x 32-bit gives you 
two halves of two different 64-bit values, rather than two halves of the 
same value.  As device drivers depend on those atomicity semantics, we 
should (at the busspace level) offer only the exactly expected semantics, 
rather than trying to patch things up.  If a device driver accessing 64-bit 
fields wants to support doing it using two 32-bit reads, it can figure out 
how to splice it together following bus_space_read_region_4().
I wouldn't make any default behaviour for bus_space_read_8 on i386, just 
amd64. My assumption (which may be unjustified) is that by far the most 
common implementations to read a 64-bit register on i386 would be to read the 
lower 4 bytes first, followed by the upper 4 bytes (or vice versa) and then 
stitch them together.  I think we should provide helper functions for these 
two cases, otherwise I fear our code base will be littered with multiple 
independent implementations of this.


Some driver writer who wants to take advantage of these helper functions 
would do something like

#ifdef i386
#definebus_space_read_8bus_space_read_8_lower_first
#endif
otherwise, using bus_space_read_8 won't compile for i386 builds.
If these implementations won't work for their case, they are free to write 
their own implementation or take whatever action is necessary.


I guess my question is, are these cases common enough that it is worth 
helping developers by providing functions that do the double read and shifts 
for them, or do we leave them to deal with it on their own at the risk of 
possibly some duplicated code.


I was thinking we might suggest to developers that they use a KPI that 
specifically captures the underlying semantics, so it's clear they understand 
them.  Untested example:


uint64_t v;

/*
 * On 32-bit systems, read the 64-bit statistic using two 32-bit
 * reads.
 *
 * XXX: This will sometimes lead to a race.
 *
 * XXX: Gosh, I wonder if some word-swapping is needed in the merge?
 */
#ifdef 32-bit
bus_space_read_region_4(space, handle, offset, (uint32_t *)v, 2;
#else
bus_space_read_8(space, handle, offset, v);
#endif

The potential need to word swap, however, suggests that you may be right about 
the error-prone nature of manual merging.


Robert

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: No bus_space_read_8 on x86 ?

2012-10-12 Thread John Baldwin
On Wednesday, October 10, 2012 5:44:09 pm Carl Delsey wrote:
 Sorry for the slow response. I was dealing with a bit of a family 
 emergency. Responses inline below.
 
 On 10/09/12 08:54, John Baldwin wrote:
  On Monday, October 08, 2012 4:59:24 pm Warner Losh wrote:
  On Oct 5, 2012, at 10:08 AM, John Baldwin wrote:
 snip
  I think cxgb* already have an implementation.  For amd64 we should 
  certainly
  have bus_space_*_8(), at least for SYS_RES_MEMORY.  I think they should 
  fail
  for SYS_RES_IOPORT.  I don't think we can force a compile-time error 
  though,
  would just have to return -1 on reads or some such?
 
 Yes. Exactly what I was thinking.
 
  I believe it was because bus reads weren't guaranteed to be atomic on i386.
  don't know if that's still the case or a concern, but it was an 
  intentional omission.
  True.  If you are on a 32-bit system you can read the two 4 byte values and
  then build a 64-bit value.  For 64-bit platforms we should offer 
  bus_read_8()
  however.
 
 I believe there is still no way to perform a 64-bit read on a i386 (or 
 at least without messing with SSE instructions), but if you have to read 
 a 64-bit register, you are stuck with doing two 32-bit reads and 
 concatenating them. I figure we may as well provide an implementation 
 for those who have to do that as well as the implementation for 64-bit.

I think the problem though is that the way you should glue those two 32-bit
reads together is device dependent.  I don't think you can provide a completely
device-neutral bus_read_8() on i386.  We should certainly have it on 64-bit
platforms, but I think drivers that want to work on 32-bit platforms need to
explicitly merge the two words themselves.
 
-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: No bus_space_read_8 on x86 ?

2012-10-12 Thread Robert Watson


On Fri, 12 Oct 2012, John Baldwin wrote:

I believe it was because bus reads weren't guaranteed to be atomic on 
i386. don't know if that's still the case or a concern, but it was an 
intentional omission.
True.  If you are on a 32-bit system you can read the two 4 byte values 
and then build a 64-bit value.  For 64-bit platforms we should offer 
bus_read_8() however.


I believe there is still no way to perform a 64-bit read on a i386 (or at 
least without messing with SSE instructions), but if you have to read a 
64-bit register, you are stuck with doing two 32-bit reads and 
concatenating them. I figure we may as well provide an implementation for 
those who have to do that as well as the implementation for 64-bit.


I think the problem though is that the way you should glue those two 32-bit 
reads together is device dependent.  I don't think you can provide a 
completely device-neutral bus_read_8() on i386.  We should certainly have it 
on 64-bit platforms, but I think drivers that want to work on 32-bit 
platforms need to explicitly merge the two words themselves.


Indeed -- and on non-x86, where there are uncached direct map segments, and 
TLB entries that disable caching, reading 2x 32-bit vs 1x 64-bit have quite 
different effects in terms of atomicity.  Where uncached I/Os are being used, 
those differences may affect semantics significantly -- e.g., if your device 
has a 64-bit memory-mapped FIFO or registers, 2x 32-bit gives you two halves 
of two different 64-bit values, rather than two halves of the same value.  As 
device drivers depend on those atomicity semantics, we should (at the busspace 
level) offer only the exactly expected semantics, rather than trying to patch 
things up.  If a device driver accessing 64-bit fields wants to support doing 
it using two 32-bit reads, it can figure out how to splice it together 
following bus_space_read_region_4().


Robert
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: No bus_space_read_8 on x86 ?

2012-10-12 Thread Carl Delsey

On 10/12/2012 9:04 AM, Robert Watson wrote:


On Fri, 12 Oct 2012, John Baldwin wrote:

I believe it was because bus reads weren't guaranteed to be atomic 
on i386. don't know if that's still the case or a concern, but it 
was an intentional omission.
True.  If you are on a 32-bit system you can read the two 4 byte 
values and then build a 64-bit value.  For 64-bit platforms we 
should offer bus_read_8() however.


I believe there is still no way to perform a 64-bit read on a i386 
(or at least without messing with SSE instructions), but if you have 
to read a 64-bit register, you are stuck with doing two 32-bit reads 
and concatenating them. I figure we may as well provide an 
implementation for those who have to do that as well as the 
implementation for 64-bit.


I think the problem though is that the way you should glue those two 
32-bit reads together is device dependent.  I don't think you can 
provide a completely device-neutral bus_read_8() on i386.  We should 
certainly have it on 64-bit platforms, but I think drivers that want 
to work on 32-bit platforms need to explicitly merge the two words 
themselves.


Indeed -- and on non-x86, where there are uncached direct map 
segments, and TLB entries that disable caching, reading 2x 32-bit vs 
1x 64-bit have quite different effects in terms of atomicity. Where 
uncached I/Os are being used, those differences may affect semantics 
significantly -- e.g., if your device has a 64-bit memory-mapped FIFO 
or registers, 2x 32-bit gives you two halves of two different 64-bit 
values, rather than two halves of the same value.  As device drivers 
depend on those atomicity semantics, we should (at the busspace level) 
offer only the exactly expected semantics, rather than trying to patch 
things up.  If a device driver accessing 64-bit fields wants to 
support doing it using two 32-bit reads, it can figure out how to 
splice it together following bus_space_read_region_4().
I wouldn't make any default behaviour for bus_space_read_8 on i386, just 
amd64. My assumption (which may be unjustified) is that by far the most 
common implementations to read a 64-bit register on i386 would be to 
read the lower 4 bytes first, followed by the upper 4 bytes (or vice 
versa) and then stitch them together.  I think we should provide helper 
functions for these two cases, otherwise I fear our code base will be 
littered with multiple independent implementations of this.


Some driver writer who wants to take advantage of these helper functions 
would do something like

#ifdef i386
#definebus_space_read_8bus_space_read_8_lower_first
#endif
otherwise, using bus_space_read_8 won't compile for i386 builds.
If these implementations won't work for their case, they are free to 
write their own implementation or take whatever action is necessary.


I guess my question is, are these cases common enough that it is worth 
helping developers by providing functions that do the double read and 
shifts for them, or do we leave them to deal with it on their own at the 
risk of possibly some duplicated code.


Thanks,
Carl

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: No bus_space_read_8 on x86 ?

2012-10-10 Thread Carl Delsey
Sorry for the slow response. I was dealing with a bit of a family 
emergency. Responses inline below.


On 10/09/12 08:54, John Baldwin wrote:

On Monday, October 08, 2012 4:59:24 pm Warner Losh wrote:

On Oct 5, 2012, at 10:08 AM, John Baldwin wrote:

snip

I think cxgb* already have an implementation.  For amd64 we should certainly
have bus_space_*_8(), at least for SYS_RES_MEMORY.  I think they should fail
for SYS_RES_IOPORT.  I don't think we can force a compile-time error though,
would just have to return -1 on reads or some such?


Yes. Exactly what I was thinking.


I believe it was because bus reads weren't guaranteed to be atomic on i386.
don't know if that's still the case or a concern, but it was an intentional 
omission.

True.  If you are on a 32-bit system you can read the two 4 byte values and
then build a 64-bit value.  For 64-bit platforms we should offer bus_read_8()
however.


I believe there is still no way to perform a 64-bit read on a i386 (or 
at least without messing with SSE instructions), but if you have to read 
a 64-bit register, you are stuck with doing two 32-bit reads and 
concatenating them. I figure we may as well provide an implementation 
for those who have to do that as well as the implementation for 64-bit.


Anyhow, it sounds like we are basically in agreement. I'll put together 
a patch and send it out for review.


Thanks,
Carl

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: No bus_space_read_8 on x86 ?

2012-10-09 Thread Warner Losh

On Oct 5, 2012, at 10:08 AM, John Baldwin wrote:

 On Thursday, October 04, 2012 1:20:52 pm Carl Delsey wrote:
 I noticed that the bus_space_*_8 functions are unimplemented for x86. 
 Looking at the code, it seems this is intentional.
 
 Is this done because on 32-bit systems we don't know, in the general 
 case, whether to read the upper or lower 32-bits first?
 
 If that's the reason, I was thinking we could provide two 
 implementations for i386: bus_space_read_8_upper_first and 
 bus_space_read_8_lower_first. For amd64 we would just have bus_space_read_8
 
 Anybody who wants to use bus_space_read_8 in their file would do 
 something like:
 #define BUS_SPACE_8_BYTES LOWER_FIRST
 or
 #define BUS_SPACE_8_BYTES UPPER_FIRST
 whichever is appropriate for their hardware.
 
 This would go in their source file before including bus.h and we would 
 take care of mapping to the correct implementation.
 
 With the prevalence of 64-bit registers these days, if we don't provide 
 an implementation, I expect many drivers will end up rolling their own.
 
 If this seems like a good idea, I'll happily whip up a patch and submit it.
 
 I think cxgb* already have an implementation.  For amd64 we should certainly 
 have bus_space_*_8(), at least for SYS_RES_MEMORY.  I think they should fail 
 for SYS_RES_IOPORT.  I don't think we can force a compile-time error though, 
 would just have to return -1 on reads or some such?

I believe it was because bus reads weren't guaranteed to be atomic on i386.  
don't know if that's still the case or a concern, but it was an intentional 
omission.

Warner
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: No bus_space_read_8 on x86 ?

2012-10-09 Thread John Baldwin
On Monday, October 08, 2012 4:59:24 pm Warner Losh wrote:
 
 On Oct 5, 2012, at 10:08 AM, John Baldwin wrote:
 
  On Thursday, October 04, 2012 1:20:52 pm Carl Delsey wrote:
  I noticed that the bus_space_*_8 functions are unimplemented for x86. 
  Looking at the code, it seems this is intentional.
  
  Is this done because on 32-bit systems we don't know, in the general 
  case, whether to read the upper or lower 32-bits first?
  
  If that's the reason, I was thinking we could provide two 
  implementations for i386: bus_space_read_8_upper_first and 
  bus_space_read_8_lower_first. For amd64 we would just have bus_space_read_8
  
  Anybody who wants to use bus_space_read_8 in their file would do 
  something like:
  #define BUS_SPACE_8_BYTES LOWER_FIRST
  or
  #define BUS_SPACE_8_BYTES UPPER_FIRST
  whichever is appropriate for their hardware.
  
  This would go in their source file before including bus.h and we would 
  take care of mapping to the correct implementation.
  
  With the prevalence of 64-bit registers these days, if we don't provide 
  an implementation, I expect many drivers will end up rolling their own.
  
  If this seems like a good idea, I'll happily whip up a patch and submit it.
  
  I think cxgb* already have an implementation.  For amd64 we should 
  certainly 
  have bus_space_*_8(), at least for SYS_RES_MEMORY.  I think they should 
  fail 
  for SYS_RES_IOPORT.  I don't think we can force a compile-time error 
  though, 
  would just have to return -1 on reads or some such?
 
 I believe it was because bus reads weren't guaranteed to be atomic on i386.
 don't know if that's still the case or a concern, but it was an intentional 
 omission.

True.  If you are on a 32-bit system you can read the two 4 byte values and
then build a 64-bit value.  For 64-bit platforms we should offer bus_read_8()
however.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: No bus_space_read_8 on x86 ?

2012-10-05 Thread John Baldwin
On Thursday, October 04, 2012 1:20:52 pm Carl Delsey wrote:
 I noticed that the bus_space_*_8 functions are unimplemented for x86. 
 Looking at the code, it seems this is intentional.
 
 Is this done because on 32-bit systems we don't know, in the general 
 case, whether to read the upper or lower 32-bits first?
 
 If that's the reason, I was thinking we could provide two 
 implementations for i386: bus_space_read_8_upper_first and 
 bus_space_read_8_lower_first. For amd64 we would just have bus_space_read_8
 
 Anybody who wants to use bus_space_read_8 in their file would do 
 something like:
 #define BUS_SPACE_8_BYTES LOWER_FIRST
 or
 #define BUS_SPACE_8_BYTES UPPER_FIRST
 whichever is appropriate for their hardware.
 
 This would go in their source file before including bus.h and we would 
 take care of mapping to the correct implementation.
 
 With the prevalence of 64-bit registers these days, if we don't provide 
 an implementation, I expect many drivers will end up rolling their own.
 
 If this seems like a good idea, I'll happily whip up a patch and submit it.

I think cxgb* already have an implementation.  For amd64 we should certainly 
have bus_space_*_8(), at least for SYS_RES_MEMORY.  I think they should fail 
for SYS_RES_IOPORT.  I don't think we can force a compile-time error though, 
would just have to return -1 on reads or some such?

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


No bus_space_read_8 on x86 ?

2012-10-04 Thread Carl Delsey
I noticed that the bus_space_*_8 functions are unimplemented for x86. 
Looking at the code, it seems this is intentional.


Is this done because on 32-bit systems we don't know, in the general 
case, whether to read the upper or lower 32-bits first?


If that's the reason, I was thinking we could provide two 
implementations for i386: bus_space_read_8_upper_first and 
bus_space_read_8_lower_first. For amd64 we would just have bus_space_read_8


Anybody who wants to use bus_space_read_8 in their file would do 
something like:

#define BUS_SPACE_8_BYTES LOWER_FIRST
or
#define BUS_SPACE_8_BYTES UPPER_FIRST
whichever is appropriate for their hardware.

This would go in their source file before including bus.h and we would 
take care of mapping to the correct implementation.


With the prevalence of 64-bit registers these days, if we don't provide 
an implementation, I expect many drivers will end up rolling their own.


If this seems like a good idea, I'll happily whip up a patch and submit it.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org