Replies and doubts in-lined

Thanks and Regards,
Shaveta

-----Original Message-----
From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Shaveta 
Leekha
Sent: Monday, March 28, 2016 11:20 PM
To: Andrew Fish <af...@apple.com>; Bill Paul <wp...@windriver.com>
Cc: edk2-de...@ml01.01.org
Subject: Re: [edk2] PCIe memory transaction issue



-----Original Message-----
From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of Andrew 
Fish
Sent: Monday, March 28, 2016 11:14 PM
To: Bill Paul <wp...@windriver.com>
Cc: edk2-de...@ml01.01.org
Subject: Re: [edk2] PCIe memory transaction issue


> On Mar 28, 2016, at 9:55 AM, Bill Paul <wp...@windriver.com> wrote:
> 
> Of all the gin joints in all the towns in all the world, Bill Paul had 
> to walk into mine at 09:33:57 on Monday 28 March 2016 and say:
> 
>> Of all the gin joints in all the towns in all the world, Shaveta 
>> Leekha had to
>> 
>> walk into mine at 00:29:39 on Monday 28 March 2016 and say:
>>> Hi,
>>> 
>>> In PCIe memory transactions, I am facing an issue.
>>> 
>>> The scenario is:
>>> 
>>> Case 1:
>>> In our system, we have allocated 32 bit memory space to one of the 
>>> PCI device (E1000 NIC card)
>> 
>> You did not say which Intel PRO/1000 card (vendor/device ID). There 
>> are literally dozens of them. (It's actually not that critical, but 
>> I'm
>> curious.)
>> 
>>> during enumeration and BAR programming. When NIC card is getting 
>>> used to transmit a ping packet, a local buffer is getting allocated 
>>> from 32 bit main memory space. In this case, the packet is getting 
>>> sent out successfully.
>>> 
>>> 
>>> Case 2:
>>> Now when NIC card is getting used to transmit a ping packet, if a 
>>> local buffer is allocated from 64 bit main memory space. The packet 
>>> failed to transmit out.
>>> 
>>> Doubt 1: Would it be possible for this PCI device/NIC card (in our
>>> case) to access this 64 bit address space for sending this packet 
>>> out of system?
>> 
>> I don't know offhand how the UEFI PRO/1000 driver handles this, but I 
>> know that pretty much all Intel PRO/1000 cards support 64-bit DMA addressing.
>> 
>> Some older PCI cards, like, say, the Intel 82557/8/9 PRO/100 cards, 
>> only support 32-bit addressing. That means that they only accept DMA 
>> source/target addresses that are 32-bits wide. For those, if you have 
>> a 64-bit system, you must use "bounce buffering." That is, the device 
>> can only DMA from addresses within the first 4GB of physical memory.
>> If you have a packet buffer outside that window, then you have to 
>> copy it to a temporary buffer inside the window first (i.e. "bounce"
>> it) and then set up the DMA transfer from that location instead.
>> 
>> This requires you to be able to allocate some storage from specific 
>> physical address regions (i.e. you have to ensure the storage is 
>> inside the 4GB window).
>> 
>> However the PRO/1000 doesn't have this limitation: you can specify 
>> fully qualified 64-bit addresses for both the RX and TX DMA ring base 
>> addresses and the packet buffers in the DMA descriptors, so you never 
>> need bounce buffering. This was true even for the earliest PCI-X
>> PRO/1000 NICs, and is still true for the PCIe ones.
>> 
>> For the base addresses, you have two 32-bit registers: one for the 
>> upper 32 bits and one for the lower 32 bits. You have to initialize 
>> both. Drivers written for 32-bit systems will often hard code the 
>> upper 32 bits of the address fields to 0. If you use that same driver 
>> code on a 64-bit system, DMA transfers will still be initiated, but 
>> the source/target addresses will be wrong.
>> 
>>> Doubt 2: If a device is allocated 32 bit Memory mapped space from 32 
>>> bit memory area, then for packet transactions, can we use 64 bit 
>>> memory space?
>> 
>> Just to clarify: do not confuse the BAR mappings with DMA. They are 
>> two different concepts. I think a 64-bit BAR allows you to map the 
>> device's register bank anywhere within the 64-bit address space, 
>> whereas with a 32-bit BAR you have to map the registers within the 
>> first 4GB of address space (preferably somewhere that doesn't overlap 
>> RAM). However that has nothing to do with how DMA works: even with 
>> the PRO/1000's BARs mapped to a 32-bit region, you should still be 
>> able to perform DMA transfers to/from any 64-bit address.
>> 
>> The BARs use an outbound, i.e. the host issues outbound read/write 
>> requests and the device is the target of those requests.
>> 
>> DMA transfers use an inbound window, i.e. the devices issues 
>> read/write requests and the host is the target of those requests.
>> 
>> The PRO/100 requires 32-bit addressing for both inbound and outbound 
>> requests.
>> 
>> The PRO/1000 can use 64-bit addressing.
> 
> Oh, sorry, there's something else I forgot to mention:
> 
> In addition to writing the PRO/1000 driver to correctly support 64-bit 
> DMA addressing, it's sometimes necessary to program the PCIe 
> controller itself correctly as well. I actually don't know how you'd 
> do this on Intel IA32 or
> X64 platforms, because it involves low-level chipset initialization 
> which is considered "secret sauce" by Intel.
> 
> But for ARM and PPC SoCs (like those made by Freescale/NXP), I know 
> that you have to program the outbound and inbound window sizes and 
> translation offsets in order for all transfers to work. (I had to do 
> this for the VxWorks drivers for the Freescale/NXP P4080 and T4240 
> PCIe controllers.)
> 
> You didn't mention what platform you're using so I don't know if this 
> applies here.
> 

>From a PCI ROM point of view the SoC/Chipset should not matter as the PCI IO 
>protocol produced by the PCI Bus driver abstracts that info from the driver. 
>Technically speaking the PCI Bus Driver is generic and chipset is abstracted 
>via the EFI_PCI_ROOT_BRIDGE_IO_PROTOCOL. 

Thus all the PCI EFI Driver needs to do is follow the PCI IO rules for doing 
DMA. 

For example:
DMA Bus Master Common Buffer Operation
Call AllocateBuffer() to allocate a common buffer.

[SHAVETA]
In " RootBridgeIoAllocateBuffer" function, allocating  a buffer from 64-bit 
address space, as main memory(DDR) is now in 64-bit address space.


Call Map() for EfiPciIoOperationBusMasterCommonBuffer.

[SHAVETA]
In " RootBridgeIoMap" function,  not Allocating a buffer below 4GB to map the 
transfer to.
Instead mapping buffer from one 64-bit area to another 64-bit space in memory 
itself.
Can this be an issue?



Program the DMA Bus Master with the DeviceAddress returned by Map().

[SHAVETA] How to do this part ??
As I guess, in my case either DMA bus master not able to pick the buffer 
properly??

Or some issue in outbound window programming, does it come in picture here??



The common buffer can now be accessed equally by the processor and the DMA bus 
master. 


Call Unmap().
Call FreeBuffer().
[SHAVETA] done


The most common problem is code does not properly follow the DMA rules and 
given DMA on IA32/X64 is cache coherent things seem to work. The problem is 
this code will fail on very large x86 servers, and ARM based platforms. 

[Shaveta] That's very true. In our Layerscape ARM V8 based platform, I had to 
add so many instructions to maintain cache coherency in E1000 driver to make 
things work

Is there some similar working code for ARM platforms?

Thanks and Regards,
Shaveta

Thanks,

Andrew Fish



> -Bill
> 
>> -Bill
>> 
>>> Thanks and Regards,
>>> Shaveta
>>> 
>>> 
>>> Resource MAP for PCI bridge and one PCI device on bus 1 is:
>>> 
>>> PciBus: Resource Map for Root Bridge PciRoot(0x0)
>>> Type =   Io16; Base = 0x0;      Length = 0x1000;        Alignment = 0xFFF
>>> Base = 0x0;    Length = 0x1000;        Alignment = 0xFFF;      Owner =
>>> PPB [00|00|00:**] Type =  Mem32; Base = 0x78000000;       Length =
>>> 0x5100000;
>>> 
>>>   Alignment = 0x3FFFFFF Base = 0x78000000;     Length = 0x4000000;
>>> 
>>> Alignment = 0x3FFFFFF;  Owner = PPB  [00|00|00:14] Base = 0x7C000000;
>>> Length = 0x1000000;     Alignment = 0xFFFFFF;   Owner = PPB 
>>> [00|00|00:10] Base = 0x7D000000;     Length = 0x100000;      Alignment =
>>> 0xFFFFF; Owner = PPB  [00|00|00:**]
>>> 
>>> PciBus: Resource Map for Bridge [00|00|00]
>>> Type =   Io16; Base = 0x0;      Length = 0x1000;        Alignment = 0xFFF
>>> Base = 0x0;    Length = 0x20;  Alignment = 0x1F;       Owner = PCI
>>> [01|00|00:18] Type =  Mem32; Base = 0x78000000;       Length = 0x4000000;
>>> 
>>>   Alignment = 0x3FFFFFF
>>> 
>>> gArmPlatformTokenSpaceGuid.PcdPciMmio32Base|0x40000000
>>> 
>>>  gArmPlatformTokenSpaceGuid.PcdPciMmio32Size|0x40000000      # 128M
>>>  gArmPlatformTokenSpaceGuid.PcdPciMemTranslation|0x1400000000
>>>  gArmPlatformTokenSpaceGuid.PcdPciMmio64Base|0x1440000000
>>>  gArmPlatformTokenSpaceGuid.PcdPciMmio64Size|0x40000000
>>> 
>>> _______________________________________________
>>> edk2-devel mailing list
>>> edk2-devel@lists.01.org
>>> https://lists.01.org/mailman/listinfo/edk2-devel
> 
> --
> =============================================================================
> -Bill Paul            (510) 749-2329 | Senior Member of Technical Staff,
>                 wp...@windriver.com | Master of Unix-Fu - Wind River 
> Systems 
> =============================================================================
>   "I put a dollar in a change machine. Nothing changed." - George 
> Carlin 
> ======================================================================
> ======= _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to