The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=98904 <https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=98904>
I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread ------------------------------ Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has. Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back, write-allocate it takes ~8 seconds, of 32 fps. PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 cache effect. ------------------------- End of quote The person who posted the above comment (mrvn) posted the code here: https://github.com/mrvn/test/blob/master/mmu.cc <https://github.com/mrvn/test/blob/master/mmu.cc> Also, it seems that when the Pi 2 starts up, cores 1-3 are put in a wait loop always accessing the bus. By putting this option in the config.txt file you can put the other cores to sleep, speeding up the code on core 1. arm_control=0x1000 It would be worth trying that option to see if the benchmark speeds up. Alan > On Jun 2, 2015, at 8:05 AM, Hesham ALMatary <heshamelmat...@gmail.com> wrote: > > On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni <krohini1...@gmail.com > <mailto:krohini1...@gmail.com>> wrote: >> From what I saw, they have to be enabled separately. Cache/mmu are disabled >> upon reset. >> > For the existing Raspberry BSP [1] there's a code for MMU/Cache init, > however I don't know about Pi2 and where its code is. > > [1] > https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi > <https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi> >> On 2 Jun 2015 16:59, "Hesham ALMatary" <heshamelmat...@gmail.com> wrote: >>> >>> Hi, >>> >>> Aren't the MMU/Caches enabled by default for RPi [1]? >>> >>> [1] >>> https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c >>> >>> On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill >>> <joel.sherr...@oarcorp.com> wrote: >>>> >>>> >>>> On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni <krohini1...@gmail.com> >>>> wrote: >>>>> Dr. Joel, >>>>> >>>>> So we can't say something solely on the basis of this result? >>>> >>>> I don't think so. If Linux performs the same, then what you did is as >>>> good as it gets. >>>> >>>> However, if Linux is faster then some setting still isn't right. >>>> >>>> You need a reference measurement to have any confidence. It is possible >>>> you did something but didn't actually turn the cache (or all the cache) on. >>>> >>>>> On 2 Jun 2015 16:28, "Rohini Kulkarni" <krohini1...@gmail.com> wrote: >>>>> >>>>> I have not run it under linux on pi2 yet. Will have to run and check >>>>> the result. >>>>> >>>>> On 2 Jun 2015 16:16, "Joel Sherrill" <joel.sherr...@oarcorp.com> wrote: >>>>> >>>>> >>>>> >>>>> On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni <krohini1...@gmail.com> >>>>> wrote: >>>>>> HI, >>>>>> >>>>>> I tried running the dhrystone benchmark with some changes for >>>>> cache/mmu >>>>>> set up. >>>>>> >>>>>> However, the output shows a reduction in performance. >>>>>> The time to run through the dhrystone has increased from 12 to 13 and >>>>>> dhrystones run per second decreased. >>>>>> >>>>>> According to this result, things were better with caches disabled. >>>>>> >>>>>> >>>>>> I have been working on this since two days and could not figure out an >>>>>> improvement. Any pointers? >>>>> >>>>> How did it do under Linux on the Pi2? >>>>> >>>>> >>>>>> Thanks. >>>>>> >>>>>> >>>>>> >>>>>> On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni >>>>>> <krohini1...@gmail.com> wrote: >>>>>> >>>>>> Hi All, >>>>>> >>>>>> I have to implement the cache coherency support for Cortex A7. But for >>>>>> A7 MPCore, unlike for A9, I am not able to find any register >>>>>> description for the Snoop Control Unit from the TRM. >>>>>> >>>>>> I need help here on how to proceed. >>>>>> >>>>>> Additionally for A9 there is a single bit for A9 in the Auxiliary >>>>>> Control Register which enables cache broadcast operations. The >>>>> register >>>>>> format is different for A7 and again I am unable to find how to >>>>> achieve >>>>>> the same for A7. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill >>>>>> <joel.sherr...@oarcorp.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 5/5/2015 11:11 AM, Rohini Kulkarni wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am working with the code for bsp hooks. I am referring to existing >>>>>> ARM multicore bsp codes, zync mainly. >>>>>> >>>>>> 1. There are existing hooks for the raspberry pi. Where should the >>>>> code >>>>>> for the Pi2 hooks be added? >>>>>> >>>>>> The Pi and Pi2 are remarkably similar so Pi2 should be placed inside >>>>>> the Pi BSP directory. >>>>>> There is already a Pi2 variant of that code built. But we know >>>>> specific >>>>>> places where there >>>>>> are variances. Depending on the scope of what is different, it can be >>>>>> as simple as >>>>>> a cpp conditional in a .h to select a value or two implementations of >>>>> a >>>>>> single method >>>>>> and the Makefile.am picking the right file to build based on the board >>>>>> variant. >>>>>> >>>>>> The big question to always ask is: Is this specific to the Pi2 and >>>>>> incompatible with the Pi? >>>>>> >>>>>> Since the Pi BSP is still missing capabilities, it is likely code >>>>>> common to both will >>>>>> be added this summer. For example, did the mailbox interface change? I >>>>>> don't know >>>>>> but would guess that it didn't. Each new capability added needs that >>>>>> added. >>>>>> >>>>>> And any differences need to be analyzed to pick the least intrusive >>>>> way >>>>>> to provide >>>>>> alternate implementations. Or enable special code like the Pi2 SMP >>>>>> support which >>>>>> is dependent on --enable-smp and being a Pi2. >>>>>> >>>>>> 2. Am I right in understanding that I will have to implement A7 >>>>>> specific functions as have been for A9? I am referring specifically to >>>>>> the arm-a9mpcore-start.h >>>>>> >>>>>> Yes. >>>>>> >>>>>> If the code is very similar between the a7 and a9, then a discussion >>>>>> on devel@ should occur to decide the best way to minimize duplication. >>>>>> >>>>>> If you end up with a7 specific code, you should follow the location >>>>> and >>>>>> >>>>>> naming patterns already established. That places it in >>>>>> libbsp/arm/shared/... >>>>>> so it can be used by any BSP with the right SMP core. >>>>>> >>>>>> >>>>>> I am referring to existing codes to locate and get hold of what needs >>>>>> to be done in the hooks. However, being new to such implementations, I >>>>>> am taking longer to understand the details. Any suggestions that might >>>>>> help here are welcome >>>>>> >>>>>> The answer will depend on the factors listed above. When code can >>>>>> be shared, we want to share it across as many BSPs as makes sense. >>>>>> When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2), then >>>>>> you want to find the way to account for the variation in the least >>>>>> intrusive code way possible. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> On 1 May 2015 12:45, "Rohini Kulkarni" <krohini1...@gmail.com> wrote: >>>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> Excited to be a part of this edition of GSoC! Thanks to everybody for >>>>>> helping me get here and congratulations to all the participating >>>>>> students! >>>>>> >>>>>> So, now getting to work, firstly I wish to know, specifically from my >>>>>> mentors, any changes that must be made to my proposed project or >>>>>> schedule. >>>>>> >>>>>> Secondly, are there any specifics for the development blog that we >>>>> need >>>>>> to create for the project? Over time what is the blog expected to >>>>>> convey. >>>>>> >>>>>> Also, I have to create a new wiki page for my project as none exists. >>>>> I >>>>>> want to know how to add one. >>>>>> >>>>>> -- >>>>>> >>>>>> Rohini Kulkarni >>>>>> >>>>>> >>>>>> -- Joel Sherrill, Ph.D. Director of Research & Development >>>>>> joel.sherr...@oarcorp.com On-Line Applications Research Ask me about >>>>>> RTEMS: a free RTOS Huntsville AL 35805 Support Available (256) >>>>> 722-9985 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Rohini Kulkarni >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Rohini Kulkarni >>>>> >>>>> --joel >>>> >>>> --joel >>>> _______________________________________________ >>>> devel mailing list >>>> devel@rtems.org >>>> http://lists.rtems.org/mailman/listinfo/devel >>> >>> >>> >>> -- >>> Hesham > > > > -- > Hesham > _______________________________________________ > devel mailing list > devel@rtems.org <mailto:devel@rtems.org> > http://lists.rtems.org/mailman/listinfo/devel > <http://lists.rtems.org/mailman/listinfo/devel>
_______________________________________________ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel