I missed mentioning the number of dhrystones in the previous mail. Originally it was 1 million. The new number of dhrystones I executed is 100 million.
On Mon, Jun 22, 2015 at 12:29 AM, Rohini Kulkarni <krohini1...@gmail.com> wrote: > Hi all, > > I have managed to get a significant performance improvement with some > changes in configurations. > > The measured time was for dhrystones reduced from 12 to "too small to be > measured " > > For dhrystones the time was 0.4. > > The number of dhrystones per second increased from approximately 83333 to > 2500000 :) > > Thanks! > > On Sun, Jun 21, 2015 at 1:32 AM, Rohini Kulkarni <krohini1...@gmail.com> > wrote: > >> Hi, >> >> I have added an SMP related post to my blog to define where exactly in >> the code I need to work. Some feedback to indicate if I am identifying the >> work area correctly would be very helpful! >> >> Thanks! >> On 18 Jun 2015 03:37, "Rohini Kulkarni" <krohini1...@gmail.com> wrote: >> >>> Hi all, >>> >>> I have updated my blog to reflect my understanding and attempts for >>> cache performance issue. >>> >>> Lately I have been trying around memory attributes for the >>> mm_config_table. One set of configurations for cacheable memory (inner and >>> outer levels)ended up reducing performance further ( which I really thought >>> would improve). So this table set up certainly controls performance. >>> >>> The results are not improving after turning on cache. So memory sections >>> are perhaps not even getting cached. >>> I get a feeling it has got to do with this mm_config_table. >>> >>> Updates from the github code and blog might help in further discussion. >>> >>> Link to github code:https://github.com/krohini1593/rtems/tree/rohini >>> >>> Link to Blog <http://rohiniwithrpi2.blogspot.in/p/blog-page_3.html> >>> >>> Thanks! >>> >>> On Mon, Jun 15, 2015 at 8:29 PM, Alan Cudmore <alan.cudm...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> Some of the code examples may give you some clues. Like this one: >>>> https://github.com/mrvn/test/blob/master/smp.cc >>>> >>>> Or this: >>>> https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT >>>> >>>> If you still can't figure it out, you can always join the >>>> raspberrypi.org forums and ask on this thread: >>>> https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=98904 >>>> >>>> When it comes to the Pi 2 and SMP, you are our RTEMS expert :) >>>> >>>> Thanks, >>>> Alan >>>> >>>> >>>> On Sat, Jun 13, 2015 at 2:29 PM, Rohini Kulkarni <krohini1...@gmail.com >>>> > wrote: >>>> >>>>> Hi, >>>>> >>>>> This is regarding Pi 2 SMP support. After powering on, the secondary >>>>> mailboxes read one of their four mailbox registers and wait for a non-zero >>>>> content to be written. This content is to be the physical address of the >>>>> location from where the cores are expected to start execution. >>>>> >>>>> I am stuck at figuring out this address. How should I go about >>>>> understanding this? >>>>> >>>>> Thanks! >>>>> On 3 Jun 2015 19:44, "Gedare Bloom" <ged...@gwu.edu> wrote: >>>>> >>>>>> On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni < >>>>>> krohini1...@gmail.com> wrote: >>>>>> > But, I can't say cache configurations have a role here. >>>>>> > >>>>>> > I'll push my code to my github project soon. >>>>>> > >>>>>> > P.S. The Pi2 board I possess seems to have broken down. It just >>>>>> isn't >>>>>> > turning on. Unable to test further. Will order one immediately. >>>>>> > >>>>>> Ouch. Make sure you put it in a safe space for development, clear of >>>>>> threats like moisture, static shock, and cats. >>>>>> >>>>>> > On 3 Jun 2015 09:03, "Rohini Kulkarni" <krohini1...@gmail.com> >>>>>> wrote: >>>>>> >> >>>>>> >> Hi, >>>>>> >> >>>>>> >> Alan, your suggestion has resulted in much improvement >>>>>> >> >>>>>> >> arm_control=0x1000 >>>>>> >> >>>>>> >> This has simply worked! Looks like the other cores were taking up >>>>>> plenty >>>>>> >> of time. >>>>>> >> I was aware from references that the other cores run a WFI, but >>>>>> ya, did >>>>>> >> not get its impact. >>>>>> >> Time for each dhrystone has reduced to 7 from 13 and the no of >>>>>> dhrystones >>>>>> >> per second also increased. >>>>>> >> >>>>>> >> But this is a change only in the config.txt not actually in the >>>>>> boot code. >>>>>> >> >>>>>> >> Thanks >>>>>> >> >>>>>> >> Rohini >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore < >>>>>> alan.cudm...@gmail.com> >>>>>> >> wrote: >>>>>> >>> >>>>>> >>> The caches are being enabled on the RPI 1 BSP. The same code is >>>>>> being >>>>>> >>> executed by the RPI 2 BSP, but obviously it’s not sufficient for >>>>>> the cache >>>>>> >>> setup. >>>>>> >>> I have been reading through this long thread, and it is very >>>>>> informative: >>>>>> >>> https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=98904 >>>>>> >>> >>>>>> >>> I am starting to understand the setup that is required to enable >>>>>> caches >>>>>> >>> on the RPI 2. For example this message near the bottom of page 3 >>>>>> gives a >>>>>> >>> good indication of the speedup available by configuring the MMU >>>>>> and caches >>>>>> >>> correctly: >>>>>> >>> Quote from above thread >>>>>> >>> ------------------------------ >>>>>> >>> Enabling I/D caches and branch prediction, just like the julia >>>>>> demo uses, >>>>>> >>> it takes ~12 seconds, or ~21 fps. It's just one core but also a >>>>>> much smaller >>>>>> >>> loop than the julia demo has. >>>>>> >>> >>>>>> >>> Enabling the MMU and mapping memory inner/outer write-back, write >>>>>> >>> allocate and the framebuffer inner write-through, no write >>>>>> allocate + outer >>>>>> >>> write-back, write-allocate it takes ~8 seconds, of 32 fps. >>>>>> >>> >>>>>> >>> PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 >>>>>> cache >>>>>> >>> effect. >>>>>> >>> ------------------------- >>>>>> >>> End of quote >>>>>> >>> >>>>>> >>> The person who posted the above comment (mrvn) posted the code >>>>>> here: >>>>>> >>> https://github.com/mrvn/test/blob/master/mmu.cc >>>>>> >>> >>>>>> >>> >>>>>> >>> Also, it seems that when the Pi 2 starts up, cores 1-3 are put in >>>>>> a wait >>>>>> >>> loop always accessing the bus. By putting this option in the >>>>>> config.txt file >>>>>> >>> you can put the other cores to sleep, speeding up the code on >>>>>> core 1. >>>>>> >>> arm_control=0x1000 >>>>>> >>> It would be worth trying that option to see if the benchmark >>>>>> speeds up. >>>>>> >>> >>>>>> >>> >>>>>> >>> Alan >>>>>> >>> >>>>>> >>> On Jun 2, 2015, at 8:05 AM, Hesham ALMatary < >>>>>> heshamelmat...@gmail.com> >>>>>> >>> wrote: >>>>>> >>> >>>>>> >>> On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni < >>>>>> krohini1...@gmail.com> >>>>>> >>> wrote: >>>>>> >>> >>>>>> >>> From what I saw, they have to be enabled separately. Cache/mmu are >>>>>> >>> disabled >>>>>> >>> upon reset. >>>>>> >>> >>>>>> >>> For the existing Raspberry BSP [1] there's a code for MMU/Cache >>>>>> init, >>>>>> >>> however I don't know about Pi2 and where its code is. >>>>>> >>> >>>>>> >>> [1] >>>>>> >>> >>>>>> https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi >>>>>> >>> >>>>>> >>> On 2 Jun 2015 16:59, "Hesham ALMatary" <heshamelmat...@gmail.com> >>>>>> wrote: >>>>>> >>> >>>>>> >>> >>>>>> >>> Hi, >>>>>> >>> >>>>>> >>> Aren't the MMU/Caches enabled by default for RPi [1]? >>>>>> >>> >>>>>> >>> [1] >>>>>> >>> >>>>>> >>> >>>>>> https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c >>>>>> >>> >>>>>> >>> On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill >>>>>> >>> <joel.sherr...@oarcorp.com> wrote: >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni < >>>>>> krohini1...@gmail.com> >>>>>> >>> wrote: >>>>>> >>> >>>>>> >>> Dr. Joel, >>>>>> >>> >>>>>> >>> So we can't say something solely on the basis of this result? >>>>>> >>> >>>>>> >>> >>>>>> >>> I don't think so. If Linux performs the same, then what you did >>>>>> is as >>>>>> >>> good as it gets. >>>>>> >>> >>>>>> >>> However, if Linux is faster then some setting still isn't right. >>>>>> >>> >>>>>> >>> You need a reference measurement to have any confidence. It is >>>>>> possible >>>>>> >>> you did something but didn't actually turn the cache (or all the >>>>>> cache) >>>>>> >>> on. >>>>>> >>> >>>>>> >>> On 2 Jun 2015 16:28, "Rohini Kulkarni" <krohini1...@gmail.com> >>>>>> wrote: >>>>>> >>> >>>>>> >>> I have not run it under linux on pi2 yet. Will have to run and >>>>>> check >>>>>> >>> the result. >>>>>> >>> >>>>>> >>> On 2 Jun 2015 16:16, "Joel Sherrill" <joel.sherr...@oarcorp.com> >>>>>> wrote: >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni < >>>>>> krohini1...@gmail.com> >>>>>> >>> wrote: >>>>>> >>> >>>>>> >>> HI, >>>>>> >>> >>>>>> >>> I tried running the dhrystone benchmark with some changes for >>>>>> >>> >>>>>> >>> cache/mmu >>>>>> >>> >>>>>> >>> set up. >>>>>> >>> >>>>>> >>> However, the output shows a reduction in performance. >>>>>> >>> The time to run through the dhrystone has increased from 12 to 13 >>>>>> and >>>>>> >>> dhrystones run per second decreased. >>>>>> >>> >>>>>> >>> According to this result, things were better with caches disabled. >>>>>> >>> >>>>>> >>> >>>>>> >>> I have been working on this since two days and could not figure >>>>>> out an >>>>>> >>> improvement. Any pointers? >>>>>> >>> >>>>>> >>> >>>>>> >>> How did it do under Linux on the Pi2? >>>>>> >>> >>>>>> >>> >>>>>> >>> Thanks. >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni >>>>>> >>> <krohini1...@gmail.com> wrote: >>>>>> >>> >>>>>> >>> Hi All, >>>>>> >>> >>>>>> >>> I have to implement the cache coherency support for Cortex A7. >>>>>> But for >>>>>> >>> A7 MPCore, unlike for A9, I am not able to find any register >>>>>> >>> description for the Snoop Control Unit from the TRM. >>>>>> >>> >>>>>> >>> I need help here on how to proceed. >>>>>> >>> >>>>>> >>> Additionally for A9 there is a single bit for A9 in the Auxiliary >>>>>> >>> Control Register which enables cache broadcast operations. The >>>>>> >>> >>>>>> >>> register >>>>>> >>> >>>>>> >>> format is different for A7 and again I am unable to find how to >>>>>> >>> >>>>>> >>> achieve >>>>>> >>> >>>>>> >>> the same for A7. >>>>>> >>> >>>>>> >>> Thanks! >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill >>>>>> >>> <joel.sherr...@oarcorp.com> wrote: >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> On 5/5/2015 11:11 AM, Rohini Kulkarni wrote: >>>>>> >>> >>>>>> >>> Hi, >>>>>> >>> >>>>>> >>> I am working with the code for bsp hooks. I am referring to >>>>>> existing >>>>>> >>> ARM multicore bsp codes, zync mainly. >>>>>> >>> >>>>>> >>> 1. There are existing hooks for the raspberry pi. Where should the >>>>>> >>> >>>>>> >>> code >>>>>> >>> >>>>>> >>> for the Pi2 hooks be added? >>>>>> >>> >>>>>> >>> The Pi and Pi2 are remarkably similar so Pi2 should be placed >>>>>> inside >>>>>> >>> the Pi BSP directory. >>>>>> >>> There is already a Pi2 variant of that code built. But we know >>>>>> >>> >>>>>> >>> specific >>>>>> >>> >>>>>> >>> places where there >>>>>> >>> are variances. Depending on the scope of what is different, it >>>>>> can be >>>>>> >>> as simple as >>>>>> >>> a cpp conditional in a .h to select a value or two >>>>>> implementations of >>>>>> >>> >>>>>> >>> a >>>>>> >>> >>>>>> >>> single method >>>>>> >>> and the Makefile.am picking the right file to build based on the >>>>>> board >>>>>> >>> variant. >>>>>> >>> >>>>>> >>> The big question to always ask is: Is this specific to the Pi2 and >>>>>> >>> incompatible with the Pi? >>>>>> >>> >>>>>> >>> Since the Pi BSP is still missing capabilities, it is likely code >>>>>> >>> common to both will >>>>>> >>> be added this summer. For example, did the mailbox interface >>>>>> change? I >>>>>> >>> don't know >>>>>> >>> but would guess that it didn't. Each new capability added needs >>>>>> that >>>>>> >>> added. >>>>>> >>> >>>>>> >>> And any differences need to be analyzed to pick the least >>>>>> intrusive >>>>>> >>> >>>>>> >>> way >>>>>> >>> >>>>>> >>> to provide >>>>>> >>> alternate implementations. Or enable special code like the Pi2 SMP >>>>>> >>> support which >>>>>> >>> is dependent on --enable-smp and being a Pi2. >>>>>> >>> >>>>>> >>> 2. Am I right in understanding that I will have to implement A7 >>>>>> >>> specific functions as have been for A9? I am referring >>>>>> specifically to >>>>>> >>> the arm-a9mpcore-start.h >>>>>> >>> >>>>>> >>> Yes. >>>>>> >>> >>>>>> >>> If the code is very similar between the a7 and a9, then a >>>>>> discussion >>>>>> >>> on devel@ should occur to decide the best way to minimize >>>>>> duplication. >>>>>> >>> >>>>>> >>> If you end up with a7 specific code, you should follow the >>>>>> location >>>>>> >>> >>>>>> >>> and >>>>>> >>> >>>>>> >>> >>>>>> >>> naming patterns already established. That places it in >>>>>> >>> libbsp/arm/shared/... >>>>>> >>> so it can be used by any BSP with the right SMP core. >>>>>> >>> >>>>>> >>> >>>>>> >>> I am referring to existing codes to locate and get hold of what >>>>>> needs >>>>>> >>> to be done in the hooks. However, being new to such >>>>>> implementations, I >>>>>> >>> am taking longer to understand the details. Any suggestions that >>>>>> might >>>>>> >>> help here are welcome >>>>>> >>> >>>>>> >>> The answer will depend on the factors listed above. When code can >>>>>> >>> be shared, we want to share it across as many BSPs as makes sense. >>>>>> >>> When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2), >>>>>> then >>>>>> >>> you want to find the way to account for the variation in the least >>>>>> >>> intrusive code way possible. >>>>>> >>> >>>>>> >>> Thanks! >>>>>> >>> >>>>>> >>> On 1 May 2015 12:45, "Rohini Kulkarni" <krohini1...@gmail.com> >>>>>> wrote: >>>>>> >>> >>>>>> >>> >>>>>> >>> Hi, >>>>>> >>> >>>>>> >>> Excited to be a part of this edition of GSoC! Thanks to >>>>>> everybody for >>>>>> >>> helping me get here and congratulations to all the participating >>>>>> >>> students! >>>>>> >>> >>>>>> >>> So, now getting to work, firstly I wish to know, specifically >>>>>> from my >>>>>> >>> mentors, any changes that must be made to my proposed project or >>>>>> >>> schedule. >>>>>> >>> >>>>>> >>> Secondly, are there any specifics for the development blog that we >>>>>> >>> >>>>>> >>> need >>>>>> >>> >>>>>> >>> to create for the project? Over time what is the blog expected to >>>>>> >>> convey. >>>>>> >>> >>>>>> >>> Also, I have to create a new wiki page for my project as none >>>>>> exists. >>>>>> >>> >>>>>> >>> I >>>>>> >>> >>>>>> >>> want to know how to add one. >>>>>> >>> >>>>>> >>> -- >>>>>> >>> >>>>>> >>> Rohini Kulkarni >>>>>> >>> >>>>>> >>> >>>>>> >>> -- Joel Sherrill, Ph.D. Director of Research & Development >>>>>> >>> joel.sherr...@oarcorp.com On-Line Applications Research Ask me >>>>>> about >>>>>> >>> RTEMS: a free RTOS Huntsville AL 35805 Support Available (256) >>>>>> >>> >>>>>> >>> 722-9985 >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> -- >>>>>> >>> >>>>>> >>> Rohini Kulkarni >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> -- >>>>>> >>> >>>>>> >>> Rohini Kulkarni >>>>>> >>> >>>>>> >>> >>>>>> >>> --joel >>>>>> >>> >>>>>> >>> >>>>>> >>> --joel >>>>>> >>> _______________________________________________ >>>>>> >>> devel mailing list >>>>>> >>> devel@rtems.org >>>>>> >>> http://lists.rtems.org/mailman/listinfo/devel >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> -- >>>>>> >>> Hesham >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> -- >>>>>> >>> Hesham >>>>>> >>> _______________________________________________ >>>>>> >>> devel mailing list >>>>>> >>> devel@rtems.org >>>>>> >>> http://lists.rtems.org/mailman/listinfo/devel >>>>>> >>> >>>>>> >>> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> -- >>>>>> >> Rohini Kulkarni >>>>>> > >>>>>> > >>>>>> > _______________________________________________ >>>>>> > devel mailing list >>>>>> > devel@rtems.org >>>>>> > http://lists.rtems.org/mailman/listinfo/devel >>>>>> >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> devel@rtems.org >>>>> http://lists.rtems.org/mailman/listinfo/devel >>>>> >>>> >>>> >>> >>> >>> -- >>> Rohini Kulkarni >>> >> > > > -- > Rohini Kulkarni > -- Rohini Kulkarni
_______________________________________________ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel