Re: GSoC 2015: Raspberry Pi 2 Support
Great news on the Pi 2 cache configuration! I am looking forward to a patch to try. Alan On Sun, Jun 21, 2015 at 3:27 PM, Rohini Kulkarni krohini1...@gmail.com wrote: :) There is very little code that had to be added. I need to clean the code and add conditional call for Pi 2. Then I would be ready to submit a patch. On 22 Jun 2015 00:52, Gedare Bloom ged...@gwu.edu wrote: On Sun, Jun 21, 2015 at 3:04 PM, Rohini Kulkarni krohini1...@gmail.com wrote: I missed mentioning the number of dhrystones in the previous mail. Originally it was 1 million. The new number of dhrystones I executed is 100 million. The next thing to do is to figure out what changes are contributing to the performance improvement, and then prepare patches. :) Great work On Mon, Jun 22, 2015 at 12:29 AM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi all, I have managed to get a significant performance improvement with some changes in configurations. The measured time was for dhrystones reduced from 12 to too small to be measured For dhrystones the time was 0.4. The number of dhrystones per second increased from approximately 8 to 250 :) Thanks! On Sun, Jun 21, 2015 at 1:32 AM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, I have added an SMP related post to my blog to define where exactly in the code I need to work. Some feedback to indicate if I am identifying the work area correctly would be very helpful! Thanks! On 18 Jun 2015 03:37, Rohini Kulkarni krohini1...@gmail.com wrote: Hi all, I have updated my blog to reflect my understanding and attempts for cache performance issue. Lately I have been trying around memory attributes for the mm_config_table. One set of configurations for cacheable memory (inner and outer levels)ended up reducing performance further ( which I really thought would improve). So this table set up certainly controls performance. The results are not improving after turning on cache. So memory sections are perhaps not even getting cached. I get a feeling it has got to do with this mm_config_table. Updates from the github code and blog might help in further discussion. Link to github code:https://github.com/krohini1593/rtems/tree/rohini Link to Blog Thanks! On Mon, Jun 15, 2015 at 8:29 PM, Alan Cudmore alan.cudm...@gmail.com wrote: Hi, Some of the code examples may give you some clues. Like this one: https://github.com/mrvn/test/blob/master/smp.cc Or this: https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT If you still can't figure it out, you can always join the raspberrypi.org forums and ask on this thread: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 When it comes to the Pi 2 and SMP, you are our RTEMS expert :) Thanks, Alan On Sat, Jun 13, 2015 at 2:29 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, This is regarding Pi 2 SMP support. After powering on, the secondary mailboxes read one of their four mailbox registers and wait for a non-zero content to be written. This content is to be the physical address of the location from where the cores are expected to start execution. I am stuck at figuring out this address. How should I go about understanding this? Thanks! On 3 Jun 2015 19:44, Gedare Bloom ged...@gwu.edu wrote: On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni krohini1...@gmail.com wrote: But, I can't say cache configurations have a role here. I'll push my code to my github project soon. P.S. The Pi2 board I possess seems to have broken down. It just isn't turning on. Unable to test further. Will order one immediately. Ouch. Make sure you put it in a safe space for development, clear of threats like moisture, static shock, and cats. On 3 Jun 2015 09:03, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Alan, your suggestion has resulted in much improvement arm_control=0x1000 This has simply worked! Looks like the other cores were taking up plenty of time. I was aware from references that the other cores run a WFI, but ya, did not get its impact. Time for each dhrystone has reduced to 7 from 13 and the no of dhrystones per second also increased. But this is a change only in the config.txt not actually in the boot code. Thanks Rohini On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For
Re: GSoC 2015: Raspberry Pi 2 Support
Hello, Are these the relevant functions from ~/rtems/cpukit/score/cpu/arm/rtems/score/cpu.h? _CPU_SMP_Get_current_processor() _CPU_SMP_Send_interrupt() _CPU_SMP_Processor_event_broadcast() _CPU_SMP_Processor_event_receive() I am unable to understand how ~/rtems/cpukit/score/cpu/no_cpu/rtems/score/no_cpu.h can be used. And am I right if I say these are in addition to what I have identified? Thanks! On Sun, Jun 21, 2015 at 12:50 PM, Sebastian Huber sebastian.hu...@embedded-brains.de wrote: Hello Rohini, the CPU functions relevant for SMP are documented in the no_cpu/cpu.h file. - Am 20. Jun 2015 um 22:02 schrieb Rohini Kulkarni krohini1...@gmail.com: Hi, I have added an SMP related post to my blog to define where exactly in the code I need to work. Some feedback to indicate if I am identifying the work area correctly would be very helpful! -- Rohini Kulkarni ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: GSoC 2015: Raspberry Pi 2 Support
On Sun, Jun 21, 2015 at 3:04 PM, Rohini Kulkarni krohini1...@gmail.com wrote: I missed mentioning the number of dhrystones in the previous mail. Originally it was 1 million. The new number of dhrystones I executed is 100 million. The next thing to do is to figure out what changes are contributing to the performance improvement, and then prepare patches. :) Great work On Mon, Jun 22, 2015 at 12:29 AM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi all, I have managed to get a significant performance improvement with some changes in configurations. The measured time was for dhrystones reduced from 12 to too small to be measured For dhrystones the time was 0.4. The number of dhrystones per second increased from approximately 8 to 250 :) Thanks! On Sun, Jun 21, 2015 at 1:32 AM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, I have added an SMP related post to my blog to define where exactly in the code I need to work. Some feedback to indicate if I am identifying the work area correctly would be very helpful! Thanks! On 18 Jun 2015 03:37, Rohini Kulkarni krohini1...@gmail.com wrote: Hi all, I have updated my blog to reflect my understanding and attempts for cache performance issue. Lately I have been trying around memory attributes for the mm_config_table. One set of configurations for cacheable memory (inner and outer levels)ended up reducing performance further ( which I really thought would improve). So this table set up certainly controls performance. The results are not improving after turning on cache. So memory sections are perhaps not even getting cached. I get a feeling it has got to do with this mm_config_table. Updates from the github code and blog might help in further discussion. Link to github code:https://github.com/krohini1593/rtems/tree/rohini Link to Blog Thanks! On Mon, Jun 15, 2015 at 8:29 PM, Alan Cudmore alan.cudm...@gmail.com wrote: Hi, Some of the code examples may give you some clues. Like this one: https://github.com/mrvn/test/blob/master/smp.cc Or this: https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT If you still can't figure it out, you can always join the raspberrypi.org forums and ask on this thread: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 When it comes to the Pi 2 and SMP, you are our RTEMS expert :) Thanks, Alan On Sat, Jun 13, 2015 at 2:29 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, This is regarding Pi 2 SMP support. After powering on, the secondary mailboxes read one of their four mailbox registers and wait for a non-zero content to be written. This content is to be the physical address of the location from where the cores are expected to start execution. I am stuck at figuring out this address. How should I go about understanding this? Thanks! On 3 Jun 2015 19:44, Gedare Bloom ged...@gwu.edu wrote: On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni krohini1...@gmail.com wrote: But, I can't say cache configurations have a role here. I'll push my code to my github project soon. P.S. The Pi2 board I possess seems to have broken down. It just isn't turning on. Unable to test further. Will order one immediately. Ouch. Make sure you put it in a safe space for development, clear of threats like moisture, static shock, and cats. On 3 Jun 2015 09:03, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Alan, your suggestion has resulted in much improvement arm_control=0x1000 This has simply worked! Looks like the other cores were taking up plenty of time. I was aware from references that the other cores run a WFI, but ya, did not get its impact. Time for each dhrystone has reduced to 7 from 13 and the no of dhrystones per second also increased. But this is a change only in the config.txt not actually in the boot code. Thanks Rohini On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread -- Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has. Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back,
Re: GSoC 2015: Raspberry Pi 2 Support
Hi all, I have managed to get a significant performance improvement with some changes in configurations. The measured time was for dhrystones reduced from 12 to too small to be measured For dhrystones the time was 0.4. The number of dhrystones per second increased from approximately 8 to 250 :) Thanks! On Sun, Jun 21, 2015 at 1:32 AM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, I have added an SMP related post to my blog to define where exactly in the code I need to work. Some feedback to indicate if I am identifying the work area correctly would be very helpful! Thanks! On 18 Jun 2015 03:37, Rohini Kulkarni krohini1...@gmail.com wrote: Hi all, I have updated my blog to reflect my understanding and attempts for cache performance issue. Lately I have been trying around memory attributes for the mm_config_table. One set of configurations for cacheable memory (inner and outer levels)ended up reducing performance further ( which I really thought would improve). So this table set up certainly controls performance. The results are not improving after turning on cache. So memory sections are perhaps not even getting cached. I get a feeling it has got to do with this mm_config_table. Updates from the github code and blog might help in further discussion. Link to github code:https://github.com/krohini1593/rtems/tree/rohini Link to Blog http://rohiniwithrpi2.blogspot.in/p/blog-page_3.html Thanks! On Mon, Jun 15, 2015 at 8:29 PM, Alan Cudmore alan.cudm...@gmail.com wrote: Hi, Some of the code examples may give you some clues. Like this one: https://github.com/mrvn/test/blob/master/smp.cc Or this: https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT If you still can't figure it out, you can always join the raspberrypi.org forums and ask on this thread: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 When it comes to the Pi 2 and SMP, you are our RTEMS expert :) Thanks, Alan On Sat, Jun 13, 2015 at 2:29 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, This is regarding Pi 2 SMP support. After powering on, the secondary mailboxes read one of their four mailbox registers and wait for a non-zero content to be written. This content is to be the physical address of the location from where the cores are expected to start execution. I am stuck at figuring out this address. How should I go about understanding this? Thanks! On 3 Jun 2015 19:44, Gedare Bloom ged...@gwu.edu wrote: On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni krohini1...@gmail.com wrote: But, I can't say cache configurations have a role here. I'll push my code to my github project soon. P.S. The Pi2 board I possess seems to have broken down. It just isn't turning on. Unable to test further. Will order one immediately. Ouch. Make sure you put it in a safe space for development, clear of threats like moisture, static shock, and cats. On 3 Jun 2015 09:03, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Alan, your suggestion has resulted in much improvement arm_control=0x1000 This has simply worked! Looks like the other cores were taking up plenty of time. I was aware from references that the other cores run a WFI, but ya, did not get its impact. Time for each dhrystone has reduced to 7 from 13 and the no of dhrystones per second also increased. But this is a change only in the config.txt not actually in the boot code. Thanks Rohini On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread -- Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has. Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back, write-allocate it takes ~8 seconds, of 32 fps. PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 cache effect. - End of quote The person who posted the above comment (mrvn) posted the code here: https://github.com/mrvn/test/blob/master/mmu.cc Also, it seems that when the Pi 2 starts up, cores 1-3 are put in a wait loop always accessing the bus. By putting this option
Re: GSoC 2015: Raspberry Pi 2 Support
:) There is very little code that had to be added. I need to clean the code and add conditional call for Pi 2. Then I would be ready to submit a patch. On 22 Jun 2015 00:52, Gedare Bloom ged...@gwu.edu wrote: On Sun, Jun 21, 2015 at 3:04 PM, Rohini Kulkarni krohini1...@gmail.com wrote: I missed mentioning the number of dhrystones in the previous mail. Originally it was 1 million. The new number of dhrystones I executed is 100 million. The next thing to do is to figure out what changes are contributing to the performance improvement, and then prepare patches. :) Great work On Mon, Jun 22, 2015 at 12:29 AM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi all, I have managed to get a significant performance improvement with some changes in configurations. The measured time was for dhrystones reduced from 12 to too small to be measured For dhrystones the time was 0.4. The number of dhrystones per second increased from approximately 8 to 250 :) Thanks! On Sun, Jun 21, 2015 at 1:32 AM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, I have added an SMP related post to my blog to define where exactly in the code I need to work. Some feedback to indicate if I am identifying the work area correctly would be very helpful! Thanks! On 18 Jun 2015 03:37, Rohini Kulkarni krohini1...@gmail.com wrote: Hi all, I have updated my blog to reflect my understanding and attempts for cache performance issue. Lately I have been trying around memory attributes for the mm_config_table. One set of configurations for cacheable memory (inner and outer levels)ended up reducing performance further ( which I really thought would improve). So this table set up certainly controls performance. The results are not improving after turning on cache. So memory sections are perhaps not even getting cached. I get a feeling it has got to do with this mm_config_table. Updates from the github code and blog might help in further discussion. Link to github code:https://github.com/krohini1593/rtems/tree/rohini Link to Blog Thanks! On Mon, Jun 15, 2015 at 8:29 PM, Alan Cudmore alan.cudm...@gmail.com wrote: Hi, Some of the code examples may give you some clues. Like this one: https://github.com/mrvn/test/blob/master/smp.cc Or this: https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT If you still can't figure it out, you can always join the raspberrypi.org forums and ask on this thread: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 When it comes to the Pi 2 and SMP, you are our RTEMS expert :) Thanks, Alan On Sat, Jun 13, 2015 at 2:29 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, This is regarding Pi 2 SMP support. After powering on, the secondary mailboxes read one of their four mailbox registers and wait for a non-zero content to be written. This content is to be the physical address of the location from where the cores are expected to start execution. I am stuck at figuring out this address. How should I go about understanding this? Thanks! On 3 Jun 2015 19:44, Gedare Bloom ged...@gwu.edu wrote: On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni krohini1...@gmail.com wrote: But, I can't say cache configurations have a role here. I'll push my code to my github project soon. P.S. The Pi2 board I possess seems to have broken down. It just isn't turning on. Unable to test further. Will order one immediately. Ouch. Make sure you put it in a safe space for development, clear of threats like moisture, static shock, and cats. On 3 Jun 2015 09:03, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Alan, your suggestion has resulted in much improvement arm_control=0x1000 This has simply worked! Looks like the other cores were taking up plenty of time. I was aware from references that the other cores run a WFI, but ya, did not get its impact. Time for each dhrystone has reduced to 7 from 13 and the no of dhrystones per second also increased. But this is a change only in the config.txt not actually in the boot code. Thanks Rohini On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread
Re: GSoC 2015: Raspberry Pi 2 Support
I missed mentioning the number of dhrystones in the previous mail. Originally it was 1 million. The new number of dhrystones I executed is 100 million. On Mon, Jun 22, 2015 at 12:29 AM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi all, I have managed to get a significant performance improvement with some changes in configurations. The measured time was for dhrystones reduced from 12 to too small to be measured For dhrystones the time was 0.4. The number of dhrystones per second increased from approximately 8 to 250 :) Thanks! On Sun, Jun 21, 2015 at 1:32 AM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, I have added an SMP related post to my blog to define where exactly in the code I need to work. Some feedback to indicate if I am identifying the work area correctly would be very helpful! Thanks! On 18 Jun 2015 03:37, Rohini Kulkarni krohini1...@gmail.com wrote: Hi all, I have updated my blog to reflect my understanding and attempts for cache performance issue. Lately I have been trying around memory attributes for the mm_config_table. One set of configurations for cacheable memory (inner and outer levels)ended up reducing performance further ( which I really thought would improve). So this table set up certainly controls performance. The results are not improving after turning on cache. So memory sections are perhaps not even getting cached. I get a feeling it has got to do with this mm_config_table. Updates from the github code and blog might help in further discussion. Link to github code:https://github.com/krohini1593/rtems/tree/rohini Link to Blog http://rohiniwithrpi2.blogspot.in/p/blog-page_3.html Thanks! On Mon, Jun 15, 2015 at 8:29 PM, Alan Cudmore alan.cudm...@gmail.com wrote: Hi, Some of the code examples may give you some clues. Like this one: https://github.com/mrvn/test/blob/master/smp.cc Or this: https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT If you still can't figure it out, you can always join the raspberrypi.org forums and ask on this thread: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 When it comes to the Pi 2 and SMP, you are our RTEMS expert :) Thanks, Alan On Sat, Jun 13, 2015 at 2:29 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, This is regarding Pi 2 SMP support. After powering on, the secondary mailboxes read one of their four mailbox registers and wait for a non-zero content to be written. This content is to be the physical address of the location from where the cores are expected to start execution. I am stuck at figuring out this address. How should I go about understanding this? Thanks! On 3 Jun 2015 19:44, Gedare Bloom ged...@gwu.edu wrote: On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni krohini1...@gmail.com wrote: But, I can't say cache configurations have a role here. I'll push my code to my github project soon. P.S. The Pi2 board I possess seems to have broken down. It just isn't turning on. Unable to test further. Will order one immediately. Ouch. Make sure you put it in a safe space for development, clear of threats like moisture, static shock, and cats. On 3 Jun 2015 09:03, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Alan, your suggestion has resulted in much improvement arm_control=0x1000 This has simply worked! Looks like the other cores were taking up plenty of time. I was aware from references that the other cores run a WFI, but ya, did not get its impact. Time for each dhrystone has reduced to 7 from 13 and the no of dhrystones per second also increased. But this is a change only in the config.txt not actually in the boot code. Thanks Rohini On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread -- Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has. Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back, write-allocate it takes ~8 seconds, of 32 fps. PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 cache effect. - End of quote The
Re: GSoC 2015: Raspberry Pi 2 Support
Hello Rohini, the CPU functions relevant for SMP are documented in the no_cpu/cpu.h file. - Am 20. Jun 2015 um 22:02 schrieb Rohini Kulkarni krohini1...@gmail.com: Hi, I have added an SMP related post to my blog to define where exactly in the code I need to work. Some feedback to indicate if I am identifying the work area correctly would be very helpful! ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: GSoC 2015: Raspberry Pi 2 Support
Hi, I have added an SMP related post to my blog to define where exactly in the code I need to work. Some feedback to indicate if I am identifying the work area correctly would be very helpful! Thanks! On 18 Jun 2015 03:37, Rohini Kulkarni krohini1...@gmail.com wrote: Hi all, I have updated my blog to reflect my understanding and attempts for cache performance issue. Lately I have been trying around memory attributes for the mm_config_table. One set of configurations for cacheable memory (inner and outer levels)ended up reducing performance further ( which I really thought would improve). So this table set up certainly controls performance. The results are not improving after turning on cache. So memory sections are perhaps not even getting cached. I get a feeling it has got to do with this mm_config_table. Updates from the github code and blog might help in further discussion. Link to github code:https://github.com/krohini1593/rtems/tree/rohini Link to Blog http://rohiniwithrpi2.blogspot.in/p/blog-page_3.html Thanks! On Mon, Jun 15, 2015 at 8:29 PM, Alan Cudmore alan.cudm...@gmail.com wrote: Hi, Some of the code examples may give you some clues. Like this one: https://github.com/mrvn/test/blob/master/smp.cc Or this: https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT If you still can't figure it out, you can always join the raspberrypi.org forums and ask on this thread: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 When it comes to the Pi 2 and SMP, you are our RTEMS expert :) Thanks, Alan On Sat, Jun 13, 2015 at 2:29 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, This is regarding Pi 2 SMP support. After powering on, the secondary mailboxes read one of their four mailbox registers and wait for a non-zero content to be written. This content is to be the physical address of the location from where the cores are expected to start execution. I am stuck at figuring out this address. How should I go about understanding this? Thanks! On 3 Jun 2015 19:44, Gedare Bloom ged...@gwu.edu wrote: On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni krohini1...@gmail.com wrote: But, I can't say cache configurations have a role here. I'll push my code to my github project soon. P.S. The Pi2 board I possess seems to have broken down. It just isn't turning on. Unable to test further. Will order one immediately. Ouch. Make sure you put it in a safe space for development, clear of threats like moisture, static shock, and cats. On 3 Jun 2015 09:03, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Alan, your suggestion has resulted in much improvement arm_control=0x1000 This has simply worked! Looks like the other cores were taking up plenty of time. I was aware from references that the other cores run a WFI, but ya, did not get its impact. Time for each dhrystone has reduced to 7 from 13 and the no of dhrystones per second also increased. But this is a change only in the config.txt not actually in the boot code. Thanks Rohini On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread -- Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has. Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back, write-allocate it takes ~8 seconds, of 32 fps. PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 cache effect. - End of quote The person who posted the above comment (mrvn) posted the code here: https://github.com/mrvn/test/blob/master/mmu.cc Also, it seems that when the Pi 2 starts up, cores 1-3 are put in a wait loop always accessing the bus. By putting this option in the config.txt file you can put the other cores to sleep, speeding up the code on core 1. arm_control=0x1000 It would be worth trying that option to see if the benchmark speeds up. Alan On Jun 2, 2015, at 8:05 AM, Hesham ALMatary heshamelmat...@gmail.com wrote: On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: From what I saw, they have
Re: GSoC 2015: Raspberry Pi 2 Support
Hi all, I have updated my blog to reflect my understanding and attempts for cache performance issue. Lately I have been trying around memory attributes for the mm_config_table. One set of configurations for cacheable memory (inner and outer levels)ended up reducing performance further ( which I really thought would improve). So this table set up certainly controls performance. The results are not improving after turning on cache. So memory sections are perhaps not even getting cached. I get a feeling it has got to do with this mm_config_table. Updates from the github code and blog might help in further discussion. Link to github code:https://github.com/krohini1593/rtems/tree/rohini Link to Blog http://rohiniwithrpi2.blogspot.in/p/blog-page_3.html Thanks! On Mon, Jun 15, 2015 at 8:29 PM, Alan Cudmore alan.cudm...@gmail.com wrote: Hi, Some of the code examples may give you some clues. Like this one: https://github.com/mrvn/test/blob/master/smp.cc Or this: https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT If you still can't figure it out, you can always join the raspberrypi.org forums and ask on this thread: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 When it comes to the Pi 2 and SMP, you are our RTEMS expert :) Thanks, Alan On Sat, Jun 13, 2015 at 2:29 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, This is regarding Pi 2 SMP support. After powering on, the secondary mailboxes read one of their four mailbox registers and wait for a non-zero content to be written. This content is to be the physical address of the location from where the cores are expected to start execution. I am stuck at figuring out this address. How should I go about understanding this? Thanks! On 3 Jun 2015 19:44, Gedare Bloom ged...@gwu.edu wrote: On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni krohini1...@gmail.com wrote: But, I can't say cache configurations have a role here. I'll push my code to my github project soon. P.S. The Pi2 board I possess seems to have broken down. It just isn't turning on. Unable to test further. Will order one immediately. Ouch. Make sure you put it in a safe space for development, clear of threats like moisture, static shock, and cats. On 3 Jun 2015 09:03, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Alan, your suggestion has resulted in much improvement arm_control=0x1000 This has simply worked! Looks like the other cores were taking up plenty of time. I was aware from references that the other cores run a WFI, but ya, did not get its impact. Time for each dhrystone has reduced to 7 from 13 and the no of dhrystones per second also increased. But this is a change only in the config.txt not actually in the boot code. Thanks Rohini On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread -- Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has. Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back, write-allocate it takes ~8 seconds, of 32 fps. PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 cache effect. - End of quote The person who posted the above comment (mrvn) posted the code here: https://github.com/mrvn/test/blob/master/mmu.cc Also, it seems that when the Pi 2 starts up, cores 1-3 are put in a wait loop always accessing the bus. By putting this option in the config.txt file you can put the other cores to sleep, speeding up the code on core 1. arm_control=0x1000 It would be worth trying that option to see if the benchmark speeds up. Alan On Jun 2, 2015, at 8:05 AM, Hesham ALMatary heshamelmat...@gmail.com wrote: On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: From what I saw, they have to be enabled separately. Cache/mmu are disabled upon reset. For the existing Raspberry BSP [1] there's a code for MMU/Cache init, however I don't know about Pi2 and where its code is. [1] https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi On 2
Re: GSoC 2015: Raspberry Pi 2 Support
Hi, Some of the code examples may give you some clues. Like this one: https://github.com/mrvn/test/blob/master/smp.cc Or this: https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT If you still can't figure it out, you can always join the raspberrypi.org forums and ask on this thread: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 When it comes to the Pi 2 and SMP, you are our RTEMS expert :) Thanks, Alan On Sat, Jun 13, 2015 at 2:29 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, This is regarding Pi 2 SMP support. After powering on, the secondary mailboxes read one of their four mailbox registers and wait for a non-zero content to be written. This content is to be the physical address of the location from where the cores are expected to start execution. I am stuck at figuring out this address. How should I go about understanding this? Thanks! On 3 Jun 2015 19:44, Gedare Bloom ged...@gwu.edu wrote: On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni krohini1...@gmail.com wrote: But, I can't say cache configurations have a role here. I'll push my code to my github project soon. P.S. The Pi2 board I possess seems to have broken down. It just isn't turning on. Unable to test further. Will order one immediately. Ouch. Make sure you put it in a safe space for development, clear of threats like moisture, static shock, and cats. On 3 Jun 2015 09:03, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Alan, your suggestion has resulted in much improvement arm_control=0x1000 This has simply worked! Looks like the other cores were taking up plenty of time. I was aware from references that the other cores run a WFI, but ya, did not get its impact. Time for each dhrystone has reduced to 7 from 13 and the no of dhrystones per second also increased. But this is a change only in the config.txt not actually in the boot code. Thanks Rohini On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread -- Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has. Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back, write-allocate it takes ~8 seconds, of 32 fps. PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 cache effect. - End of quote The person who posted the above comment (mrvn) posted the code here: https://github.com/mrvn/test/blob/master/mmu.cc Also, it seems that when the Pi 2 starts up, cores 1-3 are put in a wait loop always accessing the bus. By putting this option in the config.txt file you can put the other cores to sleep, speeding up the code on core 1. arm_control=0x1000 It would be worth trying that option to see if the benchmark speeds up. Alan On Jun 2, 2015, at 8:05 AM, Hesham ALMatary heshamelmat...@gmail.com wrote: On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: From what I saw, they have to be enabled separately. Cache/mmu are disabled upon reset. For the existing Raspberry BSP [1] there's a code for MMU/Cache init, however I don't know about Pi2 and where its code is. [1] https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi On 2 Jun 2015 16:59, Hesham ALMatary heshamelmat...@gmail.com wrote: Hi, Aren't the MMU/Caches enabled by default for RPi [1]? [1] https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: Dr. Joel, So we can't say something solely on the basis of this result? I don't think so. If Linux performs the same, then what you did is as good as it gets. However, if Linux is faster then some setting still isn't right. You need a reference measurement to have any confidence. It is possible you did something but didn't actually turn the cache (or all the cache) on. On 2 Jun 2015 16:28, Rohini Kulkarni krohini1...@gmail.com
Re: GSoC 2015: Raspberry Pi 2 Support
But, I can't say cache configurations have a role here. I'll push my code to my github project soon. P.S. The Pi2 board I possess seems to have broken down. It just isn't turning on. Unable to test further. Will order one immediately. On 3 Jun 2015 09:03, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Alan, your suggestion has resulted in much improvement arm_control=0x1000 This has simply worked! Looks like the other cores were taking up plenty of time. I was aware from references that the other cores run a WFI, but ya, did not get its impact. Time for each dhrystone has reduced to 7 from 13 and the no of dhrystones per second also increased. But this is a change only in the config.txt not actually in the boot code. Thanks Rohini On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread -- Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has. Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back, write-allocate it takes ~8 seconds, of 32 fps. PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 cache effect. - End of quote The person who posted the above comment (mrvn) posted the code here: https://github.com/mrvn/test/blob/master/mmu.cc Also, it seems that when the Pi 2 starts up, cores 1-3 are put in a wait loop always accessing the bus. By putting this option in the config.txt file you can put the other cores to sleep, speeding up the code on core 1. arm_control=0x1000 It would be worth trying that option to see if the benchmark speeds up. Alan On Jun 2, 2015, at 8:05 AM, Hesham ALMatary heshamelmat...@gmail.com wrote: On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: From what I saw, they have to be enabled separately. Cache/mmu are disabled upon reset. For the existing Raspberry BSP [1] there's a code for MMU/Cache init, however I don't know about Pi2 and where its code is. [1] https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi On 2 Jun 2015 16:59, Hesham ALMatary heshamelmat...@gmail.com wrote: Hi, Aren't the MMU/Caches enabled by default for RPi [1]? [1] https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: Dr. Joel, So we can't say something solely on the basis of this result? I don't think so. If Linux performs the same, then what you did is as good as it gets. However, if Linux is faster then some setting still isn't right. You need a reference measurement to have any confidence. It is possible you did something but didn't actually turn the cache (or all the cache) on. On 2 Jun 2015 16:28, Rohini Kulkarni krohini1...@gmail.com wrote: I have not run it under linux on pi2 yet. Will have to run and check the result. On 2 Jun 2015 16:16, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: HI, I tried running the dhrystone benchmark with some changes for cache/mmu set up. However, the output shows a reduction in performance. The time to run through the dhrystone has increased from 12 to 13 and dhrystones run per second decreased. According to this result, things were better with caches disabled. I have been working on this since two days and could not figure out an improvement. Any pointers? How did it do under Linux on the Pi2? Thanks. On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi All, I have to implement the cache coherency support for Cortex A7. But for A7 MPCore, unlike for A9, I am not able to find any register description for the Snoop Control Unit from the TRM. I need help here on how to proceed. Additionally for A9 there is a single bit for A9 in the Auxiliary Control Register which enables cache broadcast operations. The register format is different for A7 and again I am unable to find how to achieve the same for A7. Thanks! On Tue, May 5,
Re: GSoC 2015: Raspberry Pi 2 Support
On Wed, Jun 3, 2015 at 2:39 AM, Rohini Kulkarni krohini1...@gmail.com wrote: But, I can't say cache configurations have a role here. I'll push my code to my github project soon. P.S. The Pi2 board I possess seems to have broken down. It just isn't turning on. Unable to test further. Will order one immediately. Ouch. Make sure you put it in a safe space for development, clear of threats like moisture, static shock, and cats. On 3 Jun 2015 09:03, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Alan, your suggestion has resulted in much improvement arm_control=0x1000 This has simply worked! Looks like the other cores were taking up plenty of time. I was aware from references that the other cores run a WFI, but ya, did not get its impact. Time for each dhrystone has reduced to 7 from 13 and the no of dhrystones per second also increased. But this is a change only in the config.txt not actually in the boot code. Thanks Rohini On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread -- Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has. Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back, write-allocate it takes ~8 seconds, of 32 fps. PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 cache effect. - End of quote The person who posted the above comment (mrvn) posted the code here: https://github.com/mrvn/test/blob/master/mmu.cc Also, it seems that when the Pi 2 starts up, cores 1-3 are put in a wait loop always accessing the bus. By putting this option in the config.txt file you can put the other cores to sleep, speeding up the code on core 1. arm_control=0x1000 It would be worth trying that option to see if the benchmark speeds up. Alan On Jun 2, 2015, at 8:05 AM, Hesham ALMatary heshamelmat...@gmail.com wrote: On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: From what I saw, they have to be enabled separately. Cache/mmu are disabled upon reset. For the existing Raspberry BSP [1] there's a code for MMU/Cache init, however I don't know about Pi2 and where its code is. [1] https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi On 2 Jun 2015 16:59, Hesham ALMatary heshamelmat...@gmail.com wrote: Hi, Aren't the MMU/Caches enabled by default for RPi [1]? [1] https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: Dr. Joel, So we can't say something solely on the basis of this result? I don't think so. If Linux performs the same, then what you did is as good as it gets. However, if Linux is faster then some setting still isn't right. You need a reference measurement to have any confidence. It is possible you did something but didn't actually turn the cache (or all the cache) on. On 2 Jun 2015 16:28, Rohini Kulkarni krohini1...@gmail.com wrote: I have not run it under linux on pi2 yet. Will have to run and check the result. On 2 Jun 2015 16:16, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: HI, I tried running the dhrystone benchmark with some changes for cache/mmu set up. However, the output shows a reduction in performance. The time to run through the dhrystone has increased from 12 to 13 and dhrystones run per second decreased. According to this result, things were better with caches disabled. I have been working on this since two days and could not figure out an improvement. Any pointers? How did it do under Linux on the Pi2? Thanks. On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi All, I have to implement the cache coherency support for Cortex A7. But for A7 MPCore, unlike for A9, I am not able to find any register description for the Snoop Control Unit from the TRM. I need help here on how to proceed. Additionally for A9 there is a single bit for A9 in the Auxiliary
Re: GSoC 2015: Raspberry Pi 2 Support
On Tue, Jun 2, 2015 at 9:42 PM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread -- Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has. Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back, write-allocate it takes ~8 seconds, of 32 fps. PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 cache effect. - End of quote The person who posted the above comment (mrvn) posted the code here: https://github.com/mrvn/test/blob/master/mmu.cc Make sure not to copy the code from there though, as it is GPL3. You may refer to it for some learning though. ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: GSoC 2015: Raspberry Pi 2 Support
Hi, Alan, your suggestion has resulted in much improvement arm_control=0x1000 This has simply worked! Looks like the other cores were taking up plenty of time. I was aware from references that the other cores run a WFI, but ya, did not get its impact. Time for each dhrystone has reduced to 7 from 13 and the no of dhrystones per second also increased. But this is a change only in the config.txt not actually in the boot code. Thanks Rohini On Wed, Jun 3, 2015 at 7:12 AM, Alan Cudmore alan.cudm...@gmail.com wrote: The caches are being enabled on the RPI 1 BSP. The same code is being executed by the RPI 2 BSP, but obviously it’s not sufficient for the cache setup. I have been reading through this long thread, and it is very informative: https://www.raspberrypi.org/forums/viewtopic.php?f=72t=98904 I am starting to understand the setup that is required to enable caches on the RPI 2. For example this message near the bottom of page 3 gives a good indication of the speedup available by configuring the MMU and caches correctly: Quote from above thread -- Enabling I/D caches and branch prediction, just like the julia demo uses, it takes ~12 seconds, or ~21 fps. It's just one core but also a much smaller loop than the julia demo has. Enabling the MMU and mapping memory inner/outer write-back, write allocate and the framebuffer inner write-through, no write allocate + outer write-back, write-allocate it takes ~8 seconds, of 32 fps. PS: 640x480x32 with MMU gets me ~256 fps. Must have a greater L2 cache effect. - End of quote The person who posted the above comment (mrvn) posted the code here: https://github.com/mrvn/test/blob/master/mmu.cc Also, it seems that when the Pi 2 starts up, cores 1-3 are put in a wait loop always accessing the bus. By putting this option in the config.txt file you can put the other cores to sleep, speeding up the code on core 1. arm_control=0x1000 It would be worth trying that option to see if the benchmark speeds up. Alan On Jun 2, 2015, at 8:05 AM, Hesham ALMatary heshamelmat...@gmail.com wrote: On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: From what I saw, they have to be enabled separately. Cache/mmu are disabled upon reset. For the existing Raspberry BSP [1] there's a code for MMU/Cache init, however I don't know about Pi2 and where its code is. [1] https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi On 2 Jun 2015 16:59, Hesham ALMatary heshamelmat...@gmail.com wrote: Hi, Aren't the MMU/Caches enabled by default for RPi [1]? [1] https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: Dr. Joel, So we can't say something solely on the basis of this result? I don't think so. If Linux performs the same, then what you did is as good as it gets. However, if Linux is faster then some setting still isn't right. You need a reference measurement to have any confidence. It is possible you did something but didn't actually turn the cache (or all the cache) on. On 2 Jun 2015 16:28, Rohini Kulkarni krohini1...@gmail.com wrote: I have not run it under linux on pi2 yet. Will have to run and check the result. On 2 Jun 2015 16:16, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: HI, I tried running the dhrystone benchmark with some changes for cache/mmu set up. However, the output shows a reduction in performance. The time to run through the dhrystone has increased from 12 to 13 and dhrystones run per second decreased. According to this result, things were better with caches disabled. I have been working on this since two days and could not figure out an improvement. Any pointers? How did it do under Linux on the Pi2? Thanks. On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi All, I have to implement the cache coherency support for Cortex A7. But for A7 MPCore, unlike for A9, I am not able to find any register description for the Snoop Control Unit from the TRM. I need help here on how to proceed. Additionally for A9 there is a single bit for A9 in the Auxiliary Control Register which enables cache broadcast operations. The register format is different for A7 and again I am unable to find how to achieve the same for A7. Thanks! On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On 5/5/2015 11:11 AM, Rohini Kulkarni wrote: Hi, I am working with the code for bsp hooks. I am referring to existing ARM multicore bsp codes, zync mainly. 1. There are existing hooks for the raspberry pi. Where should the code for the Pi2
Re: GSoC 2015: Raspberry Pi 2 Support
From what I saw, they have to be enabled separately. Cache/mmu are disabled upon reset. On 2 Jun 2015 16:59, Hesham ALMatary heshamelmat...@gmail.com wrote: Hi, Aren't the MMU/Caches enabled by default for RPi [1]? [1] https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: Dr. Joel, So we can't say something solely on the basis of this result? I don't think so. If Linux performs the same, then what you did is as good as it gets. However, if Linux is faster then some setting still isn't right. You need a reference measurement to have any confidence. It is possible you did something but didn't actually turn the cache (or all the cache) on. On 2 Jun 2015 16:28, Rohini Kulkarni krohini1...@gmail.com wrote: I have not run it under linux on pi2 yet. Will have to run and check the result. On 2 Jun 2015 16:16, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: HI, I tried running the dhrystone benchmark with some changes for cache/mmu set up. However, the output shows a reduction in performance. The time to run through the dhrystone has increased from 12 to 13 and dhrystones run per second decreased. According to this result, things were better with caches disabled. I have been working on this since two days and could not figure out an improvement. Any pointers? How did it do under Linux on the Pi2? Thanks. On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi All, I have to implement the cache coherency support for Cortex A7. But for A7 MPCore, unlike for A9, I am not able to find any register description for the Snoop Control Unit from the TRM. I need help here on how to proceed. Additionally for A9 there is a single bit for A9 in the Auxiliary Control Register which enables cache broadcast operations. The register format is different for A7 and again I am unable to find how to achieve the same for A7. Thanks! On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On 5/5/2015 11:11 AM, Rohini Kulkarni wrote: Hi, I am working with the code for bsp hooks. I am referring to existing ARM multicore bsp codes, zync mainly. 1. There are existing hooks for the raspberry pi. Where should the code for the Pi2 hooks be added? The Pi and Pi2 are remarkably similar so Pi2 should be placed inside the Pi BSP directory. There is already a Pi2 variant of that code built. But we know specific places where there are variances. Depending on the scope of what is different, it can be as simple as a cpp conditional in a .h to select a value or two implementations of a single method and the Makefile.am picking the right file to build based on the board variant. The big question to always ask is: Is this specific to the Pi2 and incompatible with the Pi? Since the Pi BSP is still missing capabilities, it is likely code common to both will be added this summer. For example, did the mailbox interface change? I don't know but would guess that it didn't. Each new capability added needs that added. And any differences need to be analyzed to pick the least intrusive way to provide alternate implementations. Or enable special code like the Pi2 SMP support which is dependent on --enable-smp and being a Pi2. 2. Am I right in understanding that I will have to implement A7 specific functions as have been for A9? I am referring specifically to the arm-a9mpcore-start.h Yes. If the code is very similar between the a7 and a9, then a discussion on devel@ should occur to decide the best way to minimize duplication. If you end up with a7 specific code, you should follow the location and naming patterns already established. That places it in libbsp/arm/shared/... so it can be used by any BSP with the right SMP core. I am referring to existing codes to locate and get hold of what needs to be done in the hooks. However, being new to such implementations, I am taking longer to understand the details. Any suggestions that might help here are welcome The answer will depend on the factors listed above. When code can be shared, we want to share it across as many BSPs as makes sense. When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2), then you want to find the way to account for the variation in the least intrusive code way possible. Thanks! On 1 May 2015 12:45, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Excited to be a part of this edition of GSoC! Thanks to everybody for helping me get here and congratulations to all the participating students! So, now getting to work, firstly I wish to know, specifically from
Re: GSoC 2015: Raspberry Pi 2 Support
On June 2, 2015 7:29:52 AM EDT, Hesham ALMatary heshamelmat...@gmail.com wrote: Hi, Aren't the MMU/Caches enabled by default for RPi [1]? Yes but I recall that the setup is different on the Pi2 and Alan disabled the code to to work at all. [1] https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: Dr. Joel, So we can't say something solely on the basis of this result? I don't think so. If Linux performs the same, then what you did is as good as it gets. However, if Linux is faster then some setting still isn't right. You need a reference measurement to have any confidence. It is possible you did something but didn't actually turn the cache (or all the cache) on. On 2 Jun 2015 16:28, Rohini Kulkarni krohini1...@gmail.com wrote: I have not run it under linux on pi2 yet. Will have to run and check the result. On 2 Jun 2015 16:16, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: HI, I tried running the dhrystone benchmark with some changes for cache/mmu set up. However, the output shows a reduction in performance. The time to run through the dhrystone has increased from 12 to 13 and dhrystones run per second decreased. According to this result, things were better with caches disabled. I have been working on this since two days and could not figure out an improvement. Any pointers? How did it do under Linux on the Pi2? Thanks. On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi All, I have to implement the cache coherency support for Cortex A7. But for A7 MPCore, unlike for A9, I am not able to find any register description for the Snoop Control Unit from the TRM. I need help here on how to proceed. Additionally for A9 there is a single bit for A9 in the Auxiliary Control Register which enables cache broadcast operations. The register format is different for A7 and again I am unable to find how to achieve the same for A7. Thanks! On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On 5/5/2015 11:11 AM, Rohini Kulkarni wrote: Hi, I am working with the code for bsp hooks. I am referring to existing ARM multicore bsp codes, zync mainly. 1. There are existing hooks for the raspberry pi. Where should the code for the Pi2 hooks be added? The Pi and Pi2 are remarkably similar so Pi2 should be placed inside the Pi BSP directory. There is already a Pi2 variant of that code built. But we know specific places where there are variances. Depending on the scope of what is different, it can be as simple as a cpp conditional in a .h to select a value or two implementations of a single method and the Makefile.am picking the right file to build based on the board variant. The big question to always ask is: Is this specific to the Pi2 and incompatible with the Pi? Since the Pi BSP is still missing capabilities, it is likely code common to both will be added this summer. For example, did the mailbox interface change? I don't know but would guess that it didn't. Each new capability added needs that added. And any differences need to be analyzed to pick the least intrusive way to provide alternate implementations. Or enable special code like the Pi2 SMP support which is dependent on --enable-smp and being a Pi2. 2. Am I right in understanding that I will have to implement A7 specific functions as have been for A9? I am referring specifically to the arm-a9mpcore-start.h Yes. If the code is very similar between the a7 and a9, then a discussion on devel@ should occur to decide the best way to minimize duplication. If you end up with a7 specific code, you should follow the location and naming patterns already established. That places it in libbsp/arm/shared/... so it can be used by any BSP with the right SMP core. I am referring to existing codes to locate and get hold of what needs to be done in the hooks. However, being new to such implementations, I am taking longer to understand the details. Any suggestions that might help here are welcome The answer will depend on the factors listed above. When code can be shared, we want to share it across as many BSPs as makes sense. When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2), then you want to find the way to account for the variation in the least intrusive code way possible. Thanks! On 1 May 2015 12:45, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Excited to be a part of this edition of GSoC! Thanks to everybody for helping me get here and congratulations to all the participating students! So, now getting to work, firstly I wish to know, specifically from my mentors, any changes that must be made to my proposed project or schedule. Secondly, are there any specifics for the development blog
Re: GSoC 2015: Raspberry Pi 2 Support
HI, I tried running the dhrystone benchmark with some changes for cache/mmu set up. However, the output shows a reduction in performance. The time to run through the dhrystone has increased from 12 to 13 and dhrystones run per second decreased. According to this result, things were better with caches disabled. I have been working on this since two days and could not figure out an improvement. Any pointers? Thanks. On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi All, I have to implement the cache coherency support for Cortex A7. But for A7 MPCore, unlike for A9, I am not able to find any register description for the Snoop Control Unit from the TRM. I need help here on how to proceed. Additionally for A9 there is a single bit for A9 in the Auxiliary Control Register which enables cache broadcast operations. The register format is different for A7 and again I am unable to find how to achieve the same for A7. Thanks! On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On 5/5/2015 11:11 AM, Rohini Kulkarni wrote: Hi, I am working with the code for bsp hooks. I am referring to existing ARM multicore bsp codes, zync mainly. 1. There are existing hooks for the raspberry pi. Where should the code for the Pi2 hooks be added? The Pi and Pi2 are remarkably similar so Pi2 should be placed inside the Pi BSP directory. There is already a Pi2 variant of that code built. But we know specific places where there are variances. Depending on the scope of what is different, it can be as simple as a cpp conditional in a .h to select a value or two implementations of a single method and the Makefile.am picking the right file to build based on the board variant. The big question to always ask is: Is this specific to the Pi2 and incompatible with the Pi? Since the Pi BSP is still missing capabilities, it is likely code common to both will be added this summer. For example, did the mailbox interface change? I don't know but would guess that it didn't. Each new capability added needs that added. And any differences need to be analyzed to pick the least intrusive way to provide alternate implementations. Or enable special code like the Pi2 SMP support which is dependent on --enable-smp and being a Pi2. 2. Am I right in understanding that I will have to implement A7 specific functions as have been for A9? I am referring specifically to the arm-a9mpcore-start.h Yes. If the code is very similar between the a7 and a9, then a discussion on devel@ should occur to decide the best way to minimize duplication. If you end up with a7 specific code, you should follow the location and naming patterns already established. That places it in libbsp/arm/shared/... so it can be used by any BSP with the right SMP core. I am referring to existing codes to locate and get hold of what needs to be done in the hooks. However, being new to such implementations, I am taking longer to understand the details. Any suggestions that might help here are welcome The answer will depend on the factors listed above. When code can be shared, we want to share it across as many BSPs as makes sense. When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2), then you want to find the way to account for the variation in the least intrusive code way possible. Thanks! On 1 May 2015 12:45, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Excited to be a part of this edition of GSoC! Thanks to everybody for helping me get here and congratulations to all the participating students! So, now getting to work, firstly I wish to know, specifically from my mentors, any changes that must be made to my proposed project or schedule. Secondly, are there any specifics for the development blog that we need to create for the project? Over time what is the blog expected to convey. Also, I have to create a new wiki page for my project as none exists. I want to know how to add one. -- Rohini Kulkarni -- Joel Sherrill, Ph.D. Director of Research developmentjoel.sherr...@oarcorp.comOn-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available(256) 722-9985 -- Rohini Kulkarni -- Rohini Kulkarni ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: GSoC 2015: Raspberry Pi 2 Support
On Tue, Jun 2, 2015 at 12:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: From what I saw, they have to be enabled separately. Cache/mmu are disabled upon reset. For the existing Raspberry BSP [1] there's a code for MMU/Cache init, however I don't know about Pi2 and where its code is. [1] https://github.com/RTEMS/rtems/tree/master/c/src/lib/libbsp/arm/raspberrypi On 2 Jun 2015 16:59, Hesham ALMatary heshamelmat...@gmail.com wrote: Hi, Aren't the MMU/Caches enabled by default for RPi [1]? [1] https://github.com/RTEMS/rtems/blob/master/c/src/lib/libbsp/arm/shared/mminit.c On Tue, Jun 2, 2015 at 12:18 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 7:01:21 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: Dr. Joel, So we can't say something solely on the basis of this result? I don't think so. If Linux performs the same, then what you did is as good as it gets. However, if Linux is faster then some setting still isn't right. You need a reference measurement to have any confidence. It is possible you did something but didn't actually turn the cache (or all the cache) on. On 2 Jun 2015 16:28, Rohini Kulkarni krohini1...@gmail.com wrote: I have not run it under linux on pi2 yet. Will have to run and check the result. On 2 Jun 2015 16:16, Joel Sherrill joel.sherr...@oarcorp.com wrote: On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: HI, I tried running the dhrystone benchmark with some changes for cache/mmu set up. However, the output shows a reduction in performance. The time to run through the dhrystone has increased from 12 to 13 and dhrystones run per second decreased. According to this result, things were better with caches disabled. I have been working on this since two days and could not figure out an improvement. Any pointers? How did it do under Linux on the Pi2? Thanks. On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi All, I have to implement the cache coherency support for Cortex A7. But for A7 MPCore, unlike for A9, I am not able to find any register description for the Snoop Control Unit from the TRM. I need help here on how to proceed. Additionally for A9 there is a single bit for A9 in the Auxiliary Control Register which enables cache broadcast operations. The register format is different for A7 and again I am unable to find how to achieve the same for A7. Thanks! On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On 5/5/2015 11:11 AM, Rohini Kulkarni wrote: Hi, I am working with the code for bsp hooks. I am referring to existing ARM multicore bsp codes, zync mainly. 1. There are existing hooks for the raspberry pi. Where should the code for the Pi2 hooks be added? The Pi and Pi2 are remarkably similar so Pi2 should be placed inside the Pi BSP directory. There is already a Pi2 variant of that code built. But we know specific places where there are variances. Depending on the scope of what is different, it can be as simple as a cpp conditional in a .h to select a value or two implementations of a single method and the Makefile.am picking the right file to build based on the board variant. The big question to always ask is: Is this specific to the Pi2 and incompatible with the Pi? Since the Pi BSP is still missing capabilities, it is likely code common to both will be added this summer. For example, did the mailbox interface change? I don't know but would guess that it didn't. Each new capability added needs that added. And any differences need to be analyzed to pick the least intrusive way to provide alternate implementations. Or enable special code like the Pi2 SMP support which is dependent on --enable-smp and being a Pi2. 2. Am I right in understanding that I will have to implement A7 specific functions as have been for A9? I am referring specifically to the arm-a9mpcore-start.h Yes. If the code is very similar between the a7 and a9, then a discussion on devel@ should occur to decide the best way to minimize duplication. If you end up with a7 specific code, you should follow the location and naming patterns already established. That places it in libbsp/arm/shared/... so it can be used by any BSP with the right SMP core. I am referring to existing codes to locate and get hold of what needs to be done in the hooks. However, being new to such implementations, I am taking longer to understand the details. Any suggestions that might help here are welcome The answer will depend on the factors listed above. When code can be shared, we want to share it across as many BSPs as makes sense. When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2), then you want to find the way to account for the variation in the least intrusive code way possible. Thanks! On 1
Re: GSoC 2015: Raspberry Pi 2 Support
On June 2, 2015 5:58:33 AM EDT, Rohini Kulkarni krohini1...@gmail.com wrote: HI, I tried running the dhrystone benchmark with some changes for cache/mmu set up. However, the output shows a reduction in performance. The time to run through the dhrystone has increased from 12 to 13 and dhrystones run per second decreased. According to this result, things were better with caches disabled. I have been working on this since two days and could not figure out an improvement. Any pointers? How did it do under Linux on the Pi2? Thanks. On Thu, May 28, 2015 at 8:41 PM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi All, I have to implement the cache coherency support for Cortex A7. But for A7 MPCore, unlike for A9, I am not able to find any register description for the Snoop Control Unit from the TRM. I need help here on how to proceed. Additionally for A9 there is a single bit for A9 in the Auxiliary Control Register which enables cache broadcast operations. The register format is different for A7 and again I am unable to find how to achieve the same for A7. Thanks! On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On 5/5/2015 11:11 AM, Rohini Kulkarni wrote: Hi, I am working with the code for bsp hooks. I am referring to existing ARM multicore bsp codes, zync mainly. 1. There are existing hooks for the raspberry pi. Where should the code for the Pi2 hooks be added? The Pi and Pi2 are remarkably similar so Pi2 should be placed inside the Pi BSP directory. There is already a Pi2 variant of that code built. But we know specific places where there are variances. Depending on the scope of what is different, it can be as simple as a cpp conditional in a .h to select a value or two implementations of a single method and the Makefile.am picking the right file to build based on the board variant. The big question to always ask is: Is this specific to the Pi2 and incompatible with the Pi? Since the Pi BSP is still missing capabilities, it is likely code common to both will be added this summer. For example, did the mailbox interface change? I don't know but would guess that it didn't. Each new capability added needs that added. And any differences need to be analyzed to pick the least intrusive way to provide alternate implementations. Or enable special code like the Pi2 SMP support which is dependent on --enable-smp and being a Pi2. 2. Am I right in understanding that I will have to implement A7 specific functions as have been for A9? I am referring specifically to the arm-a9mpcore-start.h Yes. If the code is very similar between the a7 and a9, then a discussion on devel@ should occur to decide the best way to minimize duplication. If you end up with a7 specific code, you should follow the location and naming patterns already established. That places it in libbsp/arm/shared/... so it can be used by any BSP with the right SMP core. I am referring to existing codes to locate and get hold of what needs to be done in the hooks. However, being new to such implementations, I am taking longer to understand the details. Any suggestions that might help here are welcome The answer will depend on the factors listed above. When code can be shared, we want to share it across as many BSPs as makes sense. When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2), then you want to find the way to account for the variation in the least intrusive code way possible. Thanks! On 1 May 2015 12:45, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Excited to be a part of this edition of GSoC! Thanks to everybody for helping me get here and congratulations to all the participating students! So, now getting to work, firstly I wish to know, specifically from my mentors, any changes that must be made to my proposed project or schedule. Secondly, are there any specifics for the development blog that we need to create for the project? Over time what is the blog expected to convey. Also, I have to create a new wiki page for my project as none exists. I want to know how to add one. -- Rohini Kulkarni -- Joel Sherrill, Ph.D. Director of Research Development joel.sherr...@oarcorp.com On-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available (256) 722-9985 -- Rohini Kulkarni -- Rohini Kulkarni --joel ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
RPI2 Cache configuration Was: Re: GSoC 2015: Raspberry Pi 2 Support
On Thu, May 28, 2015 at 10:11 AM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi All, I have to implement the cache coherency support for Cortex A7. But for A7 MPCore, unlike for A9, I am not able to find any register description for the Snoop Control Unit from the TRM. I need help here on how to proceed. Based on 10 minutes from searching through the online TRM, I guess(?) you need to set the pins for the ACE configuration as described in http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0464e/BABJECBF.html See also: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0464f/BABBAAII.html Additionally for A9 there is a single bit for A9 in the Auxiliary Control Register which enables cache broadcast operations. The register format is different for A7 and again I am unable to find how to achieve the same for A7. I think this is also controlled through the ACE. Thanks! ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: GSoC 2015: Raspberry Pi 2 Support
Hi All, I have to implement the cache coherency support for Cortex A7. But for A7 MPCore, unlike for A9, I am not able to find any register description for the Snoop Control Unit from the TRM. I need help here on how to proceed. Additionally for A9 there is a single bit for A9 in the Auxiliary Control Register which enables cache broadcast operations. The register format is different for A7 and again I am unable to find how to achieve the same for A7. Thanks! On Tue, May 5, 2015 at 10:42 PM, Joel Sherrill joel.sherr...@oarcorp.com wrote: On 5/5/2015 11:11 AM, Rohini Kulkarni wrote: Hi, I am working with the code for bsp hooks. I am referring to existing ARM multicore bsp codes, zync mainly. 1. There are existing hooks for the raspberry pi. Where should the code for the Pi2 hooks be added? The Pi and Pi2 are remarkably similar so Pi2 should be placed inside the Pi BSP directory. There is already a Pi2 variant of that code built. But we know specific places where there are variances. Depending on the scope of what is different, it can be as simple as a cpp conditional in a .h to select a value or two implementations of a single method and the Makefile.am picking the right file to build based on the board variant. The big question to always ask is: Is this specific to the Pi2 and incompatible with the Pi? Since the Pi BSP is still missing capabilities, it is likely code common to both will be added this summer. For example, did the mailbox interface change? I don't know but would guess that it didn't. Each new capability added needs that added. And any differences need to be analyzed to pick the least intrusive way to provide alternate implementations. Or enable special code like the Pi2 SMP support which is dependent on --enable-smp and being a Pi2. 2. Am I right in understanding that I will have to implement A7 specific functions as have been for A9? I am referring specifically to the arm-a9mpcore-start.h Yes. If the code is very similar between the a7 and a9, then a discussion on devel@ should occur to decide the best way to minimize duplication. If you end up with a7 specific code, you should follow the location and naming patterns already established. That places it in libbsp/arm/shared/... so it can be used by any BSP with the right SMP core. I am referring to existing codes to locate and get hold of what needs to be done in the hooks. However, being new to such implementations, I am taking longer to understand the details. Any suggestions that might help here are welcome The answer will depend on the factors listed above. When code can be shared, we want to share it across as many BSPs as makes sense. When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2), then you want to find the way to account for the variation in the least intrusive code way possible. Thanks! On 1 May 2015 12:45, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Excited to be a part of this edition of GSoC! Thanks to everybody for helping me get here and congratulations to all the participating students! So, now getting to work, firstly I wish to know, specifically from my mentors, any changes that must be made to my proposed project or schedule. Secondly, are there any specifics for the development blog that we need to create for the project? Over time what is the blog expected to convey. Also, I have to create a new wiki page for my project as none exists. I want to know how to add one. -- Rohini Kulkarni -- Joel Sherrill, Ph.D. Director of Research developmentjoel.sherr...@oarcorp.comOn-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available(256) 722-9985 -- Rohini Kulkarni ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: GSoC 2015: Raspberry Pi 2 Support
On 5/5/2015 11:11 AM, Rohini Kulkarni wrote: Hi, I am working with the code for bsp hooks. I am referring to existing ARM multicore bsp codes, zync mainly. 1. There are existing hooks for the raspberry pi. Where should the code for the Pi2 hooks be added? The Pi and Pi2 are remarkably similar so Pi2 should be placed inside the Pi BSP directory. There is already a Pi2 variant of that code built. But we know specific places where there are variances. Depending on the scope of what is different, it can be as simple as a cpp conditional in a .h to select a value or two implementations of a single method and the Makefile.am picking the right file to build based on the board variant. The big question to always ask is: Is this specific to the Pi2 and incompatible with the Pi? Since the Pi BSP is still missing capabilities, it is likely code common to both will be added this summer. For example, did the mailbox interface change? I don't know but would guess that it didn't. Each new capability added needs that added. And any differences need to be analyzed to pick the least intrusive way to provide alternate implementations. Or enable special code like the Pi2 SMP support which is dependent on --enable-smp and being a Pi2. 2. Am I right in understanding that I will have to implement A7 specific functions as have been for A9? I am referring specifically to the arm-a9mpcore-start.h Yes. If the code is very similar between the a7 and a9, then a discussion on devel@ should occur to decide the best way to minimize duplication. If you end up with a7 specific code, you should follow the location and naming patterns already established. That places it in libbsp/arm/shared/... so it can be used by any BSP with the right SMP core. I am referring to existing codes to locate and get hold of what needs to be done in the hooks. However, being new to such implementations, I am taking longer to understand the details. Any suggestions that might help here are welcome The answer will depend on the factors listed above. When code can be shared, we want to share it across as many BSPs as makes sense. When it is unique to a specific BSP **variant** (e.g. Pi vs Pi2), then you want to find the way to account for the variation in the least intrusive code way possible. Thanks! On 1 May 2015 12:45, Rohini Kulkarni krohini1...@gmail.com mailto:krohini1...@gmail.com wrote: Hi, Excited to be a part of this edition of GSoC! Thanks to everybody for helping me get here and congratulations to all the participating students! So, now getting to work, firstly I wish to know, specifically from my mentors, any changes that must be made to my proposed project or schedule. Secondly, are there any specifics for the development blog that we need to create for the project? Over time what is the blog expected to convey. Also, I have to create a new wiki page for my project as none exists. I want to know how to add one. -- Rohini Kulkarni -- Joel Sherrill, Ph.D. Director of Research Development joel.sherr...@oarcorp.comOn-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available(256) 722-9985 ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel
Re: GSoC 2015: Raspberry Pi 2 Support
On Fri, May 1, 2015 at 3:15 AM, Rohini Kulkarni krohini1...@gmail.com wrote: Hi, Excited to be a part of this edition of GSoC! Thanks to everybody for helping me get here and congratulations to all the participating students! So, now getting to work, firstly I wish to know, specifically from my mentors, any changes that must be made to my proposed project or schedule. Get yourself an rpi2 working first, and start to prototype the code for smp support by copying from another bsp like the zynq. Secondly, are there any specifics for the development blog that we need to create for the project? Over time what is the blog expected to convey. The blog should help you to disseminate your design notes, progress, and instructions on how to reproduce your testing. The blog is a useful place for you to put lots of details about your project to have a single consolidated record about it. Also, I have to create a new wiki page for my project as none exists. I want to know how to add one. You can create a TracLink to a wiki page by editing the Tracking Table, and then click the link to the page. You should create it under GSoC/2015/ folder with a suitable name for your project, e.g. /GSoC/2015/RaspberryPi2Support Gedare -- Rohini Kulkarni ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel ___ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel