Thanks for the help Bas! I guess i found the problem:
In my code, I used PRU0 simply as a timer with the code LOOP: WAIT(250 cycles) XOUT 14,r5,4 //transfer register r5 from PRU0 to PRU1 JMP LOOP In the meantime, PRU1 did some tasks, including multiplication using XOUT/XIN 0,r25,1 and similar instructions, and finally should have stalled at the instruction XIN 14,r5,4 in order to synchronize with PRU0. However, if the timing is initially not right, it can happen that PRU0 waits for the other PRU while blocking the XCHG port with its XOUT 14,... command. If now PRU1 wants to retrieve the result of a multiplication, e.g. execute XIN 0,r26,4, then it will wait until the XCHG port is liberated by PRU0, which itself will wait for maximally 1024 cycles if PRU1 accepts its XOUT request while keeping the port blocked, such that PRU1 can never get to that section in the code in time. In this case the two PRU's block each other and the programs runs about 1000 times slower! Also, the controls which are run to ensure a proper transfer through the XCHG port are quite basic: It seems to me that for a successful transfer between two PRUs, one only needs one PRU that is willing to write (launching XOUT 14,... ) and the other willing to read (XIN 14,...). The actual registers which are to be read or written, or the amount of data does not have to match between the two commands. If they dont match, I dont know what data is actually written, but at least none of the PRUs stalls for 1024 cycles. -- For more options, visit http://beagleboard.org/discuss --- You received this message because you are subscribed to the Google Groups "BeagleBoard" group. To unsubscribe from this group and stop receiving emails from it, send an email to beagleboard+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.