Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
Hello, On 05.02.2011 10:43, Øyvind Harboe wrote: What sort of CPU did you run the tests on? Which test? The target cpu/mcu or my system cpu? Let me know when the patch is ready to be committed. I suppose it could need a bit of coolof . I think its fine. Regards, Mathias ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
On Tue, Feb 8, 2011 at 9:09 AM, Mathias K. kes...@freenet.de wrote: Hello, On 05.02.2011 10:43, Øyvind Harboe wrote: What sort of CPU did you run the tests on? Which test? The target cpu/mcu or my system cpu? System CPU. Let me know when the patch is ready to be committed. I suppose it could need a bit of coolof . I think its fine. OK. There has been a lot of discussion back and forth about the wonders of optimization, but you're the only one who's submitted a patch, so I'll commit that first. -- Øyvind Harboe Can Zylin Consulting help on your project? US toll free 1-866-980-3434 / International +47 51 87 40 27 http://www.zylin.com/zy1000.html ARM7 ARM9 ARM11 XScale Cortex JTAG debugger and flash programmer ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
This code should be better optimized. But, shouldn't it be: sb = src_start / 8; db = dst_start / 8; sq = src_start % 8; dq = dst_start % 8; src += sb; dst += db; 2011/2/8 Øyvind Harboe oyvind.har...@zylin.com Merged. Thanks! -- Øyvind Harboe Can Zylin Consulting help on your project? US toll free 1-866-980-3434 / International +47 51 87 40 27 http://www.zylin.com/zy1000.html ARM7 ARM9 ARM11 XScale Cortex JTAG debugger and flash programmer ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development -- Best Regards, SimonQian http://www.SimonQian.com ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
On Tue, Feb 8, 2011 at 11:29 AM, simon qian simonqian.open...@gmail.com wrote: This code should be better optimized. Patches welcome! But, shouldn't it be: sb = src_start / 8; db = dst_start / 8; sq = src_start % 8; dq = dst_start % 8; src += sb; dst += db; Isn't in master branch? I committed the wrong version and then the correct one(hopefully!) -- Øyvind Harboe Can Zylin Consulting help on your project? US toll free 1-866-980-3434 / International +47 51 87 40 27 http://www.zylin.com/zy1000.html ARM7 ARM9 ARM11 XScale Cortex JTAG debugger and flash programmer ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
I check the master branch, it's fixed an hour ago. 2011/2/8 Øyvind Harboe oyvind.har...@zylin.com On Tue, Feb 8, 2011 at 11:48 AM, simon qian simonqian.open...@gmail.com wrote: It's not in the branch. Here is the patch. Are you sure? I tried to apply your patch and it failed. Could you rebase your branch on top of the master branch? -- Øyvind Harboe Can Zylin Consulting help on your project? US toll free 1-866-980-3434 / International +47 51 87 40 27 http://www.zylin.com/zy1000.html ARM7 ARM9 ARM11 XScale Cortex JTAG debugger and flash programmer -- Best Regards, SimonQian http://www.SimonQian.com ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
What impact would it have to make this an inline fn? How often are unsigned src_start, unsigned, dst_start, unsigned len constants that could be dealt with through constant propagation and code elimination? -- Øyvind Harboe Can Zylin Consulting help on your project? US toll free 1-866-980-3434 / International +47 51 87 40 27 http://www.zylin.com/zy1000.html ARM7 ARM9 ARM11 XScale Cortex JTAG debugger and flash programmer ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
Hi all! This seems a good idea, and you can even improve the copy doing someting like this (pseudo-code only): void* buf_set_buf(const void *_src, unsigned src_start, void *_dst, unsigned dst_start, unsigned len) { /* Are src and dst bit aligned? */ if ((dst_start % 8) != (src_start % 8)) { /* No - bit to bit copy */ return ; } /* Manage non-byte data at the beginning */ while (((dst_start % 8) != 0) len) { /* bit to bit copy */ } /* we've got 2 byte-aligned buffers ;)*/ memcpy... /* Manage non-byte data at the end */ while (len) { /* bit to bit copy */ } } ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
Hello, On 07.02.2011 09:09, Øyvind Harboe wrote: What impact would it have to make this an inline fn? I think there is no need to declare this big function as inline. This will only increase the code size. I see some functions in the jtag/interface.c file with a very small body that could be declared as inline because they are called very very often: tap_set_state_impl tap_get_state tap_set_end_state tap_get_end_state Regards, Mathias ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
On Monday 07 February 2011 09:09:36 Øyvind Harboe wrote: What impact would it have to make this an inline fn? ... Inlined or not, this function could be faster. Even with inlining and constant propagation, I don't think gcc is smart enough to replace buf_set_buf(_src, 0, dst, 0, 64); by memcpy(_src, dst, 8); Marc ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
On Mon, Feb 7, 2011 at 9:50 AM, Marc Pignat marc.pig...@hevs.ch wrote: On Monday 07 February 2011 09:09:36 Øyvind Harboe wrote: What impact would it have to make this an inline fn? ... Inlined or not, this function could be faster. Even with inlining and constant propagation, I don't think gcc is smart enough to replace buf_set_buf(_src, 0, dst, 0, 64); by memcpy(_src, dst, 8); Sure it is! Just set up an if() statement that it can optimize and execute memcpy() in that case! With -O3 it can do a lot of constant propagation and figure out that an if() is always taken, especially if the args to if() are constant. -- Øyvind Harboe Can Zylin Consulting help on your project? US toll free 1-866-980-3434 / International +47 51 87 40 27 http://www.zylin.com/zy1000.html ARM7 ARM9 ARM11 XScale Cortex JTAG debugger and flash programmer ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
Am 02/07/2011 09:37 AM, schrieb Øyvind Harboe: On Mon, Feb 7, 2011 at 9:27 AM, Mathias K. kes...@freenet.de wrote: Hello, On 07.02.2011 09:09, Øyvind Harboe wrote: What impact would it have to make this an inline fn? I think there is no need to declare this big function as inline. This will only increase the code size. To the point where it matters? I think clarity and performance matters much more than code-size, right? On what kind of system? On embedded CPUs with small caches, inlining will usually slow down the code because less of it fits inside the code cache - on those systems, -Os is usually faster than the more aggressive optimization levels that do more inlining. I vote to keep the code readable and only inline when there is a real *noticeable* gain, not just because profiling shows that it is faster. (that does not mean I object against the original patch: speeding up the implementation by optimizing the code is fine, as long as it does not hamper maintainability). cu Michael ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
What sort of CPU did you run the tests on? Let me know when the patch is ready to be committed. I suppose it could need a bit of coolof . -- Øyvind Harboe US toll free 1-866-980-3434 / International +47 51 63 25 00 http://www.zylin.com/zy1000.html ARM7 ARM9 ARM11 XScale Cortex JTAG debugger and flash programmer ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
[Openocd-development] [PATCH] buf_set_buf around 30% speed increase
Hello, this patch increase the speed of the buf_set_buf function around 30%. Regards, Mathias diff --git a/src/helper/binarybuffer.c b/src/helper/binarybuffer.c index 3a16cce..e789e6f 100644 --- a/src/helper/binarybuffer.c +++ b/src/helper/binarybuffer.c @@ -133,19 +133,34 @@ void* buf_set_buf(const void *_src, unsigned src_start, { const uint8_t *src = _src; uint8_t *dst = _dst; + unsigned sb,db,sq,dq; + + sb = src_start / 8; + db = dst_start / 8; + sq = src_start % 8; + dq = dst_start % 8; - unsigned src_idx = src_start, dst_idx = dst_start; for (unsigned i = 0; i len; i++) { - if (((src[src_idx / 8] (src_idx % 8)) 1) == 1) - dst[dst_idx / 8] |= 1 (dst_idx % 8); + if (((*src (sq7)) 1) == 1) + *dst |= 1 (dq7); else - dst[dst_idx / 8] = ~(1 (dst_idx % 8)); - dst_idx++; - src_idx++; + *dst = ~(1 (dq7)); + + if ( sq++ == 7 ) + { + sq = 0; + src++; + } + + if ( dq++ == 7 ) + { + dq = 0; + dst++; + } } - return dst; + return (uint8_t*)_dst; } uint32_t flip_u32(uint32_t value, unsigned int num) ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
On Fri, Feb 4, 2011 at 5:21 PM, Mathias K. kes...@freenet.de wrote: Hello, this patch increase the speed of the buf_set_buf function around 30%. how do you arrive at 30%? What overall impact does this have? -- Øyvind Harboe Can Zylin Consulting help on your project? US toll free 1-866-980-3434 / International +47 51 87 40 27 http://www.zylin.com/zy1000.html ARM7 ARM9 ARM11 XScale Cortex JTAG debugger and flash programmer ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
Hello, okay the patch has a little bug. I have not set the correct start pointer of the input and output buffer. Also i have checked the input of this function and in many cases a simple byte copy is possible. I have added this check now and is it possible the buffer is copied byte by byte and not bit by bit. With byte boundary input the test looks like this: buf_set_buf 0x0200 iteration test: runtime (seconds): old: 6.828559 new: 0.436191 diff: 6.392368 runtime (seconds): old: 6.853636 new: 0.430389 diff: 6.423247 runtime (seconds): old: 6.794985 new: 0.423065 diff: 6.371920 Without: buf_set_buf 0x0200 iteration test: runtime (seconds): old: 6.370869 new: 5.552624 diff: 0.818245 runtime (seconds): old: 6.420730 new: 5.665887 diff: 0.754843 runtime (seconds): old: 6.583306 new: 5.599021 diff: 0.984285 Regards, Mathias diff --git a/src/helper/binarybuffer.c b/src/helper/binarybuffer.c index 3a16cce..e789e6f 100644 --- a/src/helper/binarybuffer.c +++ b/src/helper/binarybuffer.c @@ -133,19 +133,34 @@ void* buf_set_buf(const void *_src, unsigned src_start, { const uint8_t *src = _src; uint8_t *dst = _dst; + unsigned sb,db,sq,dq; + + sb = src_start / 8; + db = dst_start / 8; + sq = src_start % 8; + dq = dst_start % 8; - unsigned src_idx = src_start, dst_idx = dst_start; for (unsigned i = 0; i len; i++) { - if (((src[src_idx / 8] (src_idx % 8)) 1) == 1) - dst[dst_idx / 8] |= 1 (dst_idx % 8); + if (((*src (sq7)) 1) == 1) + *dst |= 1 (dq7); else - dst[dst_idx / 8] = ~(1 (dst_idx % 8)); - dst_idx++; - src_idx++; + *dst = ~(1 (dq7)); + + if ( sq++ == 7 ) + { + sq = 0; + src++; + } + + if ( dq++ == 7 ) + { + dq = 0; + dst++; + } } - return dst; + return (uint8_t*)_dst; } uint32_t flip_u32(uint32_t value, unsigned int num) ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development
Re: [Openocd-development] [PATCH] buf_set_buf around 30% speed increase
sorry, wrong file ... diff --git a/src/helper/binarybuffer.c b/src/helper/binarybuffer.c index 3a16cce..08e149a 100644 --- a/src/helper/binarybuffer.c +++ b/src/helper/binarybuffer.c @@ -133,19 +133,48 @@ void* buf_set_buf(const void *_src, unsigned src_start, { const uint8_t *src = _src; uint8_t *dst = _dst; + unsigned i,sb,db,sq,dq, lb,lq; + + sb = src_start / 8; + db = dst_start / 8; + sq = src_start % 8; + dq = dst_start % 8; + lb = len / 8; + lq = len % 8; + + src += sb; + dst += db; + + /* check if both buffers are on byte boundary and +* len is a multiple of 8bit so we can simple copy +* the buffer */ + if ( (sq == 0) (dq == 0) (lq == 0) ) + { + for (i = 0; i lb; i++) + *dst++ = *src++; + return (uint8_t*)_dst; + } - unsigned src_idx = src_start, dst_idx = dst_start; - for (unsigned i = 0; i len; i++) + /* fallback to slow bit copy */ + for (i = 0; i len; i++) { - if (((src[src_idx / 8] (src_idx % 8)) 1) == 1) - dst[dst_idx / 8] |= 1 (dst_idx % 8); + if (((*src (sq7)) 1) == 1) + *dst |= 1 (dq7); else - dst[dst_idx / 8] = ~(1 (dst_idx % 8)); - dst_idx++; - src_idx++; + *dst = ~(1 (dq7)); + if ( sq++ == 7 ) + { + sq = 0; + src++; + } + if ( dq++ == 7 ) + { + dq = 0; + dst++; + } } - return dst; + return (uint8_t*)_dst; } uint32_t flip_u32(uint32_t value, unsigned int num) ___ Openocd-development mailing list Openocd-development@lists.berlios.de https://lists.berlios.de/mailman/listinfo/openocd-development