The special case for s_pitch == 2 saves about 270 ms system time (2120 ->
1850ms)
with a 16x30 font.
Compared to what? How much is the function call overhead?
Your version of the inline code inserted after an if (idx==2) in
bit_putcs against my version of the
inline code.
cu,
knut
Hi,
On Wed, 31 Aug 2005, Knut Petersen wrote:
> I added the multiply back because gcc (v. 3.3.4) does generate the fastest
> code
> if I write it this way.
The multiply is not generally faster, so your version may be the fastest,
but in other situations it will be a lot slower. My version is
Hi Roman!
+static inline void __fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src, +
u32 s_pitch, u32 height)
+{
+ int i, j;
+
+ if (likely(s_pitch==1))
+ for(i=0; i < height; i++)
+ dst[d_pitch*i] = src[i];
I added the multiply back
Hi,
On Wed, 31 Aug 2005, Knut Petersen wrote:
> +static inline void __fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src, +
> u32 s_pitch, u32 height)
> +{
> + int i, j;
> +
> + if (likely(s_pitch==1))
> + for(i=0; i < height; i++)
> + dst[d_pitch*i] =
Hi,
On Wed, 31 Aug 2005, Antonino A. Daplas wrote:
> Roman, okay if you have a 'Signed-off-by' line?
Okay.
bye, Roman
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at
Something like below, which has the advantange that there is still only
one implementation of the function
True, that´s a great advantage.
and if it's still slower, we really need to check the compiler
Please have a look at the following patch. It takes your idea of
inlining but moves
Roman Zippel wrote:
> Hi,
>
> On Wed, 31 Aug 2005, Knut Petersen wrote:
>
>> How could I make it an inline function? It is used in console/bitblit.c,
>> nvidia/nvidia.c,
>> riva/fbdev.c and softcursor.c.
>
> Something like below, which has the advantange that there is still only
> one
Roman Zippel wrote:
Hi,
On Wed, 31 Aug 2005, Knut Petersen wrote:
How could I make it an inline function? It is used in console/bitblit.c,
nvidia/nvidia.c,
riva/fbdev.c and softcursor.c.
Something like below, which has the advantange that there is still only
one implementation of the
Something like below, which has the advantange that there is still only
one implementation of the function
True, that´s a great advantage.
and if it's still slower, we really need to check the compiler
Please have a look at the following patch. It takes your idea of
inlining but moves
Hi,
On Wed, 31 Aug 2005, Antonino A. Daplas wrote:
Roman, okay if you have a 'Signed-off-by' line?
Okay.
bye, Roman
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at
Hi,
On Wed, 31 Aug 2005, Knut Petersen wrote:
+static inline void __fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src, +
u32 s_pitch, u32 height)
+{
+ int i, j;
+
+ if (likely(s_pitch==1))
+ for(i=0; i height; i++)
+ dst[d_pitch*i] = src[i];
+
Hi Roman!
+static inline void __fb_pad_aligned_buffer(u8 *dst, u32 d_pitch, u8 *src, +
u32 s_pitch, u32 height)
+{
+ int i, j;
+
+ if (likely(s_pitch==1))
+ for(i=0; i height; i++)
+ dst[d_pitch*i] = src[i];
I added the multiply back
Hi,
On Wed, 31 Aug 2005, Knut Petersen wrote:
I added the multiply back because gcc (v. 3.3.4) does generate the fastest
code
if I write it this way.
The multiply is not generally faster, so your version may be the fastest,
but in other situations it will be a lot slower. My version is
The special case for s_pitch == 2 saves about 270 ms system time (2120 -
1850ms)
with a 16x30 font.
Compared to what? How much is the function call overhead?
Your version of the inline code inserted after an if (idx==2) in
bit_putcs against my version of the
inline code.
cu,
knut
Knut Petersen wrote:
> This trivial patch gives a performance boost to the framebuffer console
>
> Constructing the bitmaps that are given to the bitblit functions of the
> framebuffer
> drivers is time consuming. Here we avoide a call to the slow
> fb_pad_aligned_buffer().
> The patch replaces
Knut Petersen wrote:
> fb_pad_aligned_buffer() is also slower for those cases. But does anybody
> use such fonts?
Yes, there are 16x30 fonts out there in the wild.
Tony
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More
Hi,
On Wed, 31 Aug 2005, Knut Petersen wrote:
> How could I make it an inline function? It is used in console/bitblit.c,
> nvidia/nvidia.c,
> riva/fbdev.c and softcursor.c.
Something like below, which has the advantange that there is still only
one implementation of the function and if it's
Hi Roman,
Could you try the patch below, for a few extra cycles you might want to
make it an inline function.
No, it does not help. If there is any difference, it is too small to be
measured on
my system ... and my system does run at 1000 Hz.
After 2.6.12 fb_pad_aligned_buffer() was
On Tue, 30 Aug 2005, Knut Petersen wrote:
> > Probably you can make it even faster by avoiding the multiplication, like
> >
> >unsigned int offset = 0;
> >for (i = 0; i < image.height; i++) {
> > dst[offset] = src[i];
> > offset += pitch;
> >}
>
> More than two decades ago I
Hi,
On Tue, 30 Aug 2005, Knut Petersen wrote:
> > Probably you can make it even faster by avoiding the multiplication, like
> >
> >unsigned int offset = 0;
> >for (i = 0; i < image.height; i++) {
> > dst[offset] = src[i];
> > offset += pitch;
> >}
> >
>
> More than two
Probably you can make it even faster by avoiding the multiplication, like
unsigned int offset = 0;
for (i = 0; i < image.height; i++) {
dst[offset] = src[i];
offset += pitch;
}
More than two decades ago I learned to avoid mul and imul. Use shifts,
add and lea
On Tue, 30 Aug 2005, Knut Petersen wrote:
> linux/drivers/video/console/bitblit.c
> --- linuxorig/drivers/video/console/bitblit.c 2005-08-29 01:41:01.0
> +0200
> +++ linux/drivers/video/console/bitblit.c 2005-08-30 17:19:57.0
> +0200
> @@ -114,7 +114,7 @@ static void
This trivial patch gives a performance boost to the framebuffer console
Constructing the bitmaps that are given to the bitblit functions of the
framebuffer
drivers is time consuming. Here we avoide a call to the slow
fb_pad_aligned_buffer().
The patch replaces that call with a simple but much
This trivial patch gives a performance boost to the framebuffer console
Constructing the bitmaps that are given to the bitblit functions of the
framebuffer
drivers is time consuming. Here we avoide a call to the slow
fb_pad_aligned_buffer().
The patch replaces that call with a simple but much
On Tue, 30 Aug 2005, Knut Petersen wrote:
linux/drivers/video/console/bitblit.c
--- linuxorig/drivers/video/console/bitblit.c 2005-08-29 01:41:01.0
+0200
+++ linux/drivers/video/console/bitblit.c 2005-08-30 17:19:57.0
+0200
@@ -114,7 +114,7 @@ static void bit_putcs(struct
Probably you can make it even faster by avoiding the multiplication, like
unsigned int offset = 0;
for (i = 0; i image.height; i++) {
dst[offset] = src[i];
offset += pitch;
}
More than two decades ago I learned to avoid mul and imul. Use shifts,
add and lea
Hi,
On Tue, 30 Aug 2005, Knut Petersen wrote:
Probably you can make it even faster by avoiding the multiplication, like
unsigned int offset = 0;
for (i = 0; i image.height; i++) {
dst[offset] = src[i];
offset += pitch;
}
More than two decades ago I learned
On Tue, 30 Aug 2005, Knut Petersen wrote:
Probably you can make it even faster by avoiding the multiplication, like
unsigned int offset = 0;
for (i = 0; i image.height; i++) {
dst[offset] = src[i];
offset += pitch;
}
More than two decades ago I learned to avoid
Hi Roman,
Could you try the patch below, for a few extra cycles you might want to
make it an inline function.
No, it does not help. If there is any difference, it is too small to be
measured on
my system ... and my system does run at 1000 Hz.
After 2.6.12 fb_pad_aligned_buffer() was
Hi,
On Wed, 31 Aug 2005, Knut Petersen wrote:
How could I make it an inline function? It is used in console/bitblit.c,
nvidia/nvidia.c,
riva/fbdev.c and softcursor.c.
Something like below, which has the advantange that there is still only
one implementation of the function and if it's still
Knut Petersen wrote:
fb_pad_aligned_buffer() is also slower for those cases. But does anybody
use such fonts?
Yes, there are 16x30 fonts out there in the wild.
Tony
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More
Knut Petersen wrote:
This trivial patch gives a performance boost to the framebuffer console
Constructing the bitmaps that are given to the bitblit functions of the
framebuffer
drivers is time consuming. Here we avoide a call to the slow
fb_pad_aligned_buffer().
The patch replaces that
32 matches
Mail list logo