Re: [PATCH] [RFC] drm/radeon/kms: don't require up to 64k allocations.

2009-09-23 Thread Michel Dänzer
On Wed, 2009-09-23 at 16:56 +1000, Dave Airlie wrote: 
 From: Dave Airlie airl...@redhat.com
 
 This avoids needing to do a kmalloc  PAGE_SIZE for the main
 indirect buffer chunk, it adds an accessor for all reads from
 the chunk and caches a single page at a time for subsequent
 reads.

FWIW, this works on my PowerBook but seems to drop x11perf -aa10text
numbers from about 370k/s to about 315k/s.


-- 
Earthling Michel Dänzer   |http://www.vmware.com
Libre software enthusiast |  Debian, X and DRI developer

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] [RFC] drm/radeon/kms: don't require up to 64k allocations.

2009-09-23 Thread Dave Airlie

  From: Dave Airlie airl...@redhat.com
  
  This avoids needing to do a kmalloc  PAGE_SIZE for the main
  indirect buffer chunk, it adds an accessor for all reads from
  the chunk and caches a single page at a time for subsequent
  reads.
 
 FWIW, this works on my PowerBook but seems to drop x11perf -aa10text
 numbers from about 370k/s to about 315k/s.

Yeah I'm unsure how to bring back up the speed, two ideas I have are:
1) won't help you but on PCI/PCIE we can use the IB to read from and avoid
copying the chunk since its all cache coherent
2) keep a bitmap of the pages, copy to the kpage and then to IB, then
the IB copy from user can just copy any pages we don't hit.

But since what we are doing now is inherently broken, getting order 5 
pages once the system is running is not something we'd expect to work. I 
expect this solution is still faster than the vmalloc overhead.

Dave.

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] [RFC] drm/radeon/kms: don't require up to 64k allocations.

2009-09-23 Thread Pauli Nieminen
2009/9/23 Dave Airlie airl...@linux.ie


   From: Dave Airlie airl...@redhat.com
  
   This avoids needing to do a kmalloc  PAGE_SIZE for the main
   indirect buffer chunk, it adds an accessor for all reads from
   the chunk and caches a single page at a time for subsequent
   reads.
 
  FWIW, this works on my PowerBook but seems to drop x11perf -aa10text
  numbers from about 370k/s to about 315k/s.

 Yeah I'm unsure how to bring back up the speed, two ideas I have are:
 1) won't help you but on PCI/PCIE we can use the IB to read from and avoid
 copying the chunk since its all cache coherent
 2) keep a bitmap of the pages, copy to the kpage and then to IB, then
 the IB copy from user can just copy any pages we don't hit.

 But since what we are doing now is inherently broken, getting order 5
 pages once the system is running is not something we'd expect to work. I
 expect this solution is still faster than the vmalloc overhead.

 Dave.


Maybe only forward iteration could perform better for this parsing. My idea
would be rewrite radeon_get_ib_value like this and correct parser
accordingly. Here it could also just plainly copy directly ths skipped pages
to IB quite easily.

static inline u32 radeon_get_ib_next_value(struct radeon_cs_parser *p,
unsigned skip)
{
int i = p-chunk_ib_idx;
skip += 1; /* advance to next element */
if (unlikely(p-pg_offset + skip = (PAGE_SIZE/4))) {
/* this could be function call to reduce code size because
this
   is relative infrequent operation */
unsigned pages_to_skip;
skip -= (PAGE_SIZE/4) - p-pg_offset;
pages_to_skip = skip/(PAGE_SIZE/4);
skip -= pages_to_skip*(PAGE_SIZE/4);
p-pg_offset = 0;
p-pg_idx += pages_to_skip + 1;

if (DRM_COPY_FROM_USER(p-chunks[i].kpage,
p-chunks[i].user_ptr + (p-pg_idx *
PAGE_SIZE),
PAGE_SIZE))
return 0; /* Should we fail whole parsing here? */
}

p-pg_offset += skip;

return p-chunks[i].kpage[p-pg_offset];
}
--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel