Re: [PATCH] [RFC] drm/radeon/kms: don't require up to 64k allocations.
On Wed, 2009-09-23 at 16:56 +1000, Dave Airlie wrote: From: Dave Airlie airl...@redhat.com This avoids needing to do a kmalloc PAGE_SIZE for the main indirect buffer chunk, it adds an accessor for all reads from the chunk and caches a single page at a time for subsequent reads. FWIW, this works on my PowerBook but seems to drop x11perf -aa10text numbers from about 370k/s to about 315k/s. -- Earthling Michel Dänzer |http://www.vmware.com Libre software enthusiast | Debian, X and DRI developer -- Come build with us! The BlackBerryreg; Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9#45;12, 2009. Register now#33; http://p.sf.net/sfu/devconf -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] [RFC] drm/radeon/kms: don't require up to 64k allocations.
From: Dave Airlie airl...@redhat.com This avoids needing to do a kmalloc PAGE_SIZE for the main indirect buffer chunk, it adds an accessor for all reads from the chunk and caches a single page at a time for subsequent reads. FWIW, this works on my PowerBook but seems to drop x11perf -aa10text numbers from about 370k/s to about 315k/s. Yeah I'm unsure how to bring back up the speed, two ideas I have are: 1) won't help you but on PCI/PCIE we can use the IB to read from and avoid copying the chunk since its all cache coherent 2) keep a bitmap of the pages, copy to the kpage and then to IB, then the IB copy from user can just copy any pages we don't hit. But since what we are doing now is inherently broken, getting order 5 pages once the system is running is not something we'd expect to work. I expect this solution is still faster than the vmalloc overhead. Dave. -- Come build with us! The BlackBerryreg; Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9#45;12, 2009. Register now#33; http://p.sf.net/sfu/devconf -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] [RFC] drm/radeon/kms: don't require up to 64k allocations.
2009/9/23 Dave Airlie airl...@linux.ie From: Dave Airlie airl...@redhat.com This avoids needing to do a kmalloc PAGE_SIZE for the main indirect buffer chunk, it adds an accessor for all reads from the chunk and caches a single page at a time for subsequent reads. FWIW, this works on my PowerBook but seems to drop x11perf -aa10text numbers from about 370k/s to about 315k/s. Yeah I'm unsure how to bring back up the speed, two ideas I have are: 1) won't help you but on PCI/PCIE we can use the IB to read from and avoid copying the chunk since its all cache coherent 2) keep a bitmap of the pages, copy to the kpage and then to IB, then the IB copy from user can just copy any pages we don't hit. But since what we are doing now is inherently broken, getting order 5 pages once the system is running is not something we'd expect to work. I expect this solution is still faster than the vmalloc overhead. Dave. Maybe only forward iteration could perform better for this parsing. My idea would be rewrite radeon_get_ib_value like this and correct parser accordingly. Here it could also just plainly copy directly ths skipped pages to IB quite easily. static inline u32 radeon_get_ib_next_value(struct radeon_cs_parser *p, unsigned skip) { int i = p-chunk_ib_idx; skip += 1; /* advance to next element */ if (unlikely(p-pg_offset + skip = (PAGE_SIZE/4))) { /* this could be function call to reduce code size because this is relative infrequent operation */ unsigned pages_to_skip; skip -= (PAGE_SIZE/4) - p-pg_offset; pages_to_skip = skip/(PAGE_SIZE/4); skip -= pages_to_skip*(PAGE_SIZE/4); p-pg_offset = 0; p-pg_idx += pages_to_skip + 1; if (DRM_COPY_FROM_USER(p-chunks[i].kpage, p-chunks[i].user_ptr + (p-pg_idx * PAGE_SIZE), PAGE_SIZE)) return 0; /* Should we fail whole parsing here? */ } p-pg_offset += skip; return p-chunks[i].kpage[p-pg_offset]; } -- Come build with us! The BlackBerryreg; Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9#45;12, 2009. Register now#33; http://p.sf.net/sfu/devconf-- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel