* Eric Anholt <[EMAIL PROTECTED]> wrote: > > The APIs would be: > > > > int io_resource_init_mapping(struct resource *res); > > void io_resource_free_mapping(struct resource *res); > > void * io_resource_map(struct resource *res, pfn_t pfn, unsigned long > > offset); > > void io_resource_unmap(struct resource *res, void *kaddr); > > > > Note how simple and consistent it all gets: IO resources already > > know their physical location and their size limits. Being able to > > cache an ioremap in a mapping [and being able to use atomic kmaps on > > 32-bit] is a relatively simple and natural extension to the concept. > > > > i think that would be quite acceptable - and the APIs could just > > transparently work on it. This would also allow the PCI code to > > automatically unmap any cached mappings from resources, when the > > driver deinitializes. > > > > Linus, Jesse, what do you think? > > > > i think we need to finalize the API names and their abstraction > > level, and then could even merge those APIs into v2.6.28 on a fast > > path, to enable you to use it. It does not interact with anything > > else so it should be safe to do. > > This API needs the cacheability control, which I don't see in it > currently. [...]
yes, these two should do the trick: int io_resource_init_mapping_wc(struct resource *res); int io_resource_init_mapping_wb(struct resource *res); > Second, we need to know when we're doing a mapping whether we're > affected by atomic scheduling restrictions. Right now our plan has > been to try doing page-by-page > io_map_atomic_wc()/copy_from_user_inatomic()/io_unmap_atomic(), and if > we fail at that at some point (map returns NULL or we get a partial > completion from copy_from_user_inatomic) then fall back to io_map_wc() > and copy_from_user() the whole thing at once. That gets us good > performance on both x86 with highmem and x86-64, and not too shabby > performance on x86 non-highmem. that gets ugly very fast. I think we should not use atomic kmaps but NR_CPUS _fixmaps_ with a per CPU array of mutexes (this is basically atomic kmaps but without the preemption restrictions). We could take/drop the mutex and statistically you'll stay on the same CPU and wont ever contend on that lock in practice. > Also, while it's rare, there have been graphics cards (looking at you, > S3) where BARs were expensive for some reason and they stuffed both > the framebuffer and registers into one PCI BAR, where you want the FB > to be WC and the registers to be UC. Not sure if they would be > supportable with this API or not. And if it's not, I'm not sure how > much we care to design for them, but it's something to potentially > consider. yes, this is a weakness of this API - you cannot mix multiple cachability domains within the same BAR. and that can happen on non-graphics as well: some storage controller that has regular control registers in one portion of the BAR, which all need to be consistently accessed via UC and properly POST-ed - while it could also have some large mailbox structure at the end of the BAR, which could be mapped both cacheable or perhaps WC. So ... i guess we can go back to the io_mapping API proposed by Keith, but not make it atomic kmap based but fixmap + mutex based - for good 32-bit performance. (and the fixmap would not be used on 64-bit at all) > Finally, I'm confused by the pfn and offset args to io_resource_map, > when I expected something parallel to ioremap but with our resource > arg added. ok. Ingo ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel