Richard Mills <r...@utk.edu> writes: > On Wed, Jun 3, 2015 at 6:04 PM, Jed Brown <j...@jedbrown.org> wrote: > >> Have you heard anything back about whether move_pages() will work? >> > > move_pages() will work to move pages between MCDRAM and DRAM right > now,
Great! > but it screws up memkind's partitioning of the heap (it won't be aware > that the pages have been moved). Then memkind is stupid or the kernel isn't exposing the correct information to memkind. Tell them to not be lazy and do it right. > Jed, I'm with you in thinking that, ultimately, there actually needs to be > a way to make these kinds of decisions based on global information. We > don't have that right now. But if we get some smart allocator (and > migrator) that gives us, say malloc_use_oracle() to always make the good > decision, The oracle has to see into the future. move_pages() is so much more powerful. > we still should have something like a PetscAdvMalloc() that provides a > context to allow us to pass advice to this smart allocator to provide > hints about how it will be accessed, whatever. What does the caller know? What good is the context if we always pass I_HAVE_NO_IDEA? > In a lot of cases, simple size-based allocation is probably the way to go. > An option to do automatic size-based placement is even in the latest > memkind sources on github now, but it will do that for the entire > application. That's crude; I'd rather have each library use its own threshold. > I'd like to be able to restrict this to only the PETSc portion: Maybe > a code that uses PETSc also needs to allocate some enormous lookup > tables that are big but have accesses that are really latency- rather > than bandwidth-sensitive. Or, to be specific to a code I actually > know, I believe that in PFLOTRAN there are some pretty large > allocations required for auxiliary variables that don't need to go in > high-bandwidth memory, though we will want all of the large PETSc > objects to go in there. Fine. That involves a couple lines of code. Go into PetscMallocAlign and add the ability to use memkind. Add a run-time option to control the threshold. Done. If you want complexity to bleed into the library (and necessarily into user code if given any power at all), I think you need to demonstrate a tangible benefit that cannot be obtained by something simpler. Consider the simple and dumb threshold above to be the null hypothesis. This is just my opinion. Feel free to make a branch with whatever you prefer.
signature.asc
Description: PGP signature