Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-10 Thread Richard Mills
Hi Barry,

I like your suggestion and I'll give this implementation a try.  I've used
some experimental tools that interpose on memory allocation calls and then
track the accesses to give similar information, but having what you suggest
implemented in PETSc would be easier and more useful in a lot of ways.

What we really need is dynamically updated priorities for what arrays get
placed in the high-bandwidth memory.  This sort of tracking might enable a
reasonable way to estimate these priorities.  (This only tells us about
PETSc's memory and doesn't solve the "global" problem, but it's a start.)

I have to think about it a bit more, but I still believe that using
something like move_pages(2) will preclude the use of a heap manager the
high-bandwidth memory.  Maybe we don't need one.  If we do, then, yes, I
think we can deal with the inability to move an array between the different
types of memory while keeping the same virtual address because we can just
switch the ->array pointer.

I'll plan to implement the very simple (threshold-based) placement approach
and the tracking you suggest, and the evaluate whether the simple approach
seems adequate or whether it would be worthwhile to support more complex
options.

--Richard

On Wed, Jun 3, 2015 at 7:39 PM, Barry Smith  wrote:

>
>   Richard,
>
>If the code does not use VecSetValues() then one could measure the
> "importance" of each vector by counting two numbers, the number of times
> VecGetArray() is called on the vector and the number of times
> VecGetArrayRead() is called. We don't currently measure this but you could
> add cntread and cntwrite fields to _p_Vec and have VecGetArray[Read]()
> increment them. Then in VecDestroy() just have the vector print its name
> and the cnts. It would be interesting to see how many vectors there are,
> for an example in src/ts/examples/tutorials (or subdirectory) and what the
> distributions of these cnts is.
>
>Barry
>
> The reason this is unreliable for when VecSetValues() is used is that EACH
> VecSetValues() calls VecGetArray() which will result in artificially high
> write cnts when each one represents only accessing a tiny part of the
> vector.
>
>
> > On Jun 3, 2015, at 9:26 PM, Barry Smith  wrote:
> >
> >
> >   To follow up on this, going back to my "advise object" to malloc being
> a living object as opposed to just some flags. In the case where different
> vectors may have very different "importances" at different times in the
> runtime of the simulation one could "switch" some vectors from using slow
> to faster memory when one knows the code is switching to a different phase
> where the vector "importances" are different.
> >
> >  Barry
> >
> >  Note that even if Intel cannot provide a way to "switch" a  memory
> address between fast and slow it doesn't really mater from the PETSc point
> of view since inside any particular PETSc vector we would could switch the
> ->array pointer to a different memory location (and copy stuff over if
> needed) when changing a vector from important to unimportant or the
> opposite. (since no code outside the vector object knows what the pointer
> is).
> >
> >
> >> On Jun 3, 2015, at 9:18 PM, Barry Smith  wrote:
> >>
> >>
> >>> On Jun 3, 2015, at 8:55 PM, Richard Mills  wrote:
> >>>
> >>> Ha, yes.  I'll try this out, but I do wonder what people's thoughts
> are on the best way to "tag" an object like a Vec or Mat for some
> particular treatment of its placement in memory.  Does doing this at the
> level of a Mat or Vec (e.g., VecSetAdvMallocCtx() ) sound appropriate?  We
> could actually make this a part of any PetscObject, but I think that's not
> necessary.
> >>
> >> No idea.
> >>
> >> Perhaps, and this is just nonsense off the top of my head, if you had
> some measure of the importance of a vector (or matrix; I would start with
> vectors for simplicity and since we have more of them) based on how often
> it's values would be "accessed". So a vector that you know is only used
> "once in a while" gets a lower "importance" than one that gets used "very
> often". Of course determining these vectors importances may be difficult.
> You could do it experimentally, add some code that measures how often each
> vector gets its values "accessed (whatever that means)/read write" and see
> if there is some distribution (do this for a nontrivial TS example) where
> some vectors are accessed often and others rarely. Now place the often
> "accessed" vectors in faster memory and see how much faster the code is.
> >>
> >> Barry
> >>
> >> A related note is that "we" are not particularly careful about
> "reusing" work vectors; say a code has ten different work vectors for
> different phases of the computation; now imagine a careful "global
> analysis" that determined it could get away with three work vectors (since
> only at most three had relevant values at any one time), now pop those
> three work vectors into faster memory where the ten previous work vectors
> could not fit. Obvi

Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-04 Thread Richard Mills
On Wed, Jun 3, 2015 at 8:44 PM, Barry Smith  wrote:

>
> > On Jun 3, 2015, at 10:35 PM, Jed Brown  wrote:
> >
> > Barry Smith  writes:
> > [...]
> > A smart Congress would say "redefine 'beat us' to something that matters
> > and stop wasting your time on vanity".
>
>   Two words that will can be next to each other: smart congress
>

Why would someone who is actually smart be in congress, when their
intelligence gives them so many other options?

--Richard


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

> On Jun 3, 2015, at 10:35 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
> 
>>  It is OUR job as PETSc developers to hide that complexity from the
>>  "most people" who would be driven away from HPC because of it. 
> 
> Absolutely.  So now the question becomes "what benefit can this have,
> predicated on not letting the complexity bleed onto the user.
> 
>>  Thus if Richard proposed changing VecCreate() to VecCreate(MPI_Comm,
>>  Crazy Intel specific Memkind options, Vec *x); we would reject
>>  it. He is not even coming close to proposing that, in fact he is not
>>  proposing anything, he is just asking for advise on how to run some
>>  experiments to see if the Phi crazy memory shit can be beneficial to
>>  some PETSc apps.
> 
> And my advice is to start with the simplest thing possible.
> 
> I'm also expressing skepticism that a more sophisticated solution that
> _does not bleed complexity on the user_ is capable of substantially
> beating the simple thing across a meaningful range of applications.

  There you go again with "meaningful range of applications". Why can't you get 
it through your head that if cosmology science advances at all from exascale 
(which I doubt it will, speaking of unethical bastards) then all of exascale is 
like totally worthwhile :-)

> 
>>   Says the man who suggest the PetscThreadComm stuff in PETSc that
>>   was recently removed because it was too complicated and had too
>>   (no) benefits :-)
> 
> Yes, I was trying to solve a problem that didn't need to be solved.  My
> mistake.
> 
>>   The story to Congress is: "China might beat us if you don't give us
>>   money", any other effect is third order at best.
> 
> A smart Congress would say "redefine 'beat us' to something that matters
> and stop wasting your time on vanity".

  Two words that will can be next to each other: smart congress




Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

> On Jun 3, 2015, at 10:28 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
>>  Sure but the super high-end (DOE LCF centers) focus allows (actually
>>  loves) the need for "super brittle" stuff.  
> 
> Job security.
> 
>>  It is not the bread and butter of PETSc but if they (silly ASCR) are
>>  willing to foot our bills to do our bread and butter by pandering to
>>  the "super brittle" high-end what's the harm in pandering (aside
>>  from our souls) since we are not actually doing the work :-)
> 
> I have ethical objections to ruining the careers of scientists capable
> of doing important things.

  You are not, the few unethical bastards who have hitched their train to 
exascale will do their stuff regardless of whether you exist or not, meanwhile 
the vast majority who have not hitched their train to exascale benefit from the 
work you do that they can utilize to do science. The poor unfortunately souls 
who  are told "you must use x, y or z because of X, Y, or Z" and get screwed 
are not screwed by you, they are screwed by the snake oil salesman and all you 
can do is warn them about the snake oil salesman, which you do. Better to use a 
little bit of the exascale money to push it in a better direction then to allow 
all of that money to move things in the wrong direction; just because you can't 
FIX exascale doesn't mean it is unethical to use some of the money to move it 
slightly in a better direction.







Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jed Brown
Barry Smith  writes:

>   It is OUR job as PETSc developers to hide that complexity from the
>   "most people" who would be driven away from HPC because of it. 

Absolutely.  So now the question becomes "what benefit can this have,
predicated on not letting the complexity bleed onto the user.

>   Thus if Richard proposed changing VecCreate() to VecCreate(MPI_Comm,
>   Crazy Intel specific Memkind options, Vec *x); we would reject
>   it. He is not even coming close to proposing that, in fact he is not
>   proposing anything, he is just asking for advise on how to run some
>   experiments to see if the Phi crazy memory shit can be beneficial to
>   some PETSc apps.

And my advice is to start with the simplest thing possible.

I'm also expressing skepticism that a more sophisticated solution that
_does not bleed complexity on the user_ is capable of substantially
beating the simple thing across a meaningful range of applications.

>Says the man who suggest the PetscThreadComm stuff in PETSc that
>was recently removed because it was too complicated and had too
>(no) benefits :-)

Yes, I was trying to solve a problem that didn't need to be solved.  My
mistake.

>The story to Congress is: "China might beat us if you don't give us
>money", any other effect is third order at best.

A smart Congress would say "redefine 'beat us' to something that matters
and stop wasting your time on vanity".


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jed Brown
Barry Smith  writes:
>   Sure but the super high-end (DOE LCF centers) focus allows (actually
>   loves) the need for "super brittle" stuff.  

Job security.

>   It is not the bread and butter of PETSc but if they (silly ASCR) are
>   willing to foot our bills to do our bread and butter by pandering to
>   the "super brittle" high-end what's the harm in pandering (aside
>   from our souls) since we are not actually doing the work :-)

I have ethical objections to ruining the careers of scientists capable
of doing important things.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

> On Jun 3, 2015, at 10:08 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
>>  Even if it "helps" in only 30 percent of applications that is still
>>  a good thing (and a great thing politically). Then it becomes an
>>  issue of education and proper profiling tools to tell people for
>>  their apps that it won't work; so the other 70% is not "confused".
> 
> How much does it have to help those 30% if the complexity contributes to
> driving 30% of potential new users away from HPC?

  It is OUR job as PETSc developers to hide that complexity from the "most 
people" who would be driven away from HPC because of it. Thus if Richard 
proposed changing VecCreate() to VecCreate(MPI_Comm, Crazy Intel specific 
Memkind options, Vec *x); we would reject it. He is not even coming close to 
proposing that, in fact he is not proposing anything, he is just asking for 
advise on how to run some experiments to see if the Phi crazy memory shit can 
be beneficial to some PETSc apps.


> 
> I'm in favor of doing the simplest thing until presented with
> overwhelming evidence that the complicated thing is necessary.

   Says the man who suggest the PetscThreadComm stuff in PETSc that was 
recently removed because it was too complicated and had too (no) benefits :-)

>  I
> understand that this doesn't win grants; you have to say that the simple
> thing that has been working will never work at exascale.
> 
>>  Note that Marc Snir today told me that it is perfectly fine if the
>>  "largest computing systems", i.e. the LCFs can only provide useful
>>  performance for a small subset of all possible applications.
> 
> Even when that small subset does not contain the primary apps used to
> sell the machines to Congress.  It's just too difficult to have a
> consistent story.

   The story to Congress is: "China might beat us if you don't give us money", 
any other effect is third order at best.




Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jed Brown
Jeff Hammond  writes:

> On Wed, Jun 3, 2015 at 9:58 PM, Jed Brown  wrote:
>> Jeff Hammond  writes:
>>> The beauty of git/github is one can make branches to try out anything
>>> they want even if Jed thinks that he knows better than Intel how to
>>> write system software for Intel's hardware.
>>
>> I'm objecting to the interface.  I think that if they try to get memkind
>> merged into the existing libnuma project, they'll see similar
>> resistance.  It is essential for low-level interfaces to create
>> foundations that can be reliably built upon, not gushing wounds that
>> bleed complexity into everything built on top.
>
> Step 1: Commit a change associated with the new interface function.

This response is not germane to my point above.  I also never asked for
a new interface.

> Step 2: Commit a change implementing the new interface function.
> Step 3: File a pull request.
>
>>> This link is equivalent to pushing the "Fork" button on Github's
>>> memkind page: https://github.com/memkind/memkind#fork-destination-box.
>>> I'm sure that the memkind developers would be willing to review your
>>> pull request once you've implemented memkind_move_pages().
>>
>> 1. I cannot test it because I don't have access to the hardware.
>
> The memkind library itself was developed entirely without access to
> the hardware to which you refer, so this complaint is not relevant.

The interesting case here is testing failure modes in the face of
resource exhaustion, which doesn't seem to have been addressed in a
serious way by memkind and requires other trickery to test without
MCDRAM.  Also, the performance effects are relevant.  But I don't want
anything else in memkind because I don't want to use memkind for
anything ever.

>> 2. I think memkind is solving the wrong problem in the wrong way.
>
> It is more correct to say it is solving a different problem than the
> one you care about.  memkind is the correct way to solve the problem
> it is trying to solve.  Please stop equating your disagreement with
> the problem statement as evidence that the solution is terrible.

This is pedantry.  Is there a clear statement of what problem memkind
solves?

  The memkind library is a user extensible heap manager built on top of
  jemalloc which enables control of memory characteristics and a
  partitioning of the heap between kinds of memory.

This is just a low-level statement about what it does and I would argue
it doesn't even do this in a useful way because it is entirely at
allocation time assuming the caller is omniscient.

>> 3. According to Richard, the mature move_pages(2) interface has been
>> implemented.  That's what I wanted, so I'll just use that -- memkind
>> dependency gone.
>
> Does this mean that you will stop complaining about memkind, since it
> is not directly relevant to your life?  I would like that.

Yes, as soon as people stop telling me that I should use memkind and
stop asking to put it into packages I interact with, I'll ignore it like
countless other projects that are irrelevant to what I do.  But if, like
OpenMP, the turd keeps landing in my breakfast, I'm probably going to
mention that it's impolite to keep pooping in my breakfast.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

> On Jun 3, 2015, at 10:04 PM, Jeff Hammond  wrote:
> 
> If everyone would just indent with tabs, we could just set the indent
> spacing with our editors ;-)

  Ah, heresy, kill him!


> 
> On Wed, Jun 3, 2015 at 10:01 PM, Barry Smith  wrote:
>> 
>>> On Jun 3, 2015, at 9:58 PM, Jeff Hammond  wrote:
>>> 
>>> http://git.mpich.org/mpich.git/blob/HEAD:/src/mpi/init/init.c
>>> https://github.com/open-mpi/ompi/blob/master/ompi/mpi/c/init.c
>> 
>>  As I said, super insane :-)
>> 
>>  Barry
>> 
>>  I'm just having fun here; I do believe that 2 is the ultimate correct 
>> indentation but I can always run a preprocessor to fix their code before I 
>> use it :-)
>> 
>>> 
>>> Jeff
>>> 
>>> On Wed, Jun 3, 2015 at 9:43 PM, Barry Smith  wrote:
 
 Jeff,
 
  Ahh, from this page, it is definitively clear that the Intel people have 
 their heads totally up their asses
 
 formatted source code with astyle --style=linux --indent=spaces=4 -y -S
 
 when everyone knows that any indent that is not 2 characters is totally 
 insane :-)
 
 Barry
 
 
> On Jun 3, 2015, at 9:37 PM, Jeff Hammond  wrote:
> 
>>> but it screws up memkind's partitioning of the heap (it won't be aware
>>> that the pages have been moved).
>> 
>> Then memkind is stupid or the kernel isn't exposing the correct
>> information to memkind.  Tell them to not be lazy and do it right.
> 
> The beauty of git/github is one can make branches to try out anything
> they want even if Jed thinks that he knows better than Intel how to
> write system software for Intel's hardware.
> 
> This link is equivalent to pushing the "Fork" button on Github's
> memkind page: https://github.com/memkind/memkind#fork-destination-box.
> I'm sure that the memkind developers would be willing to review your
> pull request once you've implemented memkind_move_pages().
> 
> Jeff
> 
> --
> Jeff Hammond
> jeff.scie...@gmail.com
> http://jeffhammond.github.io/
 
>>> 
>>> 
>>> 
>>> --
>>> Jeff Hammond
>>> jeff.scie...@gmail.com
>>> http://jeffhammond.github.io/
>> 
> 
> 
> 
> -- 
> Jeff Hammond
> jeff.scie...@gmail.com
> http://jeffhammond.github.io/



Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jed Brown
Barry Smith  writes:
>   Even if it "helps" in only 30 percent of applications that is still
>   a good thing (and a great thing politically). Then it becomes an
>   issue of education and proper profiling tools to tell people for
>   their apps that it won't work; so the other 70% is not "confused".

How much does it have to help those 30% if the complexity contributes to
driving 30% of potential new users away from HPC?

I'm in favor of doing the simplest thing until presented with
overwhelming evidence that the complicated thing is necessary.  I
understand that this doesn't win grants; you have to say that the simple
thing that has been working will never work at exascale.

>   Note that Marc Snir today told me that it is perfectly fine if the
>   "largest computing systems", i.e. the LCFs can only provide useful
>   performance for a small subset of all possible applications.

Even when that small subset does not contain the primary apps used to
sell the machines to Congress.  It's just too difficult to have a
consistent story.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jeff Hammond
If everyone would just indent with tabs, we could just set the indent
spacing with our editors ;-)

On Wed, Jun 3, 2015 at 10:01 PM, Barry Smith  wrote:
>
>> On Jun 3, 2015, at 9:58 PM, Jeff Hammond  wrote:
>>
>> http://git.mpich.org/mpich.git/blob/HEAD:/src/mpi/init/init.c
>> https://github.com/open-mpi/ompi/blob/master/ompi/mpi/c/init.c
>
>   As I said, super insane :-)
>
>   Barry
>
>   I'm just having fun here; I do believe that 2 is the ultimate correct 
> indentation but I can always run a preprocessor to fix their code before I 
> use it :-)
>
>>
>> Jeff
>>
>> On Wed, Jun 3, 2015 at 9:43 PM, Barry Smith  wrote:
>>>
>>>  Jeff,
>>>
>>>   Ahh, from this page, it is definitively clear that the Intel people have 
>>> their heads totally up their asses
>>>
>>> formatted source code with astyle --style=linux --indent=spaces=4 -y -S
>>>
>>> when everyone knows that any indent that is not 2 characters is totally 
>>> insane :-)
>>>
>>>  Barry
>>>
>>>
 On Jun 3, 2015, at 9:37 PM, Jeff Hammond  wrote:

>> but it screws up memkind's partitioning of the heap (it won't be aware
>> that the pages have been moved).
>
> Then memkind is stupid or the kernel isn't exposing the correct
> information to memkind.  Tell them to not be lazy and do it right.

 The beauty of git/github is one can make branches to try out anything
 they want even if Jed thinks that he knows better than Intel how to
 write system software for Intel's hardware.

 This link is equivalent to pushing the "Fork" button on Github's
 memkind page: https://github.com/memkind/memkind#fork-destination-box.
 I'm sure that the memkind developers would be willing to review your
 pull request once you've implemented memkind_move_pages().

 Jeff

 --
 Jeff Hammond
 jeff.scie...@gmail.com
 http://jeffhammond.github.io/
>>>
>>
>>
>>
>> --
>> Jeff Hammond
>> jeff.scie...@gmail.com
>> http://jeffhammond.github.io/
>



-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jeff Hammond
On Wed, Jun 3, 2015 at 9:58 PM, Jed Brown  wrote:
> Jeff Hammond  writes:
>> The beauty of git/github is one can make branches to try out anything
>> they want even if Jed thinks that he knows better than Intel how to
>> write system software for Intel's hardware.
>
> I'm objecting to the interface.  I think that if they try to get memkind
> merged into the existing libnuma project, they'll see similar
> resistance.  It is essential for low-level interfaces to create
> foundations that can be reliably built upon, not gushing wounds that
> bleed complexity into everything built on top.

Step 1: Commit a change associated with the new interface function.
Step 2: Commit a change implementing the new interface function.
Step 3: File a pull request.

>> This link is equivalent to pushing the "Fork" button on Github's
>> memkind page: https://github.com/memkind/memkind#fork-destination-box.
>> I'm sure that the memkind developers would be willing to review your
>> pull request once you've implemented memkind_move_pages().
>
> 1. I cannot test it because I don't have access to the hardware.

The memkind library itself was developed entirely without access to
the hardware to which you refer, so this complaint is not relevant.

> 2. I think memkind is solving the wrong problem in the wrong way.

It is more correct to say it is solving a different problem than the
one you care about.  memkind is the correct way to solve the problem
it is trying to solve.  Please stop equating your disagreement with
the problem statement as evidence that the solution is terrible.

> 3. According to Richard, the mature move_pages(2) interface has been
> implemented.  That's what I wanted, so I'll just use that -- memkind
> dependency gone.

Does this mean that you will stop complaining about memkind, since it
is not directly relevant to your life?  I would like that.

Jeff

-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

> On Jun 3, 2015, at 9:58 PM, Jeff Hammond  wrote:
> 
> http://git.mpich.org/mpich.git/blob/HEAD:/src/mpi/init/init.c
> https://github.com/open-mpi/ompi/blob/master/ompi/mpi/c/init.c

  As I said, super insane :-)

  Barry

  I'm just having fun here; I do believe that 2 is the ultimate correct 
indentation but I can always run a preprocessor to fix their code before I use 
it :-)

> 
> Jeff
> 
> On Wed, Jun 3, 2015 at 9:43 PM, Barry Smith  wrote:
>> 
>>  Jeff,
>> 
>>   Ahh, from this page, it is definitively clear that the Intel people have 
>> their heads totally up their asses
>> 
>> formatted source code with astyle --style=linux --indent=spaces=4 -y -S
>> 
>> when everyone knows that any indent that is not 2 characters is totally 
>> insane :-)
>> 
>>  Barry
>> 
>> 
>>> On Jun 3, 2015, at 9:37 PM, Jeff Hammond  wrote:
>>> 
> but it screws up memkind's partitioning of the heap (it won't be aware
> that the pages have been moved).
 
 Then memkind is stupid or the kernel isn't exposing the correct
 information to memkind.  Tell them to not be lazy and do it right.
>>> 
>>> The beauty of git/github is one can make branches to try out anything
>>> they want even if Jed thinks that he knows better than Intel how to
>>> write system software for Intel's hardware.
>>> 
>>> This link is equivalent to pushing the "Fork" button on Github's
>>> memkind page: https://github.com/memkind/memkind#fork-destination-box.
>>> I'm sure that the memkind developers would be willing to review your
>>> pull request once you've implemented memkind_move_pages().
>>> 
>>> Jeff
>>> 
>>> --
>>> Jeff Hammond
>>> jeff.scie...@gmail.com
>>> http://jeffhammond.github.io/
>> 
> 
> 
> 
> -- 
> Jeff Hammond
> jeff.scie...@gmail.com
> http://jeffhammond.github.io/



Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

> On Jun 3, 2015, at 9:51 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
>>  Perhaps, and this is just nonsense off the top of my head, if you
>>  had some measure of the importance of a vector (or matrix; I would
>>  start with vectors for simplicity and since we have more of them)
>>  based on how often it's values would be "accessed". So a vector that
>>  you know is only used "once in a while" gets a lower "importance"
>>  than one that gets used "very often". Of course determining these
>>  vectors importances may be difficult. You could do it
>>  experimentally, add some code that measures how often each vector
>>  gets its values "accessed (whatever that means)/read write" and see
>>  if there is some distribution (do this for a nontrivial TS example)
>>  where some vectors are accessed often and others rarely. 
> 
> This is what I termed profile-guided and it's very accurate (you have
> global space-time information), but super brittle when
> resource-constrained.

  Sure but the super high-end (DOE LCF centers) focus allows (actually loves)  
the need for "super brittle" stuff.  It is not the bread and butter of PETSc 
but if they (silly ASCR) are willing to foot our bills to do our bread and 
butter by pandering to the "super brittle" high-end what's the harm in 
pandering (aside from our souls) since we are not actually doing the work :-)


> 
> Note that in case of Krylov solvers, the first vectors in the Krylov
> space are accessed far more than later vectors (e.g., the 30th vector is
> accessed once per 30 iterations versus the first vector which is
> accessed every iteration).  Simple greedy allocation is great for this
> case.
> 
> It's terrible in other cases, a simple case of which is two solvers
> where the first is cheap (or solved only rarely) and the second is
> solved repeatedly at great expense.  Nested solvers are one such
> example.  But you don't know which one is more expensive except in
> retrospect, and this can even change as nonlinearities evolve.



Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jeff Hammond
http://git.mpich.org/mpich.git/blob/HEAD:/src/mpi/init/init.c
https://github.com/open-mpi/ompi/blob/master/ompi/mpi/c/init.c

Jeff

On Wed, Jun 3, 2015 at 9:43 PM, Barry Smith  wrote:
>
>   Jeff,
>
>Ahh, from this page, it is definitively clear that the Intel people have 
> their heads totally up their asses
>
> formatted source code with astyle --style=linux --indent=spaces=4 -y -S
>
> when everyone knows that any indent that is not 2 characters is totally 
> insane :-)
>
>   Barry
>
>
>> On Jun 3, 2015, at 9:37 PM, Jeff Hammond  wrote:
>>
 but it screws up memkind's partitioning of the heap (it won't be aware
 that the pages have been moved).
>>>
>>> Then memkind is stupid or the kernel isn't exposing the correct
>>> information to memkind.  Tell them to not be lazy and do it right.
>>
>> The beauty of git/github is one can make branches to try out anything
>> they want even if Jed thinks that he knows better than Intel how to
>> write system software for Intel's hardware.
>>
>> This link is equivalent to pushing the "Fork" button on Github's
>> memkind page: https://github.com/memkind/memkind#fork-destination-box.
>> I'm sure that the memkind developers would be willing to review your
>> pull request once you've implemented memkind_move_pages().
>>
>> Jeff
>>
>> --
>> Jeff Hammond
>> jeff.scie...@gmail.com
>> http://jeffhammond.github.io/
>



-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jed Brown
Jeff Hammond  writes:
> The beauty of git/github is one can make branches to try out anything
> they want even if Jed thinks that he knows better than Intel how to
> write system software for Intel's hardware.

I'm objecting to the interface.  I think that if they try to get memkind
merged into the existing libnuma project, they'll see similar
resistance.  It is essential for low-level interfaces to create
foundations that can be reliably built upon, not gushing wounds that
bleed complexity into everything built on top.

> This link is equivalent to pushing the "Fork" button on Github's
> memkind page: https://github.com/memkind/memkind#fork-destination-box.
> I'm sure that the memkind developers would be willing to review your
> pull request once you've implemented memkind_move_pages().

1. I cannot test it because I don't have access to the hardware.

2. I think memkind is solving the wrong problem in the wrong way.

3. According to Richard, the mature move_pages(2) interface has been
implemented.  That's what I wanted, so I'll just use that -- memkind
dependency gone.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jed Brown
Barry Smith  writes:
>   Perhaps, and this is just nonsense off the top of my head, if you
>   had some measure of the importance of a vector (or matrix; I would
>   start with vectors for simplicity and since we have more of them)
>   based on how often it's values would be "accessed". So a vector that
>   you know is only used "once in a while" gets a lower "importance"
>   than one that gets used "very often". Of course determining these
>   vectors importances may be difficult. You could do it
>   experimentally, add some code that measures how often each vector
>   gets its values "accessed (whatever that means)/read write" and see
>   if there is some distribution (do this for a nontrivial TS example)
>   where some vectors are accessed often and others rarely. 

This is what I termed profile-guided and it's very accurate (you have
global space-time information), but super brittle when
resource-constrained.

Note that in case of Krylov solvers, the first vectors in the Krylov
space are accessed far more than later vectors (e.g., the 30th vector is
accessed once per 30 iterations versus the first vector which is
accessed every iteration).  Simple greedy allocation is great for this
case.

It's terrible in other cases, a simple case of which is two solvers
where the first is cheap (or solved only rarely) and the second is
solved repeatedly at great expense.  Nested solvers are one such
example.  But you don't know which one is more expensive except in
retrospect, and this can even change as nonlinearities evolve.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

  Jeff,

   Ahh, from this page, it is definitively clear that the Intel people have 
their heads totally up their asses 

formatted source code with astyle --style=linux --indent=spaces=4 -y -S

when everyone knows that any indent that is not 2 characters is totally insane 
:-)

  Barry


> On Jun 3, 2015, at 9:37 PM, Jeff Hammond  wrote:
> 
>>> but it screws up memkind's partitioning of the heap (it won't be aware
>>> that the pages have been moved).
>> 
>> Then memkind is stupid or the kernel isn't exposing the correct
>> information to memkind.  Tell them to not be lazy and do it right.
> 
> The beauty of git/github is one can make branches to try out anything
> they want even if Jed thinks that he knows better than Intel how to
> write system software for Intel's hardware.
> 
> This link is equivalent to pushing the "Fork" button on Github's
> memkind page: https://github.com/memkind/memkind#fork-destination-box.
> I'm sure that the memkind developers would be willing to review your
> pull request once you've implemented memkind_move_pages().
> 
> Jeff
> 
> -- 
> Jeff Hammond
> jeff.scie...@gmail.com
> http://jeffhammond.github.io/



Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jed Brown
Richard Mills  writes:
>> > but it screws up memkind's partitioning of the heap (it won't be aware
>> > that the pages have been moved).
>>
>> Then memkind is stupid or the kernel isn't exposing the correct
>> information to memkind.  Tell them to not be lazy and do it right.
>>
>
> I believe that it really comes down to a problem with what the Linux kernel
> allows right now.  To do this "right" we need to hack the kernel.  Memkind
> is working within the constraints of what the kernel currently does.

What exactly is memkind trying to do?  Does it somehow discover the
number of "compute" processes running on the node and partition their
allocation from MCDRAM?  Surely not because that would be as comically
naive as Blue Gene partitioning memory at boot, but what *does* it do
about other processes?  If you spawn new processes, can they use MCDRAM?
How much?  How is memkind budgeting affected by a users' direct use of
mmap or shm_open?  When a process exits, does memkind in the remaining
processes know that more MCDRAM is available?


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jeff Hammond
>> but it screws up memkind's partitioning of the heap (it won't be aware
>> that the pages have been moved).
>
> Then memkind is stupid or the kernel isn't exposing the correct
> information to memkind.  Tell them to not be lazy and do it right.

The beauty of git/github is one can make branches to try out anything
they want even if Jed thinks that he knows better than Intel how to
write system software for Intel's hardware.

This link is equivalent to pushing the "Fork" button on Github's
memkind page: https://github.com/memkind/memkind#fork-destination-box.
I'm sure that the memkind developers would be willing to review your
pull request once you've implemented memkind_move_pages().

Jeff

-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Richard Mills
On Wed, Jun 3, 2015 at 7:28 PM, Jed Brown  wrote:

> Barry Smith  writes:
> >   Richard has access to the hardware
>
> Is this true?  Or he will have hardware "soon"?
>

Yes, I finally have access to hardware.  It's a bit hard to get time on,
and it's flaky because it is from the initial tape-in, but it's here and
I've run on it.  That's what prompted me to bring this thread up again.


>
> >   and is not going to "lie to us" that "oh it helps so much" because
> >   he knows that you will test it yourself and see that he is lying.
>
> I'm not at all worried about him lying, but I'm concerned about being
> able to sample across a sufficiently broad range of apps/configurations.
> Maybe he can run some PETSc examples and PFLOTRAN, which is a good
> start, but may not be running in the appropriately memory-constrained
> circumstances of a package with particles like pTatin, for example.  We
> care not just about the highs but also about the confusing corners that
> users will undoubtedly encounter.
>

Good point, Jed.  And I anticipate a lot of confusing corners.

--Richard


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

> On Jun 3, 2015, at 9:28 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
>>  Richard has access to the hardware
> 
> Is this true?  Or he will have hardware "soon"?
> 
>>  and is not going to "lie to us" that "oh it helps so much" because
>>  he knows that you will test it yourself and see that he is lying. 
> 
> I'm not at all worried about him lying, but I'm concerned about being
> able to sample across a sufficiently broad range of apps/configurations.
> Maybe he can run some PETSc examples and PFLOTRAN, which is a good
> start, but may not be running in the appropriately memory-constrained
> circumstances of a package with particles like pTatin, for example.  We
> care not just about the highs but also about the confusing corners that
> users will undoubtedly encounter.

  Even if it "helps" in only 30 percent of applications that is still a good 
thing (and a great thing politically). Then it becomes an issue of education 
and proper profiling tools to tell people for their apps that it won't work; so 
the other 70% is not "confused".

  Note that Marc Snir today told me that it is perfectly fine if the "largest 
computing systems", i.e. the LCFs can only provide useful performance for a 
small subset of all possible applications.

  Barry







Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jed Brown
Barry Smith  writes:
>   Richard has access to the hardware

Is this true?  Or he will have hardware "soon"?

>   and is not going to "lie to us" that "oh it helps so much" because
>   he knows that you will test it yourself and see that he is lying. 

I'm not at all worried about him lying, but I'm concerned about being
able to sample across a sufficiently broad range of apps/configurations.
Maybe he can run some PETSc examples and PFLOTRAN, which is a good
start, but may not be running in the appropriately memory-constrained
circumstances of a package with particles like pTatin, for example.  We
care not just about the highs but also about the confusing corners that
users will undoubtedly encounter.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

   To follow up on this, going back to my "advise object" to malloc being a 
living object as opposed to just some flags. In the case where different 
vectors may have very different "importances" at different times in the runtime 
of the simulation one could "switch" some vectors from using slow to faster 
memory when one knows the code is switching to a different phase where the 
vector "importances" are different.

  Barry

  Note that even if Intel cannot provide a way to "switch" a  memory address 
between fast and slow it doesn't really mater from the PETSc point of view 
since inside any particular PETSc vector we would could switch the ->array 
pointer to a different memory location (and copy stuff over if needed) when 
changing a vector from important to unimportant or the opposite. (since no code 
outside the vector object knows what the pointer is).


> On Jun 3, 2015, at 9:18 PM, Barry Smith  wrote:
> 
> 
>> On Jun 3, 2015, at 8:55 PM, Richard Mills  wrote:
>> 
>> Ha, yes.  I'll try this out, but I do wonder what people's thoughts are on 
>> the best way to "tag" an object like a Vec or Mat for some particular 
>> treatment of its placement in memory.  Does doing this at the level of a Mat 
>> or Vec (e.g., VecSetAdvMallocCtx() ) sound appropriate?  We could actually 
>> make this a part of any PetscObject, but I think that's not necessary.
> 
>  No idea.
> 
>  Perhaps, and this is just nonsense off the top of my head, if you had some 
> measure of the importance of a vector (or matrix; I would start with vectors 
> for simplicity and since we have more of them) based on how often it's values 
> would be "accessed". So a vector that you know is only used "once in a while" 
> gets a lower "importance" than one that gets used "very often". Of course 
> determining these vectors importances may be difficult. You could do it 
> experimentally, add some code that measures how often each vector gets its 
> values "accessed (whatever that means)/read write" and see if there is some 
> distribution (do this for a nontrivial TS example) where some vectors are 
> accessed often and others rarely. Now place the often "accessed" vectors in 
> faster memory and see how much faster the code is.
> 
>  Barry
> 
> A related note is that "we" are not particularly careful about "reusing" work 
> vectors; say a code has ten different work vectors for different phases of 
> the computation; now imagine a careful "global analysis" that determined it 
> could get away with three work vectors (since only at most three had relevant 
> values at any one time), now pop those three work vectors into faster memory 
> where the ten previous work vectors could not fit. Obviously I am being 
> extreme here to make a point that careful memory decisions could potentially 
> make a difference in complicated codes (and all we are about are complicated 
> codes).
> 
> 
> 
> 
>> 
>> --Richard
>> 
>> On Wed, Jun 3, 2015 at 6:50 PM, Barry Smith  wrote:
>> 
>>  The beauty of git/bitbucket is one can make branches to try out anything 
>> they want even if some cranky old conservative PETSc developer thinks it is 
>> worse then consorting with the devil.
>> 
>>   As I said before I think that "additional argument" to advised_malloc 
>> should be a living object which one can change over time as opposed to just 
>> a "flag" type argument that only effects the malloc at malloc time. Of 
>> course the "living part" can be implemented later.
>> 
>>   Barry
>> 
>> Yes, Jed has already transformed himself into a cranky old conservative 
>> PETSc developer
>> 
>> 
>>> On Jun 3, 2015, at 7:33 PM, Richard Mills  wrote:
>>> 
>>> Hi Folks,
>>> 
>>> It's been a while, but I'd like to pick up this discussion of adding a 
>>> context to memory allocations again.
>>> 
>>> The immediate motivation I have is that I'd like to support use of the 
>>> memkind library (https://github.com/memkind/memkind), though adding a 
>>> context to PetscMallocN() (or making some other interface, say 
>>> PetscAdvMalloc() or whatever) could have much broader utility than simply 
>>> memkind support (which Jed doesn't like anyway, and I share some of his 
>>> concerns).  For the sake of having a concrete example, I'll discuss memkind 
>>> here.
>>> 
>>> Memkind's memkind_malloc() works like malloc() but takes a memkind_t 
>>> argument to specify some desired property of the memory being allocated.  
>>> For example,
>>> 
>>> hugetlb_str = (char *)memkind_malloc(MEMKIND_HUGETLB, size);
>>> 
>>> returns a pointer to memory allocated using huge pages, and
>>> 
>>> hbw_preferred_str = (char *)memkind_malloc(MEMKIND_HBW_PREFERRED, size);
>>> 
>>> allocates memory from a high-bandwidth region if it's available and 
>>> elsewhere if not (specifying MEMKIND_HBW will insist on the allocation 
>>> coming from high-bandwidth memory, failing if it's not available).
>>> 
>>> It should be straightforward to add a variant of PetscMalloc() that accepts 
>>> a context: I'll call this Pe

Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

> On Jun 3, 2015, at 8:55 PM, Richard Mills  wrote:
> 
> Ha, yes.  I'll try this out, but I do wonder what people's thoughts are on 
> the best way to "tag" an object like a Vec or Mat for some particular 
> treatment of its placement in memory.  Does doing this at the level of a Mat 
> or Vec (e.g., VecSetAdvMallocCtx() ) sound appropriate?  We could actually 
> make this a part of any PetscObject, but I think that's not necessary.

  No idea.

  Perhaps, and this is just nonsense off the top of my head, if you had some 
measure of the importance of a vector (or matrix; I would start with vectors 
for simplicity and since we have more of them) based on how often it's values 
would be "accessed". So a vector that you know is only used "once in a while" 
gets a lower "importance" than one that gets used "very often". Of course 
determining these vectors importances may be difficult. You could do it 
experimentally, add some code that measures how often each vector gets its 
values "accessed (whatever that means)/read write" and see if there is some 
distribution (do this for a nontrivial TS example) where some vectors are 
accessed often and others rarely. Now place the often "accessed" vectors in 
faster memory and see how much faster the code is.

  Barry

A related note is that "we" are not particularly careful about "reusing" work 
vectors; say a code has ten different work vectors for different phases of the 
computation; now imagine a careful "global analysis" that determined it could 
get away with three work vectors (since only at most three had relevant values 
at any one time), now pop those three work vectors into faster memory where the 
ten previous work vectors could not fit. Obviously I am being extreme here to 
make a point that careful memory decisions could potentially make a difference 
in complicated codes (and all we are about are complicated codes).




> 
> --Richard
> 
> On Wed, Jun 3, 2015 at 6:50 PM, Barry Smith  wrote:
> 
>   The beauty of git/bitbucket is one can make branches to try out anything 
> they want even if some cranky old conservative PETSc developer thinks it is 
> worse then consorting with the devil.
> 
>As I said before I think that "additional argument" to advised_malloc 
> should be a living object which one can change over time as opposed to just a 
> "flag" type argument that only effects the malloc at malloc time. Of course 
> the "living part" can be implemented later.
> 
>Barry
> 
> Yes, Jed has already transformed himself into a cranky old conservative PETSc 
> developer
> 
> 
> > On Jun 3, 2015, at 7:33 PM, Richard Mills  wrote:
> >
> > Hi Folks,
> >
> > It's been a while, but I'd like to pick up this discussion of adding a 
> > context to memory allocations again.
> >
> > The immediate motivation I have is that I'd like to support use of the 
> > memkind library (https://github.com/memkind/memkind), though adding a 
> > context to PetscMallocN() (or making some other interface, say 
> > PetscAdvMalloc() or whatever) could have much broader utility than simply 
> > memkind support (which Jed doesn't like anyway, and I share some of his 
> > concerns).  For the sake of having a concrete example, I'll discuss memkind 
> > here.
> >
> > Memkind's memkind_malloc() works like malloc() but takes a memkind_t 
> > argument to specify some desired property of the memory being allocated.  
> > For example,
> >
> >  hugetlb_str = (char *)memkind_malloc(MEMKIND_HUGETLB, size);
> >
> > returns a pointer to memory allocated using huge pages, and
> >
> >  hbw_preferred_str = (char *)memkind_malloc(MEMKIND_HBW_PREFERRED, size);
> >
> > allocates memory from a high-bandwidth region if it's available and 
> > elsewhere if not (specifying MEMKIND_HBW will insist on the allocation 
> > coming from high-bandwidth memory, failing if it's not available).
> >
> > It should be straightforward to add a variant of PetscMalloc() that accepts 
> > a context: I'll call this PetscAdvMalloc(), for now, though we can come up 
> > with a better name later.  This will allow passing on the memkind_t via 
> > this context to the underlying memkind allocator, and we can have some 
> > mechanism to set a default context (in the case of Memkind, this is likely 
> > MEMKIND_DEFAULT) that gets used when plain PetscMalloc() gets called.
> >
> > Of course, we'll need some way to ensure that the "advanced malloc" gets 
> > used to allocated the critical data structures.  As a low-level way to 
> > start, it may make sense to simply add a way to stash a context in Vec and 
> > Mat objects.  Maybe have VecSetAdvMallocCtx(), and if that context gets 
> > set, then PetscAdvMalloc() is used for the allocations associated with the 
> > contents of that object.  It would probably be better to eventually have a 
> > higher-level way to do this, e.g., support standard settings in the options 
> > database that PETSc uses to construct the appropriate arguments to 
> > underlying allocators that are su

Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Richard Mills
On Wed, Jun 3, 2015 at 6:54 PM, Jed Brown  wrote:

> Richard Mills  writes:
>
> > On Wed, Jun 3, 2015 at 6:04 PM, Jed Brown  wrote:
> >
> >> Have you heard anything back about whether move_pages() will work?
> >>
> >
> > move_pages() will work to move pages between MCDRAM and DRAM right
> > now,
>
> Great!
>
> > but it screws up memkind's partitioning of the heap (it won't be aware
> > that the pages have been moved).
>
> Then memkind is stupid or the kernel isn't exposing the correct
> information to memkind.  Tell them to not be lazy and do it right.
>

I believe that it really comes down to a problem with what the Linux kernel
allows right now.  To do this "right" we need to hack the kernel.  Memkind
is working within the constraints of what the kernel currently does.


>
> > Jed, I'm with you in thinking that, ultimately, there actually needs to
> be
> > a way to make these kinds of decisions based on global information.  We
> > don't have that right now.  But if we get some smart allocator (and
> > migrator) that gives us, say malloc_use_oracle() to always make the good
> > decision,
>
> The oracle has to see into the future.  move_pages() is so much more
> powerful.
>
> > we still should have something like a PetscAdvMalloc() that provides a
> > context to allow us to pass advice to this smart allocator to provide
> > hints about how it will be accessed, whatever.
>
> What does the caller know?  What good is the context if we always pass
> I_HAVE_NO_IDEA?
>
> > In a lot of cases, simple size-based allocation is probably the way to
> go.
> > An option to do automatic size-based placement is even in the latest
> > memkind sources on github now, but it will do that for the entire
> > application.
>
> That's crude; I'd rather have each library use its own threshold.
>
> > I'd like to be able to restrict this to only the PETSc portion: Maybe
> > a code that uses PETSc also needs to allocate some enormous lookup
> > tables that are big but have accesses that are really latency- rather
> > than bandwidth-sensitive.  Or, to be specific to a code I actually
> > know, I believe that in PFLOTRAN there are some pretty large
> > allocations required for auxiliary variables that don't need to go in
> > high-bandwidth memory, though we will want all of the large PETSc
> > objects to go in there.
>
> Fine.  That involves a couple lines of code.  Go into PetscMallocAlign
> and add the ability to use memkind.  Add a run-time option to control
> the threshold.  Done.
>

Hmm.  That's a simpler solution that may be better.  I'm not sure that it
will always be the best thing to do, but in cases where it is appropriate,
that simple option sounds like something we should support.

I assume you'd also like an option to specify that the allocation should
fail if high bandwidth memory cannot be allocated, to avoid seeing very
confusing performance.


>
> If you want complexity to bleed into the library (and necessarily into
> user code if given any power at all), I think you need to demonstrate a
> tangible benefit that cannot be obtained by something simpler.  Consider
> the simple and dumb threshold above to be the null hypothesis.
>
> This is just my opinion.  Feel free to make a branch with whatever you
> prefer.
>


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

> On Jun 3, 2015, at 9:00 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
>> Yes, Jed has already transformed himself into a cranky old conservative 
>> PETSc developer
> 
> Is disinclination to spend effort on something with negative expected
> value "conservative"?
> 
> Actually, it's almost the definition.  But if you spend time on
> legitimately high-risk things, you should expect that with high
> probability, they will be a failure.  Thus, it's essential to be
> prepared to declare failure rather than lobbying for success (e.g.,
> merging) without conclusive data.  Declaring failure in this case may be
> hard without access to the hardware to be able to push all the design
> corners.
  
  Richard has access to the hardware and is not going to "lie to us" that "oh 
it helps so much" because he knows that you will test it yourself and see that 
he is lying. So should we support some 3rd party that wants money (from ASCR) 
to prove (in a publication) that using memkind is a good idea? Absolutely not. 
But should we support Richard to try some experiments, I don't see the downside.




Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jed Brown
Barry Smith  writes:
> Yes, Jed has already transformed himself into a cranky old conservative PETSc 
> developer

Is disinclination to spend effort on something with negative expected
value "conservative"?

Actually, it's almost the definition.  But if you spend time on
legitimately high-risk things, you should expect that with high
probability, they will be a failure.  Thus, it's essential to be
prepared to declare failure rather than lobbying for success (e.g.,
merging) without conclusive data.  Declaring failure in this case may be
hard without access to the hardware to be able to push all the design
corners.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Richard Mills
Ha, yes.  I'll try this out, but I do wonder what people's thoughts are on
the best way to "tag" an object like a Vec or Mat for some particular
treatment of its placement in memory.  Does doing this at the level of a
Mat or Vec (e.g., VecSetAdvMallocCtx() ) sound appropriate?  We could
actually make this a part of any PetscObject, but I think that's not
necessary.

--Richard

On Wed, Jun 3, 2015 at 6:50 PM, Barry Smith  wrote:

>
>   The beauty of git/bitbucket is one can make branches to try out anything
> they want even if some cranky old conservative PETSc developer thinks it is
> worse then consorting with the devil.
>
>As I said before I think that "additional argument" to advised_malloc
> should be a living object which one can change over time as opposed to just
> a "flag" type argument that only effects the malloc at malloc time. Of
> course the "living part" can be implemented later.
>
>Barry
>
> Yes, Jed has already transformed himself into a cranky old conservative
> PETSc developer
>
>
> > On Jun 3, 2015, at 7:33 PM, Richard Mills  wrote:
> >
> > Hi Folks,
> >
> > It's been a while, but I'd like to pick up this discussion of adding a
> context to memory allocations again.
> >
> > The immediate motivation I have is that I'd like to support use of the
> memkind library (https://github.com/memkind/memkind), though adding a
> context to PetscMallocN() (or making some other interface, say
> PetscAdvMalloc() or whatever) could have much broader utility than simply
> memkind support (which Jed doesn't like anyway, and I share some of his
> concerns).  For the sake of having a concrete example, I'll discuss memkind
> here.
> >
> > Memkind's memkind_malloc() works like malloc() but takes a memkind_t
> argument to specify some desired property of the memory being allocated.
> For example,
> >
> >  hugetlb_str = (char *)memkind_malloc(MEMKIND_HUGETLB, size);
> >
> > returns a pointer to memory allocated using huge pages, and
> >
> >  hbw_preferred_str = (char *)memkind_malloc(MEMKIND_HBW_PREFERRED, size);
> >
> > allocates memory from a high-bandwidth region if it's available and
> elsewhere if not (specifying MEMKIND_HBW will insist on the allocation
> coming from high-bandwidth memory, failing if it's not available).
> >
> > It should be straightforward to add a variant of PetscMalloc() that
> accepts a context: I'll call this PetscAdvMalloc(), for now, though we can
> come up with a better name later.  This will allow passing on the memkind_t
> via this context to the underlying memkind allocator, and we can have some
> mechanism to set a default context (in the case of Memkind, this is likely
> MEMKIND_DEFAULT) that gets used when plain PetscMalloc() gets called.
> >
> > Of course, we'll need some way to ensure that the "advanced malloc" gets
> used to allocated the critical data structures.  As a low-level way to
> start, it may make sense to simply add a way to stash a context in Vec and
> Mat objects.  Maybe have VecSetAdvMallocCtx(), and if that context gets
> set, then PetscAdvMalloc() is used for the allocations associated with the
> contents of that object.  It would probably be better to eventually have a
> higher-level way to do this, e.g., support standard settings in the options
> database that PETSc uses to construct the appropriate arguments to
> underlying allocators that are supported, but I think just adding a way to
> set this context directly is an appropriate first step.
> >
> > Does this sound like a reasonable thing for me to prototype, or are
> others thinking something very different?  Please let me know.  I'm getting
> more access to early systems I can experiment on, and I'd really like to
> move forward on trying things with high bandwidth memory (imperfect as our
> APIs for using it are).
> >
> > Best regards,
> > Richard
> >
> >
> > On Wed, Apr 29, 2015 at 11:10 PM, Richard Mills  wrote:
> > On Wed, Apr 29, 2015 at 1:28 PM, Barry Smith  wrote:
> >
> >   Forget about the issue of "changing" PetscMallocN() or adding a new
> interface instead, that is a minor syntax and annoyance issue:
> >
> >   The question is "is it worth exploring adding a context for certain
> memory allocations that would allow us to "do" various things to the memory
> and "indicate" properties of the memory"? I think, though I agree with Jed
> that it could be fraught with difficulties, that is is worthwhile playing
> around with this.
> >
> >   Barry
> >
> >
> > I vote "yes".  One might want to, say
> >
> > * Give hints via something like madvise() on how/when the memory might
> be accessed.
> > * Specify a preferred "kind" of memory (and behavior if the preferred
> kind is not available, or perhaps even specify a priority on how hard to
> try to get the preferred memory kind)
> > * Specify something like a preference to interleave allocation blocks
> between different kinds of memory
> >
> > I'm sure we can come up with plenty of other possibilities, some of
> which might actually be useful, 

Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jed Brown
Richard Mills  writes:

> On Wed, Jun 3, 2015 at 6:04 PM, Jed Brown  wrote:
>
>> Have you heard anything back about whether move_pages() will work?
>>
>
> move_pages() will work to move pages between MCDRAM and DRAM right
> now, 

Great!

> but it screws up memkind's partitioning of the heap (it won't be aware
> that the pages have been moved). 

Then memkind is stupid or the kernel isn't exposing the correct
information to memkind.  Tell them to not be lazy and do it right.

> Jed, I'm with you in thinking that, ultimately, there actually needs to be
> a way to make these kinds of decisions based on global information.  We
> don't have that right now.  But if we get some smart allocator (and
> migrator) that gives us, say malloc_use_oracle() to always make the good
> decision, 

The oracle has to see into the future.  move_pages() is so much more
powerful.

> we still should have something like a PetscAdvMalloc() that provides a
> context to allow us to pass advice to this smart allocator to provide
> hints about how it will be accessed, whatever.

What does the caller know?  What good is the context if we always pass
I_HAVE_NO_IDEA?

> In a lot of cases, simple size-based allocation is probably the way to go.
> An option to do automatic size-based placement is even in the latest
> memkind sources on github now, but it will do that for the entire
> application.  

That's crude; I'd rather have each library use its own threshold.

> I'd like to be able to restrict this to only the PETSc portion: Maybe
> a code that uses PETSc also needs to allocate some enormous lookup
> tables that are big but have accesses that are really latency- rather
> than bandwidth-sensitive.  Or, to be specific to a code I actually
> know, I believe that in PFLOTRAN there are some pretty large
> allocations required for auxiliary variables that don't need to go in
> high-bandwidth memory, though we will want all of the large PETSc
> objects to go in there.

Fine.  That involves a couple lines of code.  Go into PetscMallocAlign
and add the ability to use memkind.  Add a run-time option to control
the threshold.  Done.

If you want complexity to bleed into the library (and necessarily into
user code if given any power at all), I think you need to demonstrate a
tangible benefit that cannot be obtained by something simpler.  Consider
the simple and dumb threshold above to be the null hypothesis.

This is just my opinion.  Feel free to make a branch with whatever you
prefer.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Barry Smith

  The beauty of git/bitbucket is one can make branches to try out anything they 
want even if some cranky old conservative PETSc developer thinks it is worse 
then consorting with the devil.

   As I said before I think that "additional argument" to advised_malloc should 
be a living object which one can change over time as opposed to just a "flag" 
type argument that only effects the malloc at malloc time. Of course the 
"living part" can be implemented later.

   Barry

Yes, Jed has already transformed himself into a cranky old conservative PETSc 
developer


> On Jun 3, 2015, at 7:33 PM, Richard Mills  wrote:
> 
> Hi Folks,
> 
> It's been a while, but I'd like to pick up this discussion of adding a 
> context to memory allocations again.
> 
> The immediate motivation I have is that I'd like to support use of the 
> memkind library (https://github.com/memkind/memkind), though adding a context 
> to PetscMallocN() (or making some other interface, say PetscAdvMalloc() or 
> whatever) could have much broader utility than simply memkind support (which 
> Jed doesn't like anyway, and I share some of his concerns).  For the sake of 
> having a concrete example, I'll discuss memkind here.
> 
> Memkind's memkind_malloc() works like malloc() but takes a memkind_t argument 
> to specify some desired property of the memory being allocated.  For example, 
> 
>  hugetlb_str = (char *)memkind_malloc(MEMKIND_HUGETLB, size);
> 
> returns a pointer to memory allocated using huge pages, and 
> 
>  hbw_preferred_str = (char *)memkind_malloc(MEMKIND_HBW_PREFERRED, size);
> 
> allocates memory from a high-bandwidth region if it's available and elsewhere 
> if not (specifying MEMKIND_HBW will insist on the allocation coming from 
> high-bandwidth memory, failing if it's not available).
> 
> It should be straightforward to add a variant of PetscMalloc() that accepts a 
> context: I'll call this PetscAdvMalloc(), for now, though we can come up with 
> a better name later.  This will allow passing on the memkind_t via this 
> context to the underlying memkind allocator, and we can have some mechanism 
> to set a default context (in the case of Memkind, this is likely 
> MEMKIND_DEFAULT) that gets used when plain PetscMalloc() gets called.
> 
> Of course, we'll need some way to ensure that the "advanced malloc" gets used 
> to allocated the critical data structures.  As a low-level way to start, it 
> may make sense to simply add a way to stash a context in Vec and Mat objects. 
>  Maybe have VecSetAdvMallocCtx(), and if that context gets set, then 
> PetscAdvMalloc() is used for the allocations associated with the contents of 
> that object.  It would probably be better to eventually have a higher-level 
> way to do this, e.g., support standard settings in the options database that 
> PETSc uses to construct the appropriate arguments to underlying allocators 
> that are supported, but I think just adding a way to set this context 
> directly is an appropriate first step.
>   
> Does this sound like a reasonable thing for me to prototype, or are others 
> thinking something very different?  Please let me know.  I'm getting more 
> access to early systems I can experiment on, and I'd really like to move 
> forward on trying things with high bandwidth memory (imperfect as our APIs 
> for using it are).
> 
> Best regards,
> Richard
> 
> 
> On Wed, Apr 29, 2015 at 11:10 PM, Richard Mills  wrote:
> On Wed, Apr 29, 2015 at 1:28 PM, Barry Smith  wrote:
> 
>   Forget about the issue of "changing" PetscMallocN() or adding a new 
> interface instead, that is a minor syntax and annoyance issue:
> 
>   The question is "is it worth exploring adding a context for certain memory 
> allocations that would allow us to "do" various things to the memory and 
> "indicate" properties of the memory"? I think, though I agree with Jed that 
> it could be fraught with difficulties, that is is worthwhile playing around 
> with this.
> 
>   Barry
> 
> 
> I vote "yes".  One might want to, say
> 
> * Give hints via something like madvise() on how/when the memory might be 
> accessed.
> * Specify a preferred "kind" of memory (and behavior if the preferred kind is 
> not available, or perhaps even specify a priority on how hard to try to get 
> the preferred memory kind)
> * Specify something like a preference to interleave allocation blocks between 
> different kinds of memory
> 
> I'm sure we can come up with plenty of other possibilities, some of which 
> might actually be useful, many of which will be useful only for very 
> contrived cases, and some that are not useful today but may become useful as 
> memory systems evolve.
> 
> --Richard
> 



Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Richard Mills
On Wed, Jun 3, 2015 at 6:04 PM, Jed Brown  wrote:

> Richard Mills  writes:
>
> > It's been a while, but I'd like to pick up this discussion of adding a
> > context to memory allocations again.
>
> Have you heard anything back about whether move_pages() will work?
>

move_pages() will work to move pages between MCDRAM and DRAM right now, but
it screws up memkind's partitioning of the heap (it won't be aware that the
pages have been moved).  (Which calls to mind the question I raised
somewhere back in this thread of whether we even need a heap manager for
the large allocations.)


>
> > hbw_preferred_str = (char *)memkind_malloc(MEMKIND_HBW_PREFERRED, size);
>
> How much would you prefer it?  If we stupidly ask for HBM in VecCreate_*
> and MatCreate_*, then our users will see catastrophic performance drops
> at magic sizes and will have support questions like "I swapped these two
> independent lines and my code ran 5x faster".  Then they'll hack the
> source by writing
>
>   if (moon_is_waxing() && operator_holding_tongue_in_right_cheek()) {
> policy = MEMKIND_HBW_PREFERRED;
>   }
>
> eventually making all decisions based on nonlocal information, ignoring
> the advice parameter.
>
> Then they'll get smart and register their own malloc so they don't have
> to hack the library.  Then they'll try to couple their application with
> another that does the same thing and now they have to write a new malloc
> that makes a new set of decisions in light of the fact that multiple
> libraries are being coupled.
>
> I think we can agree that this is madness.  Where do you draw the line
> and say that crappy performance is just reality?
>
> It's hard for me not to feel like the proposed system will be such a
> nightmarish maintenance burden with such little benefit over a simple
> size-based allocation that it would be better for everyone if it doesn't
> exist.
>

Jed, I'm with you in thinking that, ultimately, there actually needs to be
a way to make these kinds of decisions based on global information.  We
don't have that right now.  But if we get some smart allocator (and
migrator) that gives us, say malloc_use_oracle() to always make the good
decision, we still should have something like a PetscAdvMalloc() that
provides a context to allow us to pass advice to this smart allocator to
provide hints about how it will be accessed, whatever.

I know you don't like the memkind model, and I'm not thrilled with it
either (though it's what I've got to work with right now), but the
interface changes I'm proposing are applicable to other approaches.


> For example, we've already established that small allocations should
> generally go in DRAM because they're either cached or not prefetched and
> thus limited by latency instead of bandwidth.  Large allocations that
> get used a lot should go in HBM so long as they fit.  Since we can't
> determine "used a lot" or "fit" from any information possibly available
> in the calling scope, there's literally no useful advice we can provide
> at that point.  So don't try, just set a dumb threshold (crude tuning
> parameter) or implement a profile-guided allocation policy (brittle).
>

In a lot of cases, simple size-based allocation is probably the way to go.
An option to do automatic size-based placement is even in the latest
memkind sources on github now, but it will do that for the entire
application.  I'd like to be able to restrict this to only the PETSc
portion: Maybe a code that uses PETSc also needs to allocate some enormous
lookup tables that are big but have accesses that are really latency-
rather than bandwidth-sensitive.  Or, to be specific to a code I actually
know, I believe that in PFLOTRAN there are some pretty large allocations
required for auxiliary variables that don't need to go in high-bandwidth
memory, though we will want all of the large PETSc objects to go in there.


>
> Or ignore all this nonsense, implement move_pages(), and we'll have PETSc
> track accesses so we can balance the pages once the app gets going.
>
> > Of course, we'll need some way to ensure that the "advanced malloc"
>
> I thought AdvMalloc was short for AdvisedMalloc.
>

Oh, hey, I do like "Advised" better.

--Richard


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Jed Brown
Richard Mills  writes:

> It's been a while, but I'd like to pick up this discussion of adding a
> context to memory allocations again.

Have you heard anything back about whether move_pages() will work?

> hbw_preferred_str = (char *)memkind_malloc(MEMKIND_HBW_PREFERRED, size);

How much would you prefer it?  If we stupidly ask for HBM in VecCreate_*
and MatCreate_*, then our users will see catastrophic performance drops
at magic sizes and will have support questions like "I swapped these two
independent lines and my code ran 5x faster".  Then they'll hack the
source by writing

  if (moon_is_waxing() && operator_holding_tongue_in_right_cheek()) {
policy = MEMKIND_HBW_PREFERRED;
  }

eventually making all decisions based on nonlocal information, ignoring
the advice parameter.

Then they'll get smart and register their own malloc so they don't have
to hack the library.  Then they'll try to couple their application with
another that does the same thing and now they have to write a new malloc
that makes a new set of decisions in light of the fact that multiple
libraries are being coupled.

I think we can agree that this is madness.  Where do you draw the line
and say that crappy performance is just reality?

It's hard for me not to feel like the proposed system will be such a
nightmarish maintenance burden with such little benefit over a simple
size-based allocation that it would be better for everyone if it doesn't
exist.

For example, we've already established that small allocations should
generally go in DRAM because they're either cached or not prefetched and
thus limited by latency instead of bandwidth.  Large allocations that
get used a lot should go in HBM so long as they fit.  Since we can't
determine "used a lot" or "fit" from any information possibly available
in the calling scope, there's literally no useful advice we can provide
at that point.  So don't try, just set a dumb threshold (crude tuning
parameter) or implement a profile-guided allocation policy (brittle).

Or ignore all this nonsense, implement move_pages(), and we'll have PETSc
track accesses so we can balance the pages once the app gets going.

> Of course, we'll need some way to ensure that the "advanced malloc" 

I thought AdvMalloc was short for AdvisedMalloc.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-06-03 Thread Richard Mills
Hi Folks,

It's been a while, but I'd like to pick up this discussion of adding a
context to memory allocations again.

The immediate motivation I have is that I'd like to support use of the
memkind library (https://github.com/memkind/memkind), though adding a
context to PetscMallocN() (or making some other interface, say
PetscAdvMalloc() or whatever) could have much broader utility than simply
memkind support (which Jed doesn't like anyway, and I share some of his
concerns).  For the sake of having a concrete example, I'll discuss memkind
here.

Memkind's memkind_malloc() works like malloc() but takes a memkind_t
argument to specify some desired property of the memory being allocated.
For example,

 hugetlb_str = (char *)memkind_malloc(MEMKIND_HUGETLB, size);

returns a pointer to memory allocated using huge pages, and

hbw_preferred_str = (char *)memkind_malloc(MEMKIND_HBW_PREFERRED, size);

allocates memory from a high-bandwidth region if it's available and
elsewhere if not (specifying MEMKIND_HBW will insist on the allocation
coming from high-bandwidth memory, failing if it's not available).

It should be straightforward to add a variant of PetscMalloc() that accepts
a context: I'll call this PetscAdvMalloc(), for now, though we can come up
with a better name later.  This will allow passing on the memkind_t via
this context to the underlying memkind allocator, and we can have some
mechanism to set a default context (in the case of Memkind, this is likely
MEMKIND_DEFAULT) that gets used when plain PetscMalloc() gets called.

Of course, we'll need some way to ensure that the "advanced malloc" gets
used to allocated the critical data structures.  As a low-level way to
start, it may make sense to simply add a way to stash a context in Vec and
Mat objects.  Maybe have VecSetAdvMallocCtx(), and if that context gets
set, then PetscAdvMalloc() is used for the allocations associated with the
contents of that object.  It would probably be better to eventually have a
higher-level way to do this, e.g., support standard settings in the options
database that PETSc uses to construct the appropriate arguments to
underlying allocators that are supported, but I think just adding a way to
set this context directly is an appropriate first step.

Does this sound like a reasonable thing for me to prototype, or are others
thinking something very different?  Please let me know.  I'm getting more
access to early systems I can experiment on, and I'd really like to move
forward on trying things with high bandwidth memory (imperfect as our APIs
for using it are).

Best regards,
Richard


On Wed, Apr 29, 2015 at 11:10 PM, Richard Mills  wrote:

> On Wed, Apr 29, 2015 at 1:28 PM, Barry Smith  wrote:
>
>>
>>   Forget about the issue of "changing" PetscMallocN() or adding a new
>> interface instead, that is a minor syntax and annoyance issue:
>>
>>   The question is "is it worth exploring adding a context for certain
>> memory allocations that would allow us to "do" various things to the memory
>> and "indicate" properties of the memory"? I think, though I agree with Jed
>> that it could be fraught with difficulties, that is is worthwhile playing
>> around with this.
>>
>>   Barry
>>
>>
> I vote "yes".  One might want to, say
>
> * Give hints via something like madvise() on how/when the memory might be
> accessed.
> * Specify a preferred "kind" of memory (and behavior if the preferred kind
> is not available, or perhaps even specify a priority on how hard to try to
> get the preferred memory kind)
> * Specify something like a preference to interleave allocation blocks
> between different kinds of memory
>
> I'm sure we can come up with plenty of other possibilities, some of which
> might actually be useful, many of which will be useful only for very
> contrived cases, and some that are not useful today but may become useful
> as memory systems evolve.
>
> --Richard
>


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-29 Thread Richard Mills
On Tue, Apr 28, 2015 at 10:47 PM, Jed Brown  wrote:

> Richard Mills  writes:
>
> [...]
> > I think many users are going to want more control than what something
> like
> > AutoHBW provides, but, as you say, a lot of the time one will only care
> > about the the substantial allocations for things like matrices and
> vectors,
> > and these also tend to be long lived--plenty of codes will do something
> > like allocate a matrix for Jacobians once and keep it around for the
> > lifetime of the run.  Maybe we should consider not using a heap manager
> for
> > these allocations, then.  For allocations above some specified threshold,
> > perhaps we (PETSc) should simply do the appropriate mmap() and mbind()
> > calls to allocate the pages we need in the desired type of memory, and
> then
> > we could use things like use move_pages() if/when appropriate (yes, I
> know
> > we don't yet have a good way to make such decisions).  This would mean
> > PETSc getting more into the lower level details of memory management, but
> > maybe this is appropriate (an unavoidable) as more kinds of
> > user-addressable memory get introduced.  I think is actually less
> horrible
> > than it sounds, because, really, we would just want to do this for the
> > largest allocations.  (And this is somewhat analogous to how many
> malloc()
> > implementations work, anyway: Use sbrk() for the small stuff, and mmap()
> > for the big stuff.)
>
> I say just use malloc (or posix_memalign) for everything.  PETSc can't
> do a better job of the fancy stuff and these normal functions are
> perfectly sufficient.
>
> >> That is a regression relative to move_pages.  Just make move_pages work.
> >> That's the granularity I've been asking for all along.
> >
> > Cannot practically be done using a heap manager system like memkind.  But
> > we can do this if we do our own mmap() calls, as discussed above.
>
> In practice, we would still use malloc(), but set mallopt
> M_MMAP_THRESHOLD if needed and call move_pages.  The reality is that
> with 4 KiB pages, it doesn't even matter if your "large" allocation is
> not page aligned.  The first and last page don't matter--they're small
> enough to be inexpensive to re-fetch from DRAM and don't use up that
> much extra space if you map them into MCDRAM.
>

Hmm.  That may be a pretty good solution for DRAM vs. MCDRAM.  What about
when we further complicate things by adding some large pool of NVRAM?  One
might want some sufficiently large arrays to go into MCDRAM, but other
large arrays to go to NVRAM or DRAM.  I guess we can still do the
appropriate move_pages() to get things into the right places, but I can
also see wanting to do things like use a much large page size for giant
data sets going into NVRAM (which you won't be able to do without a copy to
a different mapped region).  And if there are these and other
complications... then maybe we should be using a heap manager like
memkind.  It would simplify quite a few things EXCEPT we'd have to deal
with the virtual address changing when we want to change "kind" of memory.
But maybe this would not be so bad, using an approach like Karli outlined.

--Richard


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-29 Thread Richard Mills
On Wed, Apr 29, 2015 at 11:10 PM, Richard Mills  wrote:

> On Wed, Apr 29, 2015 at 1:28 PM, Barry Smith  wrote:
>
>>
>>   Forget about the issue of "changing" PetscMallocN() or adding a new
>> interface instead, that is a minor syntax and annoyance issue:
>>
>>   The question is "is it worth exploring adding a context for certain
>> memory allocations that would allow us to "do" various things to the memory
>> and "indicate" properties of the memory"? I think, though I agree with Jed
>> that it could be fraught with difficulties, that is is worthwhile playing
>> around with this.
>>
>>   Barry
>>
>>
> I vote "yes".  One might want to, say
>
> * Give hints via something like madvise() on how/when the memory might be
> accessed.
> * Specify a preferred "kind" of memory (and behavior if the preferred kind
> is not available, or perhaps even specify a priority on how hard to try to
> get the preferred memory kind)
> * Specify something like a preference to interleave allocation blocks
> between different kinds of memory
>

Let me add to the list of things we might want to do:

  * Specify that "huge pages" be used.

--Richard


> I'm sure we can come up with plenty of other possibilities, some of which
> might actually be useful, many of which will be useful only for very
> contrived cases, and some that are not useful today but may become useful
> as memory systems evolve.
>
> --Richard
>


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-29 Thread Richard Mills
On Wed, Apr 29, 2015 at 1:28 PM, Barry Smith  wrote:

>
>   Forget about the issue of "changing" PetscMallocN() or adding a new
> interface instead, that is a minor syntax and annoyance issue:
>
>   The question is "is it worth exploring adding a context for certain
> memory allocations that would allow us to "do" various things to the memory
> and "indicate" properties of the memory"? I think, though I agree with Jed
> that it could be fraught with difficulties, that is is worthwhile playing
> around with this.
>
>   Barry
>
>
I vote "yes".  One might want to, say

* Give hints via something like madvise() on how/when the memory might be
accessed.
* Specify a preferred "kind" of memory (and behavior if the preferred kind
is not available, or perhaps even specify a priority on how hard to try to
get the preferred memory kind)
* Specify something like a preference to interleave allocation blocks
between different kinds of memory

I'm sure we can come up with plenty of other possibilities, some of which
might actually be useful, many of which will be useful only for very
contrived cases, and some that are not useful today but may become useful
as memory systems evolve.

--Richard


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-29 Thread Richard Mills
On Wed, Apr 29, 2015 at 2:39 AM, Karl Rupp  wrote:

> Hi,
>
> > (...)
>
>>
>> If we want to move data from one memory kind to another, I believe that
>> we need to be able to deal with the virtual address changing.  Yes, this
>> is a pain because extra bookkeeping is involved.  Maybe we don't want to
>> bother with supporting something like this in PETSc.  But I don't know
>> of any good way around this.  I have discussed with Chris the idea of
>> adding support for asynchronously copying pages between different kinds
>> of memory (maybe have a memdup() analog to strdup()) and he had some
>> ideas about how this might be done efficiently.  But, again, I don't
>> know of a good way to move data to a different memory kind while keeping
>> the same virtual address.  If I'm misunderstanding something about what
>> is possible with Linux (or other *nix), please let me know--I'd really
>> like to be wrong on this.
>>
>
> let me bring up another spin of this thought: Currently we have related
> issues with managing memory on GPUs. The way we address this topic there is
> that we have a plain host-buffer, and a buffer allocated on the GPU. A
> separate flag keeps track of which buffer holds the most recent data (host,
> GPU, or both). What if we extend this system slightly such that we can also
> deal with HBM?
>
> Benefits:
>  - Changes to code base mainly in *GetArrayReadWrite(), returning the
> 'correct' buffer.
>  - Command line options as well as APIs for enabling/disabling HBM can be
> easily provided.
>  - DRAM fallback always available, even if HBM exhausted.
>  - Similar code and logic for dealing with HBM and GPUs.
>
> Disadvantages:
>  - Depending on the actual implementation, we may need extra memory (data
> duplication in HBM and DRAM). Since DRAM >> HBM, this may not be a big
> issue.
>  - Some parts of PETSc allocate memory directly rather than using standard
> types. These will not use HBM then. May not be performance-critical,
> though...
>  - Asynchronous copies between DRAM and HBM remain tricky.
>  - 'Politics': The approach is not as fancy as writing heap managers and
> other low-level stuff, so it's harder to sell to e.g. program managers.
>

I like several things about this proposal, and I think it might especially
make sense for systems with very large amounts of NVRAM, and the example
that Barry was talking about in which one might want to somehow mark an
allocation as being a target for eviction if needed.  At the expense of
data duplication, it also helps address problems that might arise if one
tries to access a data structure that is in the middle of being copied from
one memory kind to another.

--Richard


>
> Best regards,
> Karli
>
>


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-29 Thread Barry Smith

  Forget about the issue of "changing" PetscMallocN() or adding a new interface 
instead, that is a minor syntax and annoyance issue:  

  The question is "is it worth exploring adding a context for certain memory 
allocations that would allow us to "do" various things to the memory and 
"indicate" properties of the memory"? I think, though I agree with Jed that it 
could be fraught with difficulties, that is is worthwhile playing around with 
this.

  Barry


> On Apr 29, 2015, at 3:17 PM, Matthew Knepley  wrote:
> 
> On Thu, Apr 30, 2015 at 6:13 AM, Barry Smith  wrote:
> 
> > On Apr 29, 2015, at 12:29 AM, Richard Mills  wrote:
> >
> > I think this is maybe getting away from the heart of the discussion, but 
> > anyway: Barry, you are talking about being able to mark certain allocated 
> > regions as "optional" and then have those allocations disappear, right?  I 
> > could see, say, having such an "optional" allocation that is backed by a 
> > file on some sort of "fast" storage (some super-fancy SSD or NVRAM) and 
> > whose pages can be evicted if the memory is needed.  I don't know that I 
> > like a pointer to that just being nullified if the eviction occurs, though. 
> >  For a case like this, I'd like to do something like have a Vec and only 
> > have this happen to the array of values if no "get" is outstanding.  This 
> > puts us back with an array to the numerical values that could get swapped 
> > around.
> 
>There are many possibilities from "just release this memory" to "if you 
> really need to you can move this to slower memory".
> 
>For example if after a DMRestoreLocalVector() the array memory could be 
> marked as "release if you need the space" then on the next DMGetLocalVector() 
> it would check if the memory had been released and allocated it again. If it 
> had not been released then it would just use it.
> 
>Clearly you can't just mark memory that is being passed around randomly in 
> user code as "release if need the space" but you can for carefully controlled 
> memory.
> 
> I still see no convincing rationale for changing the PetscMalloc interface. 
> If, as you say, few people use it, then there
> is no reason to change it. We can just change our internal interface, and 
> leave that top level interface alone. Moreover,
> since none of the interface changes are very specific I think it needs time 
> to be shaken out. If at the end, things get
> faster and more understandable, we can replace PetscMalloc.
> 
>Matt
>  
> 
>   Barry
> 
> >
> > On Tue, Apr 28, 2015 at 8:44 PM, Jed Brown  wrote:
> > Barry Smith  writes:
> > >   The special malloc would need to save the locations at which it set
> > >   the addresses and then switch the address to NULL. Then the code
> > >   that used those locations would have to know that they that they may
> > >   be set to NULL and hence check them before use.
> >
> > And then PetscMalloc(...,&tmp); foo->data = tmp; creates SEGV at some
> > unpredictable time.  Awesome!
> >
> > >   I am not saying this particular thing would be practical or not,
> > >   just that if we had a concept of a malloc context for each malloc
> > >   there are many games we could try that we couldn't try otherwise and
> > >   this is just one of them.
> >
> > I'm not convinced, except in the case of mixing in madvise hints.
> >
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener



Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-29 Thread Matthew Knepley
On Thu, Apr 30, 2015 at 6:13 AM, Barry Smith  wrote:

>
> > On Apr 29, 2015, at 12:29 AM, Richard Mills  wrote:
> >
> > I think this is maybe getting away from the heart of the discussion, but
> anyway: Barry, you are talking about being able to mark certain allocated
> regions as "optional" and then have those allocations disappear, right?  I
> could see, say, having such an "optional" allocation that is backed by a
> file on some sort of "fast" storage (some super-fancy SSD or NVRAM) and
> whose pages can be evicted if the memory is needed.  I don't know that I
> like a pointer to that just being nullified if the eviction occurs,
> though.  For a case like this, I'd like to do something like have a Vec and
> only have this happen to the array of values if no "get" is outstanding.
> This puts us back with an array to the numerical values that could get
> swapped around.
>
>There are many possibilities from "just release this memory" to "if you
> really need to you can move this to slower memory".
>
>For example if after a DMRestoreLocalVector() the array memory could be
> marked as "release if you need the space" then on the next
> DMGetLocalVector() it would check if the memory had been released and
> allocated it again. If it had not been released then it would just use it.
>
>Clearly you can't just mark memory that is being passed around randomly
> in user code as "release if need the space" but you can for carefully
> controlled memory.


I still see no convincing rationale for changing the PetscMalloc interface.
If, as you say, few people use it, then there
is no reason to change it. We can just change our internal interface, and
leave that top level interface alone. Moreover,
since none of the interface changes are very specific I think it needs time
to be shaken out. If at the end, things get
faster and more understandable, we can replace PetscMalloc.

   Matt


>
>   Barry
>
> >
> > On Tue, Apr 28, 2015 at 8:44 PM, Jed Brown  wrote:
> > Barry Smith  writes:
> > >   The special malloc would need to save the locations at which it set
> > >   the addresses and then switch the address to NULL. Then the code
> > >   that used those locations would have to know that they that they may
> > >   be set to NULL and hence check them before use.
> >
> > And then PetscMalloc(...,&tmp); foo->data = tmp; creates SEGV at some
> > unpredictable time.  Awesome!
> >
> > >   I am not saying this particular thing would be practical or not,
> > >   just that if we had a concept of a malloc context for each malloc
> > >   there are many games we could try that we couldn't try otherwise and
> > >   this is just one of them.
> >
> > I'm not convinced, except in the case of mixing in madvise hints.
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-29 Thread Barry Smith

> On Apr 29, 2015, at 12:29 AM, Richard Mills  wrote:
> 
> I think this is maybe getting away from the heart of the discussion, but 
> anyway: Barry, you are talking about being able to mark certain allocated 
> regions as "optional" and then have those allocations disappear, right?  I 
> could see, say, having such an "optional" allocation that is backed by a file 
> on some sort of "fast" storage (some super-fancy SSD or NVRAM) and whose 
> pages can be evicted if the memory is needed.  I don't know that I like a 
> pointer to that just being nullified if the eviction occurs, though.  For a 
> case like this, I'd like to do something like have a Vec and only have this 
> happen to the array of values if no "get" is outstanding.  This puts us back 
> with an array to the numerical values that could get swapped around.

   There are many possibilities from "just release this memory" to "if you 
really need to you can move this to slower memory". 

   For example if after a DMRestoreLocalVector() the array memory could be 
marked as "release if you need the space" then on the next DMGetLocalVector() 
it would check if the memory had been released and allocated it again. If it 
had not been released then it would just use it.

   Clearly you can't just mark memory that is being passed around randomly in 
user code as "release if need the space" but you can for carefully controlled 
memory.

  Barry

> 
> On Tue, Apr 28, 2015 at 8:44 PM, Jed Brown  wrote:
> Barry Smith  writes:
> >   The special malloc would need to save the locations at which it set
> >   the addresses and then switch the address to NULL. Then the code
> >   that used those locations would have to know that they that they may
> >   be set to NULL and hence check them before use.
> 
> And then PetscMalloc(...,&tmp); foo->data = tmp; creates SEGV at some
> unpredictable time.  Awesome!
> 
> >   I am not saying this particular thing would be practical or not,
> >   just that if we had a concept of a malloc context for each malloc
> >   there are many games we could try that we couldn't try otherwise and
> >   this is just one of them.
> 
> I'm not convinced, except in the case of mixing in madvise hints.
> 



Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-29 Thread Barry Smith

> On Apr 28, 2015, at 10:44 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
>>  The special malloc would need to save the locations at which it set
>>  the addresses and then switch the address to NULL. Then the code
>>  that used those locations would have to know that they that they may
>>  be set to NULL and hence check them before use.
> 
> And then PetscMalloc(...,&tmp); foo->data = tmp; creates SEGV at some
> unpredictable time.  Awesome!

   Obviously it is a controlled malloc that has to be used properly. If you 
know that you are getting some unreliable location you cannot do this type of 
code, nor would you. And since we are using the malloc in our code and users 
rarely need to use.

> 
>>  I am not saying this particular thing would be practical or not,
>>  just that if we had a concept of a malloc context for each malloc
>>  there are many games we could try that we couldn't try otherwise and
>>  this is just one of them.
> 
> I'm not convinced, except in the case of mixing in madvise hints.



Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-29 Thread Karl Rupp

Hi,

> (...)


If we want to move data from one memory kind to another, I believe that
we need to be able to deal with the virtual address changing.  Yes, this
is a pain because extra bookkeeping is involved.  Maybe we don't want to
bother with supporting something like this in PETSc.  But I don't know
of any good way around this.  I have discussed with Chris the idea of
adding support for asynchronously copying pages between different kinds
of memory (maybe have a memdup() analog to strdup()) and he had some
ideas about how this might be done efficiently.  But, again, I don't
know of a good way to move data to a different memory kind while keeping
the same virtual address.  If I'm misunderstanding something about what
is possible with Linux (or other *nix), please let me know--I'd really
like to be wrong on this.


let me bring up another spin of this thought: Currently we have related 
issues with managing memory on GPUs. The way we address this topic there 
is that we have a plain host-buffer, and a buffer allocated on the GPU. 
A separate flag keeps track of which buffer holds the most recent data 
(host, GPU, or both). What if we extend this system slightly such that 
we can also deal with HBM?


Benefits:
 - Changes to code base mainly in *GetArrayReadWrite(), returning the 
'correct' buffer.
 - Command line options as well as APIs for enabling/disabling HBM can 
be easily provided.

 - DRAM fallback always available, even if HBM exhausted.
 - Similar code and logic for dealing with HBM and GPUs.

Disadvantages:
 - Depending on the actual implementation, we may need extra memory 
(data duplication in HBM and DRAM). Since DRAM >> HBM, this may not be a 
big issue.
 - Some parts of PETSc allocate memory directly rather than using 
standard types. These will not use HBM then. May not be 
performance-critical, though...

 - Asynchronous copies between DRAM and HBM remain tricky.
 - 'Politics': The approach is not as fancy as writing heap managers 
and other low-level stuff, so it's harder to sell to e.g. program managers.


Best regards,
Karli



Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-28 Thread Jed Brown
Richard Mills  writes:

>> Really?  That's what I'm asking for.
>
> Yes, I am ~ 99% sure that this is the case, but I will double-check to make
> sure.

Thanks.

>> For small allocations, it doesn't matter where the memory is located
>> because it's either in cache or it's not.  From what I hear, KNL's
>> MCDRAM won't improve latency, so all such allocations may as well go in
>> DRAM anyway.  So all I care about are substantial allocations, like
>> matrix and vector data.  It's not expensive to allocate those the align
>> with page boundaries (provided they are big enough; coarse grids don't
>> matter).
>
> Yes, MCDRAM won't help with latency, only bandwidth, so for small
> allocations it won't matter.  Following reasoning like what you have above,
> a colleague on my team recently developed an "AutoHBW" tool for users who
> don't want to modify their code at all.  A user can specify a size
> threshold above which allocations should come from MCDRAM, and then the
> tool interposes on the malloc() (or other allocator) calls to put the small
> stuff in DRAM and the big stuff in MCDRAM.

What's the point?  If you can fit all the "large" allocations in MCDRAM,
can't you just fit everything in MCDRAM?  Is that so bad?

> I think many users are going to want more control than what something like
> AutoHBW provides, but, as you say, a lot of the time one will only care
> about the the substantial allocations for things like matrices and vectors,
> and these also tend to be long lived--plenty of codes will do something
> like allocate a matrix for Jacobians once and keep it around for the
> lifetime of the run.  Maybe we should consider not using a heap manager for
> these allocations, then.  For allocations above some specified threshold,
> perhaps we (PETSc) should simply do the appropriate mmap() and mbind()
> calls to allocate the pages we need in the desired type of memory, and then
> we could use things like use move_pages() if/when appropriate (yes, I know
> we don't yet have a good way to make such decisions).  This would mean
> PETSc getting more into the lower level details of memory management, but
> maybe this is appropriate (an unavoidable) as more kinds of
> user-addressable memory get introduced.  I think is actually less horrible
> than it sounds, because, really, we would just want to do this for the
> largest allocations.  (And this is somewhat analogous to how many malloc()
> implementations work, anyway: Use sbrk() for the small stuff, and mmap()
> for the big stuff.)

I say just use malloc (or posix_memalign) for everything.  PETSc can't
do a better job of the fancy stuff and these normal functions are
perfectly sufficient.

>> That is a regression relative to move_pages.  Just make move_pages work.
>> That's the granularity I've been asking for all along.
>
> Cannot practically be done using a heap manager system like memkind.  But
> we can do this if we do our own mmap() calls, as discussed above.

In practice, we would still use malloc(), but set mallopt
M_MMAP_THRESHOLD if needed and call move_pages.  The reality is that
with 4 KiB pages, it doesn't even matter if your "large" allocation is
not page aligned.  The first and last page don't matter--they're small
enough to be inexpensive to re-fetch from DRAM and don't use up that
much extra space if you map them into MCDRAM.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-28 Thread Richard Mills
I think this is maybe getting away from the heart of the discussion, but
anyway: Barry, you are talking about being able to mark certain allocated
regions as "optional" and then have those allocations disappear, right?  I
could see, say, having such an "optional" allocation that is backed by a
file on some sort of "fast" storage (some super-fancy SSD or NVRAM) and
whose pages can be evicted if the memory is needed.  I don't know that I
like a pointer to that just being nullified if the eviction occurs,
though.  For a case like this, I'd like to do something like have a Vec and
only have this happen to the array of values if no "get" is outstanding.
This puts us back with an array to the numerical values that could get
swapped around.

On Tue, Apr 28, 2015 at 8:44 PM, Jed Brown  wrote:

> Barry Smith  writes:
> >   The special malloc would need to save the locations at which it set
> >   the addresses and then switch the address to NULL. Then the code
> >   that used those locations would have to know that they that they may
> >   be set to NULL and hence check them before use.
>
> And then PetscMalloc(...,&tmp); foo->data = tmp; creates SEGV at some
> unpredictable time.  Awesome!
>
> >   I am not saying this particular thing would be practical or not,
> >   just that if we had a concept of a malloc context for each malloc
> >   there are many games we could try that we couldn't try otherwise and
> >   this is just one of them.
>
> I'm not convinced, except in the case of mixing in madvise hints.
>


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-28 Thread Richard Mills
On Tue, Apr 28, 2015 at 9:35 AM, Jed Brown  wrote:

> Richard Mills  writes:

[...]
>
> Linux provides the mbind(2) and move_pages(2) system calls that enable the
> > user to modify the backing physical pages of virtual address ranges
> within
> > the NUMA architecture, so these can be used to move physical pages
> between
> > NUMA nodes (and high bandwidth on-package memory will be treated as a
> NUMA
> > node).  (A user on a KNL system could actually use move_pages(2) to move
> > between DRAM and MCDRAM, I believe.)
>
> Really?  That's what I'm asking for.
>

Yes, I am ~ 99% sure that this is the case, but I will double-check to make
sure.


>
> > But Linux doesn't provide an equivalent way for a user to change the
> > page size of the backing physical pages of an address range, so it's
> > not possible to implement the above memkind_convert() with what Linux
> > currently provides.
>
> For small allocations, it doesn't matter where the memory is located
> because it's either in cache or it's not.  From what I hear, KNL's
> MCDRAM won't improve latency, so all such allocations may as well go in
> DRAM anyway.  So all I care about are substantial allocations, like
> matrix and vector data.  It's not expensive to allocate those the align
> with page boundaries (provided they are big enough; coarse grids don't
> matter).
>

Yes, MCDRAM won't help with latency, only bandwidth, so for small
allocations it won't matter.  Following reasoning like what you have above,
a colleague on my team recently developed an "AutoHBW" tool for users who
don't want to modify their code at all.  A user can specify a size
threshold above which allocations should come from MCDRAM, and then the
tool interposes on the malloc() (or other allocator) calls to put the small
stuff in DRAM and the big stuff in MCDRAM.

I think many users are going to want more control than what something like
AutoHBW provides, but, as you say, a lot of the time one will only care
about the the substantial allocations for things like matrices and vectors,
and these also tend to be long lived--plenty of codes will do something
like allocate a matrix for Jacobians once and keep it around for the
lifetime of the run.  Maybe we should consider not using a heap manager for
these allocations, then.  For allocations above some specified threshold,
perhaps we (PETSc) should simply do the appropriate mmap() and mbind()
calls to allocate the pages we need in the desired type of memory, and then
we could use things like use move_pages() if/when appropriate (yes, I know
we don't yet have a good way to make such decisions).  This would mean
PETSc getting more into the lower level details of memory management, but
maybe this is appropriate (an unavoidable) as more kinds of
user-addressable memory get introduced.  I think is actually less horrible
than it sounds, because, really, we would just want to do this for the
largest allocations.  (And this is somewhat analogous to how many malloc()
implementations work, anyway: Use sbrk() for the small stuff, and mmap()
for the big stuff.)


>
> > If we want to move data from one memory kind to another, I believe that
> we
> > need to be able to deal with the virtual address changing.
>
> That is a regression relative to move_pages.  Just make move_pages work.
> That's the granularity I've been asking for all along.
>

Cannot practically be done using a heap manager system like memkind.  But
we can do this if we do our own mmap() calls, as discussed above.


>
> > Yes, this is a pain because extra bookkeeping is involved.  Maybe we
> > don't want to bother with supporting something like this in PETSc.
> > But I don't know of any good way around this.  I have discussed with
> > Chris the idea of adding support for asynchronously copying pages
> > between different kinds of memory (maybe have a memdup() analog to
> > strdup()) and he had some ideas about how this might be done
> > efficiently.  But, again, I don't know of a good way to move data to a
> > different memory kind while keeping the same virtual address.  If I'm
> > misunderstanding something about what is possible with Linux (or other
> > *nix), please let me know--I'd really like to be wrong on this.
>
> Moving memory at page granularity is all you can do.  The hardware
> doesn't support virtual-physical mapping at different granularity, so
> there is no way to preserve address without affecting everything sharing
> that page.  But "memkinds" only matter for large allocations.
>
> Is it a showstopper to have different addresses and do full copies?
> It's more of a mess with threads (requires extra
> synchronization/coordination), but it's sometimes (maybe often)
> feasible.  It's certainly ugly and a debugging nightmare (e.g., you'll
> set a location watchpoint and not see where it was modified because it
> was copied out to a different kind).  We'll also need a system for
> eviction.
>


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-28 Thread Jed Brown
Barry Smith  writes:
>   The special malloc would need to save the locations at which it set
>   the addresses and then switch the address to NULL. Then the code
>   that used those locations would have to know that they that they may
>   be set to NULL and hence check them before use.

And then PetscMalloc(...,&tmp); foo->data = tmp; creates SEGV at some
unpredictable time.  Awesome!

>   I am not saying this particular thing would be practical or not,
>   just that if we had a concept of a malloc context for each malloc
>   there are many games we could try that we couldn't try otherwise and
>   this is just one of them.

I'm not convinced, except in the case of mixing in madvise hints.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-28 Thread Barry Smith

> On Apr 28, 2015, at 5:04 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
>>> How do you communicate to the accessor that the memory has been freed?
>>> 
>>  Accessor? What is accessor?
> 
> The code that accesses the memory behind the pointer (via the pointer or
> otherwise).

  The special malloc would need to save the locations at which it set the 
addresses and then switch the address to NULL. Then the code that used those 
locations would have to know that they that they may be set to NULL and hence 
check them before use.

  I am not saying this particular thing would be practical or not, just that if 
we had a concept of a malloc context for each malloc there are many games we 
could try that we couldn't try otherwise and this is just one of them.

  Barry




Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-28 Thread Jed Brown
Barry Smith  writes:
>> How do you communicate to the accessor that the memory has been freed?
>> 
>   Accessor? What is accessor?

The code that accesses the memory behind the pointer (via the pointer or
otherwise).


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-28 Thread Barry Smith

> On Apr 28, 2015, at 10:31 AM, Jed Brown  wrote:
> 
> Barry Smith  writes:
>>  For things like we talked about the other day. Malloc zones,
>>   maybe some indication that it is ok that the runtime take back
>>  this memory if available memory is running low, 
> 
> How do you communicate to the accessor that the memory has been freed?
> 
  Accessor? What is accessor?


>>  the ability to turn off read or all access to a malloc zone so that
>>  another library cannot corrupt the data, etc. When I said
>>  independent of memkind I meant having nothing to do with memkind.
> 
> Sure, and I'm not opposed to the concept, but I'd like it to somehow be
> based on information that the caller can use and have semantics that are
> implementable.  I'm also not wild about the global variables like
> PETSC_COMM_WORLD (whose use is pretty much always wrong in library
> code), so would like to know how a relevant context would be plumbed
> into the caller's scope.
> 
>>  You are correct that involves lots of nonlocal information or information 
>> that is not yet known, so the argument cannot be simple flags but must be a 
>> context that can be modified at later times.  Crude example
>> 
>>  malloc(zone1, n,&x);
>>   
>>  ZoneSetReadOnly(zone1);
> 
> This is implementable, just somewhat expensive.
> 
>>  ZoneSetAvailableIfNeeded(zone1);
> 
> I don't know what semantics this could have that wouldn't be a
> programming disaster.



Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-28 Thread Jed Brown
Richard Mills  writes:

>> I'm at a loss for words to express how disgusting this is.
>>
>
> Ha ha!  Yeah, I don't like it either.  Chris and I were just thinking about
> what we could do if we wanted to not break the existing API.  

But it DOES BREAK THE EXISTING API!  If you make this change, ALL
EXISTING CODE IS BROKEN and yet broken in a way that the compiler cannot
warn about.  This is literally the worst possible thing.

>> What did Chris say when you asked him about making memkind "suck less"?
>> (Using shorthand to avoid retyping my previous long emails with
>> constructive suggestions.)
>>
>
> I had some pretty good discussions with Chris.  He's a very reasonable guy,
> actually (and unfortunately has just moved to another project, so someone
> else is going to have to take over memkind ownership).  I summarize the
> main points (the ones I can recall, anyway) below:
>
> 1) Easy one first: Regarding my wish for a call to accurately query the
> amount of available high-bandwidth memory (MCDRAM), there is currently a
> memkind_get_size() API but it has the shortcomings of being expensive and
> not taking into account the heap's free pool (just the memory that the OS
> knows to be available).  It should be possible to get around the expense of
> the call with some caching and to include the free pool accounting.  Don't
> know if any work has been done on this one, yet.

I don't think this is very useful for multi-process or threaded code
(i.e., all code that might run on KNL) due to race conditions.  Suppose
that 1% of processes get allocation kinds mixed up due to the race
condition and then run 5x slower for the memory-bound phases of the
application.  Have fun load balancing that.  If you want reproducible
performance and/or avoid this load balancing disaster, you need to
either solve the packing problem in a deterministic way or you need to
adaptively modify the policy so that you can fix the low-quality
allocations due to race conditions.

> 2) Regarding the desire to be able to move pages between kinds of memory
> while keeping the same virtual address:  This is tough to implement in a
> way that will give decent performance.  I guess that what we'd really like
> to have would be an API like
>
>   int memkind_convert(memkind_t kind, void *ptr, size_t size);
>
> but the problem with the above is that is if the physical backing of a
> virtual address is being changed, then a POSIX system call has to be made.

This interface is too fine-grained in my opinion.

> Linux provides the mbind(2) and move_pages(2) system calls that enable the
> user to modify the backing physical pages of virtual address ranges within
> the NUMA architecture, so these can be used to move physical pages between
> NUMA nodes (and high bandwidth on-package memory will be treated as a NUMA
> node).  (A user on a KNL system could actually use move_pages(2) to move
> between DRAM and MCDRAM, I believe.)  

Really?  That's what I'm asking for.

> But Linux doesn't provide an equivalent way for a user to change the
> page size of the backing physical pages of an address range, so it's
> not possible to implement the above memkind_convert() with what Linux
> currently provides.

For small allocations, it doesn't matter where the memory is located
because it's either in cache or it's not.  From what I hear, KNL's
MCDRAM won't improve latency, so all such allocations may as well go in
DRAM anyway.  So all I care about are substantial allocations, like
matrix and vector data.  It's not expensive to allocate those the align
with page boundaries (provided they are big enough; coarse grids don't
matter).

> If we want to move data from one memory kind to another, I believe that we
> need to be able to deal with the virtual address changing.  

That is a regression relative to move_pages.  Just make move_pages work.
That's the granularity I've been asking for all along.

> Yes, this is a pain because extra bookkeeping is involved.  Maybe we
> don't want to bother with supporting something like this in PETSc.
> But I don't know of any good way around this.  I have discussed with
> Chris the idea of adding support for asynchronously copying pages
> between different kinds of memory (maybe have a memdup() analog to
> strdup()) and he had some ideas about how this might be done
> efficiently.  But, again, I don't know of a good way to move data to a
> different memory kind while keeping the same virtual address.  If I'm
> misunderstanding something about what is possible with Linux (or other
> *nix), please let me know--I'd really like to be wrong on this.

Moving memory at page granularity is all you can do.  The hardware
doesn't support virtual-physical mapping at different granularity, so
there is no way to preserve address without affecting everything sharing
that page.  But "memkinds" only matter for large allocations.

Is it a showstopper to have different addresses and do full copies?
It's more of a mess with threads (requires extra

Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-28 Thread Jed Brown
Richard Mills  writes:
> I really like Barry's proposal to add this context.  I can think of other
> things that could go into that context, too, like hints about how the
> memory will be used (passed to the OS via madvise(2), for instance).  

I like this better.  And "memkind" should really be an enhancement to
posix_madvise.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-28 Thread Jed Brown
Barry Smith  writes:
>   For things like we talked about the other day. Malloc zones,
>    maybe some indication that it is ok that the runtime take back
>   this memory if available memory is running low, 

How do you communicate to the accessor that the memory has been freed?

>   the ability to turn off read or all access to a malloc zone so that
>   another library cannot corrupt the data, etc. When I said
>   independent of memkind I meant having nothing to do with memkind.

Sure, and I'm not opposed to the concept, but I'd like it to somehow be
based on information that the caller can use and have semantics that are
implementable.  I'm also not wild about the global variables like
PETSC_COMM_WORLD (whose use is pretty much always wrong in library
code), so would like to know how a relevant context would be plumbed
into the caller's scope.

>   You are correct that involves lots of nonlocal information or information 
> that is not yet known, so the argument cannot be simple flags but must be a 
> context that can be modified at later times.  Crude example
>
>   malloc(zone1, n,&x);
>
>   ZoneSetReadOnly(zone1);

This is implementable, just somewhat expensive.

>   ZoneSetAvailableIfNeeded(zone1);

I don't know what semantics this could have that wouldn't be a
programming disaster.


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-28 Thread Barry Smith

  PetscObject x; It would be problematic if the address x ever changed because 
copies of that address could be stored all over the place (as references to 
that object) but for the data within an object, such as the array of numerical 
values in a vector or matrix, or indices in IS, etc there is generally only a 
single copy of that address so (except when a Get is outstanding) so at least 
in theory that memory can be swapped around without effecting the user (ahh the 
power of abstraction :-).  You could write some very simple test code such as a 
VecChangeMemory() that allocates new array space and copies the values over, or 
you can even do as we do with GPUs and have multiple array spaces allocated (in 
different kinds) and have VecGetArray() depending on "something" return 
pointers to different ones.

  Barry



> On Apr 28, 2015, at 1:38 AM, Richard Mills  wrote:
> 
> On Mon, Apr 27, 2015 at 12:38 PM, Jed Brown  wrote:
> Richard Mills  writes:
> > I think it is possible to add the memkind support without breaking all of
> > the interfaces used throughout PETSc for PetscMalloc(), etc.  I recently
> > sat with Chris Cantalupo, the main memkind developer, and walked him
> > through PETSc's allocation routines, and we came up with the following: The
> > imalloc() function pointer could have an implementation something like
> >
> > PetcErrorCode PetscMemkindMalloc(size_t size, const char *func, const char
> > *file, void **result)
> >
> > {
> >
> > struct memkind *kind;
> >
> > int err;
> >
> >
> >
> > if (*result == NULL) {
> >
> > kind = MEMKIND_DEFAULT;
> >
> > }
> >
> > else {
> >
> > kind = (struct memkind *)(*result);
> 
> I'm at a loss for words to express how disgusting this is.
> 
> Ha ha!  Yeah, I don't like it either.  Chris and I were just thinking about 
> what we could do if we wanted to not break the existing API.  But one of my 
> favorite things about PETSc is that developers are never afraid to make 
> wholesale changes to things.
>  
> 
> > This gives us (1) a method of passing the kind of memory without modifying
> > the petsc allocation routine calling sequence,
> 
> Nonsense, it just dodges the compiler's ability to tell you about the
> memory errors that it creates at every place where PetscMalloc is
> called!
> 
> 
> What did Chris say when you asked him about making memkind "suck less"?
> (Using shorthand to avoid retyping my previous long emails with
> constructive suggestions.)
>  
> I had some pretty good discussions with Chris.  He's a very reasonable guy, 
> actually (and unfortunately has just moved to another project, so someone 
> else is going to have to take over memkind ownership).  I summarize the main 
> points (the ones I can recall, anyway) below:
> 
> 1) Easy one first: Regarding my wish for a call to accurately query the 
> amount of available high-bandwidth memory (MCDRAM), there is currently a 
> memkind_get_size() API but it has the shortcomings of being expensive and not 
> taking into account the heap's free pool (just the memory that the OS knows 
> to be available).  It should be possible to get around the expense of the 
> call with some caching and to include the free pool accounting.  Don't know 
> if any work has been done on this one, yet.
> 
> 2) Regarding the desire to be able to move pages between kinds of memory 
> while keeping the same virtual address:  This is tough to implement in a way 
> that will give decent performance.  I guess that what we'd really like to 
> have would be an API like
> 
>   int memkind_convert(memkind_t kind, void *ptr, size_t size);
> 
> but the problem with the above is that is if the physical backing of a 
> virtual address is being changed, then a POSIX system call has to be made.  
> This also means that a heap management system tracking properties of virtual 
> address ranges for reuse after freeing will require *making a system call to 
> query the properties at the time of the free*.  This kills a lot of the 
> reason for using a heap manager in the first place: avoiding the expense of 
> repeated system calls (otherwise we'd just use mmap() for everything) by 
> reusing memory already obtained from the kernel.
> 
> Linux provides the mbind(2) and move_pages(2) system calls that enable the 
> user to modify the backing physical pages of virtual address ranges within 
> the NUMA architecture, so these can be used to move physical pages between 
> NUMA nodes (and high bandwidth on-package memory will be treated as a NUMA 
> node).  (A user on a KNL system could actually use move_pages(2) to move 
> between DRAM and MCDRAM, I believe.)  But Linux doesn't provide an equivalent 
> way for a user to change the page size of the backing physical pages of an 
> address range, so it's not possible to implement the above memkind_convert() 
> with what Linux currently provides.
> 
> If we want to move data from one memory kind to another, I believe that we 
> need to be able to deal wi

Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-27 Thread Richard Mills
On Mon, Apr 27, 2015 at 1:56 PM, Barry Smith  wrote:

>
> > On Apr 27, 2015, at 2:46 PM, Jed Brown  wrote:
> >
> > Barry Smith  writes:
> >>MPI_Comm argument?  PETSc users rarely need to call PetscMalloc()
> >>themselves and if they do call it then they should know the
> >>properties of the memory they are allocating. Most users won't
> >>even notice the change.
> >
> > I think that's an exaggeration, but what are you going to use for the
> > "kind" parameter?  The "correct" value depends on a ton of non-local
> > information.
> >
> >>   Note that I'd like to add this argument independent of memkind.
>
>   For things like we talked about the other day. Malloc zones,  maybe
> some indication that it is ok that the runtime take back this memory if
> available memory is running low, the ability to turn off read or all access
> to a malloc zone so that another library cannot corrupt the data, etc. When
> I said independent of memkind I meant having nothing to do with memkind.
>
>   You are correct that involves lots of nonlocal information or
> information that is not yet known, so the argument cannot be simple flags
> but must be a context that can be modified at later times.  Crude example
>
>   malloc(zone1, n,&x);
>
>   ZoneSetReadOnly(zone1);
>   ..
>   ZoneSetAvailableIfNeeded(zone1);
>
>
I really like Barry's proposal to add this context.  I can think of other
things that could go into that context, too, like hints about how the
memory will be used (passed to the OS via madvise(2), for instance).  Sure,
most of the time people won't want to pass such hints and users could
ignore this, but this is consistent with the PETSc philosophy of exposing
various details to the user if he/she cares, but making a reasonable choice
if the user doesn't.


>
>
>   Barry
>
>
> >
> > What are you going to use it for?  If the allocation is small enough,
> > it'll probably be resident in cache and if it falls out, the lower
> > latency to DRAM will be better than HBM.  As it gets bigger, provided it
> > gets enough use, then HBM becomes the right place, but later it's too
> > big and you have to go back to DRAM.
> >
> > What happens if memory of the kind requested is unavailable?  Error or
> > the implementations tries to find a different kind?  If there are
> > several memory kinds, what order is used when checking?
>
>
The questions in the second paragraph may be something worth enabling the
user to set, either through some global preference or particular entries in
the context.


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-27 Thread Richard Mills
On Mon, Apr 27, 2015 at 12:38 PM, Jed Brown  wrote:

> Richard Mills  writes:
> > I think it is possible to add the memkind support without breaking all of
> > the interfaces used throughout PETSc for PetscMalloc(), etc.  I recently
> > sat with Chris Cantalupo, the main memkind developer, and walked him
> > through PETSc's allocation routines, and we came up with the following:
> The
> > imalloc() function pointer could have an implementation something like
> >
> > PetcErrorCode PetscMemkindMalloc(size_t size, const char *func, const
> char
> > *file, void **result)
> >
> > {
> >
> > struct memkind *kind;
> >
> > int err;
> >
> >
> >
> > if (*result == NULL) {
> >
> > kind = MEMKIND_DEFAULT;
> >
> > }
> >
> > else {
> >
> > kind = (struct memkind *)(*result);
>
> I'm at a loss for words to express how disgusting this is.
>

Ha ha!  Yeah, I don't like it either.  Chris and I were just thinking about
what we could do if we wanted to not break the existing API.  But one of my
favorite things about PETSc is that developers are never afraid to make
wholesale changes to things.


>
> > This gives us (1) a method of passing the kind of memory without
> modifying
> > the petsc allocation routine calling sequence,
>
> Nonsense, it just dodges the compiler's ability to tell you about the
> memory errors that it creates at every place where PetscMalloc is
> called!
>
>
> What did Chris say when you asked him about making memkind "suck less"?
> (Using shorthand to avoid retyping my previous long emails with
> constructive suggestions.)
>

I had some pretty good discussions with Chris.  He's a very reasonable guy,
actually (and unfortunately has just moved to another project, so someone
else is going to have to take over memkind ownership).  I summarize the
main points (the ones I can recall, anyway) below:

1) Easy one first: Regarding my wish for a call to accurately query the
amount of available high-bandwidth memory (MCDRAM), there is currently a
memkind_get_size() API but it has the shortcomings of being expensive and
not taking into account the heap's free pool (just the memory that the OS
knows to be available).  It should be possible to get around the expense of
the call with some caching and to include the free pool accounting.  Don't
know if any work has been done on this one, yet.

2) Regarding the desire to be able to move pages between kinds of memory
while keeping the same virtual address:  This is tough to implement in a
way that will give decent performance.  I guess that what we'd really like
to have would be an API like

  int memkind_convert(memkind_t kind, void *ptr, size_t size);

but the problem with the above is that is if the physical backing of a
virtual address is being changed, then a POSIX system call has to be made.
This also means that a heap management system tracking properties of
virtual address ranges for reuse after freeing will require *making a
system call to query the properties at the time of the free*.  This kills a
lot of the reason for using a heap manager in the first place: avoiding the
expense of repeated system calls (otherwise we'd just use mmap() for
everything) by reusing memory already obtained from the kernel.

Linux provides the mbind(2) and move_pages(2) system calls that enable the
user to modify the backing physical pages of virtual address ranges within
the NUMA architecture, so these can be used to move physical pages between
NUMA nodes (and high bandwidth on-package memory will be treated as a NUMA
node).  (A user on a KNL system could actually use move_pages(2) to move
between DRAM and MCDRAM, I believe.)  But Linux doesn't provide an
equivalent way for a user to change the page size of the backing physical
pages of an address range, so it's not possible to implement the above
memkind_convert() with what Linux currently provides.

If we want to move data from one memory kind to another, I believe that we
need to be able to deal with the virtual address changing.  Yes, this is a
pain because extra bookkeeping is involved.  Maybe we don't want to bother
with supporting something like this in PETSc.  But I don't know of any good
way around this.  I have discussed with Chris the idea of adding support
for asynchronously copying pages between different kinds of memory (maybe
have a memdup() analog to strdup()) and he had some ideas about how this
might be done efficiently.  But, again, I don't know of a good way to move
data to a different memory kind while keeping the same virtual address.  If
I'm misunderstanding something about what is possible with Linux (or other
*nix), please let me know--I'd really like to be wrong on this.

Say that a library is eventually made available that can process all of the
nonlocal information to make reasonable recommendations about where various
data structures should be placed (or, hell, say that there is just an
oracle we can consult about this), but there isn't a good way to do this
while ke

Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-27 Thread Barry Smith

> On Apr 27, 2015, at 2:46 PM, Jed Brown  wrote:
> 
> Barry Smith  writes:
>>MPI_Comm argument?  PETSc users rarely need to call PetscMalloc()
>>themselves and if they do call it then they should know the
>>properties of the memory they are allocating. Most users won't
>>even notice the change.
> 
> I think that's an exaggeration, but what are you going to use for the
> "kind" parameter?  The "correct" value depends on a ton of non-local
> information. 
> 
>>   Note that I'd like to add this argument independent of memkind.

  For things like we talked about the other day. Malloc zones,  maybe some 
indication that it is ok that the runtime take back this memory if available 
memory is running low, the ability to turn off read or all access to a malloc 
zone so that another library cannot corrupt the data, etc. When I said 
independent of memkind I meant having nothing to do with memkind.

  You are correct that involves lots of nonlocal information or information 
that is not yet known, so the argument cannot be simple flags but must be a 
context that can be modified at later times.  Crude example

  malloc(zone1, n,&x);
   
  ZoneSetReadOnly(zone1);
  ..
  ZoneSetAvailableIfNeeded(zone1);



  Barry


> 
> What are you going to use it for?  If the allocation is small enough,
> it'll probably be resident in cache and if it falls out, the lower
> latency to DRAM will be better than HBM.  As it gets bigger, provided it
> gets enough use, then HBM becomes the right place, but later it's too
> big and you have to go back to DRAM.
> 
> What happens if memory of the kind requested is unavailable?  Error or
> the implementations tries to find a different kind?  If there are
> several memory kinds, what order is used when checking?



Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-27 Thread Jed Brown
Barry Smith  writes:
> MPI_Comm argument?  PETSc users rarely need to call PetscMalloc()
> themselves and if they do call it then they should know the
> properties of the memory they are allocating. Most users won't
> even notice the change.

I think that's an exaggeration, but what are you going to use for the
"kind" parameter?  The "correct" value depends on a ton of non-local
information.

>Note that I'd like to add this argument independent of memkind.

What are you going to use it for?  If the allocation is small enough,
it'll probably be resident in cache and if it falls out, the lower
latency to DRAM will be better than HBM.  As it gets bigger, provided it
gets enough use, then HBM becomes the right place, but later it's too
big and you have to go back to DRAM.

What happens if memory of the kind requested is unavailable?  Error or
the implementations tries to find a different kind?  If there are
several memory kinds, what order is used when checking?


signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-27 Thread Jed Brown
Richard Mills  writes:
> I think it is possible to add the memkind support without breaking all of
> the interfaces used throughout PETSc for PetscMalloc(), etc.  I recently
> sat with Chris Cantalupo, the main memkind developer, and walked him
> through PETSc's allocation routines, and we came up with the following: The
> imalloc() function pointer could have an implementation something like
>
> PetcErrorCode PetscMemkindMalloc(size_t size, const char *func, const char
> *file, void **result)
>
> {
>
> struct memkind *kind;
>
> int err;
>
>
>
> if (*result == NULL) {
>
> kind = MEMKIND_DEFAULT;
>
> }
>
> else {
>
> kind = (struct memkind *)(*result);

I'm at a loss for words to express how disgusting this is.

> This gives us (1) a method of passing the kind of memory without modifying
> the petsc allocation routine calling sequence, 

Nonsense, it just dodges the compiler's ability to tell you about the
memory errors that it creates at every place where PetscMalloc is
called!


What did Chris say when you asked him about making memkind "suck less"?
(Using shorthand to avoid retyping my previous long emails with
constructive suggestions.)

> and (2) support a fall back code path legacy applications which will
> not set the pointer to NULL.  Or am I missing something?



signature.asc
Description: PGP signature


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-27 Thread Barry Smith

> On Apr 27, 2015, at 2:30 PM, Matthew Knepley  wrote:
> 
> On Tue, Apr 28, 2015 at 5:26 AM, Barry Smith  wrote:
> 
> > On Apr 27, 2015, at 1:51 PM, Richard Mills  wrote:
> >
> > All,
> >
> > I'd like to add support for the allocators provided by the 'memkind' 
> > library (https://github.com/memkind/memkind).  I've discussed memkind a 
> > little bit with some of you off-list.  Briefly, memkind provides a "user 
> > extensible heap manager built on top of jemalloc which enables control of 
> > memory characteristics and a partitioning of the heap between kinds of 
> > memory".  The immediate motivation is to support placement of critical data 
> > structures into the high bandwidth on-package memory that will be available 
> > with Intel's "Knights Landing" generation of Xeon Phi processor (like on 
> > the upcoming NERSC Cori machine), but what the library provides is more 
> > general, and it can also be used for placing data in memory such as 
> > nonvolatile RAM (NVRAM), which will be appearing in more systems.
> >
> > I'm with Jed in thinking that, ideally, PETSc (or its users) shouldn't have 
> > to make decisions about the optimal way to solve the "packing problem" of 
> > what should go into high-bandwidth memory.  (In fact, I think this is a 
> > really interesting research problem that relates to some work on 
> > memory-adaptation in scientific applications that I did back when I was 
> > doing my Ph.D. research, e.g., 
> > http://www.climatemodeling.org/~rmills/pubs/JGC_mmlib_2007.pdf.)  However, 
> > right now I'd like to take the baby step of coming up with a mechanism to 
> > simply tag PETSc objects with a kind of memory that is preferred, and then 
> > having associated allocations reflect that preference (or requirement, if 
> > the user wants allocations to fail if such memory is not available).  Later 
> > we can worry about how to move data structures in and out of a kind of 
> > memory.
> >
> > It might make sense to add an option for certain PETSc classes--Mat and Vec 
> > are the most obvious here--to prefer allocations in a certain kind of 
> > memory.  Or, would it make more sense to add such an option at the 
> > PetscObject level?
> >
> > I think it is possible to add the memkind support without breaking all of 
> > the interfaces used throughout PETSc for PetscMalloc(), etc.
> 
>   I don't think having this as a goal is useful at all! Just break the 
> current interface; add an abstract "memkind" argument to all PetscMalloc() 
> and Free() calls that indicates any additional information about the memory 
> requested. By making it "abstract" it will always just be there and on 
> systems without any special memory options it just doesn't do anything.
> 
> Since Malloc() is so pervasive, I think it would be useful to have a 2 level 
> interface here. The standard Malloc() would call
> you advanced PlacedMalloc(), and anyone could call that function, but I think 
> its just cruel to make everyone allocating
> memory give arguments they do not understand or need.

MPI_Comm argument?  PETSc users rarely need to call PetscMalloc() 
themselves and if they do call it then they should know the properties of the 
memory they are allocating. Most users won't even notice the change. 

   Note that I'd like to add this argument independent of memkind. 

   Barry

> 
>   Matt
>  
>Barry
> 
>Note: it is not clear to me how this could be helpful on its own because I 
> agree with Jed how is the user when creating the object supposed to know the 
> optimal place to put the object?  For more complex objects it may be that 
> different parts of the object would be stored in different types of memory 
> etc etc.
> 
> 
> >  I recently sat with Chris Cantalupo, the main memkind developer, and 
> > walked him through PETSc's allocation routines, and we came up with the 
> > following: The imalloc() function pointer could have an implementation 
> > something like
> >
> > PetcErrorCode PetscMemkindMalloc(size_t size, const char *func, const char 
> > *file, void **result)
> >
> > {
> >
> > struct memkind *kind;
> >
> > int err;
> >
> >
> > if (*result == NULL) {
> >
> > kind = MEMKIND_DEFAULT;
> >
> > }
> >
> > else {
> >
> > kind = (struct memkind *)(*result);
> >
> > }
> >
> >
> > err = memkind_posix_memalign(kind, result, 16, size);
> >
> > return PosixErrToPetscErr(err);
> >
> > }
> >
> >
> > and ifree will look something like:
> >
> > PetcErrorCode PetscMemkindFree(void *ptr, int a, const char *func, const 
> > char *file)
> >
> > {
> >
> > memkind_free(0, ptr);
> >
> > return 0;
> >
> > }
> >
> >
> > This gives us (1) a method of passing the kind of memory without modifying 
> > the petsc allocation routine calling sequence, and (2) support a fall back 
> > code path legacy applications which will not set the pointer to NULL.  Or 
> > am I missing something?
> >
> > Thoughts?  I'd like to hash out something soon and start writ

Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-27 Thread Matthew Knepley
On Tue, Apr 28, 2015 at 5:26 AM, Barry Smith  wrote:

>
> > On Apr 27, 2015, at 1:51 PM, Richard Mills  wrote:
> >
> > All,
> >
> > I'd like to add support for the allocators provided by the 'memkind'
> library (https://github.com/memkind/memkind).  I've discussed memkind a
> little bit with some of you off-list.  Briefly, memkind provides a "user
> extensible heap manager built on top of jemalloc which enables control of
> memory characteristics and a partitioning of the heap between kinds of
> memory".  The immediate motivation is to support placement of critical data
> structures into the high bandwidth on-package memory that will be available
> with Intel's "Knights Landing" generation of Xeon Phi processor (like on
> the upcoming NERSC Cori machine), but what the library provides is more
> general, and it can also be used for placing data in memory such as
> nonvolatile RAM (NVRAM), which will be appearing in more systems.
> >
> > I'm with Jed in thinking that, ideally, PETSc (or its users) shouldn't
> have to make decisions about the optimal way to solve the "packing problem"
> of what should go into high-bandwidth memory.  (In fact, I think this is a
> really interesting research problem that relates to some work on
> memory-adaptation in scientific applications that I did back when I was
> doing my Ph.D. research, e.g.,
> http://www.climatemodeling.org/~rmills/pubs/JGC_mmlib_2007.pdf.)
> However, right now I'd like to take the baby step of coming up with a
> mechanism to simply tag PETSc objects with a kind of memory that is
> preferred, and then having associated allocations reflect that preference
> (or requirement, if the user wants allocations to fail if such memory is
> not available).  Later we can worry about how to move data structures in
> and out of a kind of memory.
> >
> > It might make sense to add an option for certain PETSc classes--Mat and
> Vec are the most obvious here--to prefer allocations in a certain kind of
> memory.  Or, would it make more sense to add such an option at the
> PetscObject level?
> >
> > I think it is possible to add the memkind support without breaking all
> of the interfaces used throughout PETSc for PetscMalloc(), etc.
>
>   I don't think having this as a goal is useful at all! Just break the
> current interface; add an abstract "memkind" argument to all PetscMalloc()
> and Free() calls that indicates any additional information about the memory
> requested. By making it "abstract" it will always just be there and on
> systems without any special memory options it just doesn't do anything.
>

Since Malloc() is so pervasive, I think it would be useful to have a 2
level interface here. The standard Malloc() would call
you advanced PlacedMalloc(), and anyone could call that function, but I
think its just cruel to make everyone allocating
memory give arguments they do not understand or need.

  Matt


>Barry
>
>Note: it is not clear to me how this could be helpful on its own
> because I agree with Jed how is the user when creating the object supposed
> to know the optimal place to put the object?  For more complex objects it
> may be that different parts of the object would be stored in different
> types of memory etc etc.
>
>
> >  I recently sat with Chris Cantalupo, the main memkind developer, and
> walked him through PETSc's allocation routines, and we came up with the
> following: The imalloc() function pointer could have an implementation
> something like
> >
> > PetcErrorCode PetscMemkindMalloc(size_t size, const char *func, const
> char *file, void **result)
> >
> > {
> >
> > struct memkind *kind;
> >
> > int err;
> >
> >
> > if (*result == NULL) {
> >
> > kind = MEMKIND_DEFAULT;
> >
> > }
> >
> > else {
> >
> > kind = (struct memkind *)(*result);
> >
> > }
> >
> >
> > err = memkind_posix_memalign(kind, result, 16, size);
> >
> > return PosixErrToPetscErr(err);
> >
> > }
> >
> >
> > and ifree will look something like:
> >
> > PetcErrorCode PetscMemkindFree(void *ptr, int a, const char *func, const
> char *file)
> >
> > {
> >
> > memkind_free(0, ptr);
> >
> > return 0;
> >
> > }
> >
> >
> > This gives us (1) a method of passing the kind of memory without
> modifying the petsc allocation routine calling sequence, and (2) support a
> fall back code path legacy applications which will not set the pointer to
> NULL.  Or am I missing something?
> >
> > Thoughts?  I'd like to hash out something soon and start writing some
> code.
> >
> > --Richard
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener


Re: [petsc-dev] Adding support memkind allocators in PETSc

2015-04-27 Thread Barry Smith

> On Apr 27, 2015, at 1:51 PM, Richard Mills  wrote:
> 
> All,
> 
> I'd like to add support for the allocators provided by the 'memkind' library 
> (https://github.com/memkind/memkind).  I've discussed memkind a little bit 
> with some of you off-list.  Briefly, memkind provides a "user extensible heap 
> manager built on top of jemalloc which enables control of memory 
> characteristics and a partitioning of the heap between kinds of memory".  The 
> immediate motivation is to support placement of critical data structures into 
> the high bandwidth on-package memory that will be available with Intel's 
> "Knights Landing" generation of Xeon Phi processor (like on the upcoming 
> NERSC Cori machine), but what the library provides is more general, and it 
> can also be used for placing data in memory such as nonvolatile RAM (NVRAM), 
> which will be appearing in more systems.
> 
> I'm with Jed in thinking that, ideally, PETSc (or its users) shouldn't have 
> to make decisions about the optimal way to solve the "packing problem" of 
> what should go into high-bandwidth memory.  (In fact, I think this is a 
> really interesting research problem that relates to some work on 
> memory-adaptation in scientific applications that I did back when I was doing 
> my Ph.D. research, e.g., 
> http://www.climatemodeling.org/~rmills/pubs/JGC_mmlib_2007.pdf.)  However, 
> right now I'd like to take the baby step of coming up with a mechanism to 
> simply tag PETSc objects with a kind of memory that is preferred, and then 
> having associated allocations reflect that preference (or requirement, if the 
> user wants allocations to fail if such memory is not available).  Later we 
> can worry about how to move data structures in and out of a kind of memory.
> 
> It might make sense to add an option for certain PETSc classes--Mat and Vec 
> are the most obvious here--to prefer allocations in a certain kind of memory. 
>  Or, would it make more sense to add such an option at the PetscObject level?
> 
> I think it is possible to add the memkind support without breaking all of the 
> interfaces used throughout PETSc for PetscMalloc(), etc.

  I don't think having this as a goal is useful at all! Just break the current 
interface; add an abstract "memkind" argument to all PetscMalloc() and Free() 
calls that indicates any additional information about the memory requested. By 
making it "abstract" it will always just be there and on systems without any 
special memory options it just doesn't do anything.

   Barry

   Note: it is not clear to me how this could be helpful on its own because I 
agree with Jed how is the user when creating the object supposed to know the 
optimal place to put the object?  For more complex objects it may be that 
different parts of the object would be stored in different types of memory etc 
etc.


>  I recently sat with Chris Cantalupo, the main memkind developer, and walked 
> him through PETSc's allocation routines, and we came up with the following: 
> The imalloc() function pointer could have an implementation something like
> 
> PetcErrorCode PetscMemkindMalloc(size_t size, const char *func, const char 
> *file, void **result)
> 
> {
> 
> struct memkind *kind;
> 
> int err;
> 
>  
> if (*result == NULL) {
> 
> kind = MEMKIND_DEFAULT;
> 
> }
> 
> else {
> 
> kind = (struct memkind *)(*result);
> 
> }
> 
> 
> err = memkind_posix_memalign(kind, result, 16, size);
> 
> return PosixErrToPetscErr(err);
> 
> }
> 
>  
> and ifree will look something like:
>  
> PetcErrorCode PetscMemkindFree(void *ptr, int a, const char *func, const char 
> *file)
> 
> {
> 
> memkind_free(0, ptr);
> 
> return 0;
> 
> }
> 
>  
> This gives us (1) a method of passing the kind of memory without modifying 
> the petsc allocation routine calling sequence, and (2) support a fall back 
> code path legacy applications which will not set the pointer to NULL.  Or am 
> I missing something?
> 
> Thoughts?  I'd like to hash out something soon and start writing some code.
> 
> --Richard