On Thu, May 14, 2015 at 05:51:19PM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2015-05-14 at 09:39 +0200, Vlastimil Babka wrote:
> > On 05/14/2015 01:38 AM, Benjamin Herrenschmidt wrote:
> > > On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
> > >> Sorry for reviving oldish thread...
>
On Thu, 2015-05-14 at 09:39 +0200, Vlastimil Babka wrote:
> On 05/14/2015 01:38 AM, Benjamin Herrenschmidt wrote:
> > On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
> >> Sorry for reviving oldish thread...
> >
> > Well, that's actually appreciated since this is constructive discussion
>
On 05/14/2015 01:38 AM, Benjamin Herrenschmidt wrote:
On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
Sorry for reviving oldish thread...
Well, that's actually appreciated since this is constructive discussion
of the kind I was hoping to trigger initially :-) I'll look at
I hoped s
On Wed, 2015-05-13 at 16:10 +0200, Vlastimil Babka wrote:
> Sorry for reviving oldish thread...
Well, that's actually appreciated since this is constructive discussion
of the kind I was hoping to trigger initially :-) I'll look at
ZONE_MOVABLE, I wasn't aware of its existence.
Don't we still have
Sorry for reviving oldish thread...
On 04/28/2015 01:54 AM, Benjamin Herrenschmidt wrote:
On Mon, 2015-04-27 at 11:48 -0500, Christoph Lameter wrote:
On Mon, 27 Apr 2015, Rik van Riel wrote:
Why would we want to avoid the sane approach that makes this thing
work with the fewest required chang
On Tue, Apr 28, 2015 at 09:18:55AM -0500, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Jerome Glisse wrote:
>
> > > is the mechanism that DAX relies on in the VM.
> >
> > Which would require fare more changes than you seem to think. First using
> > MIXED|PFNMAP means we loose any kind of memory
On Mon, 27 Apr 2015, Jerome Glisse wrote:
> > is the mechanism that DAX relies on in the VM.
>
> Which would require fare more changes than you seem to think. First using
> MIXED|PFNMAP means we loose any kind of memory accounting and forget about
> memcg too. Seconds it means we would need to set
On Mon, 2015-04-27 at 11:48 -0500, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Rik van Riel wrote:
>
> > Why would we want to avoid the sane approach that makes this thing
> > work with the fewest required changes to core code?
>
> Becaus new ZONEs are a pretty invasive change to the memory m
On Mon, Apr 27, 2015 at 02:26:04PM -0500, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Jerome Glisse wrote:
>
> > > We can drop the DAX name and just talk about mapping to external memory if
> > > that confuses the issue.
> >
> > DAX is for direct access block layer (X is for the cool name fact
On 04/27/2015 03:26 PM, Christoph Lameter wrote:
> DAX is about directly accessing memory. It is made for the purpose of
> serving as a block device for a filesystem right now but it can easily be
> used as a way to map any external memory into a processes space using the
> abstraction of a block
On Mon, 27 Apr 2015, Jerome Glisse wrote:
> > We can drop the DAX name and just talk about mapping to external memory if
> > that confuses the issue.
>
> DAX is for direct access block layer (X is for the cool name factor)
> there is zero code inside DAX that would be usefull to us. Because it
> i
On Mon, Apr 27, 2015 at 11:51:51AM -0500, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Jerome Glisse wrote:
>
> > > Well lets avoid that. Access to device memory comparable to what the
> > > drivers do today by establishing page table mappings or a generalization
> > > of DAX approaches would b
On Mon, 27 Apr 2015, Jerome Glisse wrote:
> > Well lets avoid that. Access to device memory comparable to what the
> > drivers do today by establishing page table mappings or a generalization
> > of DAX approaches would be the most straightforward way of implementing it
> > and would build based o
On Mon, 27 Apr 2015, Rik van Riel wrote:
> Why would we want to avoid the sane approach that makes this thing
> work with the fewest required changes to core code?
Becaus new ZONEs are a pretty invasive change to the memory management and
because there are other ways to handle references to devi
On Mon, Apr 27, 2015 at 11:17:43AM -0500, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Jerome Glisse wrote:
>
> > > Improvements to the general code would be preferred instead of
> > > having specialized solutions for a particular hardware alone. If the
> > > general code can then handle the s
On Mon, 27 Apr 2015, Paul E. McKenney wrote:
> I would instead look on this as a way to try out use of hardware migration
> hints, which could lead to hardware vendors providing similar hints for
> node-to-node migrations. At that time, the benefits could be provided
> all the functionality relyi
On 04/27/2015 12:17 PM, Christoph Lameter wrote:
> On Mon, 27 Apr 2015, Jerome Glisse wrote:
>
>>> Improvements to the general code would be preferred instead of
>>> having specialized solutions for a particular hardware alone. If the
>>> general code can then handle the special coprocessor situa
On Mon, 27 Apr 2015, Jerome Glisse wrote:
> > Improvements to the general code would be preferred instead of
> > having specialized solutions for a particular hardware alone. If the
> > general code can then handle the special coprocessor situation then we
> > avoid a lot of code development.
>
>
On Mon, Apr 27, 2015 at 10:08:29AM -0500, Christoph Lameter wrote:
> On Sat, 25 Apr 2015, Paul E. McKenney wrote:
>
> > Would you have a URL or other pointer to this code?
>
> linux/mm/migrate.c
Ah, I thought you were calling out something not yet in mainline.
> > > > Without modifying a single
On Mon, Apr 27, 2015 at 10:08:29AM -0500, Christoph Lameter wrote:
> On Sat, 25 Apr 2015, Paul E. McKenney wrote:
>
> > Would you have a URL or other pointer to this code?
>
> linux/mm/migrate.c
>
> > > > Without modifying a single line of mm code, the only way to do this is
> > > > to
> > > >
On Sat, 25 Apr 2015, Paul E. McKenney wrote:
> Would you have a URL or other pointer to this code?
linux/mm/migrate.c
> > > Without modifying a single line of mm code, the only way to do this is to
> > > either unmap from the cpu page table the range being migrated or to
> > > mprotect
> > > it
On Sat, Apr 25, 2015 at 01:32:39PM +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2015-04-24 at 22:32 -0400, Rik van Riel wrote:
> > > The result would be that the kernel would allocate only
> > migratable
> > > pages within the CCAD device's memory, and even then only if
> > > me
On Fri, Apr 24, 2015 at 10:49:28AM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Paul E. McKenney wrote:
>
> > can deliver, but where the cost of full-fledge hand tuning cannot be
> > justified.
> >
> > You seem to believe that this latter category is the empty set, which
> > I must confe
On Fri, Apr 24, 2015 at 03:00:18PM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > > Still no answer as to why is that not possible with the current scheme?
> > > You keep on talking about pointers and I keep on responding that this is a
> > > matter of making the
On Fri, Apr 24, 2015 at 11:09:36AM -0400, Jerome Glisse wrote:
> On Fri, Apr 24, 2015 at 07:57:38AM -0700, Paul E. McKenney wrote:
> > On Fri, Apr 24, 2015 at 09:12:07AM -0500, Christoph Lameter wrote:
> > > On Thu, 23 Apr 2015, Paul E. McKenney wrote:
> > >
> > > >
> > > > DAX
> > > >
> > > >
On Fri, 2015-04-24 at 22:32 -0400, Rik van Riel wrote:
> > The result would be that the kernel would allocate only
> migratable
> > pages within the CCAD device's memory, and even then only if
> > memory was otherwise exhausted.
>
> Does it make sense to allocate the device's pag
On 04/21/2015 05:44 PM, Paul E. McKenney wrote:
> AUTONUMA
>
> The Linux kernel's autonuma facility supports migrating both
> memory and processes to promote NUMA memory locality. It was
> accepted into 3.13 and is available in RHEL 7.0 and SLES 12.
> It is enabled by the
On Fri, 2015-04-24 at 11:58 -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > > What exactly is the more advanced version's benefit? What are the features
> > > that the other platforms do not provide?
> >
> > Transparent access to device memory from the CPU, you ca
On Fri, Apr 24, 2015 at 03:00:18PM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > > Still no answer as to why is that not possible with the current scheme?
> > > You keep on talking about pointers and I keep on responding that this is a
> > > matter of making the
On Fri, 24 Apr 2015, Jerome Glisse wrote:
> > Still no answer as to why is that not possible with the current scheme?
> > You keep on talking about pointers and I keep on responding that this is a
> > matter of making the address space compatible on both sides.
>
> So if do that in a naive way, ho
On Fri, Apr 24, 2015 at 01:56:45PM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > > Right this is how things work and you could improve on that. Stay with the
> > > scheme. Why would that not work if you map things the same way in both
> > > environments if both
On Fri, 24 Apr 2015, Jerome Glisse wrote:
> > Right this is how things work and you could improve on that. Stay with the
> > scheme. Why would that not work if you map things the same way in both
> > environments if both accellerator and host processor can acceess each
> > others memory?
>
> Again
On 04/23/2015 07:22 PM, Jerome Glisse wrote:
On Thu, Apr 23, 2015 at 09:20:55AM -0500, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
There are hooks in glibc where you can replace the memory
management of the apps if you want that.
We don't control the app. Le
On Fri, Apr 24, 2015 at 11:58:39AM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > > What exactly is the more advanced version's benefit? What are the features
> > > that the other platforms do not provide?
> >
> > Transparent access to device memory from the CPU,
On Fri, 24 Apr 2015, Jerome Glisse wrote:
> > What exactly is the more advanced version's benefit? What are the features
> > that the other platforms do not provide?
>
> Transparent access to device memory from the CPU, you can map any of the GPU
> memory inside the CPU and have the whole cache co
On Fri, Apr 24, 2015 at 11:03:52AM -0500, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Jerome Glisse wrote:
>
> > On Fri, Apr 24, 2015 at 09:29:12AM -0500, Christoph Lameter wrote:
> > > On Thu, 23 Apr 2015, Jerome Glisse wrote:
> > >
> > > > No this not have been solve properly. Today solution
On 04/24/2015 10:30 AM, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
>> If by "entire industry" you mean everyone who might want to use hardware
>> acceleration, for example, including mechanical computer-aided design,
>> I am skeptical.
>
> The industry designs GPUs
On 04/24/2015 11:49 AM, Christoph Lameter wrote:
> On Fri, 24 Apr 2015, Paul E. McKenney wrote:
>
>> can deliver, but where the cost of full-fledge hand tuning cannot be
>> justified.
>>
>> You seem to believe that this latter category is the empty set, which
>> I must confess does greatly surpris
On Fri, 24 Apr 2015, Jerome Glisse wrote:
> On Fri, Apr 24, 2015 at 09:29:12AM -0500, Christoph Lameter wrote:
> > On Thu, 23 Apr 2015, Jerome Glisse wrote:
> >
> > > No this not have been solve properly. Today solution is doing an explicit
> > > copy and again and again when complex data struct a
On Fri, Apr 24, 2015 at 09:30:40AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
> > If by "entire industry" you mean everyone who might want to use hardware
> > acceleration, for example, including mechanical computer-aided design,
> > I am skeptical.
>
> The i
On 04/24/2015 10:01 AM, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
>>> As far as I know Jerome is talkeing about HPC loads and high performance
>>> GPU processing. This is the same use case.
>>
>> The difference is sensitivity to latency. You have latency-sensitive
On Fri, 24 Apr 2015, Paul E. McKenney wrote:
> > DAX is a mechanism to access memory not managed by the kernel and is the
> > successor to XIP. It just happens to be needed for persistent memory.
> > Fundamentally any driver can provide an MMAPPed interface to allow access
> > to a devices memory.
On Fri, 24 Apr 2015, Paul E. McKenney wrote:
> can deliver, but where the cost of full-fledge hand tuning cannot be
> justified.
>
> You seem to believe that this latter category is the empty set, which
> I must confess does greatly surprise me.
If there are already compromises are being made the
On Fri, Apr 24, 2015 at 07:57:38AM -0700, Paul E. McKenney wrote:
> On Fri, Apr 24, 2015 at 09:12:07AM -0500, Christoph Lameter wrote:
> > On Thu, 23 Apr 2015, Paul E. McKenney wrote:
> >
> > >
> > > DAX
> > >
> > > DAX is a mechanism for providing direct-memory access to
> > > high-speed non-
On Fri, Apr 24, 2015 at 09:29:12AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Jerome Glisse wrote:
>
> > No this not have been solve properly. Today solution is doing an explicit
> > copy and again and again when complex data struct are involve (list, tree,
> > ...) this is extremly te
On Fri, Apr 24, 2015 at 09:12:07AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
> >
> > DAX
> >
> > DAX is a mechanism for providing direct-memory access to
> > high-speed non-volatile (AKA "persistent") memory. Good
> > introductions to DAX may be
On Fri, Apr 24, 2015 at 09:30:40AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
> > If by "entire industry" you mean everyone who might want to use hardware
> > acceleration, for example, including mechanical computer-aided design,
> > I am skeptical.
>
> The i
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
> If by "entire industry" you mean everyone who might want to use hardware
> acceleration, for example, including mechanical computer-aided design,
> I am skeptical.
The industry designs GPUs with super fast special ram and accellerators
with special r
On Thu, 23 Apr 2015, Jerome Glisse wrote:
> No this not have been solve properly. Today solution is doing an explicit
> copy and again and again when complex data struct are involve (list, tree,
> ...) this is extremly tedious and hard to debug. So today solution often
> restrict themself to easy
On Fri, Apr 24, 2015 at 09:01:47AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
> > > As far as I know Jerome is talkeing about HPC loads and high performance
> > > GPU processing. This is the same use case.
> >
> > The difference is sensitivity to latency. You
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
>
> DAX
>
> DAX is a mechanism for providing direct-memory access to
> high-speed non-volatile (AKA "persistent") memory. Good
> introductions to DAX may be found in the following LWN
> articles:
DAX is a mechanism to access me
On Thu, 23 Apr 2015, Jerome Glisse wrote:
> The numa code we have today for CPU case exist because it does make
> a difference but you keep trying to restrict GPU user to a workload
> that is specific. Go talk to people doing physic, biology, data
> mining, CAD most of them do not care about laten
On Thu, 23 Apr 2015, Austin S Hemmelgarn wrote:
Looking at this whole conversation, all I see is two different views on how to
present the asymmetric multiprocessing arrangements that have become
commonplace in today's systems to userspace. Your model favors performance,
while CAPI favors simpl
On Thu, 23 Apr 2015, Paul E. McKenney wrote:
> > As far as I know Jerome is talkeing about HPC loads and high performance
> > GPU processing. This is the same use case.
>
> The difference is sensitivity to latency. You have latency-sensitive
> HPC workloads, and Jerome is talking about HPC worklo
On Thu, 2015-04-23 at 09:10 -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
>
> > > Anyone
> > > wanting performance (and that is the prime reason to use a GPU) would
> > > switch this off because the latencies are otherwise not controllable and
> > > those ma
On Thu, 2015-04-23 at 11:25 -0400, Austin S Hemmelgarn wrote:
> Looking at this whole conversation, all I see is two different views on
> how to present the asymmetric multiprocessing arrangements that have
> become commonplace in today's systems to userspace. Your model favors
> performance, w
And another update, again diffs followed by the full document. The
diffs are against the version at https://lkml.org/lkml/2015/4/22/235.
Thanx, Paul
diff --git a/Devi
On Thu, Apr 23, 2015 at 09:12:38AM -0500, Christoph Lameter wrote:
> On Wed, 22 Apr 2015, Paul E. McKenney wrote:
>
> > Agreed, the use case that Jerome is thinking of differs from yours.
> > You would not (and should not) tolerate things like page faults because
> > it would destroy your worst-ca
On Thu, Apr 23, 2015 at 09:20:55AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
>
> > > There are hooks in glibc where you can replace the memory
> > > management of the apps if you want that.
> >
> > We don't control the app. Let's say we are doing a plugin
On Thu, Apr 23, 2015 at 09:38:15AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
[ . . . ]
> > It might not be *your* model based on *your* application but that doesn't
> > mean
> > it's not there, and isn't relevant.
>
> Sadly this is the way that an enti
On 04/22/2015 01:14 PM, Christoph Lameter wrote:
> On Wed, 22 Apr 2015, Jerome Glisse wrote:
>
>> Glibc hooks will not work, this is about having same address space on
>> CPU and GPU/accelerator while allowing backing memory to be regular
>> system memory or device memory all this in a transparent
On Thu, Apr 23, 2015 at 09:20:55AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
>
> > > There are hooks in glibc where you can replace the memory
> > > management of the apps if you want that.
> >
> > We don't control the app. Let's say we are doing a plugin
On Thu, Apr 23, 2015 at 09:38:15AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
[...]
> > You have something in memory, whether you got it via malloc, mmap'ing a
> > file,
> > shmem with some other application, ... and you want to work on it with the
> > c
On 04/21/2015 08:50 PM, Christoph Lameter wrote:
> On Tue, 21 Apr 2015, Jerome Glisse wrote:
>> So big use case here, let say you have an application that rely on a
>> scientific library that do matrix computation. Your application simply
>> use malloc and give pointer to this scientific library.
On Thu, Apr 23, 2015 at 09:10:13AM -0500, Christoph Lameter wrote:
> On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
>
> > > Anyone
> > > wanting performance (and that is the prime reason to use a GPU) would
> > > switch this off because the latencies are otherwise not controllable and
> > > t
On 2015-04-23 10:25, Christoph Lameter wrote:
On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
They are via MMIO space. The big differences here are that via CAPI the
memory can be fully cachable and thus have the same characteristics as
normal memory from the processor point of view, and the
On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
> In fact I'm quite surprised, what we want to achieve is the most natural
> way from an application perspective.
Well the most natural thing would be if the beast would just do what I
tell it in plain english. But then I would not have my job an
On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
> They are via MMIO space. The big differences here are that via CAPI the
> memory can be fully cachable and thus have the same characteristics as
> normal memory from the processor point of view, and the device shares
> the MMU with the host.
>
>
On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
> > There are hooks in glibc where you can replace the memory
> > management of the apps if you want that.
>
> We don't control the app. Let's say we are doing a plugin for libfoo
> which accelerates "foo" using GPUs.
There are numerous examples
On Wed, 22 Apr 2015, Paul E. McKenney wrote:
> Agreed, the use case that Jerome is thinking of differs from yours.
> You would not (and should not) tolerate things like page faults because
> it would destroy your worst-case response times. I believe that Jerome
> is more interested in throughput
On Thu, 23 Apr 2015, Benjamin Herrenschmidt wrote:
> > Anyone
> > wanting performance (and that is the prime reason to use a GPU) would
> > switch this off because the latencies are otherwise not controllable and
> > those may impact performance severely. There are typically multiple
> > parallel
On Wed, 2015-04-22 at 13:17 -0500, Christoph Lameter wrote:
>
> > But again let me stress that application that want to be in control will
> > stay in control. If you want to make the decission yourself about where
> > things should end up then nothing in all we are proposing will preclude
> > you
On Wed, 2015-04-22 at 12:14 -0500, Christoph Lameter wrote:
>
> > Bottom line is we want today anonymous, share or file mapped memory
> > to stay the only kind of memory that exist and we want to choose the
> > backing store of each of those kind for better placement depending
> > on how memory is
On Wed, 2015-04-22 at 11:16 -0500, Christoph Lameter wrote:
> On Wed, 22 Apr 2015, Paul E. McKenney wrote:
>
> > I completely agree that some critically important use cases, such as
> > yours, will absolutely require that the application explicitly choose
> > memory placement and have the memory s
On Wed, 2015-04-22 at 10:25 -0500, Christoph Lameter wrote:
> On Wed, 22 Apr 2015, Benjamin Herrenschmidt wrote:
>
> > Right, it doesn't look at all like what we want.
>
> Its definitely a way to map memory that is outside of the kernel managed
> pool into a user space process. For that matter an
On Wed, Apr 22, 2015 at 12:14:50PM -0500, Christoph Lameter wrote:
> On Wed, 22 Apr 2015, Jerome Glisse wrote:
>
> > Glibc hooks will not work, this is about having same address space on
> > CPU and GPU/accelerator while allowing backing memory to be regular
> > system memory or device memory all
On Wed, Apr 22, 2015 at 01:17:58PM -0500, Christoph Lameter wrote:
> On Wed, 22 Apr 2015, Jerome Glisse wrote:
>
> > Now if you have the exact same address space then structure you have on
> > the CPU are exactly view in the same way on the GPU and you can start
> > porting library to leverage GPU
On Wed, 22 Apr 2015, Jerome Glisse wrote:
> Now if you have the exact same address space then structure you have on
> the CPU are exactly view in the same way on the GPU and you can start
> porting library to leverage GPU without having to change a single line of
> code inside many many many appli
On Wed, 22 Apr 2015, Jerome Glisse wrote:
> Glibc hooks will not work, this is about having same address space on
> CPU and GPU/accelerator while allowing backing memory to be regular
> system memory or device memory all this in a transparent manner to
> userspace program and library.
If you cont
On Wed, Apr 22, 2015 at 11:16:49AM -0500, Christoph Lameter wrote:
> On Wed, 22 Apr 2015, Paul E. McKenney wrote:
>
> > I completely agree that some critically important use cases, such as
> > yours, will absolutely require that the application explicitly choose
> > memory placement and have the m
On Wed, Apr 22, 2015 at 10:25:37AM -0500, Christoph Lameter wrote:
> On Wed, 22 Apr 2015, Benjamin Herrenschmidt wrote:
>
> > Right, it doesn't look at all like what we want.
>
> Its definitely a way to map memory that is outside of the kernel managed
> pool into a user space process. For that ma
On Wed, 22 Apr 2015, Paul E. McKenney wrote:
> I completely agree that some critically important use cases, such as
> yours, will absolutely require that the application explicitly choose
> memory placement and have the memory stay there.
Most of what you are trying to do here is already there
On Wed, 22 Apr 2015, Benjamin Herrenschmidt wrote:
> Right, it doesn't look at all like what we want.
Its definitely a way to map memory that is outside of the kernel managed
pool into a user space process. For that matter any device driver could be
doing this as well. The point is that we alread
On Tue, 21 Apr 2015, Paul E. McKenney wrote:
> Ben will correct me if I am wrong, but I do not believe that we are
> looking for persistent memory in this case.
DAX is way of mapping special memory into user space. Persistance is one
possible use case. Its like the XIP that you IBMers know from z
On Wed, Apr 22, 2015 at 11:01:26AM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2015-04-21 at 19:50 -0500, Christoph Lameter wrote:
>
> > With a filesystem the migration can be controlled by the application. It
> > can copy stuff whenever it wants to.Having the OS do that behind my back
> > is n
On Tue, Apr 21, 2015 at 07:50:02PM -0500, Christoph Lameter wrote:
> On Tue, 21 Apr 2015, Jerome Glisse wrote:
[ . . . ]
> > Paul is working on a platform that is more advance that the one HMM try
> > to address and i believe the x86 platform will not have functionality
> > such a CAPI, at least
On Tue, Apr 21, 2015 at 07:46:07PM -0400, Jerome Glisse wrote:
> On Tue, Apr 21, 2015 at 02:44:45PM -0700, Paul E. McKenney wrote:
> > Hello!
> >
> > We have some interest in hardware on devices that is cache-coherent
> > with main memory, and in migrating memory between host memory and
> > device
On Tue, 2015-04-21 at 17:57 -0700, Paul E. McKenney wrote:
> On Wed, Apr 22, 2015 at 10:42:52AM +1000, Benjamin Herrenschmidt wrote:
> > On Tue, 2015-04-21 at 18:49 -0500, Christoph Lameter wrote:
> > > On Tue, 21 Apr 2015, Paul E. McKenney wrote:
> > >
> > > > Thoughts?
> > >
> > > Use DAX for m
On Tue, 2015-04-21 at 19:50 -0500, Christoph Lameter wrote:
> With a filesystem the migration can be controlled by the application. It
> can copy stuff whenever it wants to.Having the OS do that behind my back
> is not something that feels safe and secure.
But this is not something the user wants
On Wed, Apr 22, 2015 at 10:42:52AM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2015-04-21 at 18:49 -0500, Christoph Lameter wrote:
> > On Tue, 21 Apr 2015, Paul E. McKenney wrote:
> >
> > > Thoughts?
> >
> > Use DAX for memory instead of the other approaches? That way it is
> > explicitly clea
On Tue, 21 Apr 2015, Jerome Glisse wrote:
> Memory on this device should not be considered as something special
> (even if it is). More below.
Uhh?
> So big use case here, let say you have an application that rely on a
> scientific library that do matrix computation. Your application simply
> us
On Tue, 2015-04-21 at 18:49 -0500, Christoph Lameter wrote:
> On Tue, 21 Apr 2015, Paul E. McKenney wrote:
>
> > Thoughts?
>
> Use DAX for memory instead of the other approaches? That way it is
> explicitly clear what information is put on the CAPI device.
Care to elaborate on what DAX is ?
> >
On Tue, 2015-04-21 at 19:46 -0400, Jerome Glisse wrote:
> On Tue, Apr 21, 2015 at 02:44:45PM -0700, Paul E. McKenney wrote:
> > Hello!
> >
> > We have some interest in hardware on devices that is cache-coherent
> > with main memory, and in migrating memory between host memory and
> > device memory
On Tue, Apr 21, 2015 at 06:49:29PM -0500, Christoph Lameter wrote:
> On Tue, 21 Apr 2015, Paul E. McKenney wrote:
>
> > Thoughts?
>
> Use DAX for memory instead of the other approaches? That way it is
> explicitly clear what information is put on the CAPI device.
>
Memory on this device should
On Tue, 21 Apr 2015, Paul E. McKenney wrote:
> Thoughts?
Use DAX for memory instead of the other approaches? That way it is
explicitly clear what information is put on the CAPI device.
> Although such a device will provide CPU's with cache-coherent
Maybe call this coprocessor like IBM doe
On Tue, Apr 21, 2015 at 02:44:45PM -0700, Paul E. McKenney wrote:
> Hello!
>
> We have some interest in hardware on devices that is cache-coherent
> with main memory, and in migrating memory between host memory and
> device memory. We believe that we might not be the only ones looking
> ahead to
Hello!
We have some interest in hardware on devices that is cache-coherent
with main memory, and in migrating memory between host memory and
device memory. We believe that we might not be the only ones looking
ahead to hardware like this, so please see below for a draft of some
approaches that we
97 matches
Mail list logo