On Jan 14, 2014, at 10:39, Tom Lane wrote:
> James Bottomley writes:
>> The current mechanism for coherency between a userspace cache and the
>> in-kernel page cache is mmap ... that's the only way you get the same
>> page in both currently.
>
> Right.
>
>> glibc used to have an implementatio
On Tue, 2014-01-14 at 15:39 +0100, Hannu Krosing wrote:
> On 01/14/2014 09:39 AM, Claudio Freire wrote:
> > On Tue, Jan 14, 2014 at 5:08 AM, Hannu Krosing
> > wrote:
> >> Again, as said above the linux file system is doing fine. What we
> >> want is a few ways to interact with it to let it do eve
On Mon, 2014-01-13 at 19:48 -0500, Trond Myklebust wrote:
> On Jan 13, 2014, at 19:03, Hannu Krosing wrote:
>
> > On 01/13/2014 09:53 PM, Trond Myklebust wrote:
> >> On Jan 13, 2014, at 15:40, Andres Freund wrote:
> >>
> >>> On 2014-01-13 15:15:16 -0500, Robert Haas wrote:
> On Mon, Jan 13
On Tue, Jan 14, 2014 at 02:26:25AM +0100, Andres Freund wrote:
> On 2014-01-13 17:13:51 -0800, James Bottomley wrote:
> > a file into a user provided buffer, thus obtaining a page cache entry
> > and a copy in their userspace buffer, then insert the page of the user
> > buffer back into the page ca
On Mon, Jan 13, 2014 at 03:24:38PM -0800, Josh Berkus wrote:
> On 01/13/2014 02:26 PM, Mel Gorman wrote:
> > Really?
> >
> > zone_reclaim_mode is often a complete disaster unless the workload is
> > partitioned to fit within NUMA nodes. On older kernels enabling it would
> > sometimes cause massiv
On Tue 14-01-14 11:11:28, Heikki Linnakangas wrote:
> On 01/14/2014 12:26 AM, Mel Gorman wrote:
> >On Mon, Jan 13, 2014 at 03:15:16PM -0500, Robert Haas wrote:
> >>The other thing that comes to mind is the kernel's caching behavior.
> >>We've talked a lot over the years about the difficulties of ge
On Tue 14-01-14 09:08:40, Hannu Krosing wrote:
> >>> Effectively you end up with buffered read/write that's also mapped into
> >>> the page cache. It's a pretty awful way to hack around mmap.
> >> Well, the problem is that you can't really use mmap() for the things we
> >> do. Postgres' durability
Trond Myklebust writes:
> On Jan 14, 2014, at 10:39, Tom Lane wrote:
>> "Don't be aggressive" isn't good enough. The prohibition on early write
>> has to be absolute, because writing a dirty page before we've done
>> whatever else we need to do results in a corrupt database. It has to
>> be tre
On Tue, Jan 14, 2014 at 12:42 PM, Trond Myklebust wrote:
>> James Bottomley writes:
>>> The current mechanism for coherency between a userspace cache and the
>>> in-kernel page cache is mmap ... that's the only way you get the same
>>> page in both currently.
>>
>> Right.
>>
>>> glibc used to hav
James Bottomley writes:
> The current mechanism for coherency between a userspace cache and the
> in-kernel page cache is mmap ... that's the only way you get the same
> page in both currently.
Right.
> glibc used to have an implementation of read/write in terms of mmap, so
> it should be possib
On Tue, Jan 14, 2014 at 5:00 AM, Jan Kara wrote:
> I thought that instead of injecting pages into pagecache for aging as you
> describe in 3), you would mark pages as volatile (i.e. for reclaim by
> kernel) through vrange() syscall. Next time you need the page, you check
> whether the kernel recla
On Tue, Jan 14, 2014 at 3:39 AM, Claudio Freire wrote:
> On Tue, Jan 14, 2014 at 5:08 AM, Hannu Krosing wrote:
>> Again, as said above the linux file system is doing fine. What we
>> want is a few ways to interact with it to let it do even better when
>> working with postgresql by telling it some
On Tue, Jan 14, 2014 at 11:39 AM, Hannu Krosing wrote:
> On 01/14/2014 09:39 AM, Claudio Freire wrote:
>> On Tue, Jan 14, 2014 at 5:08 AM, Hannu Krosing wrote:
>>> Again, as said above the linux file system is doing fine. What we
>>> want is a few ways to interact with it to let it do even better
First off, I want to give a +1 on everything in the recent posts
from Heikki and Hannu.
Jan Kara wrote:
> Now the aging of pages marked as volatile as it is currently
> implemented needn't be perfect for your needs but you still have
> time to influence what gets implemented... Actually develope
On 01/14/2014 09:39 AM, Claudio Freire wrote:
> On Tue, Jan 14, 2014 at 5:08 AM, Hannu Krosing wrote:
>> Again, as said above the linux file system is doing fine. What we
>> want is a few ways to interact with it to let it do even better when
>> working with postgresql by telling it some stuff it
On Tue, Jan 14, 2014 at 5:08 AM, Hannu Krosing wrote:
> Again, as said above the linux file system is doing fine. What we
> want is a few ways to interact with it to let it do even better when
> working with postgresql by telling it some stuff it otherwise would
> have to second guess and by somet
On 01/14/2014 03:44 AM, Dave Chinner wrote:
> On Tue, Jan 14, 2014 at 02:26:25AM +0100, Andres Freund wrote:
>> On 2014-01-13 17:13:51 -0800, James Bottomley wrote:
>>> a file into a user provided buffer, thus obtaining a page cache entry
>>> and a copy in their userspace buffer, then insert the pa
On 01/13/2014 05:30 PM, Dave Chinner wrote:
> On Mon, Jan 13, 2014 at 03:24:38PM -0800, Josh Berkus wrote:
> No matter what default NUMA allocation policy we set, there will be
> an application for which that behaviour is wrong. As such, we've had
> tools for setting application specific NUMA polic
On 2014-01-13 17:13:51 -0800, James Bottomley wrote:
> a file into a user provided buffer, thus obtaining a page cache entry
> and a copy in their userspace buffer, then insert the page of the user
> buffer back into the page cache as the page cache page ... that's right,
> isn't it postgress peopl
On Jan 13, 2014, at 19:03, Hannu Krosing wrote:
> On 01/13/2014 09:53 PM, Trond Myklebust wrote:
>> On Jan 13, 2014, at 15:40, Andres Freund wrote:
>>
>>> On 2014-01-13 15:15:16 -0500, Robert Haas wrote:
On Mon, Jan 13, 2014 at 1:51 PM, Kevin Grittner wrote:
> I notice, Josh, that yo
On Mon 13-01-14 22:36:06, Mel Gorman wrote:
> On Mon, Jan 13, 2014 at 06:27:03PM -0200, Claudio Freire wrote:
> > On Mon, Jan 13, 2014 at 5:23 PM, Jim Nasby wrote:
> > > On 1/13/14, 2:19 PM, Claudio Freire wrote:
> > >>
> > >> On Mon, Jan 13, 2014 at 5:15 PM, Robert Haas
> > >> wrote:
> > >>>
> >
The issue with O_DIRECT is actually a much more general issue ---
namely, database programmers that for various reasons decide they
don't want to go down the O_DIRECT route, but then care about
performance. PostgreSQL is not the only database which is had this
issue.
There are two papers at this
On Jan 13, 2014, at 16:03, Robert Haas wrote:
> On Mon, Jan 13, 2014 at 3:53 PM, Trond Myklebust wrote:
>> O_DIRECT was specifically designed to solve the problem of double buffering
>> between applications and the kernel. Why are you not able to use that in
>> these situations?
>
> O_DIRECT
On Mon 13-01-14 22:26:45, Mel Gorman wrote:
> The flipside is also meant to hold true. If you know data will be needed
> in the near future then posix_fadvise(POSIX_FADV_WILLNEED). Glancing at
> the implementation it does a forced read-ahead on the range of pages of
> interest. It doesn't look like
On Mon, 2014-01-13 at 22:12 +0100, Andres Freund wrote:
> On 2014-01-13 12:34:35 -0800, James Bottomley wrote:
> > On Mon, 2014-01-13 at 14:32 -0600, Jim Nasby wrote:
> > > Well, if we were to collaborate with the kernel community on this then
> > > presumably we can do better than that for evictio
On 1/13/14, 4:47 PM, Jan Kara wrote:
Note to postgres guys: I think you should have a look at the proposed
'vrange' system call. The latest posting is here:
http://www.spinics.net/lists/linux-mm/msg67328.html. It contains a rather
detailed description of the feature. And if the feature looks good
On 1/13/14, 4:44 PM, Andres Freund wrote:
> > One major usecase is transplanting a page comming from postgres'
> >buffers into the kernel's buffercache because the latter has a much
> >better chance of properly allocating system resources across independent
> >applications running.
>
>If you wa
On 01/13/2014 09:53 PM, Trond Myklebust wrote:
> On Jan 13, 2014, at 15:40, Andres Freund wrote:
>
>> On 2014-01-13 15:15:16 -0500, Robert Haas wrote:
>>> On Mon, Jan 13, 2014 at 1:51 PM, Kevin Grittner wrote:
I notice, Josh, that you didn't mention the problems many people
have run int
On Mon, Jan 13, 2014 at 11:38:44PM +0100, Jan Kara wrote:
> On Mon 13-01-14 22:26:45, Mel Gorman wrote:
> > The flipside is also meant to hold true. If you know data will be needed
> > in the near future then posix_fadvise(POSIX_FADV_WILLNEED). Glancing at
> > the implementation it does a forced re
On 2014-01-13 14:19:56 -0800, James Bottomley wrote:
> > Frequently mmap()/madvise()/munmap()ing 8kb chunks has
> > horrible consequences for performance/scalability - very quickly you
> > contend on locks in the kernel.
>
> Is this because of problems in the mmap_sem?
It's been a while since I
On 2014-01-13 12:34:35 -0800, James Bottomley wrote:
> On Mon, 2014-01-13 at 14:32 -0600, Jim Nasby wrote:
> > Well, if we were to collaborate with the kernel community on this then
> > presumably we can do better than that for eviction... even to the
> > extent of "here's some data from this range
On Jan 13, 2014, at 15:40, Andres Freund wrote:
> On 2014-01-13 15:15:16 -0500, Robert Haas wrote:
>> On Mon, Jan 13, 2014 at 1:51 PM, Kevin Grittner wrote:
>>> I notice, Josh, that you didn't mention the problems many people
>>> have run into with Transparent Huge Page defrag and with NUMA
>>>
On Mon, 2014-01-13 at 14:32 -0600, Jim Nasby wrote:
> On 1/13/14, 2:27 PM, Claudio Freire wrote:
> > On Mon, Jan 13, 2014 at 5:23 PM, Jim Nasby wrote:
> >> On 1/13/14, 2:19 PM, Claudio Freire wrote:
> >>>
> >>> On Mon, Jan 13, 2014 at 5:15 PM, Robert Haas
> >>> wrote:
>
> On a related n
On Mon, Jan 13, 2014 at 3:53 PM, Trond Myklebust wrote:
> O_DIRECT was specifically designed to solve the problem of double buffering
> between applications and the kernel. Why are you not able to use that in
> these situations?
O_DIRECT was apparently designed by a deranged monkey on some seri
On 2014-01-13 15:53:36 -0500, Trond Myklebust wrote:
> > I've wondered before if there wouldn't be a chance for postgres to say
> > "my dear OS, that the file range 0-8192 of file x contains y, no need to
> > reread" and do that when we evict a page from s_b but I never dared to
> > actually propos
101 - 135 of 135 matches
Mail list logo