Re: [PATCH v5 4/5] dax: fix mapping lifetime handling, convert to __pfn_t + kmap_atomic_pfn_t()

2015-08-13 Thread Boaz Harrosh
On 08/13/2015 06:21 PM, Dan Williams wrote: > On Wed, Aug 12, 2015 at 11:26 PM, Boaz Harrosh wrote: <> > > Hmm, that's not the same block layer I've been working with for the > past several years: > > $ mount /dev/pmem0 /mnt > $ echo namespace0.0 &

Re: RFC: prepare for struct scatterlist entries without page backing

2015-08-13 Thread Boaz Harrosh
On 08/13/2015 05:40 PM, Christoph Hellwig wrote: > On Wed, Aug 12, 2015 at 03:42:47PM +0300, Boaz Harrosh wrote: >> The support I have suggested and submitted for zone-less sections. >> (In my add_persistent_memory() patchset) >> >> Would work perfectly well and transpar

Re: [PATCH v5 2/5] allow mapping page-less memremaped areas into KVA

2015-08-13 Thread Boaz Harrosh
On 08/13/2015 05:48 PM, Boaz Harrosh wrote: <> > There is already an object that holds a relationship of physical > to Kernel-virtual. It is called a memory-section. Why not just > widen its definition? > BTW: Regarding the "widen its definition" I was thinking of

Re: [PATCH v5 2/5] allow mapping page-less memremaped areas into KVA

2015-08-13 Thread Boaz Harrosh
On 08/13/2015 05:41 PM, Christoph Hellwig wrote: > On Thu, Aug 13, 2015 at 04:23:38PM +0300, Boaz Harrosh wrote: >>> DAX as is is races against pmem unbind. A synchronization cost must >>> be paid somewhere to make sure the memremap() mapping is still valid. >> >

Re: [PATCH v5 2/5] allow mapping page-less memremaped areas into KVA

2015-08-13 Thread Boaz Harrosh
On 08/13/2015 05:37 PM, Christoph Hellwig wrote: > Hi Boaz, > > can you please fix your quoting? I read down about 10 pages but still > couldn't find a comment from you. For now I gave up on this mail. > Sorry here: > +void *kmap_atomic_pfn_t(__pfn_t pfn) > +{ > + struct page *page = __pf

Re: [PATCH v5 2/5] allow mapping page-less memremaped areas into KVA

2015-08-13 Thread Boaz Harrosh
On 08/13/2015 03:57 PM, Dan Williams wrote: <> > This is explicitly addressed in the changelog, repeated here: > >> The __pfn_t to resource lookup is indeed inefficient walking of a linked >> list, >> but there are two mitigating factors: >> >> 1/ The number of persistent memory ranges is bounded

Re: [PATCH v5 4/5] dax: fix mapping lifetime handling, convert to __pfn_t + kmap_atomic_pfn_t()

2015-08-12 Thread Boaz Harrosh
On 08/13/2015 06:01 AM, Dan Williams wrote: > The primary source for non-page-backed page-frames to enter the system > is via the pmem driver's ->direct_access() method. The pfns returned by > the top-level bdev_direct_access() may be passed to any other subsystem > in the kernel and those sub-sys

Re: [PATCH v5 2/5] allow mapping page-less memremaped areas into KVA

2015-08-12 Thread Boaz Harrosh
On 08/13/2015 06:01 AM, Dan Williams wrote: > Introduce a type that encapsulates a page-frame-number that can also be > used to encode other information. This other information is the > traditional "page_link" encoding in a scatterlist, but can also denote > "device memory". Where "device memory"

Re: regression introduced by "block: Add support for DAX reads/writes to block devices"

2015-08-12 Thread Boaz Harrosh
On 08/13/2015 12:11 AM, Jeff Moyer wrote: > Boaz Harrosh writes: > >> On 08/07/2015 11:41 PM, Jeff Moyer wrote: >> <> >>> >>>> We need to cope with the case where the end of a partition isn't on a >>>> page boundary though. >>

Re: RFC: prepare for struct scatterlist entries without page backing

2015-08-12 Thread Boaz Harrosh
On 08/12/2015 10:05 AM, Christoph Hellwig wrote: > Dan Williams started to look into addressing I/O to and from > Persistent Memory in his series from June: > > http://thread.gmane.org/gmane.linux.kernel.cross-arch/27944 > > I've started looking into DMA mapping of these SGLs specifically i

Re: [PATCH, RFC 2/2] dax: use range_lock instead of i_mmap_lock

2015-08-12 Thread Boaz Harrosh
On 08/12/2015 12:48 AM, Dave Chinner wrote: > On Tue, Aug 11, 2015 at 04:51:22PM +, Wilcox, Matthew R wrote: >> The race that you're not seeing is page fault vs page fault. Two >> threads each attempt to store a byte to different locations on the >> same page. With a read-mutex to exclude tru

Re: [PATCH, RFC 2/2] dax: use range_lock instead of i_mmap_lock

2015-08-12 Thread Boaz Harrosh
On 08/11/2015 11:26 PM, Kirill A. Shutemov wrote: > On Tue, Aug 11, 2015 at 07:17:12PM +0300, Boaz Harrosh wrote: >> On 08/11/2015 06:28 PM, Kirill A. Shutemov wrote: >>> We also used lock_page() to make sure we shoot out all pages as we don't >>> exclude page fault

Re: [PATCH, RFC 2/2] dax: use range_lock instead of i_mmap_lock

2015-08-11 Thread Boaz Harrosh
less pain. (And is much more efficient say in hot path like application already did fallocate for performance) Thanks Boaz > -Original Message- > From: Boaz Harrosh [mailto:b...@plexistor.com] > Sent: Tuesday, August 11, 2015 7:31 AM > To: Jan Kara; Dave Chinner > Cc: Kiril

Re: [PATCH, RFC 2/2] dax: use range_lock instead of i_mmap_lock

2015-08-11 Thread Boaz Harrosh
On 08/11/2015 06:28 PM, Kirill A. Shutemov wrote: <> >> Hi Jan. So you got me confused above. You say: >> "DAX which needs exclusive access to the page given range in the page >> cache" >> >> but DAX and page-cache are mutually exclusive. I guess you meant the VMA >> range, or the inode->mapp

Re: [PATCH, RFC 2/2] dax: use range_lock instead of i_mmap_lock

2015-08-11 Thread Boaz Harrosh
On 08/11/2015 04:50 PM, Jan Kara wrote: > On Tue 11-08-15 19:37:08, Dave Chinner wrote: The patch below tries to recover some scalability for DAX by introducing per-mapping range lock. >>> >>> So this grows noticeably (3 longs if I'm right) struct address_space and >>> thus struct inode j

Re: [PATCH, RFC 2/2] dax: use range_lock instead of i_mmap_lock

2015-08-11 Thread Boaz Harrosh
On 08/11/2015 12:37 PM, Dave Chinner wrote: > On Tue, Aug 11, 2015 at 10:19:09AM +0200, Jan Kara wrote: >> On Mon 10-08-15 18:14:24, Kirill A. Shutemov wrote: >>> As we don't have struct pages for DAX memory, Matthew had to find an >>> replacement for lock_page() to avoid fault vs. truncate races.

Re: linux-next: error when fetching the osd tree

2015-08-10 Thread Boaz Harrosh
On 08/07/2015 04:24 AM, Joe Perches wrote: > On Fri, 2015-08-07 at 09:01 +1000, Stephen Rothwell wrote: >> Hi Boaz, >> >> Fetching the osd tree >> (git://git.open-osd.org/linux-open-osd.git#linux-next) for the past few >> days has produced this error: >> >> connection reset by peer > > And mail se

Re: regression introduced by "block: Add support for DAX reads/writes to block devices"

2015-08-10 Thread Boaz Harrosh
On 08/07/2015 11:41 PM, Jeff Moyer wrote: <> > >> We need to cope with the case where the end of a partition isn't on a >> page boundary though. > > Well, that's usually done by falling back to buffered I/O. I gave that > a try and panicked the box. :) I'll keep looking into it, but probably >

Re: regression introduced by "block: Add support for DAX reads/writes to block devices"

2015-08-09 Thread Boaz Harrosh
On 08/06/2015 11:34 PM, Dave Chinner wrote: > On Thu, Aug 06, 2015 at 10:52:47AM +0300, Boaz Harrosh wrote: >> On 08/06/2015 06:24 AM, Dave Chinner wrote: >>> On Wed, Aug 05, 2015 at 09:42:54PM -0400, Linda Knippers wrote: >>>> On 08/05/2015 06:01 PM, Dave Chinner wro

Re: regression introduced by "block: Add support for DAX reads/writes to block devices"

2015-08-06 Thread Boaz Harrosh
On 08/06/2015 06:24 AM, Dave Chinner wrote: > On Wed, Aug 05, 2015 at 09:42:54PM -0400, Linda Knippers wrote: >> On 08/05/2015 06:01 PM, Dave Chinner wrote: >>> On Wed, Aug 05, 2015 at 04:19:08PM -0400, Jeff Moyer wrote: <> I sat down with Linda to look into it, and the problem is that mk

Re: [PATCH] fs: dax: do not build on ARC or SH

2015-07-13 Thread Boaz Harrosh
On 07/13/2015 12:52 PM, Geert Uytterhoeven wrote: > From: Arnd Bergmann > > The DAX implementation relies on the architecture to provide a working > copy_user_page() function, as reported by Michael Ellerman's kisskb > build bot: > > fs/dax.c: error: implicit declaration of function 'copy_user_p

Re: [PATCH] mm: avoid setting up anonymous pages into file mapping

2015-07-05 Thread Boaz Harrosh
On 07/05/2015 06:44 PM, Kirill A. Shutemov wrote: >> Again that could mean a theoretical regression for some in-tree driver, >> do you know of any such driver? > > I did very little testing with the patch: boot kvm with Fedora and run > trinity there for a while. More testing is required. > It s

Re: [PATCH] mm: avoid setting up anonymous pages into file mapping

2015-07-05 Thread Boaz Harrosh
On 07/03/2015 05:07 PM, Kirill A. Shutemov wrote: > Reading page fault handler code I've noticed that under right > circumstances kernel would map anonymous pages into file mappings: > if the VMA doesn't have vm_ops->fault() and the VMA wasn't fully > populated on ->mmap(), kernel would handle page

Re: [PATCH v2 5/6] block: Add support for DAX reads/writes to block devices

2015-07-05 Thread Boaz Harrosh
On 07/03/2015 05:40 PM, Matthew Wilcox wrote: > If a block device supports the ->direct_access methods, bypass the normal > DIO path and use DAX to go straight to memcpy() instead of allocating > a DIO and a BIO. > I can't remember the details but I'm not sure it is safe for mmap to go through pa

Re: [PATCH v2 3/6] ext4: Use ext4_get_block_write() for DAX

2015-07-05 Thread Boaz Harrosh
On 07/03/2015 10:07 PM, Theodore Ts'o wrote: > On Fri, Jul 03, 2015 at 02:48:24PM -0400, Matthew Wilcox wrote: >> >> At boot, I "modprobe pmem". > > Is there a reason why it's important to build and load pmem as a > module? If I use CONFIG_BLK_DEV_PMEM=y (which is more convenient > given how I la

Re: [PATCH v2 2/6] dax: Use copy_from_iter_nocache

2015-07-05 Thread Boaz Harrosh
On 07/03/2015 05:40 PM, Matthew Wilcox wrote: > From: Matthew Wilcox > > When userspace does a write, there's no need for the written data to > pollute the CPU cache. This matches the original XIP code. > > Signed-off-by: Matthew Wilcox > --- > fs/dax.c | 2 +- > 1 file changed, 1 insertion(+

Re: [PATCH v2 1/6] dax: Add block size note to documentation

2015-07-05 Thread Boaz Harrosh
On 07/04/2015 08:03 AM, Christoph Hellwig wrote: > On Fri, Jul 03, 2015 at 10:40:38AM -0400, Matthew Wilcox wrote: >> From: Matthew Wilcox >> >> For block devices which are small enough, mkfs will default to creating >> a filesystem with block sizes smaller than page size. > > This seems like an

Re: [PATCH] SCSI-OSD: Delete an unnecessary check before the function call "put_disk"

2015-06-28 Thread Boaz Harrosh
detected by using the Coccinelle software. > > Signed-off-by: Markus Elfring ACK-by: Boaz Harrosh > --- > drivers/scsi/osd/osd_uld.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/drivers/scsi/osd/osd_uld.c b/drivers/scsi/osd/osd_uld.c > i

Re: [PATCH v2 4/4] arch, x86: cache management apis for persistent memory

2015-06-01 Thread Boaz Harrosh
Forgot one thing On 06/01/2015 02:39 PM, Boaz Harrosh wrote: >> +static inline void persistent_copy(void *dst, const void *src, size_t n) Could we please make this memcpy_persistent Same as: copy_from_user_nocache The generic name of what it does first then the special ov

Re: [PATCH v2 4/4] arch, x86: cache management apis for persistent memory

2015-06-01 Thread Boaz Harrosh
On 05/30/2015 09:59 PM, Dan Williams wrote: > From: Ross Zwisler > > Based on an original patch by Ross Zwisler [1]. > > Writes to persistent memory have the potential to be posted to cpu > cache, cpu write buffers, and platform write buffers (memory controller) > before being committed to persi

Re: linux-next: hang while trying to fetch the osd tree

2015-05-31 Thread Boaz Harrosh
On 05/28/2015 11:39 AM, Boaz Harrosh wrote: > On 05/28/2015 03:47 AM, Stephen Rothwell wrote: >> Hi Boaz, >> >> Trying to fetch the osd tree >> (git://git.open-osd.org/linux-open-osd.git#linux-next) just hangs after >> connecting. :-( >> > > Yes I&#x

Re: linux-next: hang while trying to fetch the osd tree

2015-05-28 Thread Boaz Harrosh
On 05/28/2015 03:47 AM, Stephen Rothwell wrote: > Hi Boaz, > > Trying to fetch the osd tree > (git://git.open-osd.org/linux-open-osd.git#linux-next) just hangs after > connecting. :-( > Yes I've seen that too. I will wait until Sunday (5/31) and if it does not come back up I will switch to anot

Re: [PATCH 0/8] ARM: mvebu: Add support for RAID6 PQ offloading

2015-05-27 Thread Boaz Harrosh
On 05/26/2015 07:31 PM, Dan Williams wrote: > [ adding Boaz as this discussion has implications for ore_raid ] > <> >> You're not talking about deprecating it, you're talking about removing >> it entirely. > > True, and adding more users makes that removal more difficult. I'm > willing to help o

Re: [Linux-nvdimm] [GIT PULL] PMEM driver for v4.1

2015-05-27 Thread Boaz Harrosh
On 05/27/2015 11:11 AM, Christoph Hellwig wrote: > On Wed, May 27, 2015 at 11:10:21AM +0300, Boaz Harrosh wrote: >> Hu funny I just looked and I see with ./check auto I get >> generic/018 1s ... [not run] defragmentation not supported for fstype "m1fs" >> generic

Re: [Linux-nvdimm] [GIT PULL] PMEM driver for v4.1

2015-05-27 Thread Boaz Harrosh
On 05/26/2015 10:31 PM, Matthew Wilcox wrote: > On Tue, May 26, 2015 at 11:41:41AM +0300, Boaz Harrosh wrote: >> I would please like to help. What is the breakage you >> see with DAX. >> >> I'm routinely testing with DAX so it is a surprise, >> Though I&#x

Re: [Linux-nvdimm] [GIT PULL] PMEM driver for v4.1

2015-05-26 Thread Boaz Harrosh
On 05/25/2015 09:16 PM, Matthew Wilcox wrote: <> > > Ingo, this sucks. You collapsed all of the separate patches into a > single "add new driver" patch, which makes it impossible to bisect which > of the recent changes broke xfstests. Please don't do this again. > Matthew hi Below is a splito

Re: [RFC] block: remove never-modified global variable

2015-05-20 Thread Boaz Harrosh
On 05/19/2015 08:58 PM, Brian Norris wrote: > On Tue, May 19, 2015 at 10:55:32AM -0700, Brian Norris wrote: <> > > Or rather, just make the above line: > > return ERR_PTR(err); > Yes my thoughts too, thanks sorry about that. Thanks Boaz >>> } > > Brian > -- To unsubscribe from this

Re: [RFC] block: remove never-modified global variable

2015-05-19 Thread Boaz Harrosh
On 05/19/2015 10:19 AM, Boaz Harrosh wrote: <> > But specially now that you are unconditionally printing it. It is better > to just combine the two statements. See suggested patch below: > Actually we can even do better: diff --git a/block/partitions/check.c b/block/partition

Re: [RFC] block: remove never-modified global variable

2015-05-19 Thread Boaz Harrosh
ned-off-by: Brian Norris > Cc: Christoph Hellwig > Cc: Boaz Harrosh Reviewed-by: Boaz Harrosh I have also tested it by returning error from read of sector zero. And the print prints. Of course. No one ever turns it off. Some comments > Cc: Jens Axboe > --- > Only compile tested f

Re: [FYI] tux3: Core changes

2015-05-18 Thread Boaz Harrosh
On 05/18/2015 05:20 AM, Rik van Riel wrote: > On 05/17/2015 09:26 AM, Boaz Harrosh wrote: >> On 05/14/2015 03:59 PM, Rik van Riel wrote: >>> On 05/14/2015 04:26 AM, Daniel Phillips wrote: >>>> Hi Rik, >> <> >>> >>> The issue is that th

Re: [FYI] tux3: Core changes

2015-05-17 Thread Boaz Harrosh
On 05/14/2015 03:59 PM, Rik van Riel wrote: > On 05/14/2015 04:26 AM, Daniel Phillips wrote: >> Hi Rik, <> > > The issue is that things like ptrace, AIO, infiniband > RDMA, and other direct memory access subsystems can take > a reference to page A, which Tux3 clones into a new page B > when the pr

Re: [PATCH 1/3] pmem: Initial version of persistent memory driver

2015-05-07 Thread Boaz Harrosh
On 05/07/2015 10:26 AM, Christoph Hellwig wrote: > On Mon, May 04, 2015 at 10:43:01AM -0600, Ross Zwisler wrote: >>> Yes, if CONFIG_DEBUG_BLOCK_EXT_DEVT isn't set that code doesn't work at all. >> >> I can't figure out a use case that breaks when using dynamically allocated >> minors without CONFIG

Re: [PATCH v2] Support for write stream IDs

2015-05-06 Thread Boaz Harrosh
On 05/06/2015 01:09 AM, Martin K. Petersen wrote: >> "Jens" == Jens Axboe writes: <> > > The only sensible solution is for the kernel to manage the stream > IDs. And for them to be plentiful. The storage device is free to ignore > them, do LRU or whatever it pleases to manage them if it has a

Re: [PATCH 18/79] exofs: switch to {simple,page}_symlink_inode_operations

2015-05-06 Thread Boaz Harrosh
On 05/05/2015 08:21 AM, Al Viro wrote: > From: Al Viro > > Signed-off-by: Al Viro ACK-by: Boaz Harrosh Thanks Al so much nicer (And safer) Boaz > --- > fs/exofs/Kbuild| 2 +- > fs/exofs/exofs.h | 4 > fs/exofs/inode.c | 9 + > fs/exofs/na

Re: [PATCH 1/1 linux-next] exofs: convert simple_str to kstr

2015-04-30 Thread Boaz Harrosh
On 04/29/2015 08:58 PM, Fabian Frederick wrote: > replace obsolete function. > > Signed-off-by: Fabian Frederick Thanks. ACK-by: Boaz Harrosh Are you pushing all these through some tree, or You need that I push it? Maybe push all these changes through some central place, like Andr

Re: [PATCH] scatterlist: enable sg chaining for all architectures

2015-04-29 Thread Boaz Harrosh
On 04/29/2015 05:15 AM, James Bottomley wrote: > > Perhaps the best thing to do is just fix target and call it quits? > Right! drivers write code for sg_chaining and on ARCHs that do not support it the code just works. Only the max_sg is smaller and the chaining code never kicks in and is dead c

Re: Using pmem from a driver exposing a memory mapping (mmap) to userspace

2015-04-29 Thread Boaz Harrosh
On 04/28/2015 06:35 PM, Mathieu Desnoyers wrote: > Hi! > > I'm currently adaping lttng-modules to use DAX and pmem. > It will allow LTTng buffers to be recovered after a kernel > crash. I've moved pretty much all struct page pointers to > page frame numbers, as I remember being told that pmem does

Re: [RFC][PATCHSET] non-recursive link_path_walk() and reducing stack footprint

2015-04-21 Thread Boaz Harrosh
On 04/21/2015 06:45 PM, Al Viro wrote: > On Tue, Apr 21, 2015 at 05:12:01PM +0200, Richard Weinberger wrote: > >> I'm pretty sure we can kill it. I had the plan to rip it out during this >> merge window >> along with other broken UML stuff but I was too late to ask on the UML >> mailinglist >> i

Re: [PATCH 01/21] e820, efi: add ACPI 6.0 persistent memory types

2015-04-19 Thread Boaz Harrosh
ot; E820_PRAM vs standard "Persistent I/O Memory" > E820_PMEM. > > Cc: Andy Lutomirski > Cc: Boaz Harrosh > Cc: H. Peter Anvin > Cc: Jens Axboe > Cc: Ingo Molnar > Cc: Christoph Hellwig > Signed-off-by: Dan Williams > Reviewed-by: Ross Zwisler &g

Re: [GIT PULL] PMEM driver for v4.1

2015-04-14 Thread Boaz Harrosh
On 04/14/2015 03:41 PM, Ingo Molnar wrote: > > * Christoph Hellwig wrote: > >> On Mon, Apr 13, 2015 at 12:45:32PM +0200, Ingo Molnar wrote: >>> Btw., what's the future design plan here? Enable struct page backing, >>> or provide special codepaths for all DAX uses like the special pte >>> based

Re: [Linux-nvdimm] [GIT PULL] PMEM driver for v4.1

2015-04-13 Thread Boaz Harrosh
On 04/13/2015 08:19 PM, Christoph Hellwig wrote: > On Mon, Apr 13, 2015 at 02:11:56PM +0300, Yigal Korman wrote: >> mlock() > > DAX files always are in-memory so this just sounds like an oversight. > method. Yes mlock on DAX can just return true, but mlock implies MAP_POPULATE. Which means "I wo

Re: [GIT PULL] PMEM driver for v4.1

2015-04-13 Thread Boaz Harrosh
On 04/13/2015 03:35 PM, Ingo Molnar wrote: > <> > > How does splice work with DAX files? AFAICS vmsplice() won't work, as > it uses get_user_pages(), which needs struct page backing. Also, how > will f_op->sendpage() work? That too needs page backing. > I'm not sure about f_op->sendpage(). I

Re: [PATCH A+B] pmem: Add prints at module load and unload

2015-04-13 Thread Boaz Harrosh
On 04/13/2015 03:36 PM, Greg KH wrote: > if you are relying on kernel log messages for > specific things to happen in your system, you are doing it wrong as they > can change and disappear in any future kernel release, they are NOT an > api. > Again. I am not doing anything with these messages. Y

Re: Regression caused by using node_to_bdi()

2015-04-13 Thread Boaz Harrosh
On 04/13/2015 03:21 PM, Jan Kara wrote: <> >> -struct backing_dev_info *inode_to_bdi(struct inode *inode); >> +struct backing_dev_info *__inode_to_bdi(struct inode *inode); >> + >> +static inline >> +struct backing_dev_info *inode_to_bdi(struct inode *inode) >> +{ >> +if (!inode || !inode->i_sb

Re: Regression caused by using node_to_bdi()

2015-04-13 Thread Boaz Harrosh
On 04/13/2015 01:22 PM, Zhao Lei wrote: <> > A new bad news: > This patch make filesystem unstable. > Rrrr yes sorry Lei. Why this boots my systems is not clear this is not what I intended to write. Here is what I meant to write (replacing the old one): diff --git a/fs/fs-writeback.c b/fs/f

Re: [GIT PULL] PMEM driver for v4.1

2015-04-13 Thread Boaz Harrosh
On 04/13/2015 01:45 PM, Ingo Molnar wrote: > > * Christoph Hellwig wrote: > >> On Mon, Apr 13, 2015 at 11:33:09AM +0200, Ingo Molnar wrote: >>> Limitations: this is a regular block device, and since the pmem areas >>> are not struct page backed, they are invisible to the rest of the >>> system

Re: [PATCH A+B] pmem: Add prints at module load and unload

2015-04-13 Thread Boaz Harrosh
On 04/13/2015 12:05 PM, Greg KH wrote: > On Tue, Apr 07, 2015 at 06:46:15PM +0300, Boaz Harrosh wrote: >> Hi Christoph, Ingo >> >> It is important in the lab for postmortem analysis to know if >> pmem driver loaded and/or unloaded. And the return code from this >&

Re: Regression caused by using node_to_bdi()

2015-04-12 Thread Boaz Harrosh
On 04/12/2015 02:33 PM, Boaz Harrosh wrote: > On 04/10/2015 02:25 PM, Zhao Lei wrote: >> Hi, Christoph Hellwig >> <> >> >> Is there some way to speed up it(inline, or some access some variant >> in struct directly, ...)? >> > > Christoph hi >

Re: Regression caused by using node_to_bdi()

2015-04-12 Thread Boaz Harrosh
On 04/10/2015 02:25 PM, Zhao Lei wrote: > Hi, Christoph Hellwig > > resend: + cc lkml, linux-fsdevel > > Since there is no response for my last mail, I worry that some problem in > the mail system, please allow me to resend it. > > I found regression in v4.0-rc1 caused by this patch: > Author:

[PATCH 1B] pmem: Add prints at module load and unload

2015-04-07 Thread Boaz Harrosh
m modprobe: [ +0.000537] pmem: init 2 devices => 0 ... Printed by pmem modprobe -r: [ +0.000537] pmem: exit Signed-off-by: Boaz Harrosh --- drivers/block/pmem.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c index 988f384..44d3f33 100644

[PATCH 1A] pmem: Add prints at pmem_probe/remove

2015-04-07 Thread Boaz Harrosh
: [ +16.299145] pmem pmem.1.auto: remove [ +0.011155] pmem pmem.0.auto: remove Signed-off-by: Boaz Harrosh --- drivers/block/pmem.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c index 988f384..36017f1 100644 --- a/drivers/block/pmem.c +++ b

[PATCH A+B] pmem: Add prints at module load and unload

2015-04-07 Thread Boaz Harrosh
mem.0.auto: remove Signed-off-by: Boaz Harrosh --- [PATCH 1B] pmem: Add prints at module load/unload When debugging people's systems it is helpful to see what went on. The load and unload of pmem is an important event. The importance is the number of loaded devices and error status.

Re: [Linux-nvdimm] [PATCH] pmem: Add prints at module load and unload

2015-04-07 Thread Boaz Harrosh
On 04/07/2015 06:19 PM, Christoph Hellwig wrote: > On Sun, Apr 05, 2015 at 11:50:06AM +0300, Boaz Harrosh wrote: >> [ +0.000537] pmem: init 2 devices => 0 >> >> So I have all the information. And I know the driver was actually loaded >> successfully on the expecte

Re: Why not build kernel with -O3

2015-04-07 Thread Boaz Harrosh
On 04/07/2015 09:43 AM, Mike Galbraith wrote: > On Tue, 2015-04-07 at 11:37 +0800, Pengfei Yuan wrote: >> Hi, >> >> I have conducted some experiments to compare kernels built with -O2 >> and -O3. Here are the results: >> >> Application Performance O2 Performance O3 Improvement >> Apache

Re: [PATCH v2] x86: Revert E820_PRAM change in e820_end_pfn()

2015-04-06 Thread Boaz Harrosh
20_PRAM ranges. But E820_PRAM ranges will have the possibility for struct-page. That said I have tested with this patch + struct-page and Tested-by: Boaz Harrosh Comments below ... > Revert the change made to account > E820_PRAM as RAM in e820.c in the commit. > > Signed-off

Re: [Linux-nvdimm] [PATCH 1/2] x86: add support for the non-standard protected e820 type

2015-04-06 Thread Boaz Harrosh
On 04/05/2015 11:06 PM, Yinghai Lu wrote: > On Sun, Apr 5, 2015 at 2:18 AM, Boaz Harrosh wrote: <> >> Hi Yinghai, Toshi >> >> In my old patches I did not have these updates as well, and everything >> was very much usable, for a long time. >> >&g

Re: [Linux-nvdimm] [PATCH 1/2] x86: add support for the non-standard protected e820 type

2015-04-05 Thread Boaz Harrosh
On 04/03/2015 08:12 PM, Yinghai Lu wrote: > On Fri, Apr 3, 2015 at 9:14 AM, Toshi Kani wrote: >> On Wed, 2015-04-01 at 09:12 +0200, Christoph Hellwig wrote: >> : >>> @@ -748,7 +758,7 @@ u64 __init early_reserve_e820(u64 size, u64 align) >>> /* >>> * Find the highest page frame number we have

Re: [Linux-nvdimm] [PATCH] pmem: Add prints at module load and unload

2015-04-05 Thread Boaz Harrosh
On 04/02/2015 07:44 PM, Christoph Hellwig wrote: > On Thu, Apr 02, 2015 at 09:01:14AM -0700, Dan Williams wrote: If anything I think these should be dev_dbg(). >>> >>> We do not have a dev at any of this point, and it does not >>> belong to any specific device. >> >> Ah, true this is prior to

Re: [Linux-nvdimm] [PATCH] pmem: Add prints at module load and unload

2015-04-02 Thread Boaz Harrosh
On 04/02/2015 06:39 PM, Dan Williams wrote: > On Thu, Apr 2, 2015 at 8:31 AM, Boaz Harrosh wrote: >> Hi Christoph, Ingo >> >> Please consider this small patch below just a small print at module >> load/unload so to know at user systems how things progressed. >> A

[PATCH] pmem: Add prints at module load and unload

2015-04-02 Thread Boaz Harrosh
Hi Christoph, Ingo Please consider this small patch below just a small print at module load/unload so to know at user systems how things progressed. As it is now, we know nothing. For any other disk kind we have two tuns of prints. --- From: Boaz Harrosh Date: Thu, 2 Apr 2015 16:43:48 +0300

Re: [PATCH] SQUASHME: Fixes to e820 handling of pmem

2015-04-02 Thread Boaz Harrosh
On 04/02/2015 12:30 PM, Christoph Hellwig wrote: > On Wed, Apr 01, 2015 at 05:25:22PM +0300, Boaz Harrosh wrote: >> pfn = PFN_DOWN(ei->addr + ei->size); >> >> -switch (ei->type) { >> -case E820_RAM: >> -

Re: [PATCH 2/2] pmem: add a driver for persistent memory

2015-04-01 Thread Boaz Harrosh
; This patch contains the initial driver from Ross Zwisler, with > various changes from Boaz Harrosh and me. > > Signed-off-by: Ross Zwisler > [hch: convert to use a platform_device for discovery, fix partition > support, merged various patches from Boaz] > Signed-off-by: Chris

[PATCH] SQUASHME: Fixes to e820 handling of pmem

2015-04-01 Thread Boaz Harrosh
ate. (Actually it will be the opposite right). Can we actually define swap on a /dev/pmemX ? ;-) Signed-off-by: Boaz Harrosh --- Documentation/kernel-parameters.txt | 6 ++ arch/x86/kernel/e820.c | 20 +--- 2 files changed, 15 insertions(+), 11 deletions(-) d

Re: another pmem variant V2

2015-04-01 Thread Boaz Harrosh
On 03/31/2015 07:16 PM, Christoph Hellwig wrote: > On Tue, Mar 31, 2015 at 06:14:15PM +0300, Boaz Harrosh wrote: >> We can not accept it as is right now. > > Who is we? > >> We have conducted farther tests. And it messes up NUMA. > > Only you if you use the memma

Re: [Linux-nvdimm] another pmem variant V2

2015-04-01 Thread Boaz Harrosh
On 04/01/2015 10:50 AM, Ingo Molnar wrote: > > * Dan Williams wrote: > >> On Tue, Mar 31, 2015 at 10:24 AM, Christoph Hellwig wrote: >>> On Tue, Mar 31, 2015 at 06:44:56PM +0200, Ingo Molnar wrote: I'd be fine with that too - mind sending an updated series? >>> >>> I will send an updated o

Re: [Linux-nvdimm] [PATCH 4/6] SQUSHME: pmem: Micro cleaning

2015-03-31 Thread Boaz Harrosh
On 03/31/2015 06:30 PM, Dan Williams wrote: > On Tue, Mar 31, 2015 at 8:24 AM, Boaz Harrosh wrote: >> On 03/31/2015 06:17 PM, Dan Williams wrote: >>> On Tue, Mar 31, 2015 at 6:27 AM, Boaz Harrosh wrote: >>>> >>>> Some error checks had unlikely so

Re: [Linux-nvdimm] [PATCH 4/6] SQUSHME: pmem: Micro cleaning

2015-03-31 Thread Boaz Harrosh
On 03/31/2015 06:17 PM, Dan Williams wrote: > On Tue, Mar 31, 2015 at 6:27 AM, Boaz Harrosh wrote: >> >> Some error checks had unlikely some did not. Put unlikely >> on all error handling paths. >> (I like unlikely for error paths specially for readability) > >

Re: another pmem variant V2

2015-03-31 Thread Boaz Harrosh
On 03/31/2015 12:25 PM, Christoph Hellwig wrote: > On Thu, Mar 26, 2015 at 06:57:47PM +0200, Boaz Harrosh wrote: <> > > Any news? I'd really like to resend this ASAP to get it into 4.1.. Hi Christoph I hate to be bearer of bad news but we have a problem with the e820 pat

[RFC] SQUASHME: pmem: Split up pmem_probe from pmem_alloc

2015-03-31 Thread Boaz Harrosh
On 03/31/2015 01:25 PM, Boaz Harrosh wrote: <> > > And one last issue. I have some configuration "hardness" with the > memmap=nn!aa Kernel command line API, it was better for me with the > pmem map= module param. Will you be OK if I split pmem_probe() into > calling

Re: [Linux-nvdimm] [PATCH] SQUASHME: Streamline pmem.c

2015-03-31 Thread Boaz Harrosh
On 03/27/2015 01:31 AM, Dan Williams wrote: > On Thu, Mar 26, 2015 at 10:02 AM, Boaz Harrosh wrote: >> >> Christoph why did you choose the fat and ugly version of >> pmem.c beats me. > > Boaz, I am so very tired of your snide commentary. It severely > detracts fr

[PATCH 6/6] SQUASHME: pmem: Remove "... based on brd.c" + Copyright

2015-03-31 Thread Boaz Harrosh
The driver is no longer similar to brd.c. Christoph completely changed the 2nd half of the patch and I completely removed the 1st half. pmem is its own thing. Also added Copyright of Christoph and me. Feel free to remove if it is not so. Signed-off-by: Boaz Harrosh --- drivers/block/pmem.c

[PATCH 5/6] SQUASHME: pmem: Remove SECTOR_SHIFT

2015-03-31 Thread Boaz Harrosh
Remove SECTOR_SHIFT. It is defined in 6 other places in the Kernel. I do not like a new one. 9 is used through out, including block core. I do not like pmem to blasphemy more than needed. Signed-off-by: Boaz Harrosh --- drivers/block/pmem.c | 6 +- 1 file changed, 1 insertion(+), 5

[PATCH 4/6] SQUSHME: pmem: Micro cleaning

2015-03-31 Thread Boaz Harrosh
Some error checks had unlikely some did not. Put unlikely on all error handling paths. (I like unlikely for error paths specially for readability) Also use bio_data_dir() to extract away the READA flag Signed-off-by: Boaz Harrosh --- drivers/block/pmem.c | 16 +++- 1 file changed

[PATCH 2/6] SQUASHME: pmem: Remove getgeo

2015-03-31 Thread Boaz Harrosh
Remove getgeo. It is not needed for modern fdisk and was never needed for libgparted and cfdisk. Signed-off-by: Boaz Harrosh --- drivers/block/pmem.c | 10 -- 1 file changed, 10 deletions(-) diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c index 545b13b..dcb524f 100644 --- a

[PATCH 3/6] SQUASHME: pmem: Streamline pmem driver

2015-03-31 Thread Boaz Harrosh
does these checks and I did not see these checks done in other drivers. Signed-off-by: Boaz Harrosh --- drivers/block/pmem.c | 112 ++- 1 file changed, 22 insertions(+), 90 deletions(-) diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c index

[PATCH 1/6] SQUASHME: Don't let e820_PMEM sections

2015-03-31 Thread Boaz Harrosh
need this patch for now. I have booting problems on some machines, when a contiguous pmem range crosses a NUMA boundary. Signed-off-by: Boaz Harrosh --- arch/x86/kernel/e820.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index

[SQUASHME 0/6] Streamline of Initial pmem submission

2015-03-31 Thread Boaz Harrosh
On 03/31/2015 12:25 PM, Christoph Hellwig wrote: > > Any news? I'd really like to resend this ASAP to get it into 4.1.. > Hi Christoph, list I'm sending in one SQUASHME patch to the first e820 patch. And few other patches to the initial pmem.c driver submission. Please squash those patches,

Re: another pmem variant V2

2015-03-31 Thread Boaz Harrosh
On 03/31/2015 01:25 PM, Boaz Harrosh wrote: > On 03/31/2015 12:25 PM, Christoph Hellwig wrote: <> > The problem I see is that if I state a memmap=nn!aa that crosses a NUMA > boundary then the machine will not boot. > So BTW for sure I need that "don't merge E82

Re: another pmem variant V2

2015-03-31 Thread Boaz Harrosh
On 03/31/2015 12:25 PM, Christoph Hellwig wrote: > On Thu, Mar 26, 2015 at 06:57:47PM +0200, Boaz Harrosh wrote: >> On 03/26/2015 10:32 AM, Christoph Hellwig wrote: >>> Here is another version of the same trivial pmem driver, because two >>> obviously aren't enou

Re: Should implementations of ->direct_access be allowed to sleep?

2015-03-29 Thread Boaz Harrosh
On 03/29/2015 11:02 AM, Boaz Harrosh wrote: > On 03/26/2015 09:32 PM, Dave Chinner wrote: <> > I think that ->direct_access should not be any different then > any other block-device access, ie allow to sleep. > BTW: Matthew you yourself have said that after a page-load of

Re: Should implementations of ->direct_access be allowed to sleep?

2015-03-29 Thread Boaz Harrosh
On 03/26/2015 09:32 PM, Dave Chinner wrote: <> >> I'm leaning towards the latter. But I'm not sure what GFP flags to >> recommend that brd use ... GFP_NOWAIT | __GFP_ZERO, perhaps? > > What, so we get random IO failures under memory pressure? > > I really think we should allow .direct_access to

Re: another pmem variant V2

2015-03-26 Thread Boaz Harrosh
On 03/26/2015 07:18 PM, Christoph Hellwig wrote: > On Thu, Mar 26, 2015 at 06:57:47PM +0200, Boaz Harrosh wrote: >> For one this auto discovery of yours is very (very) nice but is a bit >> inconvenience. Before I would reserve a big chuck on each NUMA range >> on Kernel's

[PATCH] SQUASHME: Streamline pmem.c

2015-03-26 Thread Boaz Harrosh
is used through out, including block core. I do not like pmem to blasphemy more than needed. * More style stuff ... Please squash into your initial submission Signed-off-by: Boaz Harrosh --- drivers/block/pmem.c | 137 +++ 1 file changed, 28

Re: another pmem variant V2

2015-03-26 Thread Boaz Harrosh
On 03/26/2015 10:32 AM, Christoph Hellwig wrote: > Here is another version of the same trivial pmem driver, because two > obviously aren't enough. The first patch is the same pmem driver > that Ross posted a short time ago, just modified to use platform_devices > to find the persistant memory regi

Re: [Linux-nvdimm] [PATCH 2/3] x86: add a is_e820_ram() helper

2015-03-26 Thread Boaz Harrosh
On 03/26/2015 06:02 PM, Dan Williams wrote: > On Thu, Mar 26, 2015 at 8:49 AM, Boaz Harrosh wrote: >> On 03/26/2015 11:34 AM, Christoph Hellwig wrote: >>> +/* >>> + * This is a non-standardized way to represent ADR or NVDIMM regions that >>> + * persist ov

Re: [PATCH 2/3] x86: add a is_e820_ram() helper

2015-03-26 Thread Boaz Harrosh
On 03/26/2015 11:34 AM, Christoph Hellwig wrote: <> Please re-post this patch stand alone because git am on this will Give me the wrong title and commit message small comments ... > From: Christoph Hellwig > Date: Wed, 25 Mar 2015 12:24:11 +0100 > Subject: x86: add support for the non-standard

Re: [Linux-nvdimm] [PATCH 1/3] pmem: Initial version of persistent memory driver

2015-03-26 Thread Boaz Harrosh
On 03/26/2015 04:12 PM, Dan Williams wrote: > On Thu, Mar 26, 2015 at 1:32 AM, Christoph Hellwig wrote: >> From: Ross Zwisler >> Dan something is Broken with you mailer program it keeps dropping the CC when sending replies. For example Both me and Ross who were on CC got dropped, Jens Axboe tho

Re: [PATCH 1/8] pmem: Initial version of persistent memory driver

2015-03-26 Thread Boaz Harrosh
On 03/26/2015 06:00 AM, Elliott, Robert (Server Storage) wrote: > > >> -Original Message- >> From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel- >> ow...@vger.kernel.org] On Behalf Of Andy Lutomirski >> Sent: Wednesday, March 18, 2015 1:07 PM >

Re: [RFC PATCH 0/7] evacuate struct page from the block layer

2015-03-24 Thread Boaz Harrosh
On 03/23/2015 05:19 PM, Rik van Riel wrote: >>> Michael Tsirkin and I have been doing some thinking about what >>> it would take to allocate struct pages per 2MB area permanently, >>> and allocate additional struct pages for 4kB pages on demand, >>> when a 2MB area is broken up into 4kB pages. >> >

Re: [RFC PATCH 0/7] evacuate struct page from the block layer

2015-03-22 Thread Boaz Harrosh
On 03/22/2015 07:22 PM, Dan Williams wrote: > On Sun, Mar 22, 2015 at 10:06 AM, Boaz Harrosh wrote: <> >> >> Moving to pfn's only means that all this unnamed code above that >> "relies on struct page being PAGE_SIZE" is now not allowed to >> interf

<    1   2   3   4   5   6   >