On 03/11/2015 04:56, Xiao Guangrong wrote: > > > On 11/03/2015 05:12 AM, Paolo Bonzini wrote: >> >> >> On 02/11/2015 10:13, Xiao Guangrong wrote: >>> Currently, file_ram_alloc() only works on directory - it creates a file >>> under @path and do mmap on it >>> >>> This patch tries to allow it to work on file directly, if @path is a >>> directory it works as before, otherwise it treats @path as the target >>> file then directly allocate memory from it >>> >>> Signed-off-by: Xiao Guangrong <guangrong.x...@linux.intel.com> >>> --- >>> exec.c | 80 >>> ++++++++++++++++++++++++++++++++++++++++++------------------------ >>> 1 file changed, 51 insertions(+), 29 deletions(-) >>> >>> diff --git a/exec.c b/exec.c >>> index 9075f4d..db0fdaf 100644 >>> --- a/exec.c >>> +++ b/exec.c >>> @@ -1174,14 +1174,60 @@ void qemu_mutex_unlock_ramlist(void) >>> } >>> >>> #ifdef __linux__ >>> +static bool path_is_dir(const char *path) >>> +{ >>> + struct stat fs; >>> + >>> + return stat(path, &fs) == 0 && S_ISDIR(fs.st_mode); >>> +} >>> + >>> +static int open_ram_file_path(RAMBlock *block, const char *path, >>> size_t size) >>> +{ >>> + char *filename; >>> + char *sanitized_name; >>> + char *c; >>> + int fd; >>> + >>> + if (!path_is_dir(path)) { >>> + int flags = (block->flags & RAM_SHARED) ? O_RDWR : O_RDONLY; >>> + >>> + flags |= O_EXCL; >>> + return open(path, flags); >>> + } >>> + >>> + /* Make name safe to use with mkstemp by replacing '/' with '_'. */ >>> + sanitized_name = g_strdup(memory_region_name(block->mr)); >>> + for (c = sanitized_name; *c != '\0'; c++) { >>> + if (*c == '/') { >>> + *c = '_'; >>> + } >>> + } >>> + filename = g_strdup_printf("%s/qemu_back_mem.%s.XXXXXX", path, >>> + sanitized_name); >>> + g_free(sanitized_name); >>> + fd = mkstemp(filename); >>> + if (fd >= 0) { >>> + unlink(filename); >>> + /* >>> + * ftruncate is not supported by hugetlbfs in older >>> + * hosts, so don't bother bailing out on errors. >>> + * If anything goes wrong with it under other filesystems, >>> + * mmap will fail. >>> + */ >>> + if (ftruncate(fd, size)) { >>> + perror("ftruncate"); >>> + } >>> + } >>> + g_free(filename); >>> + >>> + return fd; >>> +} >>> + >>> static void *file_ram_alloc(RAMBlock *block, >>> ram_addr_t memory, >>> const char *path, >>> Error **errp) >>> { >>> - char *filename; >>> - char *sanitized_name; >>> - char *c; >>> void *area; >>> int fd; >>> uint64_t pagesize; >>> @@ -1212,38 +1258,14 @@ static void *file_ram_alloc(RAMBlock *block, >>> goto error; >>> } >>> >>> - /* Make name safe to use with mkstemp by replacing '/' with '_'. */ >>> - sanitized_name = g_strdup(memory_region_name(block->mr)); >>> - for (c = sanitized_name; *c != '\0'; c++) { >>> - if (*c == '/') >>> - *c = '_'; >>> - } >>> - >>> - filename = g_strdup_printf("%s/qemu_back_mem.%s.XXXXXX", path, >>> - sanitized_name); >>> - g_free(sanitized_name); >>> + memory = ROUND_UP(memory, pagesize); >>> >>> - fd = mkstemp(filename); >>> + fd = open_ram_file_path(block, path, memory); >>> if (fd < 0) { >>> error_setg_errno(errp, errno, >>> "unable to create backing store for path >>> %s", path); >>> - g_free(filename); >>> goto error; >>> } >>> - unlink(filename); >>> - g_free(filename); >>> - >>> - memory = ROUND_UP(memory, pagesize); >>> - >>> - /* >>> - * ftruncate is not supported by hugetlbfs in older >>> - * hosts, so don't bother bailing out on errors. >>> - * If anything goes wrong with it under other filesystems, >>> - * mmap will fail. >>> - */ >>> - if (ftruncate(fd, memory)) { >>> - perror("ftruncate"); >>> - } >>> >>> area = qemu_ram_mmap(fd, memory, pagesize, block->flags & >>> RAM_SHARED); >>> if (area == MAP_FAILED) { >>> >> >> I was going to send tomorrow a pull request for a similar patch, >> "backends/hostmem-file: Allow to specify full pathname for backing file". >> >> The main difference seems to be your usage of O_EXCL. Can you explain >> why you added it? > > It' used if we pass a block device as a NVDIMM backend memory: > O_EXCL can be used without O_CREAT if pathname refers to a block > device. If the block device > is in use by the system (e.g., mounted), open() fails with the error EBUSY
That makes sense, but I think it's better to be consistent with the handling of block devices. Block devices do not use O_EXCL when QEMU opens them; I guess in principle it would also be possible to share a single pmem backend between multiple guests. Paolo