date:20160619

[PATCH v2 10/24] fs: cifs: Replace CURRENT_TIME by current_time()

2016-06-19 Thread Deepa Dinamani

CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_time() instead.

This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe.

CURRENT_TIME macro will be deleted before merging the
aforementioned change.

Change signature of helper cifs_all_info_to_fattr since it
now needs both super_block and cifs_sb_info.

Note: The inode timestamps read from the server are assumed
to have correct granularity and range.

Signed-off-by: Deepa Dinamani 
Cc: Steve French 
---
 fs/cifs/inode.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index 9b3d92e..721809e 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -320,9 +320,9 @@ cifs_create_dfs_fattr(struct cifs_fattr *fattr, struct 
super_block *sb)
fattr->cf_mode = S_IFDIR | S_IXUGO | S_IRWXU;
fattr->cf_uid = cifs_sb->mnt_uid;
fattr->cf_gid = cifs_sb->mnt_gid;
-   fattr->cf_atime = CURRENT_TIME;
-   fattr->cf_ctime = CURRENT_TIME;
-   fattr->cf_mtime = CURRENT_TIME;
+   ktime_get_real_ts(>cf_mtime);
+   fattr->cf_mtime = timespec_trunc(fattr->cf_mtime, sb->s_time_gran);
+   fattr->cf_atime = fattr->cf_ctime = fattr->cf_mtime;
fattr->cf_nlink = 2;
fattr->cf_flags |= CIFS_FATTR_DFS_REFERRAL;
 }
@@ -584,9 +584,10 @@ static int cifs_sfu_mode(struct cifs_fattr *fattr, const 
unsigned char *path,
 /* Fill a cifs_fattr struct with info from FILE_ALL_INFO */
 static void
 cifs_all_info_to_fattr(struct cifs_fattr *fattr, FILE_ALL_INFO *info,
-  struct cifs_sb_info *cifs_sb, bool adjust_tz,
+  struct super_block *sb, bool adjust_tz,
   bool symlink)
 {
+   struct cifs_sb_info *cifs_sb = CIFS_SB(sb);
struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
 
memset(fattr, 0, sizeof(*fattr));
@@ -596,8 +597,10 @@ cifs_all_info_to_fattr(struct cifs_fattr *fattr, 
FILE_ALL_INFO *info,
 
if (info->LastAccessTime)
fattr->cf_atime = cifs_NTtimeToUnix(info->LastAccessTime);
-   else
-   fattr->cf_atime = CURRENT_TIME;
+   else {
+   ktime_get_real_ts(>cf_atime);
+   fattr->cf_atime = timespec_trunc(fattr->cf_atime, 
sb->s_time_gran);
+   }
 
fattr->cf_ctime = cifs_NTtimeToUnix(info->ChangeTime);
fattr->cf_mtime = cifs_NTtimeToUnix(info->LastWriteTime);
@@ -657,7 +660,6 @@ cifs_get_file_info(struct file *filp)
FILE_ALL_INFO find_data;
struct cifs_fattr fattr;
struct inode *inode = file_inode(filp);
-   struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
struct cifsFileInfo *cfile = filp->private_data;
struct cifs_tcon *tcon = tlink_tcon(cfile->tlink);
struct TCP_Server_Info *server = tcon->ses->server;
@@ -669,7 +671,7 @@ cifs_get_file_info(struct file *filp)
rc = server->ops->query_file_info(xid, tcon, >fid, _data);
switch (rc) {
case 0:
-   cifs_all_info_to_fattr(, _data, cifs_sb, false,
+   cifs_all_info_to_fattr(, _data, inode->i_sb, false,
   false);
break;
case -EREMOTE:
@@ -751,7 +753,7 @@ cifs_get_inode_info(struct inode **inode, const char 
*full_path,
}
 
if (!rc) {
-   cifs_all_info_to_fattr(, data, cifs_sb, adjust_tz,
+   cifs_all_info_to_fattr(, data, sb, adjust_tz,
   symlink);
} else if (rc == -EREMOTE) {
cifs_create_dfs_fattr(, sb);
-- 
1.9.1

[PATCH v2 01/24] vfs: Add current_time() api

2016-06-19 Thread Deepa Dinamani

current_fs_time() is used for inode timestamps.

Change the signature of the function to take inode pointer
instead of superblock as per Linus's suggestion.

Also, move the api under vfs as per the discussion on the
thread: https://lkml.org/lkml/2016/6/9/36 . As per Arnd's
suggestion on the thread, changing the function name.

current_fs_time() will be deleted after all the references
to it are replaced by current_time().

Note that timespec_trunc() will also be moved to fs/inode.c
in a separate patch when this will need to be revamped for
bounds checking purposes.

Signed-off-by: Deepa Dinamani 
---
 fs/inode.c | 15 +++
 include/linux/fs.h |  1 +
 2 files changed, 16 insertions(+)

diff --git a/fs/inode.c b/fs/inode.c
index 4ccbc21..75c98cc 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -2041,3 +2041,18 @@ void inode_nohighmem(struct inode *inode)
mapping_set_gfp_mask(inode->i_mapping, GFP_USER);
 }
 EXPORT_SYMBOL(inode_nohighmem);
+
+/**
+ * current_time - Return FS time
+ * @inode: inode.
+ *
+ * Return the current time truncated to the time granularity supported by
+ * the fs.
+ */
+struct timespec current_time(struct inode *inode)
+{
+   struct timespec now = current_kernel_time();
+
+   return timespec_trunc(now, inode->i_sb->s_time_gran);
+}
+EXPORT_SYMBOL(current_time);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index dcd087e..21a8afe 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1461,6 +1461,7 @@ struct super_block {
 };
 
 extern struct timespec current_fs_time(struct super_block *sb);
+extern struct timespec current_time(struct inode *inode);
 
 static inline struct timespec current_fs_time_sec(struct super_block *sb)
 {
-- 
1.9.1

[PATCH v2 10/24] fs: cifs: Replace CURRENT_TIME by current_time()

2016-06-19 Thread Deepa Dinamani

CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_time() instead.

This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe.

CURRENT_TIME macro will be deleted before merging the
aforementioned change.

Change signature of helper cifs_all_info_to_fattr since it
now needs both super_block and cifs_sb_info.

Note: The inode timestamps read from the server are assumed
to have correct granularity and range.

Signed-off-by: Deepa Dinamani 
Cc: Steve French 
---
 fs/cifs/inode.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index 9b3d92e..721809e 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -320,9 +320,9 @@ cifs_create_dfs_fattr(struct cifs_fattr *fattr, struct 
super_block *sb)
fattr->cf_mode = S_IFDIR | S_IXUGO | S_IRWXU;
fattr->cf_uid = cifs_sb->mnt_uid;
fattr->cf_gid = cifs_sb->mnt_gid;
-   fattr->cf_atime = CURRENT_TIME;
-   fattr->cf_ctime = CURRENT_TIME;
-   fattr->cf_mtime = CURRENT_TIME;
+   ktime_get_real_ts(>cf_mtime);
+   fattr->cf_mtime = timespec_trunc(fattr->cf_mtime, sb->s_time_gran);
+   fattr->cf_atime = fattr->cf_ctime = fattr->cf_mtime;
fattr->cf_nlink = 2;
fattr->cf_flags |= CIFS_FATTR_DFS_REFERRAL;
 }
@@ -584,9 +584,10 @@ static int cifs_sfu_mode(struct cifs_fattr *fattr, const 
unsigned char *path,
 /* Fill a cifs_fattr struct with info from FILE_ALL_INFO */
 static void
 cifs_all_info_to_fattr(struct cifs_fattr *fattr, FILE_ALL_INFO *info,
-  struct cifs_sb_info *cifs_sb, bool adjust_tz,
+  struct super_block *sb, bool adjust_tz,
   bool symlink)
 {
+   struct cifs_sb_info *cifs_sb = CIFS_SB(sb);
struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
 
memset(fattr, 0, sizeof(*fattr));
@@ -596,8 +597,10 @@ cifs_all_info_to_fattr(struct cifs_fattr *fattr, 
FILE_ALL_INFO *info,
 
if (info->LastAccessTime)
fattr->cf_atime = cifs_NTtimeToUnix(info->LastAccessTime);
-   else
-   fattr->cf_atime = CURRENT_TIME;
+   else {
+   ktime_get_real_ts(>cf_atime);
+   fattr->cf_atime = timespec_trunc(fattr->cf_atime, 
sb->s_time_gran);
+   }
 
fattr->cf_ctime = cifs_NTtimeToUnix(info->ChangeTime);
fattr->cf_mtime = cifs_NTtimeToUnix(info->LastWriteTime);
@@ -657,7 +660,6 @@ cifs_get_file_info(struct file *filp)
FILE_ALL_INFO find_data;
struct cifs_fattr fattr;
struct inode *inode = file_inode(filp);
-   struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
struct cifsFileInfo *cfile = filp->private_data;
struct cifs_tcon *tcon = tlink_tcon(cfile->tlink);
struct TCP_Server_Info *server = tcon->ses->server;
@@ -669,7 +671,7 @@ cifs_get_file_info(struct file *filp)
rc = server->ops->query_file_info(xid, tcon, >fid, _data);
switch (rc) {
case 0:
-   cifs_all_info_to_fattr(, _data, cifs_sb, false,
+   cifs_all_info_to_fattr(, _data, inode->i_sb, false,
   false);
break;
case -EREMOTE:
@@ -751,7 +753,7 @@ cifs_get_inode_info(struct inode **inode, const char 
*full_path,
}
 
if (!rc) {
-   cifs_all_info_to_fattr(, data, cifs_sb, adjust_tz,
+   cifs_all_info_to_fattr(, data, sb, adjust_tz,
   symlink);
} else if (rc == -EREMOTE) {
cifs_create_dfs_fattr(, sb);
-- 
1.9.1

[PATCH v2 00/24] Delete CURRENT_TIME and CURRENT_TIME_SEC macros

2016-06-19 Thread Deepa Dinamani

The series is aimed at getting rid of CURRENT_TIME and CURRENT_TIME_SEC macros.
The macros are not y2038 safe. There is no plan to transition them into being
y2038 safe.
ktime_get_* api's can be used in their place. And, these are y2038 safe.

Thanks to Arnd Bergmann for all the guidance and discussions.

Patches 2-4 were mostly generated using coccinelle scripts.

All filesystem timestamps use current_fs_time() for right granularity as
mentioned in the respective commit texts of patches. This has a changed
signature, renamed to current_time() and moved to the fs/inode.c.

This series also serves as a preparatory series to transition vfs to 64 bit
timestamps as outlined here: https://lkml.org/lkml/2016/2/12/104 .

As per Linus's suggestion in https://lkml.org/lkml/2016/5/24/663 , all the
inode timestamp changes have been squashed into a single patch. Also,
current_time() now is used as a single generic vfs filesystem timestamp api.
It also takes struct inode* as argument instead of struct super_block*.
Posting all patches together in a bigger series so that the big picture is
clear.

As per the suggestion in https://lwn.net/Articles/672598/, CURRENT_TIME macro
bug fixes are being handled in a series separate from transitioning vfs to use.

Changes from v1:
* Change current_fs_time(struct super_block *) to current_time(struct inode *)

* Note that change to add time64_to_tm() is already part of John's
  kernel tree: https://lkml.org/lkml/2016/6/17/875 .

Deepa Dinamani (24):
  vfs: Add current_time() api
  fs: Replace CURRENT_TIME with current_time() for inode timestamps
  fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
  fs: Replace current_fs_time() with current_time()
  fs: jfs: Replace CURRENT_TIME_SEC by current_time()
  fs: ext4: Use current_time() for inode timestamps
  fs: ubifs: Replace CURRENT_TIME_SEC with current_time
  fs: btrfs: Use ktime_get_real_ts for root ctime
  fs: udf: Replace CURRENT_TIME with current_time()
  fs: cifs: Replace CURRENT_TIME by current_time()
  fs: cifs: Replace CURRENT_TIME with ktime_get_real_ts()
  fs: cifs: Replace CURRENT_TIME by get_seconds
  fs: f2fs: Use ktime_get_real_seconds for sit_info times
  drivers: staging: lustre: Replace CURRENT_TIME with current_time()
  fs: ocfs2: Use time64_t to represent orphan scan times
  fs: ocfs2: Replace CURRENT_TIME with ktime_get_real_seconds()
  audit: Use timespec64 to represent audit timestamps
  fs: nfs: Make nfs boot time y2038 safe
  fnic: Use time64_t to represent trace timestamps
  block: Replace CURRENT_TIME with ktime_get_real_ts
  libceph: Replace CURRENT_TIME with ktime_get_real_ts
  fs: ceph: Replace current_fs_time for request stamp
  time: Delete CURRENT_TIME_SEC and CURRENT_TIME macro
  time: Delete current_fs_time() function

 arch/powerpc/platforms/cell/spufs/inode.c  |  2 +-
 arch/s390/hypfs/inode.c|  4 ++--
 drivers/block/rbd.c|  2 +-
 drivers/char/sonypi.c  |  2 +-
 drivers/infiniband/hw/qib/qib_fs.c |  2 +-
 drivers/misc/ibmasm/ibmasmfs.c |  2 +-
 drivers/oprofile/oprofilefs.c  |  2 +-
 drivers/platform/x86/sony-laptop.c |  2 +-
 drivers/scsi/fnic/fnic_trace.c |  4 ++--
 drivers/scsi/fnic/fnic_trace.h |  2 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c| 16 ++---
 drivers/staging/lustre/lustre/llite/namei.c|  4 ++--
 drivers/staging/lustre/lustre/mdc/mdc_reint.c  |  6 ++---
 .../lustre/lustre/obdclass/linux/linux-obdo.c  |  6 ++---
 drivers/staging/lustre/lustre/obdclass/obdo.c  |  6 ++---
 drivers/staging/lustre/lustre/osc/osc_io.c |  2 +-
 drivers/usb/core/devio.c   | 18 +++---
 drivers/usb/gadget/function/f_fs.c |  8 +++
 drivers/usb/gadget/legacy/inode.c  |  2 +-
 fs/9p/vfs_inode.c  |  2 +-
 fs/adfs/inode.c|  2 +-
 fs/affs/amigaffs.c |  6 ++---
 fs/affs/inode.c|  2 +-
 fs/attr.c  |  2 +-
 fs/autofs4/inode.c |  2 +-
 fs/autofs4/root.c  |  6 ++---
 fs/bad_inode.c |  2 +-
 fs/bfs/dir.c   | 14 +--
 fs/binfmt_misc.c   |  2 +-
 fs/btrfs/file.c|  6 ++---
 fs/btrfs/inode.c   | 22 -
 fs/btrfs/ioctl.c   |  8 +++
 fs/btrfs/root-tree.c   |  3 ++-
 fs/btrfs/transaction.c |  4 ++--
 fs/btrfs/xattr.c   |  2 +-
 fs/ceph/file.c

[PATCH v2 09/24] fs: udf: Replace CURRENT_TIME with current_time()

2016-06-19 Thread Deepa Dinamani

CURRENT_TIME is not y2038 safe.

CURRENT_TIME macro is also not appropriate for filesystems
as it doesn't use the right granularity for filesystem
timestamps.

Logical Volume Integrity format is described to have the
same timestamp format for "Recording Date and time" as
the other [a,c,m]timestamps.
Hence using current_time() instead here promises to
maintain the same granularity as other timestamps.

This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe. As part of the effort current_fs_time() will be
extended to do range checks.

Signed-off-by: Deepa Dinamani 
Cc: Jan Kara 
Reviewed-by: Jan Kara 
---
 fs/udf/super.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/udf/super.c b/fs/udf/super.c
index 4942549..134c63a 100644
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -1986,6 +1986,7 @@ static void udf_open_lvid(struct super_block *sb)
struct buffer_head *bh = sbi->s_lvid_bh;
struct logicalVolIntegrityDesc *lvid;
struct logicalVolIntegrityDescImpUse *lvidiu;
+   struct timespec ts;
 
if (!bh)
return;
@@ -1997,8 +1998,9 @@ static void udf_open_lvid(struct super_block *sb)
mutex_lock(>s_alloc_mutex);
lvidiu->impIdent.identSuffix[0] = UDF_OS_CLASS_UNIX;
lvidiu->impIdent.identSuffix[1] = UDF_OS_ID_LINUX;
+   ktime_get_real_ts();
udf_time_to_disk_stamp(>recordingDateAndTime,
-   CURRENT_TIME);
+   timespec_trunc(ts, sb->s_time_gran));
lvid->integrityType = cpu_to_le32(LVID_INTEGRITY_TYPE_OPEN);
 
lvid->descTag.descCRC = cpu_to_le16(
@@ -2019,6 +2021,7 @@ static void udf_close_lvid(struct super_block *sb)
struct buffer_head *bh = sbi->s_lvid_bh;
struct logicalVolIntegrityDesc *lvid;
struct logicalVolIntegrityDescImpUse *lvidiu;
+   struct timespec ts;
 
if (!bh)
return;
@@ -2030,7 +2033,9 @@ static void udf_close_lvid(struct super_block *sb)
mutex_lock(>s_alloc_mutex);
lvidiu->impIdent.identSuffix[0] = UDF_OS_CLASS_UNIX;
lvidiu->impIdent.identSuffix[1] = UDF_OS_ID_LINUX;
-   udf_time_to_disk_stamp(>recordingDateAndTime, CURRENT_TIME);
+   ktime_get_real_ts();
+   udf_time_to_disk_stamp(>recordingDateAndTime,
+   timespec_trunc(ts, sb->s_time_gran));
if (UDF_MAX_WRITE_VERSION > le16_to_cpu(lvidiu->maxUDFWriteRev))
lvidiu->maxUDFWriteRev = cpu_to_le16(UDF_MAX_WRITE_VERSION);
if (sbi->s_udfrev > le16_to_cpu(lvidiu->minUDFReadRev))
-- 
1.9.1

[PATCH v2 06/24] fs: ext4: Use current_time() for inode timestamps

2016-06-19 Thread Deepa Dinamani

CURRENT_TIME_SEC and CURRENT_TIME are not y2038 safe.
current_time() will be transitioned to be y2038 safe
along with vfs.

current_time() returns timestamps according to the
granularities set in the super_block.
The granularity check in ext4_current_time() to call
current_time() or CURRENT_TIME_SEC is not required.
Use current_time() directly to obtain timestamps
unconditionally, and remove ext4_current_time().

Quota files are assumed to be on the same filesystem.
Hence, use current_time() for these files as well.

Signed-off-by: Deepa Dinamani 
Cc: "Theodore Ts'o" 
Cc: Andreas Dilger 
Cc: linux-e...@vger.kernel.org
---
 fs/ext4/acl.c |  2 +-
 fs/ext4/ext4.h|  6 --
 fs/ext4/extents.c | 10 +-
 fs/ext4/ialloc.c  |  2 +-
 fs/ext4/inline.c  |  4 ++--
 fs/ext4/inode.c   |  6 +++---
 fs/ext4/ioctl.c   |  8 
 fs/ext4/namei.c   | 24 +---
 fs/ext4/xattr.c   |  2 +-
 9 files changed, 30 insertions(+), 34 deletions(-)

diff --git a/fs/ext4/acl.c b/fs/ext4/acl.c
index c6601a4..733e2f24 100644
--- a/fs/ext4/acl.c
+++ b/fs/ext4/acl.c
@@ -197,7 +197,7 @@ __ext4_set_acl(handle_t *handle, struct inode *inode, int 
type,
if (error < 0)
return error;
else {
-   inode->i_ctime = ext4_current_time(inode);
+   inode->i_ctime = current_time(inode);
ext4_mark_inode_dirty(handle, inode);
if (error == 0)
acl = NULL;
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index b84aa1c..14e5cf4 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1523,12 +1523,6 @@ static inline struct ext4_inode_info *EXT4_I(struct 
inode *inode)
return container_of(inode, struct ext4_inode_info, vfs_inode);
 }
 
-static inline struct timespec ext4_current_time(struct inode *inode)
-{
-   return (inode->i_sb->s_time_gran < NSEC_PER_SEC) ?
-   current_fs_time(inode->i_sb) : CURRENT_TIME_SEC;
-}
-
 static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino)
 {
return ino == EXT4_ROOT_INO ||
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 2a2eef9..2584317 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4722,7 +4722,7 @@ retry:
map.m_lblk += ret;
map.m_len = len = len - ret;
epos = (loff_t)map.m_lblk << inode->i_blkbits;
-   inode->i_ctime = ext4_current_time(inode);
+   inode->i_ctime = current_time(inode);
if (new_size) {
if (epos > new_size)
epos = new_size;
@@ -4850,7 +4850,7 @@ static long ext4_zero_range(struct file *file, loff_t 
offset,
}
/* Now release the pages and zero block aligned part of pages */
truncate_pagecache_range(inode, start, end - 1);
-   inode->i_mtime = inode->i_ctime = ext4_current_time(inode);
+   inode->i_mtime = inode->i_ctime = current_time(inode);
 
ret = ext4_alloc_file_blocks(file, lblk, max_blocks, new_size,
 flags, mode);
@@ -4875,7 +4875,7 @@ static long ext4_zero_range(struct file *file, loff_t 
offset,
goto out_dio;
}
 
-   inode->i_mtime = inode->i_ctime = ext4_current_time(inode);
+   inode->i_mtime = inode->i_ctime = current_time(inode);
if (new_size) {
ext4_update_inode_size(inode, new_size);
} else {
@@ -5574,7 +5574,7 @@ int ext4_collapse_range(struct inode *inode, loff_t 
offset, loff_t len)
up_write(_I(inode)->i_data_sem);
if (IS_SYNC(inode))
ext4_handle_sync(handle);
-   inode->i_mtime = inode->i_ctime = ext4_current_time(inode);
+   inode->i_mtime = inode->i_ctime = current_time(inode);
ext4_mark_inode_dirty(handle, inode);
 
 out_stop:
@@ -5684,7 +5684,7 @@ int ext4_insert_range(struct inode *inode, loff_t offset, 
loff_t len)
/* Expand file to avoid data loss if there is error while shifting */
inode->i_size += len;
EXT4_I(inode)->i_disksize += len;
-   inode->i_mtime = inode->i_ctime = ext4_current_time(inode);
+   inode->i_mtime = inode->i_ctime = current_time(inode);
ret = ext4_mark_inode_dirty(handle, inode);
if (ret)
goto out_stop;
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 1e4b0b7..8b3d58f 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -1039,7 +1039,7 @@ got:
/* This is the optimal IO size (for stat), not the fs block size */
inode->i_blocks = 0;
inode->i_mtime = inode->i_atime = inode->i_ctime = ei->i_crtime =
-  ext4_current_time(inode);
+

[PATCH v2 09/24] fs: udf: Replace CURRENT_TIME with current_time()

2016-06-19 Thread Deepa Dinamani

CURRENT_TIME is not y2038 safe.

CURRENT_TIME macro is also not appropriate for filesystems
as it doesn't use the right granularity for filesystem
timestamps.

Logical Volume Integrity format is described to have the
same timestamp format for "Recording Date and time" as
the other [a,c,m]timestamps.
Hence using current_time() instead here promises to
maintain the same granularity as other timestamps.

This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe. As part of the effort current_fs_time() will be
extended to do range checks.

Signed-off-by: Deepa Dinamani 
Cc: Jan Kara 
Reviewed-by: Jan Kara 
---
 fs/udf/super.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/udf/super.c b/fs/udf/super.c
index 4942549..134c63a 100644
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -1986,6 +1986,7 @@ static void udf_open_lvid(struct super_block *sb)
struct buffer_head *bh = sbi->s_lvid_bh;
struct logicalVolIntegrityDesc *lvid;
struct logicalVolIntegrityDescImpUse *lvidiu;
+   struct timespec ts;
 
if (!bh)
return;
@@ -1997,8 +1998,9 @@ static void udf_open_lvid(struct super_block *sb)
mutex_lock(>s_alloc_mutex);
lvidiu->impIdent.identSuffix[0] = UDF_OS_CLASS_UNIX;
lvidiu->impIdent.identSuffix[1] = UDF_OS_ID_LINUX;
+   ktime_get_real_ts();
udf_time_to_disk_stamp(>recordingDateAndTime,
-   CURRENT_TIME);
+   timespec_trunc(ts, sb->s_time_gran));
lvid->integrityType = cpu_to_le32(LVID_INTEGRITY_TYPE_OPEN);
 
lvid->descTag.descCRC = cpu_to_le16(
@@ -2019,6 +2021,7 @@ static void udf_close_lvid(struct super_block *sb)
struct buffer_head *bh = sbi->s_lvid_bh;
struct logicalVolIntegrityDesc *lvid;
struct logicalVolIntegrityDescImpUse *lvidiu;
+   struct timespec ts;
 
if (!bh)
return;
@@ -2030,7 +2033,9 @@ static void udf_close_lvid(struct super_block *sb)
mutex_lock(>s_alloc_mutex);
lvidiu->impIdent.identSuffix[0] = UDF_OS_CLASS_UNIX;
lvidiu->impIdent.identSuffix[1] = UDF_OS_ID_LINUX;
-   udf_time_to_disk_stamp(>recordingDateAndTime, CURRENT_TIME);
+   ktime_get_real_ts();
+   udf_time_to_disk_stamp(>recordingDateAndTime,
+   timespec_trunc(ts, sb->s_time_gran));
if (UDF_MAX_WRITE_VERSION > le16_to_cpu(lvidiu->maxUDFWriteRev))
lvidiu->maxUDFWriteRev = cpu_to_le16(UDF_MAX_WRITE_VERSION);
if (sbi->s_udfrev > le16_to_cpu(lvidiu->minUDFReadRev))
-- 
1.9.1

[PATCH v2 06/24] fs: ext4: Use current_time() for inode timestamps

2016-06-19 Thread Deepa Dinamani

CURRENT_TIME_SEC and CURRENT_TIME are not y2038 safe.
current_time() will be transitioned to be y2038 safe
along with vfs.

current_time() returns timestamps according to the
granularities set in the super_block.
The granularity check in ext4_current_time() to call
current_time() or CURRENT_TIME_SEC is not required.
Use current_time() directly to obtain timestamps
unconditionally, and remove ext4_current_time().

Quota files are assumed to be on the same filesystem.
Hence, use current_time() for these files as well.

Signed-off-by: Deepa Dinamani 
Cc: "Theodore Ts'o" 
Cc: Andreas Dilger 
Cc: linux-e...@vger.kernel.org
---
 fs/ext4/acl.c |  2 +-
 fs/ext4/ext4.h|  6 --
 fs/ext4/extents.c | 10 +-
 fs/ext4/ialloc.c  |  2 +-
 fs/ext4/inline.c  |  4 ++--
 fs/ext4/inode.c   |  6 +++---
 fs/ext4/ioctl.c   |  8 
 fs/ext4/namei.c   | 24 +---
 fs/ext4/xattr.c   |  2 +-
 9 files changed, 30 insertions(+), 34 deletions(-)

diff --git a/fs/ext4/acl.c b/fs/ext4/acl.c
index c6601a4..733e2f24 100644
--- a/fs/ext4/acl.c
+++ b/fs/ext4/acl.c
@@ -197,7 +197,7 @@ __ext4_set_acl(handle_t *handle, struct inode *inode, int 
type,
if (error < 0)
return error;
else {
-   inode->i_ctime = ext4_current_time(inode);
+   inode->i_ctime = current_time(inode);
ext4_mark_inode_dirty(handle, inode);
if (error == 0)
acl = NULL;
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index b84aa1c..14e5cf4 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1523,12 +1523,6 @@ static inline struct ext4_inode_info *EXT4_I(struct 
inode *inode)
return container_of(inode, struct ext4_inode_info, vfs_inode);
 }
 
-static inline struct timespec ext4_current_time(struct inode *inode)
-{
-   return (inode->i_sb->s_time_gran < NSEC_PER_SEC) ?
-   current_fs_time(inode->i_sb) : CURRENT_TIME_SEC;
-}
-
 static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino)
 {
return ino == EXT4_ROOT_INO ||
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 2a2eef9..2584317 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4722,7 +4722,7 @@ retry:
map.m_lblk += ret;
map.m_len = len = len - ret;
epos = (loff_t)map.m_lblk << inode->i_blkbits;
-   inode->i_ctime = ext4_current_time(inode);
+   inode->i_ctime = current_time(inode);
if (new_size) {
if (epos > new_size)
epos = new_size;
@@ -4850,7 +4850,7 @@ static long ext4_zero_range(struct file *file, loff_t 
offset,
}
/* Now release the pages and zero block aligned part of pages */
truncate_pagecache_range(inode, start, end - 1);
-   inode->i_mtime = inode->i_ctime = ext4_current_time(inode);
+   inode->i_mtime = inode->i_ctime = current_time(inode);
 
ret = ext4_alloc_file_blocks(file, lblk, max_blocks, new_size,
 flags, mode);
@@ -4875,7 +4875,7 @@ static long ext4_zero_range(struct file *file, loff_t 
offset,
goto out_dio;
}
 
-   inode->i_mtime = inode->i_ctime = ext4_current_time(inode);
+   inode->i_mtime = inode->i_ctime = current_time(inode);
if (new_size) {
ext4_update_inode_size(inode, new_size);
} else {
@@ -5574,7 +5574,7 @@ int ext4_collapse_range(struct inode *inode, loff_t 
offset, loff_t len)
up_write(_I(inode)->i_data_sem);
if (IS_SYNC(inode))
ext4_handle_sync(handle);
-   inode->i_mtime = inode->i_ctime = ext4_current_time(inode);
+   inode->i_mtime = inode->i_ctime = current_time(inode);
ext4_mark_inode_dirty(handle, inode);
 
 out_stop:
@@ -5684,7 +5684,7 @@ int ext4_insert_range(struct inode *inode, loff_t offset, 
loff_t len)
/* Expand file to avoid data loss if there is error while shifting */
inode->i_size += len;
EXT4_I(inode)->i_disksize += len;
-   inode->i_mtime = inode->i_ctime = ext4_current_time(inode);
+   inode->i_mtime = inode->i_ctime = current_time(inode);
ret = ext4_mark_inode_dirty(handle, inode);
if (ret)
goto out_stop;
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 1e4b0b7..8b3d58f 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -1039,7 +1039,7 @@ got:
/* This is the optimal IO size (for stat), not the fs block size */
inode->i_blocks = 0;
inode->i_mtime = inode->i_atime = inode->i_ctime = ei->i_crtime =
-  ext4_current_time(inode);
+  current_time(inode);

Re: [RFC][PATCH] ftracetest: Fix hist unsupported result in hist selftests

2016-06-19 Thread Namhyung Kim

On Fri, Jun 17, 2016 at 05:28:47PM -0400, Steven Rostedt wrote:
> Ah, due to traveling I never got around to finishing this. What about
> this patch?
> 
> From 7bf19b58ba02e66014efce6c051acba2c6cbd861 Mon Sep 17 00:00:00 2001
> From: "Steven Rostedt (Red Hat)" 
> Date: Mon, 23 May 2016 15:06:30 -0400
> Subject: [PATCH] ftracetest: Fix hist unsupported result in hist selftests
> 
> When histograms are not configured in the kernel, the ftracetest histogram
> selftests should return "unsupported" and not "Failed". To detect this, the
> test scripts have:
> 
>  FEATURE=`grep hist events/sched/sched_process_fork/trigger`
>  if [ -z "$FEATURE" ]; then
>  echo "hist trigger is not supported"
>  exit_unsupported
>  fi
> 
> The problem is that '-e' is in effect and any error will cause the program
> to terminate. The grep for 'hist' fails, because it is not compiled it (thus
> unsupported), but because grep has an error code for failing to find the
> string, it causes the program to terminate, and is marked as a failed test.
> 
> Namhyung Kim recommended to test for the "hist" file located in
> events/sched/sched_process_fork/hist instead, as it is more inline with the
> other checks. As the hist file is only created if the histogram feature is
> enabled, that is a valid check.
> 
> Link: http://lkml.kernel.org/r/20160523151538.4ea9c...@gandalf.local.home
> 
> Suggested-by: Namhyung Kim 
> Fixes: 76929ab51f0ee ("kselftests/ftrace: Add hist trigger testcases")
> Signed-off-by: Steven Rostedt 

Acked-by: Namhyung Kim 

Thanks,
Namhyung


> ---
>  .../testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc  | 9 
> -
>  tools/testing/selftests/ftrace/test.d/trigger/trigger-hist.tc| 9 
> -
>  .../testing/selftests/ftrace/test.d/trigger/trigger-multihist.tc | 9 
> -
>  3 files changed, 12 insertions(+), 15 deletions(-)
> 
> diff --git 
> a/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc 
> b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc
> index c2b61c4fda11..0bf5085281f3 100644
> --- a/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc
> +++ b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc
> @@ -23,15 +23,14 @@ if [ ! -f events/sched/sched_process_fork/trigger ]; then
>  exit_unsupported
>  fi
>  
> -reset_tracer
> -do_reset
> -
> -FEATURE=`grep hist events/sched/sched_process_fork/trigger`
> -if [ -z "$FEATURE" ]; then
> +if [ ! -f events/sched/sched_process_fork/hist ]; then
>  echo "hist trigger is not supported"
>  exit_unsupported
>  fi
>  
> +reset_tracer
> +do_reset
> +
>  echo "Test histogram with execname modifier"
>  
>  echo 'hist:keys=common_pid.execname' > 
> events/sched/sched_process_fork/trigger
> diff --git a/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist.tc 
> b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist.tc
> index b2902d42a537..a00184cd9c95 100644
> --- a/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist.tc
> +++ b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist.tc
> @@ -23,15 +23,14 @@ if [ ! -f events/sched/sched_process_fork/trigger ]; then
>  exit_unsupported
>  fi
>  
> -reset_tracer
> -do_reset
> -
> -FEATURE=`grep hist events/sched/sched_process_fork/trigger`
> -if [ -z "$FEATURE" ]; then
> +if [ ! -f events/sched/sched_process_fork/hist ]; then
>  echo "hist trigger is not supported"
>  exit_unsupported
>  fi
>  
> +reset_tracer
> +do_reset
> +
>  echo "Test histogram basic tigger"
>  
>  echo 'hist:keys=parent_pid:vals=child_pid' > 
> events/sched/sched_process_fork/trigger
> diff --git 
> a/tools/testing/selftests/ftrace/test.d/trigger/trigger-multihist.tc 
> b/tools/testing/selftests/ftrace/test.d/trigger/trigger-multihist.tc
> index 03c4a46561fc..3478b00ead57 100644
> --- a/tools/testing/selftests/ftrace/test.d/trigger/trigger-multihist.tc
> +++ b/tools/testing/selftests/ftrace/test.d/trigger/trigger-multihist.tc
> @@ -23,15 +23,14 @@ if [ ! -f events/sched/sched_process_fork/trigger ]; then
>  exit_unsupported
>  fi
>  
> -reset_tracer
> -do_reset
> -
> -FEATURE=`grep hist events/sched/sched_process_fork/trigger`
> -if [ -z "$FEATURE" ]; then
> +if [ ! -f events/sched/sched_process_fork/hist ]; then
>  echo "hist trigger is not supported"
>  exit_unsupported
>  fi
>  
> +reset_tracer
> +do_reset
> +
>  reset_trigger
>  
>  echo "Test histogram multiple tiggers"
> -- 
> 1.9.3
>

Re: [RFC][PATCH] ftracetest: Fix hist unsupported result in hist selftests

2016-06-19 Thread Namhyung Kim

On Fri, Jun 17, 2016 at 05:28:47PM -0400, Steven Rostedt wrote:
> Ah, due to traveling I never got around to finishing this. What about
> this patch?
> 
> From 7bf19b58ba02e66014efce6c051acba2c6cbd861 Mon Sep 17 00:00:00 2001
> From: "Steven Rostedt (Red Hat)" 
> Date: Mon, 23 May 2016 15:06:30 -0400
> Subject: [PATCH] ftracetest: Fix hist unsupported result in hist selftests
> 
> When histograms are not configured in the kernel, the ftracetest histogram
> selftests should return "unsupported" and not "Failed". To detect this, the
> test scripts have:
> 
>  FEATURE=`grep hist events/sched/sched_process_fork/trigger`
>  if [ -z "$FEATURE" ]; then
>  echo "hist trigger is not supported"
>  exit_unsupported
>  fi
> 
> The problem is that '-e' is in effect and any error will cause the program
> to terminate. The grep for 'hist' fails, because it is not compiled it (thus
> unsupported), but because grep has an error code for failing to find the
> string, it causes the program to terminate, and is marked as a failed test.
> 
> Namhyung Kim recommended to test for the "hist" file located in
> events/sched/sched_process_fork/hist instead, as it is more inline with the
> other checks. As the hist file is only created if the histogram feature is
> enabled, that is a valid check.
> 
> Link: http://lkml.kernel.org/r/20160523151538.4ea9c...@gandalf.local.home
> 
> Suggested-by: Namhyung Kim 
> Fixes: 76929ab51f0ee ("kselftests/ftrace: Add hist trigger testcases")
> Signed-off-by: Steven Rostedt 

Acked-by: Namhyung Kim 

Thanks,
Namhyung


> ---
>  .../testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc  | 9 
> -
>  tools/testing/selftests/ftrace/test.d/trigger/trigger-hist.tc| 9 
> -
>  .../testing/selftests/ftrace/test.d/trigger/trigger-multihist.tc | 9 
> -
>  3 files changed, 12 insertions(+), 15 deletions(-)
> 
> diff --git 
> a/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc 
> b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc
> index c2b61c4fda11..0bf5085281f3 100644
> --- a/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc
> +++ b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist-mod.tc
> @@ -23,15 +23,14 @@ if [ ! -f events/sched/sched_process_fork/trigger ]; then
>  exit_unsupported
>  fi
>  
> -reset_tracer
> -do_reset
> -
> -FEATURE=`grep hist events/sched/sched_process_fork/trigger`
> -if [ -z "$FEATURE" ]; then
> +if [ ! -f events/sched/sched_process_fork/hist ]; then
>  echo "hist trigger is not supported"
>  exit_unsupported
>  fi
>  
> +reset_tracer
> +do_reset
> +
>  echo "Test histogram with execname modifier"
>  
>  echo 'hist:keys=common_pid.execname' > 
> events/sched/sched_process_fork/trigger
> diff --git a/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist.tc 
> b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist.tc
> index b2902d42a537..a00184cd9c95 100644
> --- a/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist.tc
> +++ b/tools/testing/selftests/ftrace/test.d/trigger/trigger-hist.tc
> @@ -23,15 +23,14 @@ if [ ! -f events/sched/sched_process_fork/trigger ]; then
>  exit_unsupported
>  fi
>  
> -reset_tracer
> -do_reset
> -
> -FEATURE=`grep hist events/sched/sched_process_fork/trigger`
> -if [ -z "$FEATURE" ]; then
> +if [ ! -f events/sched/sched_process_fork/hist ]; then
>  echo "hist trigger is not supported"
>  exit_unsupported
>  fi
>  
> +reset_tracer
> +do_reset
> +
>  echo "Test histogram basic tigger"
>  
>  echo 'hist:keys=parent_pid:vals=child_pid' > 
> events/sched/sched_process_fork/trigger
> diff --git 
> a/tools/testing/selftests/ftrace/test.d/trigger/trigger-multihist.tc 
> b/tools/testing/selftests/ftrace/test.d/trigger/trigger-multihist.tc
> index 03c4a46561fc..3478b00ead57 100644
> --- a/tools/testing/selftests/ftrace/test.d/trigger/trigger-multihist.tc
> +++ b/tools/testing/selftests/ftrace/test.d/trigger/trigger-multihist.tc
> @@ -23,15 +23,14 @@ if [ ! -f events/sched/sched_process_fork/trigger ]; then
>  exit_unsupported
>  fi
>  
> -reset_tracer
> -do_reset
> -
> -FEATURE=`grep hist events/sched/sched_process_fork/trigger`
> -if [ -z "$FEATURE" ]; then
> +if [ ! -f events/sched/sched_process_fork/hist ]; then
>  echo "hist trigger is not supported"
>  exit_unsupported
>  fi
>  
> +reset_tracer
> +do_reset
> +
>  reset_trigger
>  
>  echo "Test histogram multiple tiggers"
> -- 
> 1.9.3
>

Re: [PATCH] tracing: Add *iter check for NULL

2016-06-19 Thread Namhyung Kim

On Fri, Jun 17, 2016 at 02:24:57PM -0400, Steven Rostedt wrote:
> I tried your patch and it works until you remove the module and try
> reading the trace again. As I said, you left out later processing. This
> should not exit on error. Below is a patch I wrote, and it works well.
> 
> I'll add you as reported by.
> 
> Thanks!

Acked-by: Namhyung Kim 

Thanks,
Namhyung


> diff --git a/kernel/trace/trace_printk.c b/kernel/trace/trace_printk.c
> index f96f0383f6c6..ad1d6164e946 100644
> --- a/kernel/trace/trace_printk.c
> +++ b/kernel/trace/trace_printk.c
> @@ -36,6 +36,10 @@ struct trace_bprintk_fmt {
>  static inline struct trace_bprintk_fmt *lookup_format(const char *fmt)
>  {
>   struct trace_bprintk_fmt *pos;
> +
> + if (!fmt)
> + return ERR_PTR(-EINVAL);
> +
>   list_for_each_entry(pos, _bprintk_fmt_list, list) {
>   if (!strcmp(pos->fmt, fmt))
>   return pos;
> @@ -57,7 +61,8 @@ void hold_module_trace_bprintk_format(const char **start, 
> const char **end)
>   for (iter = start; iter < end; iter++) {
>   struct trace_bprintk_fmt *tb_fmt = lookup_format(*iter);
>   if (tb_fmt) {
> - *iter = tb_fmt->fmt;
> + if (!IS_ERR(tb_fmt))
> + *iter = tb_fmt->fmt;
>   continue;
>   }
>

Re: [PATCH] tracing: Add *iter check for NULL

2016-06-19 Thread Namhyung Kim

On Fri, Jun 17, 2016 at 02:24:57PM -0400, Steven Rostedt wrote:
> I tried your patch and it works until you remove the module and try
> reading the trace again. As I said, you left out later processing. This
> should not exit on error. Below is a patch I wrote, and it works well.
> 
> I'll add you as reported by.
> 
> Thanks!

Acked-by: Namhyung Kim 

Thanks,
Namhyung


> diff --git a/kernel/trace/trace_printk.c b/kernel/trace/trace_printk.c
> index f96f0383f6c6..ad1d6164e946 100644
> --- a/kernel/trace/trace_printk.c
> +++ b/kernel/trace/trace_printk.c
> @@ -36,6 +36,10 @@ struct trace_bprintk_fmt {
>  static inline struct trace_bprintk_fmt *lookup_format(const char *fmt)
>  {
>   struct trace_bprintk_fmt *pos;
> +
> + if (!fmt)
> + return ERR_PTR(-EINVAL);
> +
>   list_for_each_entry(pos, _bprintk_fmt_list, list) {
>   if (!strcmp(pos->fmt, fmt))
>   return pos;
> @@ -57,7 +61,8 @@ void hold_module_trace_bprintk_format(const char **start, 
> const char **end)
>   for (iter = start; iter < end; iter++) {
>   struct trace_bprintk_fmt *tb_fmt = lookup_format(*iter);
>   if (tb_fmt) {
> - *iter = tb_fmt->fmt;
> + if (!IS_ERR(tb_fmt))
> + *iter = tb_fmt->fmt;
>   continue;
>   }
>

Re: [LKP] [lkp] [mm] 5c0a85fad9: unixbench.score -6.3% regression

2016-06-19 Thread Minchan Kim

On Fri, Jun 17, 2016 at 12:26:51PM -0700, Huang, Ying wrote:
> Minchan Kim  writes:
> 
> > On Thu, Jun 16, 2016 at 03:27:44PM -0700, Huang, Ying wrote:
> >> Minchan Kim  writes:
> >> 
> >> > On Thu, Jun 16, 2016 at 07:52:26AM +0800, Huang, Ying wrote:
> >> >> "Kirill A. Shutemov"  writes:
> >> >> 
> >> >> > On Tue, Jun 14, 2016 at 05:57:28PM +0900, Minchan Kim wrote:
> >> >> >> On Wed, Jun 08, 2016 at 11:58:11AM +0300, Kirill A. Shutemov wrote:
> >> >> >> > On Wed, Jun 08, 2016 at 04:41:37PM +0800, Huang, Ying wrote:
> >> >> >> > > "Huang, Ying"  writes:
> >> >> >> > > 
> >> >> >> > > > "Kirill A. Shutemov"  writes:
> >> >> >> > > >
> >> >> >> > > >> On Mon, Jun 06, 2016 at 10:27:24AM +0800, kernel test robot 
> >> >> >> > > >> wrote:
> >> >> >> > > >>> 
> >> >> >> > > >>> FYI, we noticed a -6.3% regression of unixbench.score due to 
> >> >> >> > > >>> commit:
> >> >> >> > > >>> 
> >> >> >> > > >>> commit 5c0a85fad949212b3e059692deecdeed74ae7ec7 ("mm: make 
> >> >> >> > > >>> faultaround produce old ptes")
> >> >> >> > > >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> >> >> >> > > >>>  master
> >> >> >> > > >>> 
> >> >> >> > > >>> in testcase: unixbench
> >> >> >> > > >>> on test machine: lituya: 16 threads Haswell High-end Desktop 
> >> >> >> > > >>> (i7-5960X 3.0G) with 16G memory
> >> >> >> > > >>> with following parameters: 
> >> >> >> > > >>> cpufreq_governor=performance/nr_task=1/test=shell8
> >> >> >> > > >>> 
> >> >> >> > > >>> 
> >> >> >> > > >>> Details are as below:
> >> >> >> > > >>> -->
> >> >> >> > > >>> 
> >> >> >> > > >>> 
> >> >> >> > > >>> =
> >> >> >> > > >>> compiler/cpufreq_governor/kconfig/nr_task/rootfs/tbox_group/test/testcase:
> >> >> >> > > >>>   
> >> >> >> > > >>> gcc-4.9/performance/x86_64-rhel/1/debian-x86_64-2015-02-07.cgz/lituya/shell8/unixbench
> >> >> >> > > >>> 
> >> >> >> > > >>> commit: 
> >> >> >> > > >>>   4b50bcc7eda4d3cc9e3f2a0aa60e590fedf728c5
> >> >> >> > > >>>   5c0a85fad949212b3e059692deecdeed74ae7ec7
> >> >> >> > > >>> 
> >> >> >> > > >>> 4b50bcc7eda4d3cc 5c0a85fad949212b3e059692de 
> >> >> >> > > >>>  -- 
> >> >> >> > > >>>fail:runs  %reproductionfail:runs
> >> >> >> > > >>>| | |
> >> >> >> > > >>>   3:4  -75%:4 
> >> >> >> > > >>> kmsg.DHCP/BOOTP:Reply_not_for_us,op[#]xid[#]
> >> >> >> > > >>>  %stddev %change %stddev
> >> >> >> > > >>>  \  |\  
> >> >> >> > > >>>  14321 .  0%  -6.3%  13425 .  0%  unixbench.score
> >> >> >> > > >>>1996897 .  0%  -6.1%1874635 .  0%  
> >> >> >> > > >>> unixbench.time.involuntary_context_switches
> >> >> >> > > >>>  1.721e+08 .  0%  -6.2%  1.613e+08 .  0%  
> >> >> >> > > >>> unixbench.time.minor_page_faults
> >> >> >> > > >>> 758.65 .  0%  -3.0% 735.86 .  0%  
> >> >> >> > > >>> unixbench.time.system_time
> >> >> >> > > >>> 387.66 .  0%  +5.4% 408.49 .  0%  
> >> >> >> > > >>> unixbench.time.user_time
> >> >> >> > > >>>5950278 .  0%  -6.2%5583456 .  0%  
> >> >> >> > > >>> unixbench.time.voluntary_context_switches
> >> >> >> > > >>
> >> >> >> > > >> That's weird.
> >> >> >> > > >>
> >> >> >> > > >> I don't understand why the change would reduce number or 
> >> >> >> > > >> minor faults.
> >> >> >> > > >> It should stay the same on x86-64. Rise of user_time is 
> >> >> >> > > >> puzzling too.
> >> >> >> > > >
> >> >> >> > > > unixbench runs in fixed time mode.  That is, the total time to 
> >> >> >> > > > run
> >> >> >> > > > unixbench is fixed, but the work done varies.  So the 
> >> >> >> > > > minor_page_faults
> >> >> >> > > > change may reflect only the work done.
> >> >> >> > > >
> >> >> >> > > >> Hm. Is reproducible? Across reboot?
> >> >> >> > > >
> >> >> >> > > 
> >> >> >> > > And FYI, there is no swap setup for test, all root file system 
> >> >> >> > > including
> >> >> >> > > benchmark files are in tmpfs, so no real page reclaim will be
> >> >> >> > > triggered.  But it appears that active file cache reduced after 
> >> >> >> > > the
> >> >> >> > > commit.
> >> >> >> > > 
> >> >> >> > > 111331 .  1% -13.3%  96503 .  0%  meminfo.Active
> >> >> >> > >  27603 .  1% -43.9%  15486 .  0%  
> >> >> >> > > meminfo.Active(file)
> >> >> >> > > 
> >> >> >> > > I think this is the expected behavior of the commit?
> >> >> >> > 
> >> >> >> > Yes, it's expected.
> >> >> >> > 
> >> >> >> > After the change faularound would produce old pte. It means 
> >> >> >> > there's more
> >> >> >> > chance for these pages to be on inactive lru, unless somebody

Re: [LKP] [lkp] [mm] 5c0a85fad9: unixbench.score -6.3% regression

2016-06-19 Thread Minchan Kim

On Fri, Jun 17, 2016 at 12:26:51PM -0700, Huang, Ying wrote:
> Minchan Kim  writes:
> 
> > On Thu, Jun 16, 2016 at 03:27:44PM -0700, Huang, Ying wrote:
> >> Minchan Kim  writes:
> >> 
> >> > On Thu, Jun 16, 2016 at 07:52:26AM +0800, Huang, Ying wrote:
> >> >> "Kirill A. Shutemov"  writes:
> >> >> 
> >> >> > On Tue, Jun 14, 2016 at 05:57:28PM +0900, Minchan Kim wrote:
> >> >> >> On Wed, Jun 08, 2016 at 11:58:11AM +0300, Kirill A. Shutemov wrote:
> >> >> >> > On Wed, Jun 08, 2016 at 04:41:37PM +0800, Huang, Ying wrote:
> >> >> >> > > "Huang, Ying"  writes:
> >> >> >> > > 
> >> >> >> > > > "Kirill A. Shutemov"  writes:
> >> >> >> > > >
> >> >> >> > > >> On Mon, Jun 06, 2016 at 10:27:24AM +0800, kernel test robot 
> >> >> >> > > >> wrote:
> >> >> >> > > >>> 
> >> >> >> > > >>> FYI, we noticed a -6.3% regression of unixbench.score due to 
> >> >> >> > > >>> commit:
> >> >> >> > > >>> 
> >> >> >> > > >>> commit 5c0a85fad949212b3e059692deecdeed74ae7ec7 ("mm: make 
> >> >> >> > > >>> faultaround produce old ptes")
> >> >> >> > > >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> >> >> >> > > >>>  master
> >> >> >> > > >>> 
> >> >> >> > > >>> in testcase: unixbench
> >> >> >> > > >>> on test machine: lituya: 16 threads Haswell High-end Desktop 
> >> >> >> > > >>> (i7-5960X 3.0G) with 16G memory
> >> >> >> > > >>> with following parameters: 
> >> >> >> > > >>> cpufreq_governor=performance/nr_task=1/test=shell8
> >> >> >> > > >>> 
> >> >> >> > > >>> 
> >> >> >> > > >>> Details are as below:
> >> >> >> > > >>> -->
> >> >> >> > > >>> 
> >> >> >> > > >>> 
> >> >> >> > > >>> =
> >> >> >> > > >>> compiler/cpufreq_governor/kconfig/nr_task/rootfs/tbox_group/test/testcase:
> >> >> >> > > >>>   
> >> >> >> > > >>> gcc-4.9/performance/x86_64-rhel/1/debian-x86_64-2015-02-07.cgz/lituya/shell8/unixbench
> >> >> >> > > >>> 
> >> >> >> > > >>> commit: 
> >> >> >> > > >>>   4b50bcc7eda4d3cc9e3f2a0aa60e590fedf728c5
> >> >> >> > > >>>   5c0a85fad949212b3e059692deecdeed74ae7ec7
> >> >> >> > > >>> 
> >> >> >> > > >>> 4b50bcc7eda4d3cc 5c0a85fad949212b3e059692de 
> >> >> >> > > >>>  -- 
> >> >> >> > > >>>fail:runs  %reproductionfail:runs
> >> >> >> > > >>>| | |
> >> >> >> > > >>>   3:4  -75%:4 
> >> >> >> > > >>> kmsg.DHCP/BOOTP:Reply_not_for_us,op[#]xid[#]
> >> >> >> > > >>>  %stddev %change %stddev
> >> >> >> > > >>>  \  |\  
> >> >> >> > > >>>  14321 .  0%  -6.3%  13425 .  0%  unixbench.score
> >> >> >> > > >>>1996897 .  0%  -6.1%1874635 .  0%  
> >> >> >> > > >>> unixbench.time.involuntary_context_switches
> >> >> >> > > >>>  1.721e+08 .  0%  -6.2%  1.613e+08 .  0%  
> >> >> >> > > >>> unixbench.time.minor_page_faults
> >> >> >> > > >>> 758.65 .  0%  -3.0% 735.86 .  0%  
> >> >> >> > > >>> unixbench.time.system_time
> >> >> >> > > >>> 387.66 .  0%  +5.4% 408.49 .  0%  
> >> >> >> > > >>> unixbench.time.user_time
> >> >> >> > > >>>5950278 .  0%  -6.2%5583456 .  0%  
> >> >> >> > > >>> unixbench.time.voluntary_context_switches
> >> >> >> > > >>
> >> >> >> > > >> That's weird.
> >> >> >> > > >>
> >> >> >> > > >> I don't understand why the change would reduce number or 
> >> >> >> > > >> minor faults.
> >> >> >> > > >> It should stay the same on x86-64. Rise of user_time is 
> >> >> >> > > >> puzzling too.
> >> >> >> > > >
> >> >> >> > > > unixbench runs in fixed time mode.  That is, the total time to 
> >> >> >> > > > run
> >> >> >> > > > unixbench is fixed, but the work done varies.  So the 
> >> >> >> > > > minor_page_faults
> >> >> >> > > > change may reflect only the work done.
> >> >> >> > > >
> >> >> >> > > >> Hm. Is reproducible? Across reboot?
> >> >> >> > > >
> >> >> >> > > 
> >> >> >> > > And FYI, there is no swap setup for test, all root file system 
> >> >> >> > > including
> >> >> >> > > benchmark files are in tmpfs, so no real page reclaim will be
> >> >> >> > > triggered.  But it appears that active file cache reduced after 
> >> >> >> > > the
> >> >> >> > > commit.
> >> >> >> > > 
> >> >> >> > > 111331 .  1% -13.3%  96503 .  0%  meminfo.Active
> >> >> >> > >  27603 .  1% -43.9%  15486 .  0%  
> >> >> >> > > meminfo.Active(file)
> >> >> >> > > 
> >> >> >> > > I think this is the expected behavior of the commit?
> >> >> >> > 
> >> >> >> > Yes, it's expected.
> >> >> >> > 
> >> >> >> > After the change faularound would produce old pte. It means 
> >> >> >> > there's more
> >> >> >> > chance for these pages to be on inactive lru, unless somebody 
> >> >> >> > actually
> >> >> >> > touch them and flip accessed bit.
> >> >> >> 
> >> >> >> Hmm, tmpfs pages should

actualizar su cuenta de correo electrónico de inmediato

2016-06-19 Thread Administrator

Hola, por favor verificar su dirección de correo electrónico .

Por favor, siga este enlace https://formcrafts.com/a/20226 para completar el 
proceso de verificación

Su correo electrónico no estará en la lista hasta que haya comprobado o 
significaba estar cerca

Saludos,
Webmaster Equipo 2016
Actualización del cliente Web

actualizar su cuenta de correo electrónico de inmediato

2016-06-19 Thread Administrator

Hola, por favor verificar su dirección de correo electrónico .

Por favor, siga este enlace https://formcrafts.com/a/20226 para completar el 
proceso de verificación

Su correo electrónico no estará en la lista hasta que haya comprobado o 
significaba estar cerca

Saludos,
Webmaster Equipo 2016
Actualización del cliente Web

Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting

2016-06-19 Thread Minchan Kim

On Fri, Jun 17, 2016 at 08:59:31AM -0700, Huang, Ying wrote:
> Minchan Kim  writes:
> 
> > Hi,
> >
> > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote:
> >> From: Huang Ying 
> >> 
> >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are
> >> required.  In madvise_free_huge_pmd, if part pages of THP are discarded,
> >> the THP will be split and fallback PTE operations should be used if
> >> splitting succeeds.  But the original code will make fallback PTE
> >> operations skipped, after splitting succeeds.  Fix that via make
> >> madvise_free_huge_pmd return 0 after splitting successfully, so that the
> >> fallback PTE operations will be done.
> >
> > You're right. Thanks!
> >
> >> 
> >> Know issues: if my understanding were correct, return 1 from
> >> madvise_free_huge_pmd means the following processing for the PMD should
> >> be skipped, while return 0 means the following processing is still
> >> needed.  So the function should return 0 only if the THP is split
> >> successfully or the PMD is not trans huge.  But the pmd_trans_unstable
> >> after madvise_free_huge_pmd guarantee the following processing will be
> >> skipped for huge PMD.  So current code can run properly.  But if my
> >> understanding were correct, we can clean up return code of
> >> madvise_free_huge_pmd accordingly.
> >
> > I like your clean up. Just a minor comment below.
> >
> >> 
> >> Signed-off-by: "Huang, Ying" 
> >> ---
> >>  mm/huge_memory.c | 7 +--
> >>  1 file changed, 1 insertion(+), 6 deletions(-)
> >> 
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index 2ad52d5..64dc95d 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >
> > First of all, let's change ret from int to bool.
> > And then, add description in the function entry.
> >
> > /*
> >  * Return true if we do MADV_FREE successfully on entire pmd page.
> >  * Otherwise, return false.
> >  */
> >
> > And do not set to 1 if it is huge_zero_pmd but just goto out to
> > return false.
> 
> Do you want to fold the cleanup with this patch or do that in another
> patch?

I prefer separating cleanup and bug fix so that we can send only bug
fix patch to stable tree.

Thanks.

Re: [PATCH] MADVISE_FREE, THP: Fix madvise_free_huge_pmd return value after splitting

2016-06-19 Thread Minchan Kim

On Fri, Jun 17, 2016 at 08:59:31AM -0700, Huang, Ying wrote:
> Minchan Kim  writes:
> 
> > Hi,
> >
> > On Thu, Jun 16, 2016 at 08:03:54PM -0700, Huang, Ying wrote:
> >> From: Huang Ying 
> >> 
> >> madvise_free_huge_pmd should return 0 if the fallback PTE operations are
> >> required.  In madvise_free_huge_pmd, if part pages of THP are discarded,
> >> the THP will be split and fallback PTE operations should be used if
> >> splitting succeeds.  But the original code will make fallback PTE
> >> operations skipped, after splitting succeeds.  Fix that via make
> >> madvise_free_huge_pmd return 0 after splitting successfully, so that the
> >> fallback PTE operations will be done.
> >
> > You're right. Thanks!
> >
> >> 
> >> Know issues: if my understanding were correct, return 1 from
> >> madvise_free_huge_pmd means the following processing for the PMD should
> >> be skipped, while return 0 means the following processing is still
> >> needed.  So the function should return 0 only if the THP is split
> >> successfully or the PMD is not trans huge.  But the pmd_trans_unstable
> >> after madvise_free_huge_pmd guarantee the following processing will be
> >> skipped for huge PMD.  So current code can run properly.  But if my
> >> understanding were correct, we can clean up return code of
> >> madvise_free_huge_pmd accordingly.
> >
> > I like your clean up. Just a minor comment below.
> >
> >> 
> >> Signed-off-by: "Huang, Ying" 
> >> ---
> >>  mm/huge_memory.c | 7 +--
> >>  1 file changed, 1 insertion(+), 6 deletions(-)
> >> 
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index 2ad52d5..64dc95d 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >
> > First of all, let's change ret from int to bool.
> > And then, add description in the function entry.
> >
> > /*
> >  * Return true if we do MADV_FREE successfully on entire pmd page.
> >  * Otherwise, return false.
> >  */
> >
> > And do not set to 1 if it is huge_zero_pmd but just goto out to
> > return false.
> 
> Do you want to fold the cleanup with this patch or do that in another
> patch?

I prefer separating cleanup and bug fix so that we can send only bug
fix patch to stable tree.

Thanks.

Re: [PATCH 1/3] toshiba_acpi: Add IIO interface for accelerometer axis data

2016-06-19 Thread Azael Avalos

Hi,

Thanks for the feedback, I just did a respin of the series with the
changes you mentioned.


Cheers
Azael

2016-06-19 14:16 GMT-06:00 Jonathan Cameron :
> On 11/06/16 19:57, Azael Avalos wrote:
>> This patch adds the accelerometer axis data to the IIO subsystem.
>>
>> Currently reporting the X, Y and Z values, as no other data can be
>> queried given the fact that the accelerometer chip itself is hidden
>> behind the Toshiba proprietary interface.
>>
>> Signed-off-by: Azael Avalos 
> Looks pretty good and simple to me.  A few bits and bobs inline.
>
> Jonathan
>> ---
>> All:
>>  This is my first attempt with the IIO subsysem, I'll be looking
>>  forward for your valuable input on this.
>>
>> Darren:
>>  There's a warning about more than 80 columns on this patch, once
>>  I get feedback from the IIO guys I'll respin this with that issue
>>  corrected.
>>
>>  drivers/platform/x86/toshiba_acpi.c | 130 
>> ++--
>>  1 file changed, 126 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/platform/x86/toshiba_acpi.c 
>> b/drivers/platform/x86/toshiba_acpi.c
>> index 01e12d2..85014a3 100644
>> --- a/drivers/platform/x86/toshiba_acpi.c
>> +++ b/drivers/platform/x86/toshiba_acpi.c
>> @@ -53,6 +53,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>
>> @@ -134,6 +135,7 @@ MODULE_LICENSE("GPL");
>>
>>  /* Field definitions */
>>  #define HCI_ACCEL_MASK   0x7fff
>> +#define HCI_ACCEL_DIRECTION_MASK 0x8000
>>  #define HCI_HOTKEY_DISABLE   0x0b
>>  #define HCI_HOTKEY_ENABLE0x09
>>  #define HCI_HOTKEY_SPECIAL_FUNCTIONS 0x10
>> @@ -178,6 +180,7 @@ struct toshiba_acpi_dev {
>>   struct led_classdev eco_led;
>>   struct miscdevice miscdev;
>>   struct rfkill *wwan_rfk;
>> + struct iio_dev *indio_dev;
>>
>>   int force_fan;
>>   int last_key_event;
>> @@ -1962,8 +1965,8 @@ static ssize_t position_show(struct device *dev,
>>struct device_attribute *attr, char *buf)
>>  {
>>   struct toshiba_acpi_dev *toshiba = dev_get_drvdata(dev);
>> - u32 xyval, zval, tmp;
>> - u16 x, y, z;
>> + u32 xyval, zval;
>> + int x, y, z;
>>   int ret;
>>
>>   xyval = zval = 0;
>> @@ -1971,10 +1974,14 @@ static ssize_t position_show(struct device *dev,
>>   if (ret < 0)
>>   return ret;
>>
>> + /* Accelerometer values */
>>   x = xyval & HCI_ACCEL_MASK;
>> - tmp = xyval >> HCI_MISC_SHIFT;
>> - y = tmp & HCI_ACCEL_MASK;
>> + y = (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK;
>>   z = zval & HCI_ACCEL_MASK;
>> + /* Movement direction */
>> + x *= xyval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
>> + y *= (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
>> + z *= zval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> This lot is really an unrelated change - do it as a separate precursor patch
> to the IIO support.
>>
>>   return sprintf(buf, "%d %d %d\n", x, y, z);
>>  }
>> @@ -2420,6 +2427,94 @@ static void toshiba_acpi_kbd_bl_work(struct 
>> work_struct *work)
>>  }
>>
>>  /*
>> + * IIO device
>> + */
>> +
>> +enum toshiba_accel_chan {
>> + AXIS_X,
>> + AXIS_Y,
>> + AXIS_Z
>> +};
>> +
>> +static int toshiba_accel_get_axis(enum toshiba_accel_chan chan)
>> +{
>> + u32 xyval, zval;
>> + int x, y, z;
>> + int ret;
>> +
>> + xyval = zval = 0;
>> + ret = toshiba_accelerometer_get(toshiba_acpi, , );
>> + if (ret < 0)
>> + return ret;
>> +
>> + /* Accelerometer values */
>> + x = xyval & HCI_ACCEL_MASK;
>> + y = (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK;
>> + z = zval & HCI_ACCEL_MASK;
>> + /* Movement direction */
>> + x *= xyval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
>> + y *= (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
>> + z *= zval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> Wow, that's hideous ;)
>> +
>> + switch (chan) {
>> + case AXIS_X:
>> + ret = x;
>> + break;
>> + case AXIS_Y:
>> + ret = y;
>> + break;
>> + case AXIS_Z:
>> + ret = z;
>> + break;
> Just compute the one you are returning perhaps?
> case AXIS_X:
>  return xyval & HCI_ACCEL_DIRECTION_MASK ?
> -(xyval & HCI_ACCEL_MASK) :
> xyval & HCI_ACCEL_MASK;
> etc?
> Brings all the 'mess' into one location.
> Or break it out into steps which is fine, but only compute the one
> we care about.
>
>> + }
>> +
>> + return ret;
>> +}
>> +
>> +static int toshiba_accel_read_raw(struct iio_dev *indio_dev,
>> +   struct iio_chan_spec const *chan,
>> +   int *val, int *val2, long mask)
>> +{
>> + int ret;
>> +
>> + switch (mask) {
>> + case IIO_CHAN_INFO_RAW:
>> + if (iio_buffer_enabled(indio_dev))
>> +

Re: [PATCH 1/3] toshiba_acpi: Add IIO interface for accelerometer axis data

2016-06-19 Thread Azael Avalos

Hi,

Thanks for the feedback, I just did a respin of the series with the
changes you mentioned.


Cheers
Azael

2016-06-19 14:16 GMT-06:00 Jonathan Cameron :
> On 11/06/16 19:57, Azael Avalos wrote:
>> This patch adds the accelerometer axis data to the IIO subsystem.
>>
>> Currently reporting the X, Y and Z values, as no other data can be
>> queried given the fact that the accelerometer chip itself is hidden
>> behind the Toshiba proprietary interface.
>>
>> Signed-off-by: Azael Avalos 
> Looks pretty good and simple to me.  A few bits and bobs inline.
>
> Jonathan
>> ---
>> All:
>>  This is my first attempt with the IIO subsysem, I'll be looking
>>  forward for your valuable input on this.
>>
>> Darren:
>>  There's a warning about more than 80 columns on this patch, once
>>  I get feedback from the IIO guys I'll respin this with that issue
>>  corrected.
>>
>>  drivers/platform/x86/toshiba_acpi.c | 130 
>> ++--
>>  1 file changed, 126 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/platform/x86/toshiba_acpi.c 
>> b/drivers/platform/x86/toshiba_acpi.c
>> index 01e12d2..85014a3 100644
>> --- a/drivers/platform/x86/toshiba_acpi.c
>> +++ b/drivers/platform/x86/toshiba_acpi.c
>> @@ -53,6 +53,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>
>> @@ -134,6 +135,7 @@ MODULE_LICENSE("GPL");
>>
>>  /* Field definitions */
>>  #define HCI_ACCEL_MASK   0x7fff
>> +#define HCI_ACCEL_DIRECTION_MASK 0x8000
>>  #define HCI_HOTKEY_DISABLE   0x0b
>>  #define HCI_HOTKEY_ENABLE0x09
>>  #define HCI_HOTKEY_SPECIAL_FUNCTIONS 0x10
>> @@ -178,6 +180,7 @@ struct toshiba_acpi_dev {
>>   struct led_classdev eco_led;
>>   struct miscdevice miscdev;
>>   struct rfkill *wwan_rfk;
>> + struct iio_dev *indio_dev;
>>
>>   int force_fan;
>>   int last_key_event;
>> @@ -1962,8 +1965,8 @@ static ssize_t position_show(struct device *dev,
>>struct device_attribute *attr, char *buf)
>>  {
>>   struct toshiba_acpi_dev *toshiba = dev_get_drvdata(dev);
>> - u32 xyval, zval, tmp;
>> - u16 x, y, z;
>> + u32 xyval, zval;
>> + int x, y, z;
>>   int ret;
>>
>>   xyval = zval = 0;
>> @@ -1971,10 +1974,14 @@ static ssize_t position_show(struct device *dev,
>>   if (ret < 0)
>>   return ret;
>>
>> + /* Accelerometer values */
>>   x = xyval & HCI_ACCEL_MASK;
>> - tmp = xyval >> HCI_MISC_SHIFT;
>> - y = tmp & HCI_ACCEL_MASK;
>> + y = (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK;
>>   z = zval & HCI_ACCEL_MASK;
>> + /* Movement direction */
>> + x *= xyval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
>> + y *= (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
>> + z *= zval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> This lot is really an unrelated change - do it as a separate precursor patch
> to the IIO support.
>>
>>   return sprintf(buf, "%d %d %d\n", x, y, z);
>>  }
>> @@ -2420,6 +2427,94 @@ static void toshiba_acpi_kbd_bl_work(struct 
>> work_struct *work)
>>  }
>>
>>  /*
>> + * IIO device
>> + */
>> +
>> +enum toshiba_accel_chan {
>> + AXIS_X,
>> + AXIS_Y,
>> + AXIS_Z
>> +};
>> +
>> +static int toshiba_accel_get_axis(enum toshiba_accel_chan chan)
>> +{
>> + u32 xyval, zval;
>> + int x, y, z;
>> + int ret;
>> +
>> + xyval = zval = 0;
>> + ret = toshiba_accelerometer_get(toshiba_acpi, , );
>> + if (ret < 0)
>> + return ret;
>> +
>> + /* Accelerometer values */
>> + x = xyval & HCI_ACCEL_MASK;
>> + y = (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK;
>> + z = zval & HCI_ACCEL_MASK;
>> + /* Movement direction */
>> + x *= xyval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
>> + y *= (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
>> + z *= zval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> Wow, that's hideous ;)
>> +
>> + switch (chan) {
>> + case AXIS_X:
>> + ret = x;
>> + break;
>> + case AXIS_Y:
>> + ret = y;
>> + break;
>> + case AXIS_Z:
>> + ret = z;
>> + break;
> Just compute the one you are returning perhaps?
> case AXIS_X:
>  return xyval & HCI_ACCEL_DIRECTION_MASK ?
> -(xyval & HCI_ACCEL_MASK) :
> xyval & HCI_ACCEL_MASK;
> etc?
> Brings all the 'mess' into one location.
> Or break it out into steps which is fine, but only compute the one
> we care about.
>
>> + }
>> +
>> + return ret;
>> +}
>> +
>> +static int toshiba_accel_read_raw(struct iio_dev *indio_dev,
>> +   struct iio_chan_spec const *chan,
>> +   int *val, int *val2, long mask)
>> +{
>> + int ret;
>> +
>> + switch (mask) {
>> + case IIO_CHAN_INFO_RAW:
>> + if (iio_buffer_enabled(indio_dev))
>> + return -EBUSY;
> Couple of things here.

[PATCH v2 1/3] toshiba_acpi: Add IIO interface for accelerometer axis data

2016-06-19 Thread Azael Avalos

This patch adds the accelerometer axis data to the IIO subsystem.

Currently reporting the X, Y and Z values, as no other data can be
queried given the fact that the accelerometer chip itself is hidden
behind the Toshiba proprietary interface.

Signed-off-by: Azael Avalos 
---
 drivers/platform/x86/toshiba_acpi.c | 107 
 1 file changed, 107 insertions(+)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index 01e12d2..7949929 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -53,6 +53,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -134,6 +135,7 @@ MODULE_LICENSE("GPL");
 
 /* Field definitions */
 #define HCI_ACCEL_MASK 0x7fff
+#define HCI_ACCEL_DIRECTION_MASK   0x8000
 #define HCI_HOTKEY_DISABLE 0x0b
 #define HCI_HOTKEY_ENABLE  0x09
 #define HCI_HOTKEY_SPECIAL_FUNCTIONS   0x10
@@ -178,6 +180,7 @@ struct toshiba_acpi_dev {
struct led_classdev eco_led;
struct miscdevice miscdev;
struct rfkill *wwan_rfk;
+   struct iio_dev *indio_dev;
 
int force_fan;
int last_key_event;
@@ -2420,6 +2423,83 @@ static void toshiba_acpi_kbd_bl_work(struct work_struct 
*work)
 }
 
 /*
+ * IIO device
+ */
+
+enum toshiba_accel_chan {
+   AXIS_X,
+   AXIS_Y,
+   AXIS_Z
+};
+
+static int toshiba_accel_get_axis(enum toshiba_accel_chan chan)
+{
+   u32 xyval;
+   u32 zval;
+   int ret;
+
+   xyval = zval = 0;
+   ret = toshiba_accelerometer_get(toshiba_acpi, , );
+   if (ret < 0)
+   return ret;
+
+   switch (chan) {
+   case AXIS_X:
+   return xyval & HCI_ACCEL_DIRECTION_MASK ?
+   -(xyval & HCI_ACCEL_MASK) : xyval & HCI_ACCEL_MASK;
+   case AXIS_Y:
+   return (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_DIRECTION_MASK ?
+   -((xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK) :
+   (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK;
+   case AXIS_Z:
+   return zval & HCI_ACCEL_DIRECTION_MASK ?
+   -(zval & HCI_ACCEL_MASK) : zval & HCI_ACCEL_MASK;
+   }
+
+   return ret;
+}
+
+static int toshiba_accel_read_raw(struct iio_dev *indio_dev,
+ struct iio_chan_spec const *chan,
+ int *val, int *val2, long mask)
+{
+   int ret;
+
+   switch (mask) {
+   case IIO_CHAN_INFO_RAW:
+   ret = toshiba_accel_get_axis(chan->channel);
+   if (ret == -EIO || ret == -ENODEV)
+   return ret;
+
+   *val = ret;
+
+   return IIO_VAL_INT;
+   }
+
+   return -EINVAL;
+}
+
+#define TOSHIBA_ACCEL_CHANNEL(axis, chan) { \
+   .type = IIO_ACCEL, \
+   .modified = 1, \
+   .channel = chan, \
+   .channel2 = IIO_MOD_##axis, \
+   .output = 1, \
+   .info_mask_separate = BIT(IIO_CHAN_INFO_RAW), \
+}
+
+static const struct iio_chan_spec toshiba_accel_channels[] = {
+   TOSHIBA_ACCEL_CHANNEL(X, AXIS_X),
+   TOSHIBA_ACCEL_CHANNEL(Y, AXIS_Y),
+   TOSHIBA_ACCEL_CHANNEL(Z, AXIS_Z),
+};
+
+static const struct iio_info toshiba_accel_info = {
+   .driver_module = THIS_MODULE,
+   .read_raw = _accel_read_raw,
+};
+
+/*
  * Misc device
  */
 static int toshiba_acpi_smm_bridge(SMMRegisters *regs)
@@ -2904,6 +2984,11 @@ static int toshiba_acpi_remove(struct acpi_device 
*acpi_dev)
 
remove_toshiba_proc_entries(dev);
 
+   if (dev->accelerometer_supported) {
+   iio_device_unregister(dev->indio_dev);
+   iio_device_free(dev->indio_dev);
+   }
+
if (dev->sysfs_created)
sysfs_remove_group(>acpi_dev->dev.kobj,
   _attr_group);
@@ -3051,6 +3136,28 @@ static int toshiba_acpi_add(struct acpi_device *acpi_dev)
dev->touchpad_supported = !ret;
 
toshiba_accelerometer_available(dev);
+   if (dev->accelerometer_supported) {
+   dev->indio_dev = iio_device_alloc(sizeof(*dev));
+   if (!dev->indio_dev)
+   return -ENOMEM;
+
+   pr_info("Registering Toshiba accelerometer iio device\n");
+
+   dev->indio_dev->info = _accel_info;
+   dev->indio_dev->name = "Toshiba accelerometer";
+   dev->indio_dev->dev.parent = _dev->dev;
+   dev->indio_dev->modes = INDIO_DIRECT_MODE;
+   dev->indio_dev->channels = toshiba_accel_channels;
+   dev->indio_dev->num_channels =
+   ARRAY_SIZE(toshiba_accel_channels);
+
+   ret = iio_device_register(dev->indio_dev);
+   if (ret < 0) {
+   pr_err("Unable to register iio device\n");
+

[PATCH v2 1/3] toshiba_acpi: Add IIO interface for accelerometer axis data

2016-06-19 Thread Azael Avalos

This patch adds the accelerometer axis data to the IIO subsystem.

Currently reporting the X, Y and Z values, as no other data can be
queried given the fact that the accelerometer chip itself is hidden
behind the Toshiba proprietary interface.

Signed-off-by: Azael Avalos 
---
 drivers/platform/x86/toshiba_acpi.c | 107 
 1 file changed, 107 insertions(+)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index 01e12d2..7949929 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -53,6 +53,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -134,6 +135,7 @@ MODULE_LICENSE("GPL");
 
 /* Field definitions */
 #define HCI_ACCEL_MASK 0x7fff
+#define HCI_ACCEL_DIRECTION_MASK   0x8000
 #define HCI_HOTKEY_DISABLE 0x0b
 #define HCI_HOTKEY_ENABLE  0x09
 #define HCI_HOTKEY_SPECIAL_FUNCTIONS   0x10
@@ -178,6 +180,7 @@ struct toshiba_acpi_dev {
struct led_classdev eco_led;
struct miscdevice miscdev;
struct rfkill *wwan_rfk;
+   struct iio_dev *indio_dev;
 
int force_fan;
int last_key_event;
@@ -2420,6 +2423,83 @@ static void toshiba_acpi_kbd_bl_work(struct work_struct 
*work)
 }
 
 /*
+ * IIO device
+ */
+
+enum toshiba_accel_chan {
+   AXIS_X,
+   AXIS_Y,
+   AXIS_Z
+};
+
+static int toshiba_accel_get_axis(enum toshiba_accel_chan chan)
+{
+   u32 xyval;
+   u32 zval;
+   int ret;
+
+   xyval = zval = 0;
+   ret = toshiba_accelerometer_get(toshiba_acpi, , );
+   if (ret < 0)
+   return ret;
+
+   switch (chan) {
+   case AXIS_X:
+   return xyval & HCI_ACCEL_DIRECTION_MASK ?
+   -(xyval & HCI_ACCEL_MASK) : xyval & HCI_ACCEL_MASK;
+   case AXIS_Y:
+   return (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_DIRECTION_MASK ?
+   -((xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK) :
+   (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK;
+   case AXIS_Z:
+   return zval & HCI_ACCEL_DIRECTION_MASK ?
+   -(zval & HCI_ACCEL_MASK) : zval & HCI_ACCEL_MASK;
+   }
+
+   return ret;
+}
+
+static int toshiba_accel_read_raw(struct iio_dev *indio_dev,
+ struct iio_chan_spec const *chan,
+ int *val, int *val2, long mask)
+{
+   int ret;
+
+   switch (mask) {
+   case IIO_CHAN_INFO_RAW:
+   ret = toshiba_accel_get_axis(chan->channel);
+   if (ret == -EIO || ret == -ENODEV)
+   return ret;
+
+   *val = ret;
+
+   return IIO_VAL_INT;
+   }
+
+   return -EINVAL;
+}
+
+#define TOSHIBA_ACCEL_CHANNEL(axis, chan) { \
+   .type = IIO_ACCEL, \
+   .modified = 1, \
+   .channel = chan, \
+   .channel2 = IIO_MOD_##axis, \
+   .output = 1, \
+   .info_mask_separate = BIT(IIO_CHAN_INFO_RAW), \
+}
+
+static const struct iio_chan_spec toshiba_accel_channels[] = {
+   TOSHIBA_ACCEL_CHANNEL(X, AXIS_X),
+   TOSHIBA_ACCEL_CHANNEL(Y, AXIS_Y),
+   TOSHIBA_ACCEL_CHANNEL(Z, AXIS_Z),
+};
+
+static const struct iio_info toshiba_accel_info = {
+   .driver_module = THIS_MODULE,
+   .read_raw = _accel_read_raw,
+};
+
+/*
  * Misc device
  */
 static int toshiba_acpi_smm_bridge(SMMRegisters *regs)
@@ -2904,6 +2984,11 @@ static int toshiba_acpi_remove(struct acpi_device 
*acpi_dev)
 
remove_toshiba_proc_entries(dev);
 
+   if (dev->accelerometer_supported) {
+   iio_device_unregister(dev->indio_dev);
+   iio_device_free(dev->indio_dev);
+   }
+
if (dev->sysfs_created)
sysfs_remove_group(>acpi_dev->dev.kobj,
   _attr_group);
@@ -3051,6 +3136,28 @@ static int toshiba_acpi_add(struct acpi_device *acpi_dev)
dev->touchpad_supported = !ret;
 
toshiba_accelerometer_available(dev);
+   if (dev->accelerometer_supported) {
+   dev->indio_dev = iio_device_alloc(sizeof(*dev));
+   if (!dev->indio_dev)
+   return -ENOMEM;
+
+   pr_info("Registering Toshiba accelerometer iio device\n");
+
+   dev->indio_dev->info = _accel_info;
+   dev->indio_dev->name = "Toshiba accelerometer";
+   dev->indio_dev->dev.parent = _dev->dev;
+   dev->indio_dev->modes = INDIO_DIRECT_MODE;
+   dev->indio_dev->channels = toshiba_accel_channels;
+   dev->indio_dev->num_channels =
+   ARRAY_SIZE(toshiba_accel_channels);
+
+   ret = iio_device_register(dev->indio_dev);
+   if (ret < 0) {
+   pr_err("Unable to register iio device\n");
+   iio_device_free(dev->indio_dev);
+

[PATCH v2 2/3] toshiba_acpi: Remove the position sysfs entry

2016-06-19 Thread Azael Avalos

Now that we have proper support for the acceleromeer under the IIO
subsystem, the _position_ sysfs file is now deprecated.

This patch removes all code related to the position sysfs entry.

Signed-off-by: Azael Avalos 
---
 drivers/platform/x86/toshiba_acpi.c | 25 -
 1 file changed, 25 deletions(-)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index 7949929..a9760d2 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -1961,28 +1961,6 @@ static ssize_t touchpad_show(struct device *dev,
 }
 static DEVICE_ATTR_RW(touchpad);
 
-static ssize_t position_show(struct device *dev,
-struct device_attribute *attr, char *buf)
-{
-   struct toshiba_acpi_dev *toshiba = dev_get_drvdata(dev);
-   u32 xyval, zval, tmp;
-   u16 x, y, z;
-   int ret;
-
-   xyval = zval = 0;
-   ret = toshiba_accelerometer_get(toshiba, , );
-   if (ret < 0)
-   return ret;
-
-   x = xyval & HCI_ACCEL_MASK;
-   tmp = xyval >> HCI_MISC_SHIFT;
-   y = tmp & HCI_ACCEL_MASK;
-   z = zval & HCI_ACCEL_MASK;
-
-   return sprintf(buf, "%d %d %d\n", x, y, z);
-}
-static DEVICE_ATTR_RO(position);
-
 static ssize_t usb_sleep_charge_show(struct device *dev,
 struct device_attribute *attr, char *buf)
 {
@@ -2353,7 +2331,6 @@ static struct attribute *toshiba_attributes[] = {
_attr_available_kbd_modes.attr,
_attr_kbd_backlight_timeout.attr,
_attr_touchpad.attr,
-   _attr_position.attr,
_attr_usb_sleep_charge.attr,
_attr_sleep_functions_on_battery.attr,
_attr_usb_rapid_charge.attr,
@@ -2380,8 +2357,6 @@ static umode_t toshiba_sysfs_is_visible(struct kobject 
*kobj,
exists = (drv->kbd_mode == SCI_KBD_MODE_AUTO) ? true : false;
else if (attr == _attr_touchpad.attr)
exists = (drv->touchpad_supported) ? true : false;
-   else if (attr == _attr_position.attr)
-   exists = (drv->accelerometer_supported) ? true : false;
else if (attr == _attr_usb_sleep_charge.attr)
exists = (drv->usb_sleep_charge_supported) ? true : false;
else if (attr == _attr_sleep_functions_on_battery.attr)
-- 
2.8.4

[PATCH v2 0/3] toshiba_acpi: Accelerometer updates

2016-06-19 Thread Azael Avalos

These series of patches update the accelerometer axis data
reporting to use the IIO subsystem, deprecating the custom
position sysfs entry, and finally bumping the driver version
to 0.24.

Changes since v1:
- Small format and style changes
- Changed the iio code according to feedback from Jonathan Cameron

Azael Avalos (3):
  toshiba_acpi: Add IIO interface for accelerometer axis data
  toshiba_acpi: Remove the position sysfs entry
  toshiba_acpi: Bump driver version and update copyright year

 drivers/platform/x86/toshiba_acpi.c | 136 +---
 1 file changed, 109 insertions(+), 27 deletions(-)

-- 
2.8.4

[PATCH v2 3/3] toshiba_acpi: Bump driver version and update copyright year

2016-06-19 Thread Azael Avalos

After several fixes, and added support for more features (WWAN,
Cooling Method and IIO accelometer axis data), bump the driver
version to 0.24.

Also update the copyright year.

Signed-off-by: Azael Avalos 
---
 drivers/platform/x86/toshiba_acpi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index a9760d2..9c8f349 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -4,7 +4,7 @@
  *  Copyright (C) 2002-2004 John Belmonte
  *  Copyright (C) 2008 Philip Langdale
  *  Copyright (C) 2010 Pierre Ducroquet
- *  Copyright (C) 2014-2015 Azael Avalos
+ *  Copyright (C) 2014-2016 Azael Avalos
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -31,7 +31,7 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
-#define TOSHIBA_ACPI_VERSION   "0.23"
+#define TOSHIBA_ACPI_VERSION   "0.24"
 #define PROC_INTERFACE_VERSION 1
 
 #include 
-- 
2.8.4

[PATCH v2 2/3] toshiba_acpi: Remove the position sysfs entry

2016-06-19 Thread Azael Avalos

Now that we have proper support for the acceleromeer under the IIO
subsystem, the _position_ sysfs file is now deprecated.

This patch removes all code related to the position sysfs entry.

Signed-off-by: Azael Avalos 
---
 drivers/platform/x86/toshiba_acpi.c | 25 -
 1 file changed, 25 deletions(-)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index 7949929..a9760d2 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -1961,28 +1961,6 @@ static ssize_t touchpad_show(struct device *dev,
 }
 static DEVICE_ATTR_RW(touchpad);
 
-static ssize_t position_show(struct device *dev,
-struct device_attribute *attr, char *buf)
-{
-   struct toshiba_acpi_dev *toshiba = dev_get_drvdata(dev);
-   u32 xyval, zval, tmp;
-   u16 x, y, z;
-   int ret;
-
-   xyval = zval = 0;
-   ret = toshiba_accelerometer_get(toshiba, , );
-   if (ret < 0)
-   return ret;
-
-   x = xyval & HCI_ACCEL_MASK;
-   tmp = xyval >> HCI_MISC_SHIFT;
-   y = tmp & HCI_ACCEL_MASK;
-   z = zval & HCI_ACCEL_MASK;
-
-   return sprintf(buf, "%d %d %d\n", x, y, z);
-}
-static DEVICE_ATTR_RO(position);
-
 static ssize_t usb_sleep_charge_show(struct device *dev,
 struct device_attribute *attr, char *buf)
 {
@@ -2353,7 +2331,6 @@ static struct attribute *toshiba_attributes[] = {
_attr_available_kbd_modes.attr,
_attr_kbd_backlight_timeout.attr,
_attr_touchpad.attr,
-   _attr_position.attr,
_attr_usb_sleep_charge.attr,
_attr_sleep_functions_on_battery.attr,
_attr_usb_rapid_charge.attr,
@@ -2380,8 +2357,6 @@ static umode_t toshiba_sysfs_is_visible(struct kobject 
*kobj,
exists = (drv->kbd_mode == SCI_KBD_MODE_AUTO) ? true : false;
else if (attr == _attr_touchpad.attr)
exists = (drv->touchpad_supported) ? true : false;
-   else if (attr == _attr_position.attr)
-   exists = (drv->accelerometer_supported) ? true : false;
else if (attr == _attr_usb_sleep_charge.attr)
exists = (drv->usb_sleep_charge_supported) ? true : false;
else if (attr == _attr_sleep_functions_on_battery.attr)
-- 
2.8.4

[PATCH v2 0/3] toshiba_acpi: Accelerometer updates

2016-06-19 Thread Azael Avalos

These series of patches update the accelerometer axis data
reporting to use the IIO subsystem, deprecating the custom
position sysfs entry, and finally bumping the driver version
to 0.24.

Changes since v1:
- Small format and style changes
- Changed the iio code according to feedback from Jonathan Cameron

Azael Avalos (3):
  toshiba_acpi: Add IIO interface for accelerometer axis data
  toshiba_acpi: Remove the position sysfs entry
  toshiba_acpi: Bump driver version and update copyright year

 drivers/platform/x86/toshiba_acpi.c | 136 +---
 1 file changed, 109 insertions(+), 27 deletions(-)

-- 
2.8.4

[PATCH v2 3/3] toshiba_acpi: Bump driver version and update copyright year

2016-06-19 Thread Azael Avalos

After several fixes, and added support for more features (WWAN,
Cooling Method and IIO accelometer axis data), bump the driver
version to 0.24.

Also update the copyright year.

Signed-off-by: Azael Avalos 
---
 drivers/platform/x86/toshiba_acpi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/x86/toshiba_acpi.c 
b/drivers/platform/x86/toshiba_acpi.c
index a9760d2..9c8f349 100644
--- a/drivers/platform/x86/toshiba_acpi.c
+++ b/drivers/platform/x86/toshiba_acpi.c
@@ -4,7 +4,7 @@
  *  Copyright (C) 2002-2004 John Belmonte
  *  Copyright (C) 2008 Philip Langdale
  *  Copyright (C) 2010 Pierre Ducroquet
- *  Copyright (C) 2014-2015 Azael Avalos
+ *  Copyright (C) 2014-2016 Azael Avalos
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -31,7 +31,7 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
-#define TOSHIBA_ACPI_VERSION   "0.23"
+#define TOSHIBA_ACPI_VERSION   "0.24"
 #define PROC_INTERFACE_VERSION 1
 
 #include 
-- 
2.8.4

Re: [PATCH 1/2] net: ethernet: bcmsysport: use phydev from struct net_device

2016-06-19 Thread David Miller

From: Philippe Reynes 
Date: Sun, 19 Jun 2016 20:39:08 +0200

> The private structure contain a pointer to phydev, but the structure
> net_device already contain such pointer. So we can remove the pointer
> phydev in the private structure, and update the driver to use the
> one contained in struct net_device.
> 
> Signed-off-by: Philippe Reynes 

Applied.

Re: [PATCH 2/2] net: ethernet: bcmsysport: use phy_ethtool_{get|set}_link_ksettings

2016-06-19 Thread David Miller

From: Philippe Reynes 
Date: Sun, 19 Jun 2016 20:39:09 +0200

> There are two generics functions phy_ethtool_{get|set}_link_ksettings,
> so we can use them instead of defining the same code in the driver.
> 
> Signed-off-by: Philippe Reynes 

Applied.

Re: [PATCH 1/2] net: ethernet: bcmsysport: use phydev from struct net_device

2016-06-19 Thread David Miller

From: Philippe Reynes 
Date: Sun, 19 Jun 2016 20:39:08 +0200

> The private structure contain a pointer to phydev, but the structure
> net_device already contain such pointer. So we can remove the pointer
> phydev in the private structure, and update the driver to use the
> one contained in struct net_device.
> 
> Signed-off-by: Philippe Reynes 

Applied.

Re: [PATCH 2/2] net: ethernet: bcmsysport: use phy_ethtool_{get|set}_link_ksettings

2016-06-19 Thread David Miller

From: Philippe Reynes 
Date: Sun, 19 Jun 2016 20:39:09 +0200

> There are two generics functions phy_ethtool_{get|set}_link_ksettings,
> so we can use them instead of defining the same code in the driver.
> 
> Signed-off-by: Philippe Reynes 

Applied.

Re: [PATCH 5/7] random: replace non-blocking pool with a Chacha20-based CRNG

2016-06-19 Thread Theodore Ts'o

On Wed, Jun 15, 2016 at 10:59:08PM +0800, Herbert Xu wrote:
> I think you should be accessing this through the crypto API rather
> than going direct.  We already have at least one accelerated
> implementation of chacha20 and there may well be more of them
> in future.  Going through the crypto API means that you will
> automatically pick up the best implementation for the platform.

While there are some benefits of going through the crypto API, there
are some downsides as well:

A) Unlike using ChaCha20 in cipher mode, only need the keystream, and
we don't need to XOR the output with plaintext.  We could supply a
dummy zero-filled buffer to archive the same result, but now the
"accelerated" version is having to do an extra memory reference.  Even
if the L1 cache is big enough so that we're not going all the way out
to DRAM, we're putting additional pressure the D cache.

B) The anti-backtracking feature involves taking the existing key and
XOR'ing it with unsued output from the keystream.  We can't do that
using the Crypto API without keeping our own copy of the key, and then
calling setkey --- which means yet more extra memory references.

C) Simply compiling in the Crypto layer and the ChaCha20 generic
handling (all of which is doing extra work which we would then be
undoing in the random layer --- and I haven't included the extra code
in the random driver needed interface with the crypto layer) costs an
extra 20k.  That's roughly the amount of extra kernel bloat that the
Linux kernel grew in its allnoconfig from version to version from 3.0
to 3.16.  I don't have the numbers from the more recent kernels, but
roughly speaking, we would be responsible for **all** of the extra
kernel bloat (and if there was any extra kernel bloat, we would
helping to double it) in the kernel release where this code would go
in.  I suspect the folks involved with the kernel tinificaiton efforts
wouldn't exactly be pleased with this.

Yes, I understand the argument that the networking stack is now
requiring the crypto layer --- but not all IOT devices may necessarily
require the IP stack (they might be using some alternate wireless
communications stack) and I'd much rather not make things worse.

The final thing is that it's not at all clear that the accelerated
implementation is all that important anyway.  Consider the following
two results using the unaccelerated ChaCha20:

% dd if=/dev/urandom bs=4M count=32 of=/dev/null
32+0 records in
32+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 1.18647 s, 113 MB/s

% dd if=/dev/urandom bs=32 count=4194304 of=/dev/null
4194304+0 records in
4194304+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 7.08294 s, 18.9 MB/s

So in both cases, we are reading 128M from the CRNG.  In the first
case, we see the sort of speed we would get if we were using the CRNG
for some illegitimate, such as "dd if=/dev/urandom of=/dev/sdX bs=4M"
(because they were too lazy to type "apt-get install nwipe").

In the second case, we see the use of /dev/urandom in a much more
reasonable, proper, real-world use case for /de/urandom, which is some
userspace process needing a 256 bit session key for a TLS connection,
or some such.  In this case, we see that the other overheads of
providing the anti-backtracking protection, system call overhead,
etc., completely dominate the speed of the core crypto primitive.

So even if the AVX optimized is 100% faster than the generic version,
it would change the time needed to create a 256 byte session key from
1.68 microseconds to 1.55 microseconds.  And this is ignoring the
extra overhead needed to set up AVX, the fact that this will require
the kernel to do extra work doing the XSAVE and XRESTORE because of
the use of the AVX registers, etc.

The bottom line is that optimized ChaCha20 optimizations might be
great for bulk encryption, but for the purposes of generating 256 byte
session keys, I don't think the costs outweigh the benefits.

Cheers,

- Ted

Re: [PATCH 5/7] random: replace non-blocking pool with a Chacha20-based CRNG

2016-06-19 Thread Theodore Ts'o

On Wed, Jun 15, 2016 at 10:59:08PM +0800, Herbert Xu wrote:
> I think you should be accessing this through the crypto API rather
> than going direct.  We already have at least one accelerated
> implementation of chacha20 and there may well be more of them
> in future.  Going through the crypto API means that you will
> automatically pick up the best implementation for the platform.

While there are some benefits of going through the crypto API, there
are some downsides as well:

A) Unlike using ChaCha20 in cipher mode, only need the keystream, and
we don't need to XOR the output with plaintext.  We could supply a
dummy zero-filled buffer to archive the same result, but now the
"accelerated" version is having to do an extra memory reference.  Even
if the L1 cache is big enough so that we're not going all the way out
to DRAM, we're putting additional pressure the D cache.

B) The anti-backtracking feature involves taking the existing key and
XOR'ing it with unsued output from the keystream.  We can't do that
using the Crypto API without keeping our own copy of the key, and then
calling setkey --- which means yet more extra memory references.

C) Simply compiling in the Crypto layer and the ChaCha20 generic
handling (all of which is doing extra work which we would then be
undoing in the random layer --- and I haven't included the extra code
in the random driver needed interface with the crypto layer) costs an
extra 20k.  That's roughly the amount of extra kernel bloat that the
Linux kernel grew in its allnoconfig from version to version from 3.0
to 3.16.  I don't have the numbers from the more recent kernels, but
roughly speaking, we would be responsible for **all** of the extra
kernel bloat (and if there was any extra kernel bloat, we would
helping to double it) in the kernel release where this code would go
in.  I suspect the folks involved with the kernel tinificaiton efforts
wouldn't exactly be pleased with this.

Yes, I understand the argument that the networking stack is now
requiring the crypto layer --- but not all IOT devices may necessarily
require the IP stack (they might be using some alternate wireless
communications stack) and I'd much rather not make things worse.

The final thing is that it's not at all clear that the accelerated
implementation is all that important anyway.  Consider the following
two results using the unaccelerated ChaCha20:

% dd if=/dev/urandom bs=4M count=32 of=/dev/null
32+0 records in
32+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 1.18647 s, 113 MB/s

% dd if=/dev/urandom bs=32 count=4194304 of=/dev/null
4194304+0 records in
4194304+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 7.08294 s, 18.9 MB/s

So in both cases, we are reading 128M from the CRNG.  In the first
case, we see the sort of speed we would get if we were using the CRNG
for some illegitimate, such as "dd if=/dev/urandom of=/dev/sdX bs=4M"
(because they were too lazy to type "apt-get install nwipe").

In the second case, we see the use of /dev/urandom in a much more
reasonable, proper, real-world use case for /de/urandom, which is some
userspace process needing a 256 bit session key for a TLS connection,
or some such.  In this case, we see that the other overheads of
providing the anti-backtracking protection, system call overhead,
etc., completely dominate the speed of the core crypto primitive.

So even if the AVX optimized is 100% faster than the generic version,
it would change the time needed to create a 256 byte session key from
1.68 microseconds to 1.55 microseconds.  And this is ignoring the
extra overhead needed to set up AVX, the fact that this will require
the kernel to do extra work doing the XSAVE and XRESTORE because of
the use of the AVX registers, etc.

The bottom line is that optimized ChaCha20 optimizations might be
great for bulk encryption, but for the purposes of generating 256 byte
session keys, I don't think the costs outweigh the benefits.

Cheers,

- Ted

[PATCH 0/3] nvme: Don't add namespaces for locked drives

2016-06-19 Thread Jethro Beekman

Hi all,

If an NVMe drive is locked with ATA Security, most commands sent to the drive 
will fail. This includes commands sent by the kernel upon discovery to probe 
for partitions. The failing happens in such a way that trying to do anything 
with the drive (e.g. sending an unlock command; unloading the nvme module) is 
basically impossible with the high default command timeout.

This patch adds a check to see if the drive is locked, and if it is, its 
namespaces are not initialized. It is expected that userspace will send the 
proper "security send/unlock" command and then reset the controller. Userspace 
tools are available at [1].

This is my first kernel patch so please let me know if you have any feedback.

I intend to also submit a future patch that tracks ATA Security commands sent 
from userspace and remembers the password so it can be submitted to a locked 
drive upon pm_resume. (still WIP)

Jethro Beekman

[1] https://github.com/jethrogb/nvme-ata-security

Jethro Beekman (3):
  nvme: When scanning namespaces, make sure the drive is not locked
  nvme: Add function for NVMe security receive command
  nvme: Check if drive is locked using ATA Security

 drivers/nvme/host/core.c | 70 ++
 1 file changed, 70 insertions(+)

-- 
2.9.0

[PATCH 3/3] nvme: Check if drive is locked using ATA Security

2016-06-19 Thread Jethro Beekman

Signed-off-by: Jethro Beekman 
---
 drivers/nvme/host/core.c | 49 +-
 1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index da027ed..0164122 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1389,10 +1389,57 @@ int nvme_security_recv(struct nvme_ctrl *dev, u8 
protocol, void *buf,
return nvme_submit_sync_cmd(dev->admin_q, , buf, len);
 }
 
+#define OACS_SECURITY (1<<0)
+#define SCSI_SECURITY_PROTOCOL_ATA_SECURITY 0xef
+#define ATA_SECURITY_LOCKED 0x4
+
 static bool nvme_security_is_locked(struct nvme_ctrl *ctrl,
struct nvme_id_ctrl *id)
 {
-   return false;
+   int err;
+   unsigned int i;
+   bool found;
+   u8 protocols[256+8]; /* 8 byte hdr + max number of possible protocols */
+   u8 ata_security[16];
+   u16 n;
+
+   /* security commands supported? */
+   if (!(le16_to_cpu(id->oacs) & OACS_SECURITY))
+   return false;
+
+   /* list security protocols */
+   err = nvme_security_recv(ctrl, 0, protocols, sizeof(protocols));
+   if (err) {
+   dev_warn(ctrl->device, "nvme_security_recv returned error 
%xh\n",
+   err);
+   return false;
+   }
+
+   /* find ata security protocol */
+   n = be16_to_cpup((__be16 *)(protocols+6));
+   if (n >= 256) {
+   dev_warn(ctrl->device, "security info protocol returned more 
than 256 protocols\n");
+   return false;
+   }
+   found = false;
+   for (i = 0; i <= n; i++) {
+   if (protocols[8+i] == SCSI_SECURITY_PROTOCOL_ATA_SECURITY) {
+   found = true;
+   break;
+   }
+   }
+   if (!found)
+   return false;
+
+   /* do ata security identify */
+   err = nvme_security_recv(ctrl, SCSI_SECURITY_PROTOCOL_ATA_SECURITY,
+   ata_security, sizeof(ata_security))
+   if (err) {
+   dev_warn(ctrl->device, "nvme_security_recv returned error 
%xh\n",
+   err);
+   return false;
+   }
+   return ata_security[1] == 0xe && (ata_security[9]_SECURITY_LOCKED);
 }
 
 void nvme_scan_namespaces(struct nvme_ctrl *ctrl)
-- 
2.9.0

[PATCH 2/3] nvme: Add function for NVMe security receive command

2016-06-19 Thread Jethro Beekman

Signed-off-by: Jethro Beekman 
---
 drivers/nvme/host/core.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 3a0d48c..da027ed 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1377,6 +1377,18 @@ static void __nvme_scan_namespaces(struct nvme_ctrl 
*ctrl, unsigned nn)
}
 }
 
+int nvme_security_recv(struct nvme_ctrl *dev, u8 protocol, void *buf,
+   unsigned int len)
+{
+   struct nvme_command c = { };
+
+   c.common.opcode = nvme_admin_security_recv;
+   c.common.cdw10[0] = cpu_to_le32(((u32)protocol)<<24);
+   c.common.cdw10[1] = cpu_to_le32(len);
+
+   return nvme_submit_sync_cmd(dev->admin_q, , buf, len);
+}
+
 static bool nvme_security_is_locked(struct nvme_ctrl *ctrl,
struct nvme_id_ctrl *id)
 {
-- 
2.9.0

[PATCH 1/3] nvme: When scanning namespaces, make sure the drive is not locked

2016-06-19 Thread Jethro Beekman

Signed-off-by: Jethro Beekman 
---
 drivers/nvme/host/core.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 643f457..3a0d48c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1377,6 +1377,12 @@ static void __nvme_scan_namespaces(struct nvme_ctrl 
*ctrl, unsigned nn)
}
 }
 
+static bool nvme_security_is_locked(struct nvme_ctrl *ctrl,
+   struct nvme_id_ctrl *id)
+{
+   return false;
+}
+
 void nvme_scan_namespaces(struct nvme_ctrl *ctrl)
 {
struct nvme_id_ctrl *id;
@@ -1385,6 +1391,11 @@ void nvme_scan_namespaces(struct nvme_ctrl *ctrl)
if (nvme_identify_ctrl(ctrl, ))
return;
 
+   if (nvme_security_is_locked(ctrl, id)) {
+   nvme_remove_namespaces(ctrl);
+   return;
+   }
+
mutex_lock(>namespaces_mutex);
nn = le32_to_cpu(id->nn);
if (ctrl->vs >= NVME_VS(1, 1) &&
-- 
2.9.0

[PATCH 2/3] nvme: Add function for NVMe security receive command

2016-06-19 Thread Jethro Beekman

Signed-off-by: Jethro Beekman 
---
 drivers/nvme/host/core.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 3a0d48c..da027ed 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1377,6 +1377,18 @@ static void __nvme_scan_namespaces(struct nvme_ctrl 
*ctrl, unsigned nn)
}
 }
 
+int nvme_security_recv(struct nvme_ctrl *dev, u8 protocol, void *buf,
+   unsigned int len)
+{
+   struct nvme_command c = { };
+
+   c.common.opcode = nvme_admin_security_recv;
+   c.common.cdw10[0] = cpu_to_le32(((u32)protocol)<<24);
+   c.common.cdw10[1] = cpu_to_le32(len);
+
+   return nvme_submit_sync_cmd(dev->admin_q, , buf, len);
+}
+
 static bool nvme_security_is_locked(struct nvme_ctrl *ctrl,
struct nvme_id_ctrl *id)
 {
-- 
2.9.0

[PATCH 1/3] nvme: When scanning namespaces, make sure the drive is not locked

2016-06-19 Thread Jethro Beekman

Signed-off-by: Jethro Beekman 
---
 drivers/nvme/host/core.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 643f457..3a0d48c 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1377,6 +1377,12 @@ static void __nvme_scan_namespaces(struct nvme_ctrl 
*ctrl, unsigned nn)
}
 }
 
+static bool nvme_security_is_locked(struct nvme_ctrl *ctrl,
+   struct nvme_id_ctrl *id)
+{
+   return false;
+}
+
 void nvme_scan_namespaces(struct nvme_ctrl *ctrl)
 {
struct nvme_id_ctrl *id;
@@ -1385,6 +1391,11 @@ void nvme_scan_namespaces(struct nvme_ctrl *ctrl)
if (nvme_identify_ctrl(ctrl, ))
return;
 
+   if (nvme_security_is_locked(ctrl, id)) {
+   nvme_remove_namespaces(ctrl);
+   return;
+   }
+
mutex_lock(>namespaces_mutex);
nn = le32_to_cpu(id->nn);
if (ctrl->vs >= NVME_VS(1, 1) &&
-- 
2.9.0

[PATCH 0/3] nvme: Don't add namespaces for locked drives

2016-06-19 Thread Jethro Beekman

Hi all,

If an NVMe drive is locked with ATA Security, most commands sent to the drive 
will fail. This includes commands sent by the kernel upon discovery to probe 
for partitions. The failing happens in such a way that trying to do anything 
with the drive (e.g. sending an unlock command; unloading the nvme module) is 
basically impossible with the high default command timeout.

This patch adds a check to see if the drive is locked, and if it is, its 
namespaces are not initialized. It is expected that userspace will send the 
proper "security send/unlock" command and then reset the controller. Userspace 
tools are available at [1].

This is my first kernel patch so please let me know if you have any feedback.

I intend to also submit a future patch that tracks ATA Security commands sent 
from userspace and remembers the password so it can be submitted to a locked 
drive upon pm_resume. (still WIP)

Jethro Beekman

[1] https://github.com/jethrogb/nvme-ata-security

Jethro Beekman (3):
  nvme: When scanning namespaces, make sure the drive is not locked
  nvme: Add function for NVMe security receive command
  nvme: Check if drive is locked using ATA Security

 drivers/nvme/host/core.c | 70 ++
 1 file changed, 70 insertions(+)

-- 
2.9.0

[PATCH 3/3] nvme: Check if drive is locked using ATA Security

2016-06-19 Thread Jethro Beekman

Signed-off-by: Jethro Beekman 
---
 drivers/nvme/host/core.c | 49 +-
 1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index da027ed..0164122 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1389,10 +1389,57 @@ int nvme_security_recv(struct nvme_ctrl *dev, u8 
protocol, void *buf,
return nvme_submit_sync_cmd(dev->admin_q, , buf, len);
 }
 
+#define OACS_SECURITY (1<<0)
+#define SCSI_SECURITY_PROTOCOL_ATA_SECURITY 0xef
+#define ATA_SECURITY_LOCKED 0x4
+
 static bool nvme_security_is_locked(struct nvme_ctrl *ctrl,
struct nvme_id_ctrl *id)
 {
-   return false;
+   int err;
+   unsigned int i;
+   bool found;
+   u8 protocols[256+8]; /* 8 byte hdr + max number of possible protocols */
+   u8 ata_security[16];
+   u16 n;
+
+   /* security commands supported? */
+   if (!(le16_to_cpu(id->oacs) & OACS_SECURITY))
+   return false;
+
+   /* list security protocols */
+   err = nvme_security_recv(ctrl, 0, protocols, sizeof(protocols));
+   if (err) {
+   dev_warn(ctrl->device, "nvme_security_recv returned error 
%xh\n",
+   err);
+   return false;
+   }
+
+   /* find ata security protocol */
+   n = be16_to_cpup((__be16 *)(protocols+6));
+   if (n >= 256) {
+   dev_warn(ctrl->device, "security info protocol returned more 
than 256 protocols\n");
+   return false;
+   }
+   found = false;
+   for (i = 0; i <= n; i++) {
+   if (protocols[8+i] == SCSI_SECURITY_PROTOCOL_ATA_SECURITY) {
+   found = true;
+   break;
+   }
+   }
+   if (!found)
+   return false;
+
+   /* do ata security identify */
+   err = nvme_security_recv(ctrl, SCSI_SECURITY_PROTOCOL_ATA_SECURITY,
+   ata_security, sizeof(ata_security))
+   if (err) {
+   dev_warn(ctrl->device, "nvme_security_recv returned error 
%xh\n",
+   err);
+   return false;
+   }
+   return ata_security[1] == 0xe && (ata_security[9]_SECURITY_LOCKED);
 }
 
 void nvme_scan_namespaces(struct nvme_ctrl *ctrl)
-- 
2.9.0

[PATCH 1/2] net: ethernet: bgmac: use phydev from struct net_device

2016-06-19 Thread Philippe Reynes

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/broadcom/bgmac.c |   17 ++---
 drivers/net/ethernet/broadcom/bgmac.h |1 -
 2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bgmac.c 
b/drivers/net/ethernet/broadcom/bgmac.c
index 0ee34cc..90ed244 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -1320,7 +1320,7 @@ static int bgmac_open(struct net_device *net_dev)
}
napi_enable(>napi);
 
-   phy_start(bgmac->phy_dev);
+   phy_start(net_dev->phydev);
 
netif_carrier_on(net_dev);
return 0;
@@ -1332,7 +1332,7 @@ static int bgmac_stop(struct net_device *net_dev)
 
netif_carrier_off(net_dev);
 
-   phy_stop(bgmac->phy_dev);
+   phy_stop(net_dev->phydev);
 
napi_disable(>napi);
bgmac_chip_intrs_off(bgmac);
@@ -1370,12 +1370,10 @@ static int bgmac_set_mac_address(struct net_device 
*net_dev, void *addr)
 
 static int bgmac_ioctl(struct net_device *net_dev, struct ifreq *ifr, int cmd)
 {
-   struct bgmac *bgmac = netdev_priv(net_dev);
-
if (!netif_running(net_dev))
return -EINVAL;
 
-   return phy_mii_ioctl(bgmac->phy_dev, ifr, cmd);
+   return phy_mii_ioctl(net_dev->phydev, ifr, cmd);
 }
 
 static const struct net_device_ops bgmac_netdev_ops = {
@@ -1518,7 +1516,7 @@ static int bgmac_get_settings(struct net_device *net_dev,
 {
struct bgmac *bgmac = netdev_priv(net_dev);
 
-   return phy_ethtool_gset(bgmac->phy_dev, cmd);
+   return phy_ethtool_gset(net_dev->phydev, cmd);
 }
 
 static int bgmac_set_settings(struct net_device *net_dev,
@@ -1526,7 +1524,7 @@ static int bgmac_set_settings(struct net_device *net_dev,
 {
struct bgmac *bgmac = netdev_priv(net_dev);
 
-   return phy_ethtool_sset(bgmac->phy_dev, cmd);
+   return phy_ethtool_sset(net_dev->phydev, cmd);
 }
 
 static void bgmac_get_drvinfo(struct net_device *net_dev,
@@ -1563,7 +1561,7 @@ static int bgmac_mii_write(struct mii_bus *bus, int 
mii_id, int regnum,
 static void bgmac_adjust_link(struct net_device *net_dev)
 {
struct bgmac *bgmac = netdev_priv(net_dev);
-   struct phy_device *phy_dev = bgmac->phy_dev;
+   struct phy_device *phy_dev = net_dev->phydev;
bool update = false;
 
if (phy_dev->link) {
@@ -1607,8 +1605,6 @@ static int bgmac_fixed_phy_register(struct bgmac *bgmac)
return err;
}
 
-   bgmac->phy_dev = phy_dev;
-
return err;
 }
 
@@ -1653,7 +1649,6 @@ static int bgmac_mii_register(struct bgmac *bgmac)
err = PTR_ERR(phy_dev);
goto err_unregister_bus;
}
-   bgmac->phy_dev = phy_dev;
 
return err;
 
diff --git a/drivers/net/ethernet/broadcom/bgmac.h 
b/drivers/net/ethernet/broadcom/bgmac.h
index 853d72b..99beb18 100644
--- a/drivers/net/ethernet/broadcom/bgmac.h
+++ b/drivers/net/ethernet/broadcom/bgmac.h
@@ -441,7 +441,6 @@ struct bgmac {
struct net_device *net_dev;
struct napi_struct napi;
struct mii_bus *mii_bus;
-   struct phy_device *phy_dev;
 
/* DMA */
struct bgmac_dma_ring tx_ring[BGMAC_MAX_TX_RINGS];
-- 
1.7.4.4

[PATCH 1/2] net: ethernet: bgmac: use phydev from struct net_device

2016-06-19 Thread Philippe Reynes

The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/broadcom/bgmac.c |   17 ++---
 drivers/net/ethernet/broadcom/bgmac.h |1 -
 2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bgmac.c 
b/drivers/net/ethernet/broadcom/bgmac.c
index 0ee34cc..90ed244 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -1320,7 +1320,7 @@ static int bgmac_open(struct net_device *net_dev)
}
napi_enable(>napi);
 
-   phy_start(bgmac->phy_dev);
+   phy_start(net_dev->phydev);
 
netif_carrier_on(net_dev);
return 0;
@@ -1332,7 +1332,7 @@ static int bgmac_stop(struct net_device *net_dev)
 
netif_carrier_off(net_dev);
 
-   phy_stop(bgmac->phy_dev);
+   phy_stop(net_dev->phydev);
 
napi_disable(>napi);
bgmac_chip_intrs_off(bgmac);
@@ -1370,12 +1370,10 @@ static int bgmac_set_mac_address(struct net_device 
*net_dev, void *addr)
 
 static int bgmac_ioctl(struct net_device *net_dev, struct ifreq *ifr, int cmd)
 {
-   struct bgmac *bgmac = netdev_priv(net_dev);
-
if (!netif_running(net_dev))
return -EINVAL;
 
-   return phy_mii_ioctl(bgmac->phy_dev, ifr, cmd);
+   return phy_mii_ioctl(net_dev->phydev, ifr, cmd);
 }
 
 static const struct net_device_ops bgmac_netdev_ops = {
@@ -1518,7 +1516,7 @@ static int bgmac_get_settings(struct net_device *net_dev,
 {
struct bgmac *bgmac = netdev_priv(net_dev);
 
-   return phy_ethtool_gset(bgmac->phy_dev, cmd);
+   return phy_ethtool_gset(net_dev->phydev, cmd);
 }
 
 static int bgmac_set_settings(struct net_device *net_dev,
@@ -1526,7 +1524,7 @@ static int bgmac_set_settings(struct net_device *net_dev,
 {
struct bgmac *bgmac = netdev_priv(net_dev);
 
-   return phy_ethtool_sset(bgmac->phy_dev, cmd);
+   return phy_ethtool_sset(net_dev->phydev, cmd);
 }
 
 static void bgmac_get_drvinfo(struct net_device *net_dev,
@@ -1563,7 +1561,7 @@ static int bgmac_mii_write(struct mii_bus *bus, int 
mii_id, int regnum,
 static void bgmac_adjust_link(struct net_device *net_dev)
 {
struct bgmac *bgmac = netdev_priv(net_dev);
-   struct phy_device *phy_dev = bgmac->phy_dev;
+   struct phy_device *phy_dev = net_dev->phydev;
bool update = false;
 
if (phy_dev->link) {
@@ -1607,8 +1605,6 @@ static int bgmac_fixed_phy_register(struct bgmac *bgmac)
return err;
}
 
-   bgmac->phy_dev = phy_dev;
-
return err;
 }
 
@@ -1653,7 +1649,6 @@ static int bgmac_mii_register(struct bgmac *bgmac)
err = PTR_ERR(phy_dev);
goto err_unregister_bus;
}
-   bgmac->phy_dev = phy_dev;
 
return err;
 
diff --git a/drivers/net/ethernet/broadcom/bgmac.h 
b/drivers/net/ethernet/broadcom/bgmac.h
index 853d72b..99beb18 100644
--- a/drivers/net/ethernet/broadcom/bgmac.h
+++ b/drivers/net/ethernet/broadcom/bgmac.h
@@ -441,7 +441,6 @@ struct bgmac {
struct net_device *net_dev;
struct napi_struct napi;
struct mii_bus *mii_bus;
-   struct phy_device *phy_dev;
 
/* DMA */
struct bgmac_dma_ring tx_ring[BGMAC_MAX_TX_RINGS];
-- 
1.7.4.4

[PATCH] rcutorture: Remove outdated config option description

2016-06-19 Thread SeongJae Park

CONFIG_RCU_TORTURE_TEST_RUNNABLE has removed by commit 4e9a073f60367
("torture: Remove CONFIG_RCU_TORTURE_TEST_RUNNABLE, simplify code")
entirely but the document has not updated.  This commit updates the
document to remove the description for the config option and adding a
description for the alternative module parameter.

Signed-off-by: SeongJae Park 
---
 Documentation/RCU/torture.txt | 20 +---
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt
index 118e7c1..4bd9d86 100644
--- a/Documentation/RCU/torture.txt
+++ b/Documentation/RCU/torture.txt
@@ -10,21 +10,6 @@ status messages via printk(), which can be examined via the 
dmesg
 command (perhaps grepping for "torture").  The test is started
 when the module is loaded, and stops when the module is unloaded.
 
-CONFIG_RCU_TORTURE_TEST_RUNNABLE
-
-It is also possible to specify CONFIG_RCU_TORTURE_TEST=y, which will
-result in the tests being loaded into the base kernel.  In this case,
-the CONFIG_RCU_TORTURE_TEST_RUNNABLE config option is used to specify
-whether the RCU torture tests are to be started immediately during
-boot or whether the /proc/sys/kernel/rcutorture_runnable file is used
-to enable them.  This /proc file can be used to repeatedly pause and
-restart the tests, regardless of the initial state specified by the
-CONFIG_RCU_TORTURE_TEST_RUNNABLE config option.
-
-You will normally -not- want to start the RCU torture tests during boot
-(and thus the default is CONFIG_RCU_TORTURE_TEST_RUNNABLE=n), but doing
-this can sometimes be useful in finding boot-time bugs.
-
 
 MODULE PARAMETERS
 
@@ -164,6 +149,11 @@ test_no_idle_hzWhether or not to test the ability of 
RCU to operate in
idle CPUs.  Boolean parameter, "1" to test, "0" otherwise.
Defaults to omitting this test.
 
+torture_runnable  Start rcutorture at boot time in the case where the
+ module is built into the kernel, otherwise wait for
+ torture_runnable to be set via sysfs before starting.
+ By default it will begin once the module is loaded.
+
 torture_type   The type of RCU to test, with string values as follows:
 
"rcu":  rcu_read_lock(), rcu_read_unlock() and call_rcu(),
-- 
1.9.1

[PATCH] rcutorture: Remove outdated config option description

2016-06-19 Thread SeongJae Park

CONFIG_RCU_TORTURE_TEST_RUNNABLE has removed by commit 4e9a073f60367
("torture: Remove CONFIG_RCU_TORTURE_TEST_RUNNABLE, simplify code")
entirely but the document has not updated.  This commit updates the
document to remove the description for the config option and adding a
description for the alternative module parameter.

Signed-off-by: SeongJae Park 
---
 Documentation/RCU/torture.txt | 20 +---
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt
index 118e7c1..4bd9d86 100644
--- a/Documentation/RCU/torture.txt
+++ b/Documentation/RCU/torture.txt
@@ -10,21 +10,6 @@ status messages via printk(), which can be examined via the 
dmesg
 command (perhaps grepping for "torture").  The test is started
 when the module is loaded, and stops when the module is unloaded.
 
-CONFIG_RCU_TORTURE_TEST_RUNNABLE
-
-It is also possible to specify CONFIG_RCU_TORTURE_TEST=y, which will
-result in the tests being loaded into the base kernel.  In this case,
-the CONFIG_RCU_TORTURE_TEST_RUNNABLE config option is used to specify
-whether the RCU torture tests are to be started immediately during
-boot or whether the /proc/sys/kernel/rcutorture_runnable file is used
-to enable them.  This /proc file can be used to repeatedly pause and
-restart the tests, regardless of the initial state specified by the
-CONFIG_RCU_TORTURE_TEST_RUNNABLE config option.
-
-You will normally -not- want to start the RCU torture tests during boot
-(and thus the default is CONFIG_RCU_TORTURE_TEST_RUNNABLE=n), but doing
-this can sometimes be useful in finding boot-time bugs.
-
 
 MODULE PARAMETERS
 
@@ -164,6 +149,11 @@ test_no_idle_hzWhether or not to test the ability of 
RCU to operate in
idle CPUs.  Boolean parameter, "1" to test, "0" otherwise.
Defaults to omitting this test.
 
+torture_runnable  Start rcutorture at boot time in the case where the
+ module is built into the kernel, otherwise wait for
+ torture_runnable to be set via sysfs before starting.
+ By default it will begin once the module is loaded.
+
 torture_type   The type of RCU to test, with string values as follows:
 
"rcu":  rcu_read_lock(), rcu_read_unlock() and call_rcu(),
-- 
1.9.1

Re: [PULL] seccomp update (next)

2016-06-19 Thread Andy Lutomirski

On Sat, Jun 18, 2016 at 3:21 AM, Andy Lutomirski  wrote:
> On Jun 18, 2016 12:02 AM, "Kees Cook"  wrote:
>>
>> On Fri, Jun 17, 2016 at 11:55 AM, Andy Lutomirski  
>> wrote:
>> > On Fri, Jun 17, 2016 at 12:15 AM, James Morris  wrote:
>> >> On Tue, 14 Jun 2016, Kees Cook wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> Please pull these seccomp changes for next. These have been tested by
>> >>> myself and Andy, and close a long-standing issue with seccomp where 
>> >>> tracers
>> >>> could change the syscall out from under seccomp.
>> >>
>> >> Pulled to security -next.
>> >
>> > As a heads up: I think this doesn't quite close the hole on x86.  Consider:
>> >
>> > 64-bit task arranges to be traced by a 32-bit task (or presumably a
>> > 64-bit task that calls ptrace via int80).
>> >
>> > Tracer does PTRACE_SYSCALL.
>> >
>> > Tracee does a normal syscall.
>> >
>> > Tracer writes tracee's orig_ax, thus invoking this thing in
>> > arch/x86/kernel/ptrace.c:
>> >
>> > if (syscall_get_nr(child, regs) >= 0)
>> > child->thread.status |= TS_COMPAT;
>> >
>> > Tracer resumes and gets confused.
>> >
>> > I think the right fix is to just delete:
>> >
>> > if (syscall_get_nr(child, regs) >= 0)
>> > child->thread.status |= TS_COMPAT;
>> >
>> > from ptrace.c.   The comment above it is garbage, too.
>>
>> I'm perfectly happy to see it removed. I can't make sense of the comment. :)
>>
>> That said, the only confusion I see is pretty minor. The arch is saved
>> before the tracer could force TS_COMPAT, so nothing confused is handed
>> to seccomp (the first time). And the syscall will continue to be
>> looked up on sys_call_table not ia32_sys_call_table.
>
> Hmm, right, but...
>
>>
>> The only thing I see is if the tracer has also added a
>> SECCOMP_RET_TRACE filter, after which the recheck will reload all the
>> seccomp info, including the arch. And at this point, a sensible filter
>> will reject a non-matching architecture.
>>
>
> Yes for some filters, but others might have different logic for
> different arches, at which point there's a minor bypass.
>
>> Maybe I'm missing something more?
>>
>
> You can also use this to do a 64-bit syscall with in_compat_syscall()
> returning true, which could cause issues for audit and maybe some
> ioctl handlers irrespective of this patch series.  I'll see about
> getting it fixed in x86/urgent.

Hi Kees and James-

On further consideration:

(a) that TS_COMPAT thing is highly, highly buggy -- much buggier and
more confusing than I thought, and not in the way I thought.

(b) it doesn't interfere with seccomp at all, so fixing it is not at
all a prerequisite for these changes.


-- 
Andy Lutomirski
AMA Capital Management, LLC

Re: [PULL] seccomp update (next)

2016-06-19 Thread Andy Lutomirski

On Sat, Jun 18, 2016 at 3:21 AM, Andy Lutomirski  wrote:
> On Jun 18, 2016 12:02 AM, "Kees Cook"  wrote:
>>
>> On Fri, Jun 17, 2016 at 11:55 AM, Andy Lutomirski  
>> wrote:
>> > On Fri, Jun 17, 2016 at 12:15 AM, James Morris  wrote:
>> >> On Tue, 14 Jun 2016, Kees Cook wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> Please pull these seccomp changes for next. These have been tested by
>> >>> myself and Andy, and close a long-standing issue with seccomp where 
>> >>> tracers
>> >>> could change the syscall out from under seccomp.
>> >>
>> >> Pulled to security -next.
>> >
>> > As a heads up: I think this doesn't quite close the hole on x86.  Consider:
>> >
>> > 64-bit task arranges to be traced by a 32-bit task (or presumably a
>> > 64-bit task that calls ptrace via int80).
>> >
>> > Tracer does PTRACE_SYSCALL.
>> >
>> > Tracee does a normal syscall.
>> >
>> > Tracer writes tracee's orig_ax, thus invoking this thing in
>> > arch/x86/kernel/ptrace.c:
>> >
>> > if (syscall_get_nr(child, regs) >= 0)
>> > child->thread.status |= TS_COMPAT;
>> >
>> > Tracer resumes and gets confused.
>> >
>> > I think the right fix is to just delete:
>> >
>> > if (syscall_get_nr(child, regs) >= 0)
>> > child->thread.status |= TS_COMPAT;
>> >
>> > from ptrace.c.   The comment above it is garbage, too.
>>
>> I'm perfectly happy to see it removed. I can't make sense of the comment. :)
>>
>> That said, the only confusion I see is pretty minor. The arch is saved
>> before the tracer could force TS_COMPAT, so nothing confused is handed
>> to seccomp (the first time). And the syscall will continue to be
>> looked up on sys_call_table not ia32_sys_call_table.
>
> Hmm, right, but...
>
>>
>> The only thing I see is if the tracer has also added a
>> SECCOMP_RET_TRACE filter, after which the recheck will reload all the
>> seccomp info, including the arch. And at this point, a sensible filter
>> will reject a non-matching architecture.
>>
>
> Yes for some filters, but others might have different logic for
> different arches, at which point there's a minor bypass.
>
>> Maybe I'm missing something more?
>>
>
> You can also use this to do a 64-bit syscall with in_compat_syscall()
> returning true, which could cause issues for audit and maybe some
> ioctl handlers irrespective of this patch series.  I'll see about
> getting it fixed in x86/urgent.

Hi Kees and James-

On further consideration:

(a) that TS_COMPAT thing is highly, highly buggy -- much buggier and
more confusing than I thought, and not in the way I thought.

(b) it doesn't interfere with seccomp at all, so fixing it is not at
all a prerequisite for these changes.


-- 
Andy Lutomirski
AMA Capital Management, LLC

Re: [PATCH] x86/ptrace: Remove questionable TS_COMPAT usage in ptrace

2016-06-19 Thread Andy Lutomirski

On Sun, Jun 19, 2016 at 2:19 PM, Oleg Nesterov  wrote:
> Let me first thank Pedro who has already replied!
>
> And I have to admit I will need to re-read his explanations after
> sleep to (try to) convince myself I fully understans the problems ;)
> Too late for me.
>
> Right now I have nothing to add, but
>
> On 06/18, Andy Lutomirski wrote:
>>
>> @@ -922,16 +922,7 @@ static int putreg32(struct task_struct *child, unsigned 
>> regno, u32 value)
>>   R32(esp, sp);
>>
>>   case offsetof(struct user32, regs.orig_eax):
>> - /*
>> -  * A 32-bit debugger setting orig_eax means to restore
>> -  * the state of the task restarting a 32-bit syscall.
>> -  * Make sure we interpret the -ERESTART* codes correctly
>> -  * in case the task is not actually still sitting at the
>> -  * exit from a 32-bit syscall with TS_COMPAT still set.
>> -  */
>>   regs->orig_ax = value;
>> - if (syscall_get_nr(child, regs) >= 0)
>> - task_thread_info(child)->status |= TS_COMPAT;
>
> I agree it would be nice to remove this code, but then it is not clear
> how/when we should sign-extend regs->ax..
>
> And this leads to another question, why do we actually need to set/clear
> TS_COMPAT in set_personality_ia32() ??

I have no idea.  Legacy junk?  Maybe so audit sees execution of a
64-bit task as a 64-bit sys_execve return even if the task was 32-bit
before execve?

--Andy

Re: [PATCH] x86/ptrace: Remove questionable TS_COMPAT usage in ptrace

2016-06-19 Thread Andy Lutomirski

On Sun, Jun 19, 2016 at 2:19 PM, Oleg Nesterov  wrote:
> Let me first thank Pedro who has already replied!
>
> And I have to admit I will need to re-read his explanations after
> sleep to (try to) convince myself I fully understans the problems ;)
> Too late for me.
>
> Right now I have nothing to add, but
>
> On 06/18, Andy Lutomirski wrote:
>>
>> @@ -922,16 +922,7 @@ static int putreg32(struct task_struct *child, unsigned 
>> regno, u32 value)
>>   R32(esp, sp);
>>
>>   case offsetof(struct user32, regs.orig_eax):
>> - /*
>> -  * A 32-bit debugger setting orig_eax means to restore
>> -  * the state of the task restarting a 32-bit syscall.
>> -  * Make sure we interpret the -ERESTART* codes correctly
>> -  * in case the task is not actually still sitting at the
>> -  * exit from a 32-bit syscall with TS_COMPAT still set.
>> -  */
>>   regs->orig_ax = value;
>> - if (syscall_get_nr(child, regs) >= 0)
>> - task_thread_info(child)->status |= TS_COMPAT;
>
> I agree it would be nice to remove this code, but then it is not clear
> how/when we should sign-extend regs->ax..
>
> And this leads to another question, why do we actually need to set/clear
> TS_COMPAT in set_personality_ia32() ??

I have no idea.  Legacy junk?  Maybe so audit sees execution of a
64-bit task as a 64-bit sys_execve return even if the task was 32-bit
before execve?

--Andy

Re: [PATCH] x86/ptrace: Remove questionable TS_COMPAT usage in ptrace

2016-06-19 Thread Oleg Nesterov

Let me first thank Pedro who has already replied!

And I have to admit I will need to re-read his explanations after
sleep to (try to) convince myself I fully understans the problems ;)
Too late for me.

Right now I have nothing to add, but

On 06/18, Andy Lutomirski wrote:
>
> @@ -922,16 +922,7 @@ static int putreg32(struct task_struct *child, unsigned 
> regno, u32 value)
>   R32(esp, sp);
>  
>   case offsetof(struct user32, regs.orig_eax):
> - /*
> -  * A 32-bit debugger setting orig_eax means to restore
> -  * the state of the task restarting a 32-bit syscall.
> -  * Make sure we interpret the -ERESTART* codes correctly
> -  * in case the task is not actually still sitting at the
> -  * exit from a 32-bit syscall with TS_COMPAT still set.
> -  */
>   regs->orig_ax = value;
> - if (syscall_get_nr(child, regs) >= 0)
> - task_thread_info(child)->status |= TS_COMPAT;

I agree it would be nice to remove this code, but then it is not clear
how/when we should sign-extend regs->ax..

And this leads to another question, why do we actually need to set/clear
TS_COMPAT in set_personality_ia32() ??

Oleg.

Re: [PATCH] x86/ptrace: Remove questionable TS_COMPAT usage in ptrace

2016-06-19 Thread Oleg Nesterov

Let me first thank Pedro who has already replied!

And I have to admit I will need to re-read his explanations after
sleep to (try to) convince myself I fully understans the problems ;)
Too late for me.

Right now I have nothing to add, but

On 06/18, Andy Lutomirski wrote:
>
> @@ -922,16 +922,7 @@ static int putreg32(struct task_struct *child, unsigned 
> regno, u32 value)
>   R32(esp, sp);
>  
>   case offsetof(struct user32, regs.orig_eax):
> - /*
> -  * A 32-bit debugger setting orig_eax means to restore
> -  * the state of the task restarting a 32-bit syscall.
> -  * Make sure we interpret the -ERESTART* codes correctly
> -  * in case the task is not actually still sitting at the
> -  * exit from a 32-bit syscall with TS_COMPAT still set.
> -  */
>   regs->orig_ax = value;
> - if (syscall_get_nr(child, regs) >= 0)
> - task_thread_info(child)->status |= TS_COMPAT;

I agree it would be nice to remove this code, but then it is not clear
how/when we should sign-extend regs->ax..

And this leads to another question, why do we actually need to set/clear
TS_COMPAT in set_personality_ia32() ??

Oleg.

Re: [PATCH] staging: dgnc: add __exit macro to dgnc_driver.c

2016-06-19 Thread Joshua Houghton

On Saturday, 18 June 2016 19:42:07 UTC Joshua Houghton wrote:
> Add the __exit macro to the dgnc_cleanup_module(void) function
> in dgnc_driver.c
> 
> Signed-off-by: Joshua Houghton 
> ---
>  drivers/staging/dgnc/dgnc_driver.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/dgnc/dgnc_driver.c
> b/drivers/staging/dgnc/dgnc_driver.c index af2e835..2c3fb2a 100644
> --- a/drivers/staging/dgnc/dgnc_driver.c
> +++ b/drivers/staging/dgnc/dgnc_driver.c
> @@ -158,7 +158,7 @@ static void cleanup(bool sysfiles)
>   *
>   * Module unload.  This is where it all ends.
>   */
> -static void dgnc_cleanup_module(void)
> +static void __exit dgnc_cleanup_module(void)
>  {
>   cleanup(true);
>   pci_unregister_driver(_driver);

Forgot to CC in Greg Kroah-Hartman

Re: [PATCH] staging: dgnc: add __exit macro to dgnc_driver.c

2016-06-19 Thread Joshua Houghton

On Saturday, 18 June 2016 19:42:07 UTC Joshua Houghton wrote:
> Add the __exit macro to the dgnc_cleanup_module(void) function
> in dgnc_driver.c
> 
> Signed-off-by: Joshua Houghton 
> ---
>  drivers/staging/dgnc/dgnc_driver.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/dgnc/dgnc_driver.c
> b/drivers/staging/dgnc/dgnc_driver.c index af2e835..2c3fb2a 100644
> --- a/drivers/staging/dgnc/dgnc_driver.c
> +++ b/drivers/staging/dgnc/dgnc_driver.c
> @@ -158,7 +158,7 @@ static void cleanup(bool sysfiles)
>   *
>   * Module unload.  This is where it all ends.
>   */
> -static void dgnc_cleanup_module(void)
> +static void __exit dgnc_cleanup_module(void)
>  {
>   cleanup(true);
>   pci_unregister_driver(_driver);

Forgot to CC in Greg Kroah-Hartman

Re: [Intel-gfx] Bad flicker on skylake HQD due to code in the 4.7 merge window

2016-06-19 Thread James Bottomley

On Fri, 2016-06-17 at 16:06 -0700, James Bottomley wrote:
> On Fri, 2016-06-17 at 16:34 +0300, Jani Nikula wrote:
> > On Fri, 17 Jun 2016, Daniel Vetter  wrote:
> > > On Thu, Jun 16, 2016 at 03:42:12PM -0700, James Bottomley wrote:
> > > > On Thu, 2016-06-16 at 14:29 -0700, James Bottomley wrote:
> > > > > On Thu, 2016-06-16 at 23:24 +0200, Daniel Vetter wrote:
> > > > > > I guess we'll need the bisect on this one to make progress.
> > > > > 
> > > > > Sigh, I was afraid that might be the next step.
> > > > 
> > > > OK, I have a curious data point.  I assumed the problem would
> > > > be
> > > > somewhere in the drm update, so I started bisecting that at the
> > > > top. 
> > > >  However, the top most commit:
> > > > 
> > > > commit 1d6da87a3241deb13d073c4125d19ed0e5a0c62c
> > > > Merge: 1f40c49 a39ed68
> > > > Author: Linus Torvalds 
> > > > Date:   Mon May 23 11:48:48 2016 -0700
> > > > 
> > > > Merge branch 'drm-next' of
> > > > git://people.freedesktop.org/~airlied/linux
> > > > 
> > > > Isn't actually bad.  There's no flicker here, so whatever
> > > > caused
> > > > the
> > > > problem came from some update after this.
> > > 
> > > There was a fixes pull after this. Might be worth it to restrict
> > > to
> > > just
> > > the i915 changes, which are just
> > > 5b4fd5bb1230cd037..157d2c7fad0863222
> > > 
> > > Looking at those nothing seems to stick out which might explain
> > > what's
> > > happening for you.
> 
> OK, so just on the firmware, the system seems less flickery with the
> new 1.4.3 UEFI, so I'm starting to think it is a Skylake errata 
> issue.  The flicker isn't gone for good, but seems to be reboot 
> dependent (it's there in some boots, but gone on a reboot).
> 
> > This should be easy enough to try before bisecting:
> > http://patchwork.freedesktop.org/patch/msgid/1466162081-12042-1-git
> > -s
> > end-email-mika.kah...@intel.com
> 
> Applying this didn't seem to make a difference: still there on some 
> and gone on other reboots.

OK, my candidate bad commit is this one:

commit a05628195a0d9f3173dd9aa76f482aef692e46ee
Author: Ville Syrjälä 
Date:   Mon Apr 11 10:23:51 2016 +0300

drm/i915: Get panel_type from OpRegion panel details

After being more careful about waiting to identify flicker, this one
seems to be the one the bisect finds.  I'm now running v4.7-rc3 with
this one reverted and am currently seeing no flicker problems.  It is,
however, early days because the flicker can hide for long periods, so I
'll wait until Monday evening and a few reboots before declaring
victory.

James

Re: [Intel-gfx] Bad flicker on skylake HQD due to code in the 4.7 merge window

2016-06-19 Thread James Bottomley

On Fri, 2016-06-17 at 16:06 -0700, James Bottomley wrote:
> On Fri, 2016-06-17 at 16:34 +0300, Jani Nikula wrote:
> > On Fri, 17 Jun 2016, Daniel Vetter  wrote:
> > > On Thu, Jun 16, 2016 at 03:42:12PM -0700, James Bottomley wrote:
> > > > On Thu, 2016-06-16 at 14:29 -0700, James Bottomley wrote:
> > > > > On Thu, 2016-06-16 at 23:24 +0200, Daniel Vetter wrote:
> > > > > > I guess we'll need the bisect on this one to make progress.
> > > > > 
> > > > > Sigh, I was afraid that might be the next step.
> > > > 
> > > > OK, I have a curious data point.  I assumed the problem would
> > > > be
> > > > somewhere in the drm update, so I started bisecting that at the
> > > > top. 
> > > >  However, the top most commit:
> > > > 
> > > > commit 1d6da87a3241deb13d073c4125d19ed0e5a0c62c
> > > > Merge: 1f40c49 a39ed68
> > > > Author: Linus Torvalds 
> > > > Date:   Mon May 23 11:48:48 2016 -0700
> > > > 
> > > > Merge branch 'drm-next' of
> > > > git://people.freedesktop.org/~airlied/linux
> > > > 
> > > > Isn't actually bad.  There's no flicker here, so whatever
> > > > caused
> > > > the
> > > > problem came from some update after this.
> > > 
> > > There was a fixes pull after this. Might be worth it to restrict
> > > to
> > > just
> > > the i915 changes, which are just
> > > 5b4fd5bb1230cd037..157d2c7fad0863222
> > > 
> > > Looking at those nothing seems to stick out which might explain
> > > what's
> > > happening for you.
> 
> OK, so just on the firmware, the system seems less flickery with the
> new 1.4.3 UEFI, so I'm starting to think it is a Skylake errata 
> issue.  The flicker isn't gone for good, but seems to be reboot 
> dependent (it's there in some boots, but gone on a reboot).
> 
> > This should be easy enough to try before bisecting:
> > http://patchwork.freedesktop.org/patch/msgid/1466162081-12042-1-git
> > -s
> > end-email-mika.kah...@intel.com
> 
> Applying this didn't seem to make a difference: still there on some 
> and gone on other reboots.

OK, my candidate bad commit is this one:

commit a05628195a0d9f3173dd9aa76f482aef692e46ee
Author: Ville Syrjälä 
Date:   Mon Apr 11 10:23:51 2016 +0300

drm/i915: Get panel_type from OpRegion panel details

After being more careful about waiting to identify flicker, this one
seems to be the one the bisect finds.  I'm now running v4.7-rc3 with
this one reverted and am currently seeing no flicker problems.  It is,
however, early days because the flicker can hide for long periods, so I
'll wait until Monday evening and a few reboots before declaring
victory.

James

Re: [PATCH] x86/ptrace: Remove questionable TS_COMPAT usage in ptrace

2016-06-19 Thread Andy Lutomirski

On Sat, Jun 18, 2016 at 10:02 AM, Andy Lutomirski  wrote:
> On Jun 18, 2016 6:56 AM, "Pedro Alves"  wrote:
>>
>> On 06/18/2016 11:21 AM, Andy Lutomirski wrote:
>> > A 32-bit tracer can set a tracee's TS_COMPAT flag by poking orig_ax.
>> >
>> > - If the tracee is stopped in a 32-bit syscall, this has no effect
>> >   as TS_COMPAT will already be set.
>> >
>> > - If the tracee is stopped on entry to a 64-bit syscall, this could
>> >   cause problems: in_compat_syscall() etc will be out of sync with
>> >   the actual syscall table in use.  I can imagine this bre aking
>> >   audit.  (It can't meaningfully break seccomp, as a malicious
>> >   tracer can already fully bypass seccomp.)  I could also imagine it
>> >   subtly confusing the implementations of a few syscalls.
>> >
>> >  - If the tracee is stopped in a non-syscall context, then TS_COMPAT
>> >will end up set in user mode, which isn't supposed to happen.  The 
>> > results
>> >are likely to be similar to the 64-bit syscall entry case.
>> >
>> > Fix it by deleting the code.
>> >
>> > Here's my understanding of the previous intent.  Suppose a
>> > 32-bit tracee makes a syscall that returns one of the -ERESTART
>> > codes.  A 32-bit tracer saves away its register state.  The tracee
>> > resumes, returns from the system call, and gets stopped again for a
>> > non-syscall reason (e.g. a signal).  Then the tracer tries to roll
>> > back the tracee's state by writing all of the saved registers back.
>> >
>> > The result of this sequence of events will be that the tracee's
>> > registers' low bits match what they were at the end of the syscall
>> > but TS_COMPAT will be clear.  This will cause syscall_get_error() to
>> > return a *positive* number, because we zero-extend registers poked
>> > by 32-bit tracers instead of sign-extending them.  As a result, with
>> > this change, syscall restart won't happen, whereas, historically, it
>> > would have happened.
>> >
>> > As far as I can tell, this corner case isn't very important, and I
>>
>> I believe it's actually very much very important for gdb, for restoring
>> the inferior state when the user calls a function in the inferior, with:
>>
>>  (gdb) print foo()
>>
>> Some background here:
>>
>>  
>> http://linux-kernel.vger.kernel.narkive.com/fqrh4xKK/patch-x86-ptrace-sign-extend-eax-with-orig-eax-0
>
> Yuck.  I should have dug in to the history.  Why not just
> unconditionally sign-extend eax when set by a 32-bit tracer?
>
> Do you know how to acquire a copy of erestartsys-trap.c?  The old
> links appear to be broken.
>
> Also, while I have your attention: when gdb restores old state like
> this, does it do it with individual calls to PTRACE_POKEUSER or does
> it use SETREGSET or similar to do it all at once?  I'm asking because
> I have some other code (fsgsbase) that's on hold until I can figure
> out how to keep it from breaking gdb if and when gdb writes to fs and
> fs_base.

OK, I studied this a bit.  What a mess!  Here's what's going on AFAICT:

1. For reasons that probably made sense, the kernel delivers signals
and triggers ptrace before handling syscall restart.  This means that
-ERESTART_RESTARTBLOCK, etc is visible to userspace.  We could
plausibly get away with changing that, but it seems quite risky.

2. As a result of (1), gdb (quite reasonably) expects that it can
snapshot user state on signal delivery, adjust regs to call a
function, and then restore user state.

3. Presumably as a result of (2), we do syscall restart if indicated
by the register state on ptrace resume even if we're *not* resuming a
syscall.

4. Also as a result of (1), gdb expects that writing -1 to orig_eax
via POKEUSER or similar will *disable* syscall restart, which is
necessary to get function calling on syscall exit to work.

The combination of (1) and (4) means that we need to skip syscall
restart if orig_eax == -1 (in a 32-bit signed sense).  The combination
of (1) and (2) means that we need to enable syscall restart if
orig_eax > 0 (in a 32-bit signed sense) and eax contains a
-ERESTARTxyz code (again in a signed sense).  And, for extra fun, we
need to make sure that, if we set __NR_restart_syscall, we set the
correct value based on the syscall that we're restarting.

The latter bit is a mess and is probably broken on current kernels for
64-bit gdb attached to a 32-bit process.  (Is it?  All of this stuff
is a bit of a pain to test.)

Here's my proposal:

Step 1: for 4.7 and for -stable, introduce TS_I386_REGS_POKED.  Set it
in putreg32.  Use it in syscall_get_error, get_nr_restart_syscall,
etc.  Clear it in do_signal.  This gets rid of the arch confusion
while (hopefully) preserving current behavior.

Step 2: Optionally, for 4.8, consider cleaning up the whole mess.
First, change putreg32 to sign-extend orig_eax and eax and just get
rid of the TS_COMPAT checks in syscall_get_nr and syscall_get_error.
Then we either change the 64-bit -ERESTART_BLOCK to a different number
or

Re: [PATCH] x86/ptrace: Remove questionable TS_COMPAT usage in ptrace

2016-06-19 Thread Andy Lutomirski

On Sat, Jun 18, 2016 at 10:02 AM, Andy Lutomirski  wrote:
> On Jun 18, 2016 6:56 AM, "Pedro Alves"  wrote:
>>
>> On 06/18/2016 11:21 AM, Andy Lutomirski wrote:
>> > A 32-bit tracer can set a tracee's TS_COMPAT flag by poking orig_ax.
>> >
>> > - If the tracee is stopped in a 32-bit syscall, this has no effect
>> >   as TS_COMPAT will already be set.
>> >
>> > - If the tracee is stopped on entry to a 64-bit syscall, this could
>> >   cause problems: in_compat_syscall() etc will be out of sync with
>> >   the actual syscall table in use.  I can imagine this bre aking
>> >   audit.  (It can't meaningfully break seccomp, as a malicious
>> >   tracer can already fully bypass seccomp.)  I could also imagine it
>> >   subtly confusing the implementations of a few syscalls.
>> >
>> >  - If the tracee is stopped in a non-syscall context, then TS_COMPAT
>> >will end up set in user mode, which isn't supposed to happen.  The 
>> > results
>> >are likely to be similar to the 64-bit syscall entry case.
>> >
>> > Fix it by deleting the code.
>> >
>> > Here's my understanding of the previous intent.  Suppose a
>> > 32-bit tracee makes a syscall that returns one of the -ERESTART
>> > codes.  A 32-bit tracer saves away its register state.  The tracee
>> > resumes, returns from the system call, and gets stopped again for a
>> > non-syscall reason (e.g. a signal).  Then the tracer tries to roll
>> > back the tracee's state by writing all of the saved registers back.
>> >
>> > The result of this sequence of events will be that the tracee's
>> > registers' low bits match what they were at the end of the syscall
>> > but TS_COMPAT will be clear.  This will cause syscall_get_error() to
>> > return a *positive* number, because we zero-extend registers poked
>> > by 32-bit tracers instead of sign-extending them.  As a result, with
>> > this change, syscall restart won't happen, whereas, historically, it
>> > would have happened.
>> >
>> > As far as I can tell, this corner case isn't very important, and I
>>
>> I believe it's actually very much very important for gdb, for restoring
>> the inferior state when the user calls a function in the inferior, with:
>>
>>  (gdb) print foo()
>>
>> Some background here:
>>
>>  
>> http://linux-kernel.vger.kernel.narkive.com/fqrh4xKK/patch-x86-ptrace-sign-extend-eax-with-orig-eax-0
>
> Yuck.  I should have dug in to the history.  Why not just
> unconditionally sign-extend eax when set by a 32-bit tracer?
>
> Do you know how to acquire a copy of erestartsys-trap.c?  The old
> links appear to be broken.
>
> Also, while I have your attention: when gdb restores old state like
> this, does it do it with individual calls to PTRACE_POKEUSER or does
> it use SETREGSET or similar to do it all at once?  I'm asking because
> I have some other code (fsgsbase) that's on hold until I can figure
> out how to keep it from breaking gdb if and when gdb writes to fs and
> fs_base.

OK, I studied this a bit.  What a mess!  Here's what's going on AFAICT:

1. For reasons that probably made sense, the kernel delivers signals
and triggers ptrace before handling syscall restart.  This means that
-ERESTART_RESTARTBLOCK, etc is visible to userspace.  We could
plausibly get away with changing that, but it seems quite risky.

2. As a result of (1), gdb (quite reasonably) expects that it can
snapshot user state on signal delivery, adjust regs to call a
function, and then restore user state.

3. Presumably as a result of (2), we do syscall restart if indicated
by the register state on ptrace resume even if we're *not* resuming a
syscall.

4. Also as a result of (1), gdb expects that writing -1 to orig_eax
via POKEUSER or similar will *disable* syscall restart, which is
necessary to get function calling on syscall exit to work.

The combination of (1) and (4) means that we need to skip syscall
restart if orig_eax == -1 (in a 32-bit signed sense).  The combination
of (1) and (2) means that we need to enable syscall restart if
orig_eax > 0 (in a 32-bit signed sense) and eax contains a
-ERESTARTxyz code (again in a signed sense).  And, for extra fun, we
need to make sure that, if we set __NR_restart_syscall, we set the
correct value based on the syscall that we're restarting.

The latter bit is a mess and is probably broken on current kernels for
64-bit gdb attached to a 32-bit process.  (Is it?  All of this stuff
is a bit of a pain to test.)

Here's my proposal:

Step 1: for 4.7 and for -stable, introduce TS_I386_REGS_POKED.  Set it
in putreg32.  Use it in syscall_get_error, get_nr_restart_syscall,
etc.  Clear it in do_signal.  This gets rid of the arch confusion
while (hopefully) preserving current behavior.

Step 2: Optionally, for 4.8, consider cleaning up the whole mess.
First, change putreg32 to sign-extend orig_eax and eax and just get
rid of the TS_COMPAT checks in syscall_get_nr and syscall_get_error.
Then we either change the 64-bit -ERESTART_BLOCK to a different number
or encode the desired restart bitness in

Re: [PATCH v2 0/6] x86: Rewrite switch_to()

2016-06-19 Thread Brian Gerst

On Sat, Jun 18, 2016 at 4:56 PM, Brian Gerst  wrote:
> This patch set simplifies the switch_to() code, by moving the stack switch
> code out of line into an asm stub before calling __switch_to().  This ends
> up being more readable, and using the C calling convention instead of
> clobbering all registers improves code generation.  It also allows newly
> forked processes to construct a special stack frame to seamlessly flow
> to ret_from_fork, instead of using a test and branch, or an unbalanced
> call/ret.
>
> Changes from v1:
> - Added struct inactive_task_frame
> - Added comments about kernel threads returning to userspace
> - Cleaned up some incorrect uses of thread.sp

I forgot to also add:
- Rearranged inactive stack frame so that BP (frame pointer) is in the
natural position right below the return address.  This should take
care of unwinding issues Josh raised.

--
Brian Gerst

Re: [PATCH v2 0/6] x86: Rewrite switch_to()

2016-06-19 Thread Brian Gerst

On Sat, Jun 18, 2016 at 4:56 PM, Brian Gerst  wrote:
> This patch set simplifies the switch_to() code, by moving the stack switch
> code out of line into an asm stub before calling __switch_to().  This ends
> up being more readable, and using the C calling convention instead of
> clobbering all registers improves code generation.  It also allows newly
> forked processes to construct a special stack frame to seamlessly flow
> to ret_from_fork, instead of using a test and branch, or an unbalanced
> call/ret.
>
> Changes from v1:
> - Added struct inactive_task_frame
> - Added comments about kernel threads returning to userspace
> - Cleaned up some incorrect uses of thread.sp

I forgot to also add:
- Rearranged inactive stack frame so that BP (frame pointer) is in the
natural position right below the return address.  This should take
care of unwinding issues Josh raised.

--
Brian Gerst

Re: [PATCH v2 5/6] x86: Pass kernel thread parameters in fork_frame

2016-06-19 Thread Brian Gerst

On Sun, Jun 19, 2016 at 5:28 PM, Andy Lutomirski  wrote:
> On Sat, Jun 18, 2016 at 1:56 PM, Brian Gerst  wrote:
>> Instead of setting up a fake pt_regs context, put the kernel thread
>> function pointer and arg into the unused callee-restored registers
>> of struct fork_frame.
>
> This seems generally okay.
>
>>
>> Signed-off-by: Brian Gerst 
>
>> @@ -146,19 +147,12 @@ int copy_thread_tls(unsigned long clone_flags, 
>> unsigned long sp,
>
>
>> if (unlikely(p->flags & PF_KTHREAD)) {
>> /* kernel thread */
>> memset(childregs, 0, sizeof(struct pt_regs));
>> -   frame->ret_addr = (unsigned long) ret_from_kernel_thread;
>> -   task_user_gs(p) = __KERNEL_STACK_CANARY;
>> -   childregs->ds = __USER_DS;
>> -   childregs->es = __USER_DS;
>> -   childregs->fs = __KERNEL_PERCPU;
>
> Is the idea that do_execve promises to initialize all these fields to
> something sensible if the kernel thread in question tries to return to
> user mode?
>
> --Andy

Yes, do_execve() should be setting the full pt_regs.

--
Brian Gerst

Re: [PATCH v2 5/6] x86: Pass kernel thread parameters in fork_frame

2016-06-19 Thread Brian Gerst

On Sun, Jun 19, 2016 at 5:28 PM, Andy Lutomirski  wrote:
> On Sat, Jun 18, 2016 at 1:56 PM, Brian Gerst  wrote:
>> Instead of setting up a fake pt_regs context, put the kernel thread
>> function pointer and arg into the unused callee-restored registers
>> of struct fork_frame.
>
> This seems generally okay.
>
>>
>> Signed-off-by: Brian Gerst 
>
>> @@ -146,19 +147,12 @@ int copy_thread_tls(unsigned long clone_flags, 
>> unsigned long sp,
>
>
>> if (unlikely(p->flags & PF_KTHREAD)) {
>> /* kernel thread */
>> memset(childregs, 0, sizeof(struct pt_regs));
>> -   frame->ret_addr = (unsigned long) ret_from_kernel_thread;
>> -   task_user_gs(p) = __KERNEL_STACK_CANARY;
>> -   childregs->ds = __USER_DS;
>> -   childregs->es = __USER_DS;
>> -   childregs->fs = __KERNEL_PERCPU;
>
> Is the idea that do_execve promises to initialize all these fields to
> something sensible if the kernel thread in question tries to return to
> user mode?
>
> --Andy

Yes, do_execve() should be setting the full pt_regs.

--
Brian Gerst

Re: [RFC PATCH] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost

2016-06-19 Thread Michal Hocko

On Sun 19-06-16 23:35:43, Michal Hocko wrote:
> On Sat 18-06-16 03:09:02, Michael S. Tsirkin wrote:
> > On Fri, Jun 17, 2016 at 11:00:17AM +0200, Michal Hocko wrote:
[...]
> > >  /*
> > > + * A safe variant of __get_user for for use_mm() users to have a
> > > + * gurantee that the address space wasn't reaped in the background
> > > + */
> > > +#define __get_user_mm(mm, x, ptr)\
> > > +({   \
> > > + int ___gu_err = __get_user(x, ptr); \
> > > + if (!___gu_err && test_bit(MMF_UNSTABLE, >flags))   \
> > 
> > test_bit is somewhat expensive. See my old mail
> > x86/bitops: implement __test_bit
> 
> Do you have a msg_id?

Found it
http://lkml.kernel.org/r/1440776707-22016-1-git-send-email-...@redhat.com
-- 
Michal Hocko
SUSE Labs

Re: [RFC PATCH] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost

2016-06-19 Thread Michal Hocko

On Sun 19-06-16 23:35:43, Michal Hocko wrote:
> On Sat 18-06-16 03:09:02, Michael S. Tsirkin wrote:
> > On Fri, Jun 17, 2016 at 11:00:17AM +0200, Michal Hocko wrote:
[...]
> > >  /*
> > > + * A safe variant of __get_user for for use_mm() users to have a
> > > + * gurantee that the address space wasn't reaped in the background
> > > + */
> > > +#define __get_user_mm(mm, x, ptr)\
> > > +({   \
> > > + int ___gu_err = __get_user(x, ptr); \
> > > + if (!___gu_err && test_bit(MMF_UNSTABLE, >flags))   \
> > 
> > test_bit is somewhat expensive. See my old mail
> > x86/bitops: implement __test_bit
> 
> Do you have a msg_id?

Found it
http://lkml.kernel.org/r/1440776707-22016-1-git-send-email-...@redhat.com
-- 
Michal Hocko
SUSE Labs

Re: [RFC PATCH] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost

2016-06-19 Thread Michal Hocko

On Sat 18-06-16 03:09:02, Michael S. Tsirkin wrote:
> On Fri, Jun 17, 2016 at 11:00:17AM +0200, Michal Hocko wrote:
[...]
> > It seems that vhost usage would suffer from this problem because
> > it reads from the userspace to get (status) flags and makes some
> > decisions based on the read value.  I do not understand the code so I
> > couldn't evaluate whether that would lead to some real problems so I
> > conservatively assumed it wouldn't handle that gracefully.
> 
> Getting an error from __get_user and friends is handled gracefully.
> Getting zero instead of a real value will cause userspace
> memory corruption.

OK, thanks for the confirmation! I will add this to the changelog. I
assume that the memory corruption could "leak out" of the mm we just
read from, right? I am asking because the mm and all its users will die
by SIGKILL so they will not "see" the corruption. I am not familiar with the
vhost transfer model but I guess it wouldn't be uncommon if the target
memory could be a shared object (e.g. tmpfs or a regular file) so it
would outlive the mm.

[...]

> > @@ -1713,7 +1713,7 @@ bool vhost_enable_notify(struct vhost_dev *dev, 
> > struct vhost_virtqueue *vq)
> > /* They could have slipped one in as we were doing that: make
> >  * sure it's written, then check again. */
> > smp_mb();
> > -   r = __get_user(avail_idx, >avail->idx);
> > +   r = __get_user_mm(dev->mm,avail_idx, >avail->idx);
> 
> space after , pls

sure

> 
> > if (r) {
> > vq_err(vq, "Failed to check avail idx at %p: %d\n",
> >>avail->idx, r);
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 6d81a1eb974a..2b00ac7faa18 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -513,6 +513,7 @@ static inline int get_dumpable(struct mm_struct *mm)
> >  #define MMF_RECALC_UPROBES 20  /* MMF_HAS_UPROBES can be wrong */
> >  #define MMF_OOM_REAPED 21  /* mm has been already reaped */
> >  #define MMF_OOM_NOT_REAPABLE   22  /* mm couldn't be reaped */
> > +#define MMF_UNSTABLE   23  /* mm is unstable for 
> > copy_from_user */
> >  
> >  #define MMF_INIT_MASK  (MMF_DUMPABLE_MASK | 
> > MMF_DUMP_FILTER_MASK)
> >  
> > diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> > index 349557825428..b1f314fca3c8 100644
> > --- a/include/linux/uaccess.h
> > +++ b/include/linux/uaccess.h
> > @@ -76,6 +76,28 @@ static inline unsigned long 
> > __copy_from_user_nocache(void *to,
> >  #endif /* ARCH_HAS_NOCACHE_UACCESS */
> >  
> >  /*
> > + * A safe variant of __get_user for for use_mm() users to have a
> > + * gurantee that the address space wasn't reaped in the background
> > + */
> > +#define __get_user_mm(mm, x, ptr)  \
> > +({ \
> > +   int ___gu_err = __get_user(x, ptr); \
> > +   if (!___gu_err && test_bit(MMF_UNSTABLE, >flags))   \
> 
> test_bit is somewhat expensive. See my old mail
>   x86/bitops: implement __test_bit

Do you have a msg_id?

> I dropped it as virtio just switched to simple &/| for features,
> but we might need something like this now.

Is this such a hot path that something like this would make a visible
difference? 

> 
> 
> 
> > +   ___gu_err = -EFAULT;\
> > +   ___gu_err;  \
> > +})
> > +
> > +/* similar to __get_user_mm */
> > +static inline __must_check long __copy_from_user_mm(struct mm_struct *mm,
> > +   void *to, const void __user * from, unsigned long n)
> > +{
> > +   long ret = __copy_from_user(to, from, n);
> > +   if (!ret && test_bit(MMF_UNSTABLE, >flags))
> > +   return -EFAULT;
> > +   return ret;

And I've just noticed that this is not correct. We need 
if ((ret >= 0) && test_bit(MMF_UNSTABLE, >flags))

[...]

> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 6303bc7caeda..3fa43e96a59b 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -506,6 +506,12 @@ static bool __oom_reap_task(struct task_struct *tsk)
> > goto mm_drop;
> > }
> >  
> > +   /*
> > +* Tell all users of get_user_mm/copy_from_user_mm that the content
> > +* is no longer stable.
> > +*/
> > +   set_bit(MMF_UNSTABLE, >flags);
> > +
> 
> do we need some kind of barrier after this?

Well I believe we don't because unmapping the memory will likely
imply memory barriers on the way.

> 
> and if yes - does flag read need a barrier before it too?

A good question. I was basically assuming the same as above. If we didn't fault
then the oom reaper wouldn't touch that memory and so we are safe even when
we see the outdated mm flags, if the memory was reaped then we have to page
fault and that should imply memory barrier AFAIU.

Does that make sense?

> 
> > tlb_gather_mmu(, mm, 0, -1);
> > for (vma = mm->mmap ; vma; vma =

Re: [RFC PATCH] vhost, mm: make sure that oom_reaper doesn't reap memory read by vhost

2016-06-19 Thread Michal Hocko

On Sat 18-06-16 03:09:02, Michael S. Tsirkin wrote:
> On Fri, Jun 17, 2016 at 11:00:17AM +0200, Michal Hocko wrote:
[...]
> > It seems that vhost usage would suffer from this problem because
> > it reads from the userspace to get (status) flags and makes some
> > decisions based on the read value.  I do not understand the code so I
> > couldn't evaluate whether that would lead to some real problems so I
> > conservatively assumed it wouldn't handle that gracefully.
> 
> Getting an error from __get_user and friends is handled gracefully.
> Getting zero instead of a real value will cause userspace
> memory corruption.

OK, thanks for the confirmation! I will add this to the changelog. I
assume that the memory corruption could "leak out" of the mm we just
read from, right? I am asking because the mm and all its users will die
by SIGKILL so they will not "see" the corruption. I am not familiar with the
vhost transfer model but I guess it wouldn't be uncommon if the target
memory could be a shared object (e.g. tmpfs or a regular file) so it
would outlive the mm.

[...]

> > @@ -1713,7 +1713,7 @@ bool vhost_enable_notify(struct vhost_dev *dev, 
> > struct vhost_virtqueue *vq)
> > /* They could have slipped one in as we were doing that: make
> >  * sure it's written, then check again. */
> > smp_mb();
> > -   r = __get_user(avail_idx, >avail->idx);
> > +   r = __get_user_mm(dev->mm,avail_idx, >avail->idx);
> 
> space after , pls

sure

> 
> > if (r) {
> > vq_err(vq, "Failed to check avail idx at %p: %d\n",
> >>avail->idx, r);
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 6d81a1eb974a..2b00ac7faa18 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -513,6 +513,7 @@ static inline int get_dumpable(struct mm_struct *mm)
> >  #define MMF_RECALC_UPROBES 20  /* MMF_HAS_UPROBES can be wrong */
> >  #define MMF_OOM_REAPED 21  /* mm has been already reaped */
> >  #define MMF_OOM_NOT_REAPABLE   22  /* mm couldn't be reaped */
> > +#define MMF_UNSTABLE   23  /* mm is unstable for 
> > copy_from_user */
> >  
> >  #define MMF_INIT_MASK  (MMF_DUMPABLE_MASK | 
> > MMF_DUMP_FILTER_MASK)
> >  
> > diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> > index 349557825428..b1f314fca3c8 100644
> > --- a/include/linux/uaccess.h
> > +++ b/include/linux/uaccess.h
> > @@ -76,6 +76,28 @@ static inline unsigned long 
> > __copy_from_user_nocache(void *to,
> >  #endif /* ARCH_HAS_NOCACHE_UACCESS */
> >  
> >  /*
> > + * A safe variant of __get_user for for use_mm() users to have a
> > + * gurantee that the address space wasn't reaped in the background
> > + */
> > +#define __get_user_mm(mm, x, ptr)  \
> > +({ \
> > +   int ___gu_err = __get_user(x, ptr); \
> > +   if (!___gu_err && test_bit(MMF_UNSTABLE, >flags))   \
> 
> test_bit is somewhat expensive. See my old mail
>   x86/bitops: implement __test_bit

Do you have a msg_id?

> I dropped it as virtio just switched to simple &/| for features,
> but we might need something like this now.

Is this such a hot path that something like this would make a visible
difference? 

> 
> 
> 
> > +   ___gu_err = -EFAULT;\
> > +   ___gu_err;  \
> > +})
> > +
> > +/* similar to __get_user_mm */
> > +static inline __must_check long __copy_from_user_mm(struct mm_struct *mm,
> > +   void *to, const void __user * from, unsigned long n)
> > +{
> > +   long ret = __copy_from_user(to, from, n);
> > +   if (!ret && test_bit(MMF_UNSTABLE, >flags))
> > +   return -EFAULT;
> > +   return ret;

And I've just noticed that this is not correct. We need 
if ((ret >= 0) && test_bit(MMF_UNSTABLE, >flags))

[...]

> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 6303bc7caeda..3fa43e96a59b 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -506,6 +506,12 @@ static bool __oom_reap_task(struct task_struct *tsk)
> > goto mm_drop;
> > }
> >  
> > +   /*
> > +* Tell all users of get_user_mm/copy_from_user_mm that the content
> > +* is no longer stable.
> > +*/
> > +   set_bit(MMF_UNSTABLE, >flags);
> > +
> 
> do we need some kind of barrier after this?

Well I believe we don't because unmapping the memory will likely
imply memory barriers on the way.

> 
> and if yes - does flag read need a barrier before it too?

A good question. I was basically assuming the same as above. If we didn't fault
then the oom reaper wouldn't touch that memory and so we are safe even when
we see the outdated mm flags, if the memory was reaped then we have to page
fault and that should imply memory barrier AFAIU.

Does that make sense?

> 
> > tlb_gather_mmu(, mm, 0, -1);
> > for (vma = mm->mmap ; vma; vma =

Re: [PATCH v2 5/6] x86: Pass kernel thread parameters in fork_frame

2016-06-19 Thread Andy Lutomirski

On Sat, Jun 18, 2016 at 1:56 PM, Brian Gerst  wrote:
> Instead of setting up a fake pt_regs context, put the kernel thread
> function pointer and arg into the unused callee-restored registers
> of struct fork_frame.

This seems generally okay.

>
> Signed-off-by: Brian Gerst 

> @@ -146,19 +147,12 @@ int copy_thread_tls(unsigned long clone_flags, unsigned 
> long sp,


> if (unlikely(p->flags & PF_KTHREAD)) {
> /* kernel thread */
> memset(childregs, 0, sizeof(struct pt_regs));
> -   frame->ret_addr = (unsigned long) ret_from_kernel_thread;
> -   task_user_gs(p) = __KERNEL_STACK_CANARY;
> -   childregs->ds = __USER_DS;
> -   childregs->es = __USER_DS;
> -   childregs->fs = __KERNEL_PERCPU;

Is the idea that do_execve promises to initialize all these fields to
something sensible if the kernel thread in question tries to return to
user mode?

--Andy

Re: [PATCH v2 5/6] x86: Pass kernel thread parameters in fork_frame

2016-06-19 Thread Andy Lutomirski

On Sat, Jun 18, 2016 at 1:56 PM, Brian Gerst  wrote:
> Instead of setting up a fake pt_regs context, put the kernel thread
> function pointer and arg into the unused callee-restored registers
> of struct fork_frame.

This seems generally okay.

>
> Signed-off-by: Brian Gerst 

> @@ -146,19 +147,12 @@ int copy_thread_tls(unsigned long clone_flags, unsigned 
> long sp,


> if (unlikely(p->flags & PF_KTHREAD)) {
> /* kernel thread */
> memset(childregs, 0, sizeof(struct pt_regs));
> -   frame->ret_addr = (unsigned long) ret_from_kernel_thread;
> -   task_user_gs(p) = __KERNEL_STACK_CANARY;
> -   childregs->ds = __USER_DS;
> -   childregs->es = __USER_DS;
> -   childregs->fs = __KERNEL_PERCPU;

Is the idea that do_execve promises to initialize all these fields to
something sensible if the kernel thread in question tries to return to
user mode?

--Andy

Re: [PATCH v2 4/6] x86: Rewrite switch_to() code

2016-06-19 Thread Andy Lutomirski

On Sat, Jun 18, 2016 at 1:56 PM, Brian Gerst  wrote:
> Move the low-level context switch code to an out-of-line asm stub instead of
> using complex inline asm.  This allows constructing a new stack frame for the
> child process to make it seamlessly flow to ret_from_fork without an extra
> test and branch in __switch_to().  It also improves code generation for
> __schedule() by using the C calling convention instead of clobbering all
> registers.

This is great.  The conflict with my stack code is minimal enough that
I won't bother rebasing unless someone specifically wants me to.

Acked-by: Andy Lutomirski 

--Andy

Re: [PATCH v2 4/6] x86: Rewrite switch_to() code

2016-06-19 Thread Andy Lutomirski

On Sat, Jun 18, 2016 at 1:56 PM, Brian Gerst  wrote:
> Move the low-level context switch code to an out-of-line asm stub instead of
> using complex inline asm.  This allows constructing a new stack frame for the
> child process to make it seamlessly flow to ret_from_fork without an extra
> test and branch in __switch_to().  It also improves code generation for
> __schedule() by using the C calling convention instead of clobbering all
> registers.

This is great.  The conflict with my stack code is minimal enough that
I won't bother rebasing unless someone specifically wants me to.

Acked-by: Andy Lutomirski 

--Andy

Re: [PATCH v2 3/6] x86: Add struct inactive_task_frame

2016-06-19 Thread Andy Lutomirski

On Sat, Jun 18, 2016 at 1:56 PM, Brian Gerst  wrote:
> Add struct inactive_task_frame, which defines the layout of the stack for
> a sleeping process.  For now, the only defined field is the BP register
> (frame pointer).

I like this version.

Reviewed-by: Andy Lutomirski

Re: [PATCH v2 3/6] x86: Add struct inactive_task_frame

2016-06-19 Thread Andy Lutomirski

On Sat, Jun 18, 2016 at 1:56 PM, Brian Gerst  wrote:
> Add struct inactive_task_frame, which defines the layout of the stack for
> a sleeping process.  For now, the only defined field is the BP register
> (frame pointer).

I like this version.

Reviewed-by: Andy Lutomirski

Re: Let me know about regressions in 4.7 (was: Re: Linux 4.7-rc1)

2016-06-19 Thread Pavel Machek

Hi!

> On 29.05.2016 19:00, Linus Torvalds wrote:
> > [???] Anyway, enough blathering. Go out and test. [???]
> 
> And if you find any regressions in 4.7 pre-releases let me know via
> regressi...@leemhuis.info I'll try to compile them into a list and post
> it on LKML once a week similar to how Rafael did it until a few years
> ago. My plan is to do it for 4.7 and maybe 4.8. Don't expect me to do it
> for the next few years or so, as I fear kernel regression tracking
> might turn out to be quite a lot of work -- and it's work I'll be doing
> in my spare time for fun, just because I think someone should do it.

Heh, you are a brave soul :-).

I wonder if it might make sense to set up an alias
(regressi...@kernel.org) where these could be cc-ed? Perhaps we get a
next volunteer after you...

Pavel


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Re: Let me know about regressions in 4.7 (was: Re: Linux 4.7-rc1)

2016-06-19 Thread Pavel Machek

Hi!

> On 29.05.2016 19:00, Linus Torvalds wrote:
> > [???] Anyway, enough blathering. Go out and test. [???]
> 
> And if you find any regressions in 4.7 pre-releases let me know via
> regressi...@leemhuis.info I'll try to compile them into a list and post
> it on LKML once a week similar to how Rafael did it until a few years
> ago. My plan is to do it for 4.7 and maybe 4.8. Don't expect me to do it
> for the next few years or so, as I fear kernel regression tracking
> might turn out to be quite a lot of work -- and it's work I'll be doing
> in my spare time for fun, just because I think someone should do it.

Heh, you are a brave soul :-).

I wonder if it might make sense to set up an alias
(regressi...@kernel.org) where these could be cc-ed? Perhaps we get a
next volunteer after you...

Pavel


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Re: [PATCH 2/2] net: ethernet: bcmsysport: use phy_ethtool_{get|set}_link_ksettings

2016-06-19 Thread Florian Fainelli

Le 19/06/2016 11:39, Philippe Reynes a écrit :
> There are two generics functions phy_ethtool_{get|set}_link_ksettings,
> so we can use them instead of defining the same code in the driver.
> 
> Signed-off-by: Philippe Reynes 

Reviewed-by: Florian Fainelli 
-- 
Florian

Re: [PATCH 2/2] net: ethernet: bcmsysport: use phy_ethtool_{get|set}_link_ksettings

2016-06-19 Thread Florian Fainelli

Le 19/06/2016 11:39, Philippe Reynes a écrit :
> There are two generics functions phy_ethtool_{get|set}_link_ksettings,
> so we can use them instead of defining the same code in the driver.
> 
> Signed-off-by: Philippe Reynes 

Reviewed-by: Florian Fainelli 
-- 
Florian

Re: [PATCH 1/2] net: ethernet: bcmsysport: use phydev from struct net_device

2016-06-19 Thread Florian Fainelli

Le 19/06/2016 11:39, Philippe Reynes a écrit :
> The private structure contain a pointer to phydev, but the structure
> net_device already contain such pointer. So we can remove the pointer
> phydev in the private structure, and update the driver to use the
> one contained in struct net_device.
> 
> Signed-off-by: Philippe Reynes 

Reviewed-by: Florian Fainelli 
-- 
Florian

Re: [PATCH 1/2] net: ethernet: bcmsysport: use phydev from struct net_device

2016-06-19 Thread Florian Fainelli

Le 19/06/2016 11:39, Philippe Reynes a écrit :
> The private structure contain a pointer to phydev, but the structure
> net_device already contain such pointer. So we can remove the pointer
> phydev in the private structure, and update the driver to use the
> one contained in struct net_device.
> 
> Signed-off-by: Philippe Reynes 

Reviewed-by: Florian Fainelli 
-- 
Florian

[PATCH 2/2] net: ethernet: bgmac: use phy_ethtool_{get|set}_link_ksettings

2016-06-19 Thread Philippe Reynes

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/broadcom/bgmac.c |   20 ++--
 1 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bgmac.c 
b/drivers/net/ethernet/broadcom/bgmac.c
index 90ed244..e6e74ca 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -1511,22 +1511,6 @@ static void bgmac_get_ethtool_stats(struct net_device 
*dev,
}
 }
 
-static int bgmac_get_settings(struct net_device *net_dev,
- struct ethtool_cmd *cmd)
-{
-   struct bgmac *bgmac = netdev_priv(net_dev);
-
-   return phy_ethtool_gset(net_dev->phydev, cmd);
-}
-
-static int bgmac_set_settings(struct net_device *net_dev,
- struct ethtool_cmd *cmd)
-{
-   struct bgmac *bgmac = netdev_priv(net_dev);
-
-   return phy_ethtool_sset(net_dev->phydev, cmd);
-}
-
 static void bgmac_get_drvinfo(struct net_device *net_dev,
  struct ethtool_drvinfo *info)
 {
@@ -1538,9 +1522,9 @@ static const struct ethtool_ops bgmac_ethtool_ops = {
.get_strings= bgmac_get_strings,
.get_sset_count = bgmac_get_sset_count,
.get_ethtool_stats  = bgmac_get_ethtool_stats,
-   .get_settings   = bgmac_get_settings,
-   .set_settings   = bgmac_set_settings,
.get_drvinfo= bgmac_get_drvinfo,
+   .get_link_ksettings = phy_ethtool_get_link_ksettings,
+   .set_link_ksettings = phy_ethtool_set_link_ksettings,
 };
 
 /**
-- 
1.7.4.4

[PATCH 2/2] net: ethernet: bgmac: use phy_ethtool_{get|set}_link_ksettings

2016-06-19 Thread Philippe Reynes

There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.

Signed-off-by: Philippe Reynes 
---
 drivers/net/ethernet/broadcom/bgmac.c |   20 ++--
 1 files changed, 2 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bgmac.c 
b/drivers/net/ethernet/broadcom/bgmac.c
index 90ed244..e6e74ca 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -1511,22 +1511,6 @@ static void bgmac_get_ethtool_stats(struct net_device 
*dev,
}
 }
 
-static int bgmac_get_settings(struct net_device *net_dev,
- struct ethtool_cmd *cmd)
-{
-   struct bgmac *bgmac = netdev_priv(net_dev);
-
-   return phy_ethtool_gset(net_dev->phydev, cmd);
-}
-
-static int bgmac_set_settings(struct net_device *net_dev,
- struct ethtool_cmd *cmd)
-{
-   struct bgmac *bgmac = netdev_priv(net_dev);
-
-   return phy_ethtool_sset(net_dev->phydev, cmd);
-}
-
 static void bgmac_get_drvinfo(struct net_device *net_dev,
  struct ethtool_drvinfo *info)
 {
@@ -1538,9 +1522,9 @@ static const struct ethtool_ops bgmac_ethtool_ops = {
.get_strings= bgmac_get_strings,
.get_sset_count = bgmac_get_sset_count,
.get_ethtool_stats  = bgmac_get_ethtool_stats,
-   .get_settings   = bgmac_get_settings,
-   .set_settings   = bgmac_set_settings,
.get_drvinfo= bgmac_get_drvinfo,
+   .get_link_ksettings = phy_ethtool_get_link_ksettings,
+   .set_link_ksettings = phy_ethtool_set_link_ksettings,
 };
 
 /**
-- 
1.7.4.4

Re: [PATCH v5 0/7] /dev/random - a new approach

2016-06-19 Thread Sandy Harris

On Sun, Jun 19, 2016 at 3:36 PM, Pavel Machek  wrote:

>> The following patch set provides a different approach to /dev/random ...
>
> Dunno. It is very similar to existing rng, AFAICT.

I do not think so. A lot of the basic principles are the same of course,
but Stephan is suggesting some real changes. On the other hand, I'm
not sure all of them are good ideas & Ted has already incorporated
some into the driver, so it is debatable how much here is really useful.

> And at the very least, constants in existing RNG could be tuned
> to provide "entropy at the boot time".

No, this is a rather hard problem & just tweaking definitely will
not solve it. Ted's patches, Stephan's, mine, the grsecurity
stuff and the kernel hardening project all have things that
might help, but as far as I can see there is no complete
in-kernel solution yet.

Closest thing I have seen to a solution are Denker's suggestions at:
http://www.av8n.com/computer/htm/secure-random.htm#sec-boot-image

Those, though, require changes to build & installation methods
& it might be hard to get distros & device vendors to do it.

> So IMO this should be re-done as tweaks to existing design, not as
> completely new RNG.

I agree, & I think Stephan has already done some of that.

Re: [PATCH v5 0/7] /dev/random - a new approach

2016-06-19 Thread Sandy Harris

On Sun, Jun 19, 2016 at 3:36 PM, Pavel Machek  wrote:

>> The following patch set provides a different approach to /dev/random ...
>
> Dunno. It is very similar to existing rng, AFAICT.

I do not think so. A lot of the basic principles are the same of course,
but Stephan is suggesting some real changes. On the other hand, I'm
not sure all of them are good ideas & Ted has already incorporated
some into the driver, so it is debatable how much here is really useful.

> And at the very least, constants in existing RNG could be tuned
> to provide "entropy at the boot time".

No, this is a rather hard problem & just tweaking definitely will
not solve it. Ted's patches, Stephan's, mine, the grsecurity
stuff and the kernel hardening project all have things that
might help, but as far as I can see there is no complete
in-kernel solution yet.

Closest thing I have seen to a solution are Denker's suggestions at:
http://www.av8n.com/computer/htm/secure-random.htm#sec-boot-image

Those, though, require changes to build & installation methods
& it might be hard to get distros & device vendors to do it.

> So IMO this should be re-done as tweaks to existing design, not as
> completely new RNG.

I agree, & I think Stephan has already done some of that.

[PATCH] checkpatch: remove obsolete CONFIG_EXPERIMENTAL checks

2016-06-19 Thread Ruslan Bilovol

Config EXPERIMENTAL has been removed from kernel in 2013
(see 3d374d0: "final removal of CONFIG_EXPERIMENTAL"),
there is no any reason to do these checks now.

Signed-off-by: Ruslan Bilovol 
---
 scripts/checkpatch.pl | 13 -
 1 file changed, 13 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 4904ced..4ede92c 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2670,13 +2670,6 @@ sub process {
#print "is_start<$is_start> is_end<$is_end> 
length<$length>\n";
}
 
-# discourage the addition of CONFIG_EXPERIMENTAL in Kconfig.
-   if ($realfile =~ /Kconfig/ &&
-   $line =~ /.\s*depends on\s+.*\bEXPERIMENTAL\b/) {
-   WARN("CONFIG_EXPERIMENTAL",
-"Use of CONFIG_EXPERIMENTAL is deprecated. For 
alternatives, see https://lkml.org/lkml/2012/10/23/580\n;);
-   }
-
 # discourage the use of boolean for type definition attributes of Kconfig 
options
if ($realfile =~ /Kconfig/ &&
$line =~ /^\+\s*\bboolean\b/) {
@@ -3042,12 +3035,6 @@ sub process {
}
}
 
-# discourage the addition of CONFIG_EXPERIMENTAL in #if(def).
-   if ($line =~ /^\+\s*\#\s*if.*\bCONFIG_EXPERIMENTAL\b/) {
-   WARN("CONFIG_EXPERIMENTAL",
-"Use of CONFIG_EXPERIMENTAL is deprecated. For 
alternatives, see https://lkml.org/lkml/2012/10/23/580\n;);
-   }
-
 # check for RCS/CVS revision markers
if ($rawline =~ /^\+.*\$(Revision|Log|Id)(?:\$|)/) {
WARN("CVS_KEYWORD",
-- 
1.9.1

[PATCH] checkpatch: remove obsolete CONFIG_EXPERIMENTAL checks

2016-06-19 Thread Ruslan Bilovol

Config EXPERIMENTAL has been removed from kernel in 2013
(see 3d374d0: "final removal of CONFIG_EXPERIMENTAL"),
there is no any reason to do these checks now.

Signed-off-by: Ruslan Bilovol 
---
 scripts/checkpatch.pl | 13 -
 1 file changed, 13 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 4904ced..4ede92c 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2670,13 +2670,6 @@ sub process {
#print "is_start<$is_start> is_end<$is_end> 
length<$length>\n";
}
 
-# discourage the addition of CONFIG_EXPERIMENTAL in Kconfig.
-   if ($realfile =~ /Kconfig/ &&
-   $line =~ /.\s*depends on\s+.*\bEXPERIMENTAL\b/) {
-   WARN("CONFIG_EXPERIMENTAL",
-"Use of CONFIG_EXPERIMENTAL is deprecated. For 
alternatives, see https://lkml.org/lkml/2012/10/23/580\n;);
-   }
-
 # discourage the use of boolean for type definition attributes of Kconfig 
options
if ($realfile =~ /Kconfig/ &&
$line =~ /^\+\s*\bboolean\b/) {
@@ -3042,12 +3035,6 @@ sub process {
}
}
 
-# discourage the addition of CONFIG_EXPERIMENTAL in #if(def).
-   if ($line =~ /^\+\s*\#\s*if.*\bCONFIG_EXPERIMENTAL\b/) {
-   WARN("CONFIG_EXPERIMENTAL",
-"Use of CONFIG_EXPERIMENTAL is deprecated. For 
alternatives, see https://lkml.org/lkml/2012/10/23/580\n;);
-   }
-
 # check for RCS/CVS revision markers
if ($rawline =~ /^\+.*\$(Revision|Log|Id)(?:\$|)/) {
WARN("CVS_KEYWORD",
-- 
1.9.1

Re: [PATCH 1/3] toshiba_acpi: Add IIO interface for accelerometer axis data

2016-06-19 Thread Jonathan Cameron

On 11/06/16 19:57, Azael Avalos wrote:
> This patch adds the accelerometer axis data to the IIO subsystem.
> 
> Currently reporting the X, Y and Z values, as no other data can be
> queried given the fact that the accelerometer chip itself is hidden
> behind the Toshiba proprietary interface.
> 
> Signed-off-by: Azael Avalos 
Looks pretty good and simple to me.  A few bits and bobs inline.

Jonathan
> ---
> All:
>  This is my first attempt with the IIO subsysem, I'll be looking
>  forward for your valuable input on this.
> 
> Darren:
>  There's a warning about more than 80 columns on this patch, once
>  I get feedback from the IIO guys I'll respin this with that issue
>  corrected.
> 
>  drivers/platform/x86/toshiba_acpi.c | 130 
> ++--
>  1 file changed, 126 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/platform/x86/toshiba_acpi.c 
> b/drivers/platform/x86/toshiba_acpi.c
> index 01e12d2..85014a3 100644
> --- a/drivers/platform/x86/toshiba_acpi.c
> +++ b/drivers/platform/x86/toshiba_acpi.c
> @@ -53,6 +53,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -134,6 +135,7 @@ MODULE_LICENSE("GPL");
>  
>  /* Field definitions */
>  #define HCI_ACCEL_MASK   0x7fff
> +#define HCI_ACCEL_DIRECTION_MASK 0x8000
>  #define HCI_HOTKEY_DISABLE   0x0b
>  #define HCI_HOTKEY_ENABLE0x09
>  #define HCI_HOTKEY_SPECIAL_FUNCTIONS 0x10
> @@ -178,6 +180,7 @@ struct toshiba_acpi_dev {
>   struct led_classdev eco_led;
>   struct miscdevice miscdev;
>   struct rfkill *wwan_rfk;
> + struct iio_dev *indio_dev;
>  
>   int force_fan;
>   int last_key_event;
> @@ -1962,8 +1965,8 @@ static ssize_t position_show(struct device *dev,
>struct device_attribute *attr, char *buf)
>  {
>   struct toshiba_acpi_dev *toshiba = dev_get_drvdata(dev);
> - u32 xyval, zval, tmp;
> - u16 x, y, z;
> + u32 xyval, zval;
> + int x, y, z;
>   int ret;
>  
>   xyval = zval = 0;
> @@ -1971,10 +1974,14 @@ static ssize_t position_show(struct device *dev,
>   if (ret < 0)
>   return ret;
>  
> + /* Accelerometer values */
>   x = xyval & HCI_ACCEL_MASK;
> - tmp = xyval >> HCI_MISC_SHIFT;
> - y = tmp & HCI_ACCEL_MASK;
> + y = (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK;
>   z = zval & HCI_ACCEL_MASK;
> + /* Movement direction */
> + x *= xyval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> + y *= (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> + z *= zval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
This lot is really an unrelated change - do it as a separate precursor patch
to the IIO support.
>  
>   return sprintf(buf, "%d %d %d\n", x, y, z);
>  }
> @@ -2420,6 +2427,94 @@ static void toshiba_acpi_kbd_bl_work(struct 
> work_struct *work)
>  }
>  
>  /*
> + * IIO device
> + */
> +
> +enum toshiba_accel_chan {
> + AXIS_X,
> + AXIS_Y,
> + AXIS_Z
> +};
> +
> +static int toshiba_accel_get_axis(enum toshiba_accel_chan chan)
> +{
> + u32 xyval, zval;
> + int x, y, z;
> + int ret;
> +
> + xyval = zval = 0;
> + ret = toshiba_accelerometer_get(toshiba_acpi, , );
> + if (ret < 0)
> + return ret;
> +
> + /* Accelerometer values */
> + x = xyval & HCI_ACCEL_MASK;
> + y = (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK;
> + z = zval & HCI_ACCEL_MASK;
> + /* Movement direction */
> + x *= xyval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> + y *= (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> + z *= zval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
Wow, that's hideous ;)
> +
> + switch (chan) {
> + case AXIS_X:
> + ret = x;
> + break;
> + case AXIS_Y:
> + ret = y;
> + break;
> + case AXIS_Z:
> + ret = z;
> + break;
Just compute the one you are returning perhaps?
case AXIS_X:
 return xyval & HCI_ACCEL_DIRECTION_MASK ?
-(xyval & HCI_ACCEL_MASK) :
xyval & HCI_ACCEL_MASK;
etc?
Brings all the 'mess' into one location.
Or break it out into steps which is fine, but only compute the one
we care about.
 
> + }
> +
> + return ret;
> +}
> +
> +static int toshiba_accel_read_raw(struct iio_dev *indio_dev,
> +   struct iio_chan_spec const *chan,
> +   int *val, int *val2, long mask)
> +{
> + int ret;
> +
> + switch (mask) {
> + case IIO_CHAN_INFO_RAW:
> + if (iio_buffer_enabled(indio_dev))
> + return -EBUSY;
Couple of things here.
* you aren't supporting buffered reads (pushed data flows) so
very unlikely the buffer would be enabled.
* if you were you'd need to be holding indio_dev->mlock to
avoid races around entering buffered mode mid way through this
function.  Note we have the claim_direct functions to

Re: linux-next: manual merge of the staging tree with the staging.current tree

2016-06-19 Thread Jonathan Cameron

On 14/06/16 06:04, Stephen Rothwell wrote:
> Hi Greg,
> 
> Today's linux-next merge of the staging tree got a conflict in:
> 
>   drivers/iio/industrialio-trigger.c
> 
> between commit:
> 
>   995438233579 ("iio: Fix error handling in iio_trigger_attach_poll_func")
> 
> from the staging.current tree and commit:
> 
>   ef2d71d6b7fb ("iio: triggers: Make trigger ops structure explicitly non 
> optional.")
> 
> from the staging tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
Thanks Stephen,

Looks great.

Jonathan

Re: [PATCH 1/3] toshiba_acpi: Add IIO interface for accelerometer axis data

2016-06-19 Thread Jonathan Cameron

On 11/06/16 19:57, Azael Avalos wrote:
> This patch adds the accelerometer axis data to the IIO subsystem.
> 
> Currently reporting the X, Y and Z values, as no other data can be
> queried given the fact that the accelerometer chip itself is hidden
> behind the Toshiba proprietary interface.
> 
> Signed-off-by: Azael Avalos 
Looks pretty good and simple to me.  A few bits and bobs inline.

Jonathan
> ---
> All:
>  This is my first attempt with the IIO subsysem, I'll be looking
>  forward for your valuable input on this.
> 
> Darren:
>  There's a warning about more than 80 columns on this patch, once
>  I get feedback from the IIO guys I'll respin this with that issue
>  corrected.
> 
>  drivers/platform/x86/toshiba_acpi.c | 130 
> ++--
>  1 file changed, 126 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/platform/x86/toshiba_acpi.c 
> b/drivers/platform/x86/toshiba_acpi.c
> index 01e12d2..85014a3 100644
> --- a/drivers/platform/x86/toshiba_acpi.c
> +++ b/drivers/platform/x86/toshiba_acpi.c
> @@ -53,6 +53,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -134,6 +135,7 @@ MODULE_LICENSE("GPL");
>  
>  /* Field definitions */
>  #define HCI_ACCEL_MASK   0x7fff
> +#define HCI_ACCEL_DIRECTION_MASK 0x8000
>  #define HCI_HOTKEY_DISABLE   0x0b
>  #define HCI_HOTKEY_ENABLE0x09
>  #define HCI_HOTKEY_SPECIAL_FUNCTIONS 0x10
> @@ -178,6 +180,7 @@ struct toshiba_acpi_dev {
>   struct led_classdev eco_led;
>   struct miscdevice miscdev;
>   struct rfkill *wwan_rfk;
> + struct iio_dev *indio_dev;
>  
>   int force_fan;
>   int last_key_event;
> @@ -1962,8 +1965,8 @@ static ssize_t position_show(struct device *dev,
>struct device_attribute *attr, char *buf)
>  {
>   struct toshiba_acpi_dev *toshiba = dev_get_drvdata(dev);
> - u32 xyval, zval, tmp;
> - u16 x, y, z;
> + u32 xyval, zval;
> + int x, y, z;
>   int ret;
>  
>   xyval = zval = 0;
> @@ -1971,10 +1974,14 @@ static ssize_t position_show(struct device *dev,
>   if (ret < 0)
>   return ret;
>  
> + /* Accelerometer values */
>   x = xyval & HCI_ACCEL_MASK;
> - tmp = xyval >> HCI_MISC_SHIFT;
> - y = tmp & HCI_ACCEL_MASK;
> + y = (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK;
>   z = zval & HCI_ACCEL_MASK;
> + /* Movement direction */
> + x *= xyval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> + y *= (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> + z *= zval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
This lot is really an unrelated change - do it as a separate precursor patch
to the IIO support.
>  
>   return sprintf(buf, "%d %d %d\n", x, y, z);
>  }
> @@ -2420,6 +2427,94 @@ static void toshiba_acpi_kbd_bl_work(struct 
> work_struct *work)
>  }
>  
>  /*
> + * IIO device
> + */
> +
> +enum toshiba_accel_chan {
> + AXIS_X,
> + AXIS_Y,
> + AXIS_Z
> +};
> +
> +static int toshiba_accel_get_axis(enum toshiba_accel_chan chan)
> +{
> + u32 xyval, zval;
> + int x, y, z;
> + int ret;
> +
> + xyval = zval = 0;
> + ret = toshiba_accelerometer_get(toshiba_acpi, , );
> + if (ret < 0)
> + return ret;
> +
> + /* Accelerometer values */
> + x = xyval & HCI_ACCEL_MASK;
> + y = (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_MASK;
> + z = zval & HCI_ACCEL_MASK;
> + /* Movement direction */
> + x *= xyval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> + y *= (xyval >> HCI_MISC_SHIFT) & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
> + z *= zval & HCI_ACCEL_DIRECTION_MASK ? -1 : 1;
Wow, that's hideous ;)
> +
> + switch (chan) {
> + case AXIS_X:
> + ret = x;
> + break;
> + case AXIS_Y:
> + ret = y;
> + break;
> + case AXIS_Z:
> + ret = z;
> + break;
Just compute the one you are returning perhaps?
case AXIS_X:
 return xyval & HCI_ACCEL_DIRECTION_MASK ?
-(xyval & HCI_ACCEL_MASK) :
xyval & HCI_ACCEL_MASK;
etc?
Brings all the 'mess' into one location.
Or break it out into steps which is fine, but only compute the one
we care about.
 
> + }
> +
> + return ret;
> +}
> +
> +static int toshiba_accel_read_raw(struct iio_dev *indio_dev,
> +   struct iio_chan_spec const *chan,
> +   int *val, int *val2, long mask)
> +{
> + int ret;
> +
> + switch (mask) {
> + case IIO_CHAN_INFO_RAW:
> + if (iio_buffer_enabled(indio_dev))
> + return -EBUSY;
Couple of things here.
* you aren't supporting buffered reads (pushed data flows) so
very unlikely the buffer would be enabled.
* if you were you'd need to be holding indio_dev->mlock to
avoid races around entering buffered mode mid way through this
function.  Note we have the claim_direct functions to handle this
case

Re: linux-next: manual merge of the staging tree with the staging.current tree

2016-06-19 Thread Jonathan Cameron

On 14/06/16 06:04, Stephen Rothwell wrote:
> Hi Greg,
> 
> Today's linux-next merge of the staging tree got a conflict in:
> 
>   drivers/iio/industrialio-trigger.c
> 
> between commit:
> 
>   995438233579 ("iio: Fix error handling in iio_trigger_attach_poll_func")
> 
> from the staging.current tree and commit:
> 
>   ef2d71d6b7fb ("iio: triggers: Make trigger ops structure explicitly non 
> optional.")
> 
> from the staging tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
Thanks Stephen,

Looks great.

Jonathan

Re: [PATCH 0/4] ASoC: tpa6130a2: Add support for multiple instances

2016-06-19 Thread Sebastian Reichel

Hi,

On Sat, Jun 18, 2016 at 12:26:27AM -0300, Helen Koike wrote:
> The current tpa6130a2 driver supports only a single instance.
> This patch series add support for multiple instances by removing the global
> variable that holds the instance.
> This is performed by using the component API, regmap, the
> snd_soc_{info,put,get}_volsw API and DAPM.
> 
> This patch series also touches code from the Nokia RX51 which I didn't tested 
> (as
> I am testing the tpa6130a2 in another board that is not upstream).
> I would appreciate is if someone who possesses the Nokia RX51 (n900) could 
> please
> test the code.
> 
> This patch series is based on 
> git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
> Available at 
> https://git.collabora.com/cgit/user/koike/linux.git/log/?h=sound/review/tpa6130a2

Locking at the resulting tpa6130a2.c I suggest to add a patch
removing the useless "goto err_gpio" in tpa6130a2_probe().

Also switching to gpiod_get() safes some more lines of code.
It may make sense to postpone this to 4.9, though (this change
involves removing the gpio from the platform data and the rx51
will become DT only in 4.8).

-- Sebastian


signature.asc
Description: PGP signature

Re: [PATCH 0/4] ASoC: tpa6130a2: Add support for multiple instances

2016-06-19 Thread Sebastian Reichel

Hi,

On Sat, Jun 18, 2016 at 12:26:27AM -0300, Helen Koike wrote:
> The current tpa6130a2 driver supports only a single instance.
> This patch series add support for multiple instances by removing the global
> variable that holds the instance.
> This is performed by using the component API, regmap, the
> snd_soc_{info,put,get}_volsw API and DAPM.
> 
> This patch series also touches code from the Nokia RX51 which I didn't tested 
> (as
> I am testing the tpa6130a2 in another board that is not upstream).
> I would appreciate is if someone who possesses the Nokia RX51 (n900) could 
> please
> test the code.
> 
> This patch series is based on 
> git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next
> Available at 
> https://git.collabora.com/cgit/user/koike/linux.git/log/?h=sound/review/tpa6130a2

Locking at the resulting tpa6130a2.c I suggest to add a patch
removing the useless "goto err_gpio" in tpa6130a2_probe().

Also switching to gpiod_get() safes some more lines of code.
It may make sense to postpone this to 4.9, though (this change
involves removing the gpio from the platform data and the rx51
will become DT only in 4.8).

-- Sebastian


signature.asc
Description: PGP signature

Re: [PATCH 4/4] ASoC: tpa6130a2: Add DAPM support

2016-06-19 Thread Sebastian Reichel

Hi,

On Sun, Jun 19, 2016 at 11:39:15AM +0200, Lars-Peter Clausen wrote:
> On 06/19/2016 01:21 AM, Sebastian Reichel wrote:
> > On Sat, Jun 18, 2016 at 12:26:31AM -0300, Helen Koike wrote:
> >> Add DAPM support and updated rx51 accordingly.
> >> As a consequence:
> >> - the exported function tpa6130a2_stereo_enable is not needed anymore
> >> - the mutex is dealt in the DAPM
> >> - the power state is tracked by the DAPM
> > 
> > This breaks headphone on N900. So far I only checked, that
> > tpa6130a2_power_event() is not called. I guess the DAPM
> > graph is wrong.
> 
> Can you try with
> 
> + {"TPA6130A2 HPLEFT", NULL, "LLOUT"},
> + {"TPA6130A2 HPRIGHT", NULL, "RLOUT"}
> 
> being replaced with
> 
> +   {"TPA6130A2 LEFTIN", NULL, "LLOUT"},
> +   {"TPA6130A2 RIGHTIN", NULL, "RLOUT"},

With that change:

Tested-By: Sebastian Reichel 
Reviewed-By: Sebastian Reichel 

-- Sebastian


signature.asc
Description: PGP signature

Re: [PATCH 4/4] ASoC: tpa6130a2: Add DAPM support

2016-06-19 Thread Sebastian Reichel

Hi,

On Sun, Jun 19, 2016 at 11:39:15AM +0200, Lars-Peter Clausen wrote:
> On 06/19/2016 01:21 AM, Sebastian Reichel wrote:
> > On Sat, Jun 18, 2016 at 12:26:31AM -0300, Helen Koike wrote:
> >> Add DAPM support and updated rx51 accordingly.
> >> As a consequence:
> >> - the exported function tpa6130a2_stereo_enable is not needed anymore
> >> - the mutex is dealt in the DAPM
> >> - the power state is tracked by the DAPM
> > 
> > This breaks headphone on N900. So far I only checked, that
> > tpa6130a2_power_event() is not called. I guess the DAPM
> > graph is wrong.
> 
> Can you try with
> 
> + {"TPA6130A2 HPLEFT", NULL, "LLOUT"},
> + {"TPA6130A2 HPRIGHT", NULL, "RLOUT"}
> 
> being replaced with
> 
> +   {"TPA6130A2 LEFTIN", NULL, "LLOUT"},
> +   {"TPA6130A2 RIGHTIN", NULL, "RLOUT"},

With that change:

Tested-By: Sebastian Reichel 
Reviewed-By: Sebastian Reichel 

-- Sebastian


signature.asc
Description: PGP signature

Re: [RFC] capabilities: add capability cgroup controller

2016-06-19 Thread serge

apologies for top posting, this phone doesn't support inline)

Where are you preventing less privileged tasks from limiting the caps of a more 
privileged task?  It looks like you are relying on the cgroupfs for that?

Overall I'm not a fan of this for several reasons.  Can you tell us precisely 
what your use case is?
On 6/18/16 14:31 Topi Miettinen wrote:
Add a new cgroup controller for enforcement of and monitoring of
capabilities in the cgroup.

Test case (boot to rdshell);
BusyBox v1.22.1 (Debian 1:1.22.0-19) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) cd /sys/fs
(initramfs) mount -t cgroup2 cgroup cgroup
(initramfs) cd cgroup
(initramfs) echo +capability > cgroup.subtree_control
(initramfs) mkdir test; cd test
(initramfs) ls
capability.bounding_set  cgroup.controllers   cgroup.procs
capability.used  cgroup.eventscgroup.subtree_control
(initramfs) sh

BusyBox v1.22.1 (Debian 1:1.22.0-19) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) echo $$ >cgroup.procs
(initramfs) cat capability.used

(initramfs) mknod /dev/z1 c 1 2
(initramfs) cat capability.used
0800
(initramfs) exit
(initramfs) echo  > capability.bounding_set
(initramfs) sh

BusyBox v1.22.1 (Debian 1:1.22.0-19) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) echo $$ >cgroup.procs
(initramfs) mknod /dev/z2 c 1 2
mknod: /dev/z2: Operation not permitted
(initramfs) exit

Signed-off-by: Topi Miettinen 
---
 include/linux/capability_cgroup.h |   7 ++
 include/linux/cgroup_subsys.h |   4 +
 init/Kconfig  |   6 ++
 kernel/capability.c   |   2 +
 security/Makefile |   1 +
 security/capability_cgroup.c  | 216 ++
 6 files changed, 236 insertions(+)
 create mode 100644 include/linux/capability_cgroup.h
 create mode 100644 security/capability_cgroup.c

diff --git a/include/linux/capability_cgroup.h 
b/include/linux/capability_cgroup.h
new file mode 100644
index 000..c03b58d
--- /dev/null
+++ b/include/linux/capability_cgroup.h
@@ -0,0 +1,7 @@
+#ifdef CONFIG_CGROUP_CAPABILITY
+void capability_cgroup_update_used(int cap);
+#else
+static inline void capability_cgroup_update_used(int cap)
+{
+}
+#endif
diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index 0df0336a..a5161d0 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -56,6 +56,10 @@ SUBSYS(hugetlb)
 SUBSYS(pids)
 #endif
 
+#if IS_ENABLED(CONFIG_CGROUP_CAPABILITY)
+SUBSYS(capability)
+#endif
+
 /*
  * The following subsystems are not supported on the default hierarchy.
  */
diff --git a/init/Kconfig b/init/Kconfig
index f755a60..098ce66 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1141,6 +1141,12 @@ config CGROUP_PERF
 
  Say N if unsure.
 
+config CGROUP_CAPABILITY
+   bool "Capability controller"
+   help
+ Provides a simple controller for enforcement of and monitoring of
+ capabilities in the cgroup.
+
 config CGROUP_DEBUG
bool "Example controller"
default n
diff --git a/kernel/capability.c b/kernel/capability.c
index 45432b5..b57d7f9 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -380,6 +381,7 @@ bool ns_capable(struct user_namespace *ns, int cap)
}
 
if (security_capable(current_cred(), ns, cap) == 0) {
+   capability_cgroup_update_used(cap);
current->flags |= PF_SUPERPRIV;
return true;
}
diff --git a/security/Makefile b/security/Makefile
index f2d71cd..2bb04f1 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_SECURITY_APPARMOR)   += apparmor/
 obj-$(CONFIG_SECURITY_YAMA)+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN) += loadpin/
 obj-$(CONFIG_CGROUP_DEVICE)+= device_cgroup.o
+obj-$(CONFIG_CGROUP_CAPABILITY)+= capability_cgroup.o
 
 # Object integrity file lists
 subdir-$(CONFIG_INTEGRITY) += integrity
diff --git a/security/capability_cgroup.c b/security/capability_cgroup.c
new file mode 100644
index 000..6e03fce
--- /dev/null
+++ b/security/capability_cgroup.c
@@ -0,0 +1,216 @@
+/*
+ * Capability cgroup
+ *
+ * Copyright 2016 Topi Miettinen
+ *
+ * This file is subject to the terms and conditions of the GNU General
+ * Public License.  See the file COPYING in the main directory of the
+ * Linux distribution for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static DEFINE_MUTEX(capcg_mutex);
+
+struct capcg_cgroup {
+   struct cgroup_subsys_state css;
+   kernel_cap_t cap_bset; /* Capability bounding set */
+   kernel_cap_t cap_used; /* Capabilities actually used */
+};
+
+static inline

Re: [RFC] capabilities: add capability cgroup controller

2016-06-19 Thread serge

apologies for top posting, this phone doesn't support inline)

Where are you preventing less privileged tasks from limiting the caps of a more 
privileged task?  It looks like you are relying on the cgroupfs for that?

Overall I'm not a fan of this for several reasons.  Can you tell us precisely 
what your use case is?
On 6/18/16 14:31 Topi Miettinen wrote:
Add a new cgroup controller for enforcement of and monitoring of
capabilities in the cgroup.

Test case (boot to rdshell);
BusyBox v1.22.1 (Debian 1:1.22.0-19) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) cd /sys/fs
(initramfs) mount -t cgroup2 cgroup cgroup
(initramfs) cd cgroup
(initramfs) echo +capability > cgroup.subtree_control
(initramfs) mkdir test; cd test
(initramfs) ls
capability.bounding_set  cgroup.controllers   cgroup.procs
capability.used  cgroup.eventscgroup.subtree_control
(initramfs) sh

BusyBox v1.22.1 (Debian 1:1.22.0-19) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) echo $$ >cgroup.procs
(initramfs) cat capability.used

(initramfs) mknod /dev/z1 c 1 2
(initramfs) cat capability.used
0800
(initramfs) exit
(initramfs) echo  > capability.bounding_set
(initramfs) sh

BusyBox v1.22.1 (Debian 1:1.22.0-19) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs) echo $$ >cgroup.procs
(initramfs) mknod /dev/z2 c 1 2
mknod: /dev/z2: Operation not permitted
(initramfs) exit

Signed-off-by: Topi Miettinen 
---
 include/linux/capability_cgroup.h |   7 ++
 include/linux/cgroup_subsys.h |   4 +
 init/Kconfig  |   6 ++
 kernel/capability.c   |   2 +
 security/Makefile |   1 +
 security/capability_cgroup.c  | 216 ++
 6 files changed, 236 insertions(+)
 create mode 100644 include/linux/capability_cgroup.h
 create mode 100644 security/capability_cgroup.c

diff --git a/include/linux/capability_cgroup.h 
b/include/linux/capability_cgroup.h
new file mode 100644
index 000..c03b58d
--- /dev/null
+++ b/include/linux/capability_cgroup.h
@@ -0,0 +1,7 @@
+#ifdef CONFIG_CGROUP_CAPABILITY
+void capability_cgroup_update_used(int cap);
+#else
+static inline void capability_cgroup_update_used(int cap)
+{
+}
+#endif
diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index 0df0336a..a5161d0 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -56,6 +56,10 @@ SUBSYS(hugetlb)
 SUBSYS(pids)
 #endif
 
+#if IS_ENABLED(CONFIG_CGROUP_CAPABILITY)
+SUBSYS(capability)
+#endif
+
 /*
  * The following subsystems are not supported on the default hierarchy.
  */
diff --git a/init/Kconfig b/init/Kconfig
index f755a60..098ce66 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1141,6 +1141,12 @@ config CGROUP_PERF
 
  Say N if unsure.
 
+config CGROUP_CAPABILITY
+   bool "Capability controller"
+   help
+ Provides a simple controller for enforcement of and monitoring of
+ capabilities in the cgroup.
+
 config CGROUP_DEBUG
bool "Example controller"
default n
diff --git a/kernel/capability.c b/kernel/capability.c
index 45432b5..b57d7f9 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -380,6 +381,7 @@ bool ns_capable(struct user_namespace *ns, int cap)
}
 
if (security_capable(current_cred(), ns, cap) == 0) {
+   capability_cgroup_update_used(cap);
current->flags |= PF_SUPERPRIV;
return true;
}
diff --git a/security/Makefile b/security/Makefile
index f2d71cd..2bb04f1 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -25,6 +25,7 @@ obj-$(CONFIG_SECURITY_APPARMOR)   += apparmor/
 obj-$(CONFIG_SECURITY_YAMA)+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN) += loadpin/
 obj-$(CONFIG_CGROUP_DEVICE)+= device_cgroup.o
+obj-$(CONFIG_CGROUP_CAPABILITY)+= capability_cgroup.o
 
 # Object integrity file lists
 subdir-$(CONFIG_INTEGRITY) += integrity
diff --git a/security/capability_cgroup.c b/security/capability_cgroup.c
new file mode 100644
index 000..6e03fce
--- /dev/null
+++ b/security/capability_cgroup.c
@@ -0,0 +1,216 @@
+/*
+ * Capability cgroup
+ *
+ * Copyright 2016 Topi Miettinen
+ *
+ * This file is subject to the terms and conditions of the GNU General
+ * Public License.  See the file COPYING in the main directory of the
+ * Linux distribution for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static DEFINE_MUTEX(capcg_mutex);
+
+struct capcg_cgroup {
+   struct cgroup_subsys_state css;
+   kernel_cap_t cap_bset; /* Capability bounding set */
+   kernel_cap_t cap_used; /* Capabilities actually used */
+};
+
+static inline struct capcg_cgroup

Re: [PATCH 2/3] staging: luster: Checkpatch Cleanup

2016-06-19 Thread Craig Inches

I did build the kernel 1 for each change, then a final with all changes
applied.

Apologies if I missed something,

Craig

On Sat, Jun 18, 2016 at 07:02:35PM -0700, Greg Kroah-Hartman wrote:
> On Sat, Jun 18, 2016 at 10:25:55PM +0100, Craig Inches wrote:
> > Macros with complex values should be enclosed in parenthesis
> > 
> > Signed-off-by: Craig Inches 
> > ---
> >  drivers/staging/lustre/lustre/include/cl_object.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> ALWAYS test build your patches, to not do so just makes maintainers
> grumpy...
> 
> Remember, checkpatch is a _hint_, it's not always right.
> 
> greg k-h

Re: [PATCH 2/3] staging: luster: Checkpatch Cleanup

2016-06-19 Thread Craig Inches

I did build the kernel 1 for each change, then a final with all changes
applied.

Apologies if I missed something,

Craig

On Sat, Jun 18, 2016 at 07:02:35PM -0700, Greg Kroah-Hartman wrote:
> On Sat, Jun 18, 2016 at 10:25:55PM +0100, Craig Inches wrote:
> > Macros with complex values should be enclosed in parenthesis
> > 
> > Signed-off-by: Craig Inches 
> > ---
> >  drivers/staging/lustre/lustre/include/cl_object.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> ALWAYS test build your patches, to not do so just makes maintainers
> grumpy...
> 
> Remember, checkpatch is a _hint_, it's not always right.
> 
> greg k-h

Re: [PATCH 1/1] iio: fix config watermark initial value

2016-06-19 Thread Jonathan Cameron

On 16/06/16 17:19, Lars-Peter Clausen wrote:
> On 06/16/2016 06:16 PM, Jonathan Cameron wrote:
>>
>>
>> On 16 June 2016 12:56:11 BST, Daniel Baluta  wrote:
>>> On Mon, Mar 28, 2016 at 1:02 PM, Jonathan Cameron 
>>> wrote:
 On 24/03/16 09:23, Lars-Peter Clausen wrote:
> On 03/24/2016 10:09 AM, Irina Tirdea wrote:
>> config structure is set to 0 when updating the buffers, so by
>> default config->watermark will be 0. When computing the minimum
>> between config->watermark and the buffer->watermark or
>> insert_buffer-watermark, this will always be 0 regardless of the
>> value set by the user for the buffer.
>>
>> Set as initial value for config->watermark the maximum allowed
>> value so that the minimum value will always be set from one of the
>> buffers.
>>
>> Signed-off-by: Irina Tirdea 
>
> Looks good. This bug was my fault, sorry.
>
> Fixes: f0566c0c405d ("iio: Set device watermark based on watermark
>>> of all
> attached buffers")
 Applied to the fixes-togreg-post-rc1 branch of iio.git and marked for
>>> stable.

 Thanks,

 Jonathan
>
>> ---
>>  drivers/iio/industrialio-buffer.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/iio/industrialio-buffer.c
>>> b/drivers/iio/industrialio-buffer.c
>> index b976332..90462fc 100644
>> --- a/drivers/iio/industrialio-buffer.c
>> +++ b/drivers/iio/industrialio-buffer.c
>> @@ -653,6 +653,7 @@ static int iio_verify_update(struct iio_dev
>>> *indio_dev,
>>  unsigned int modes;
>>
>>  memset(config, 0, sizeof(*config));
>> +config->watermark = ~0;
>>
>>  /*
>>   * If there is just one buffer and we are removing it there is
>>> nothing
>>
>

 --
>>>
>>>
>>> Hi Jonathan,
>>>
>>> After a discussion with Laurentiu (Cc'ed), we noticed that this patch
>>> is not in the latest IIO (testing, togreg) tree.
>>>
>>> What is the path of an IIO fixes patch?
>>>
>>> thanks,
>>> Daniel.
>> Should be fixes-togreg then staging-linus then mainline.
> 
> To add to that once it is in mainline it will trickle down again through
> staging/next to iio/togreg. So there might be a fair amount of pipeline
> delay until it arrives in iio/{togreg,testing}.
> 
It's now in staging-next which is the upstream for togreg.
I've been a bit snowed under this week so it might be a little while yet
before I get my next pull request out to Greg and then do a fast
forward merge which will pull that change in.

Thanks,

Jonathan

Re: [PATCH 1/1] iio: fix config watermark initial value

2016-06-19 Thread Jonathan Cameron

On 16/06/16 17:19, Lars-Peter Clausen wrote:
> On 06/16/2016 06:16 PM, Jonathan Cameron wrote:
>>
>>
>> On 16 June 2016 12:56:11 BST, Daniel Baluta  wrote:
>>> On Mon, Mar 28, 2016 at 1:02 PM, Jonathan Cameron 
>>> wrote:
 On 24/03/16 09:23, Lars-Peter Clausen wrote:
> On 03/24/2016 10:09 AM, Irina Tirdea wrote:
>> config structure is set to 0 when updating the buffers, so by
>> default config->watermark will be 0. When computing the minimum
>> between config->watermark and the buffer->watermark or
>> insert_buffer-watermark, this will always be 0 regardless of the
>> value set by the user for the buffer.
>>
>> Set as initial value for config->watermark the maximum allowed
>> value so that the minimum value will always be set from one of the
>> buffers.
>>
>> Signed-off-by: Irina Tirdea 
>
> Looks good. This bug was my fault, sorry.
>
> Fixes: f0566c0c405d ("iio: Set device watermark based on watermark
>>> of all
> attached buffers")
 Applied to the fixes-togreg-post-rc1 branch of iio.git and marked for
>>> stable.

 Thanks,

 Jonathan
>
>> ---
>>  drivers/iio/industrialio-buffer.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/iio/industrialio-buffer.c
>>> b/drivers/iio/industrialio-buffer.c
>> index b976332..90462fc 100644
>> --- a/drivers/iio/industrialio-buffer.c
>> +++ b/drivers/iio/industrialio-buffer.c
>> @@ -653,6 +653,7 @@ static int iio_verify_update(struct iio_dev
>>> *indio_dev,
>>  unsigned int modes;
>>
>>  memset(config, 0, sizeof(*config));
>> +config->watermark = ~0;
>>
>>  /*
>>   * If there is just one buffer and we are removing it there is
>>> nothing
>>
>

 --
>>>
>>>
>>> Hi Jonathan,
>>>
>>> After a discussion with Laurentiu (Cc'ed), we noticed that this patch
>>> is not in the latest IIO (testing, togreg) tree.
>>>
>>> What is the path of an IIO fixes patch?
>>>
>>> thanks,
>>> Daniel.
>> Should be fixes-togreg then staging-linus then mainline.
> 
> To add to that once it is in mainline it will trickle down again through
> staging/next to iio/togreg. So there might be a fair amount of pipeline
> delay until it arrives in iio/{togreg,testing}.
> 
It's now in staging-next which is the upstream for togreg.
I've been a bit snowed under this week so it might be a little while yet
before I get my next pull request out to Greg and then do a fast
forward merge which will pull that change in.

Thanks,

Jonathan

Re: [RFC] regmap: Add regmap_pipe_read API

2016-06-19 Thread Jonathan Cameron

On 17/06/16 09:05, Lars-Peter Clausen wrote:
> On 06/17/2016 09:04 AM, Crestez Dan Leonard wrote:
>> On 06/16/2016 06:43 PM, Geert Uytterhoeven wrote:
>>> Hi Leonard,
>>>
>>> On Thu, Jun 16, 2016 at 5:24 PM, Crestez Dan Leonard
>>>  wrote:
 The regmap API usually assumes that bulk read operations will read a
 range of registers but some I2C/SPI devices have certain registers for
 which a such a read operation will return data from an internal FIFO
 instead. Add an explicit API to support bulk read with pipe rather than
 range semantics.
>>>
>>> Please settle on either "fifo" or "pipe", instead of mixing both.
>>> Personally, I prefer the former.
>>
>> Well, it doesn't have to be a fifo. The device can return data from some
>> other kind of buffer (maybe a stack). I can adjust the documentation to
>> clarify.
>>
>> I considered naming it something like regmap_multi_read_one_reg or
>> something but regmap_pipe_read sounds reasonable and short.
> 
> stream might be another option, but pipe is ok in my opinion.
> 
I'm not fussy on naming, but definitely support having the functionality!

Jonathan

Re: [RFC] regmap: Add regmap_pipe_read API

2016-06-19 Thread Jonathan Cameron

On 17/06/16 09:05, Lars-Peter Clausen wrote:
> On 06/17/2016 09:04 AM, Crestez Dan Leonard wrote:
>> On 06/16/2016 06:43 PM, Geert Uytterhoeven wrote:
>>> Hi Leonard,
>>>
>>> On Thu, Jun 16, 2016 at 5:24 PM, Crestez Dan Leonard
>>>  wrote:
 The regmap API usually assumes that bulk read operations will read a
 range of registers but some I2C/SPI devices have certain registers for
 which a such a read operation will return data from an internal FIFO
 instead. Add an explicit API to support bulk read with pipe rather than
 range semantics.
>>>
>>> Please settle on either "fifo" or "pipe", instead of mixing both.
>>> Personally, I prefer the former.
>>
>> Well, it doesn't have to be a fifo. The device can return data from some
>> other kind of buffer (maybe a stack). I can adjust the documentation to
>> clarify.
>>
>> I considered naming it something like regmap_multi_read_one_reg or
>> something but regmap_pipe_read sounds reasonable and short.
> 
> stream might be another option, but pipe is ok in my opinion.
> 
I'm not fussy on naming, but definitely support having the functionality!

Jonathan

< 1 2 3 4 5 6 >

301 - 400 of 592 matches

Mail list logo